Background

Types of stress tests

Traffic Record

Record live traffic and playback directly in the live region. (Only can do for “GET” endpoint, i.e., the endpoints without writing DB operation)

Pros

100% simulate the real user behaviour in Live env during normal days
Cover the full-chain (services) triggered by the endpoints stessed

Cons

Cannot cover POST endpoint
Cannot cover the services not passthrough by the endpoints stessed
Cannot simulate the real user behaviour in Live env during campaign

Note that user bahaviour during campaign may be different during normal time, e.g.,

during campaign the items/shops with high promotion may be hotter, while during normal days views to items/shops are relatively more distributed
during campaign more vouchers are dispatched
during campaign QPSs usually have spike due to user behaviour, e.g., Flash Sales start, TV shows start or are ongoing

Stress Test by Service

Pros

Simple

Cons

May not simulate the real user behaviour in Live env, subject to the way used to similate real requests (of parameters)
Cannot cover POST endpoint
Cannot cover the services not passthrough by the endpoints stessed
Cannot simulate the real user behaviour in Live env during campaign

Why Full Chain Stress Test (FCST)

The objective FCST is to enable Stress Testing on live environment on real regions to try to simulate Big Campaign, e.g., traffic patterns.

So as to increase stability

How

Generally, we will give a specific flag (shadow flag) to the traffic triggered by our stress test engine, and everything should be the same as live traffic except the data is stored in a different table/topic.

Compotents involved

Service
- Identify shadow traffic
Cache
- Remote Cache, e.g., Redis
- Local Cache, i.e., in-process cache
DB
Message Queue, e.g., Kafka
Cronjob
Library
- ORM - add the logic where if shadow traffic, read/write to shadow DB
- Cache Library - add the logic where if shadow traffic, read/write to shadow cache

Reference

FEATURED TAGS

algorithm algorithmproblem architecturalpattern architecture aws c# cachesystem codis compile concurrentcontrol database dataformat datastructure debug design designpattern distributedsystem django docker domain engineering freebsd git golang grafana hackintosh hadoop hardware hexo http hugo ios iot java javaee javascript kafka kubernetes linux linuxcommand linuxio lock macos markdown microservices mysql nas network networkprogramming nginx node.js npm oop openwrt operatingsystem padavan performance programming prometheus protobuf python redis router security shell software testing spring sql systemdesign truenas ubuntu vmware vpn windows wmware wordpress xml zookeeper

【Software Testing】全链路测试（Full Chain Stress Test）