Currently, Membase uses load testing tools such as memcachetest, brutis, memslap in order to both test Membase's functionality and to simulate customer data patterns. The problem with these tools is that they are useful only in particular load testing scenarios making their uses for testing a variety of customer scenarios difficult. Memcachedtest and memslap for example are more useful as testing tools to diagnose fundamental problems with either Membase or Memcached. These two tools only test gets and sets and they cannot be configured to work on subsets of a the key space in Memcached or Membase. For example, we cannot tell these two testing tools tools to do operations on 10% of the key space 99% of the time. The load testing tool brutis can solve problems that memcachedtest and memslap cannot, but becomes a problem due to its use of php which provides significant overhead on the client side and therefore doesn't provide accurate results. Also, bruits can solve customer data patterns that memcachedtest and memslap cannot it is still not a totally general tool and cannot simulate all customer data patterns.
As a result of the shortcomings of some of these tools we found it necessary to design a new load testing framework that would be able to simulate any customer data scenario that we have now or will have in the future. Our framework will be built on top of the YCSB (Yahoo Cloud Services Benchmark) Framework and be able to test any customer data scenario against Membase and provide a variety of metrics that describe how well Membase performed against its data load.
Load test with different variations including:
- Increased item count over time, for example, every 1 hour add 10K new items
- Define working set size, for example, working set for load test is 50K items out of the total item count
- Define what percentage of the working set churns (i.e replaced with non working set items) at what time interal, for example working set is 10K items with 1% churn every 6 hours
- Increased working set over time, for example working set should start with 10K, end with 50K, increased by 1K every 1 hour
- Controlled throughput, for example, throughput would be 10K as long as the server can hold it
- Increased throughput, for example, increase throughput from 1K to 30K by 1K ops/sec every 1 hour
- Fully memcapable, vBucket aware
- Define key prefix
- Define mix of verbs for the test
The tool should provide meaningful stats in two forms
- Summary stats on the stdout (total ops/sec , 95, 99 latency percentile)
- Detailed stats on a file including:
- The total and per verb ops/sec
- Histogram of latency for the total and per verb operations (histogram count should be 10%, 50%, 90%, 95%, 99%, 99.9%)
- the file should be grepable so it will be easy to aggregate the results and put it into excel for better analysis
- Uniform: Choose an item at random. For example, when choosing a record, all records in the database are equally likely.
- Zipfian: Choose an item according to the Zipfian distribution. For example, when choosing a record, some records will be extremely popular (the head of the distribution) while most of the records will be unpopular (the tail of the distribution).
- Latest: Like the Zipfian distribution, except the most recently inserted records are at the head of the distribution.
- Multinomial: Probabilities for each item can be specified. For example we might assign a probability of .95 to the Read operation and a probability of .05 to the Update operation, and a probability of 0 to Scan and Insert. The result would be a read heavy workload.