Cache Miss Ratio in case Lower value of Resident Ratio

Debasis_Mallick · April 14, 2023, 1:30pm

Hi Team,

We had populated 24 million of records (–key-prefix a) in one of the CB bucket whose storage engine type (couchstore) and resident ratio was 79 percent. Then again populated another 22 million records (–key-prefix b) so that all data with “–key-prefix a” will push down to disk and resident ratio was 34 percent. While perfroming get operation for (–key-prefix a) it should be disk read and in CB UI cache miss ratio should reach near to 100 percent but we can see that cache miss ratio is very minimal (0.251 percent).Due to which we are not able to see proper result for r/s and rkB/s metrices through iostat OS command.

Could you please advise why we are seeing such low cache miss ratio.

We are using below pillowfight commands to populate data.

/opt/couchbase/bin/cbc-pillowfight -U couchbase://xx.xx.xx.xx/data1 -u -P --min-size 1000 --max-size 1000 --json --set-pct 100 --batch-size 1 --num-items 1000000000 --sequential --num-threads 20 --rate-limit 8000 --key-prefix a;
/opt/couchbase/bin/cbc-pillowfight -U couchbase://xx.xx.xx.xx/data1 -u -P --min-size 1000 --max-size 1000 --json --set-pct 100 --batch-size 1 --num-items 1000000000 --sequential --num-threads 20 --rate-limit 8000 --key-prefix b;
/opt/couchbase/bin/cbc-pillowfight -U couchbase://xx.xx.xx.xx/data1 -u -P --min-size 1000 --max-size 1000 --json --set-pct 0 --batch-size 1 --num-items 24000000 --num-threads 20 --rate-limit 8000 --key-prefix a --no-population;

Thanks,
Debasis

Debasis_Mallick · April 14, 2023, 5:04pm

We had continued run these commands for almost 45 mins to populate the data.

mreiche · April 14, 2023, 6:19pm

So you killed the first command after it inserted 24M documents, and killed the second command after it inserted 22M documents. OK.

What appears to be happening is all 20 threads are fetching the same documents. Use --num-threads 1 and you should see a much higher cache-miss rate. With 0.009% resident ratio, I was seeing 10% cache-miss with 20 threads, 100% cache-miss with 1 thread.

Also - omitting --rate-limit might save you some time.

Edit: there is a very recent option ‘–rand-space-per-thread’ that will give each thread different random sequences when --sequential is not specified. If your cbc-pillowfight recognizes that option, you can use that.

Debasis_Mallick · April 17, 2023, 12:31pm

Thanks @mreiche . I need help on below two questions.

According to you if my command uses all 20 threads to fetch same docs, is there any option we can use in pillowfight so that my 20 threads fetch different docs.

Blockquote What appears to be happening is all 20 threads are fetching the same documents

While doing random access in below pillowfight command, can we give start-at option like in sequential scan provides.

/opt/couchbase/bin/cbc-pillowfight -U couchbase://xx.xx.xx.xx/data1 -u -P --min-size 1000 --max-size 1000 --json --set-pct 0 --batch-size 1 --num-items 24000000 --num-threads 20 --rate-limit 8000 --key-prefix a --no-population;

Thanks,
Debasis

mreiche · April 17, 2023, 4:27pm

See my previous response for the option.

While doing random access in below pillowfight command, can we give start-at option like in sequential scan provides.

Not according to --help.

   --start-at                    For sequential access, set the first item [Default=0]

Debasis_Mallick · April 18, 2023, 5:17am

So you mean to say that we use --num-thread 1 to fetch docs and if I want my pillowfight to fetch 20 different docs, then I need to run 20 instances of pillowfight with --num-thread 1. Please confirm.

Thanks,
Debasis

Debasis_Mallick · April 18, 2023, 10:15am

Hi @mreiche ,

While going through docs for pillowfight the “–num-threads” definition showing that each thread is assigned its own object. Is there any way we can see in CB UI or cli commands to verify how couchbase is behaving w.r.t "--num-threads" while perfroming pillowfight test on cb cluster. Please suggest on this.

https://docs.couchbase.com/sdk-api/couchbase-c-client-2.4.8/md_doc_cbc-pillowfight.html

-t, --num-threads=NTHREADS: Set the number of threads (and thus the number of client instances) to run concurrently. Each thread is assigned its own client object.*

mreiche · April 18, 2023, 3:03pm

Why not get the latest version which has the --rand option?

Or write your own load driver client.

Here’s the source for pillow fight. Knock yourself out.

github.com

couchbase/libcouchbase/blob/master/tools/cbc-pillowfight.cc

/* -*- Mode: C++; tab-width: 4; c-basic-offset: 4; indent-tabs-mode: nil -*- */
/*
 *     Copyright 2011-2020 Couchbase, Inc.
 *
 *   Licensed under the Apache License, Version 2.0 (the "License");
 *   you may not use this file except in compliance with the License.
 *   You may obtain a copy of the License at
 *
 *       http://www.apache.org/licenses/LICENSE-2.0
 *
 *   Unless required by applicable law or agreed to in writing, software
 *   distributed under the License is distributed on an "AS IS" BASIS,
 *   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *   See the License for the specific language governing permissions and
 *   limitations under the License.
 */
#define NOMINMAX // Because MS' CRT headers #define min and max :(
#include "config.h"
#include <sys/types.h>
#include <libcouchbase/couchbase.h>

This file has been truncated. show original

If you need more assistance open a case with customer support.

Debasis_Mallick · April 19, 2023, 5:19am

We are using the pilowfight version (Version=3.2.5) and we did not find any option such as –rand . The CB Version we are using is 7.1.

Could you please let us know which version of pillowfight the mentioned option is available .

Thanks,
Debasis

pccb · April 19, 2023, 8:46am

Hello @mreiche , may I ask a question?
When we are generating a load using pillowfight, we see very high number of get_misses. My theory is as follows:
pillowfight generates documents with keys having 20 characters. With prefix “a”, first document will be a + 19 zeroes and the last document will be a + 19 nines. Hope that is correct. Lets call this the “FULL RANGE”.
Lets say 20m documents have been generated by pillowfight using --sequential. Then the key range existing in database will be first 20m docs from the FULL RANGE. Lets call this the “20m RANGE”.
Post that, if pillowfight is doing only reads (GETs) and --sequential is not used, then it will do a GET for any key from the FULL RANGE and not the 20m RANGE. That is the reason why we see a huge number of get_misses because most keys which are being fetched are not existing in the database.

If above theory is correct, do you know if there is a way to specify a limited range (like the 20m RANGE) to fetch from?

Many Thanks

mreiche · April 19, 2023, 4:19pm

Apparently not released yet. So either use 1 thread (in mutiple cbc-pillowfight executions if desired). Or wright your own load driver. GitHub - mikereiche/loaddriver

" Build couchbase-server-7.2.0-5304 contains libcouchbase commit 8af01cb with commit message:

~~CCBC-1546~~: Add arg to allow threads to work from different rand numbers"

Debasis_Mallick · April 20, 2023, 7:43am

Yes @mreiche due to high number of “get_misses” only few records are getting read from disk. Thus the cache miss ratio in UI shows very low percentage. The commands which we executed mentioned in initial thread. Thanks for your valuable time.

Below is get_misses captured while executing one of the test load.

“get_misses”:[7497.4,7457.6,7445.8,7445.8,7445.8,7445.8,7445.8,7445.8,7445.8,7455,7455,7501.400000000001,7457,7457,7457,7457,7457,7457,7457,7545,7545,7532.5,7555.3,7555.3,7555.3,7555.3,7555.3,7555.3,7555.3,7530.400000000001,7530.400000000001,7482.5,7481,7481,7481,7481,7481,7481,7481,7461.4,7461.4,7468.7,7449.9,7449.9,7449.9,7449.9,7449.9,7449.9,7449.9,7450.9,7450.9,7471.1,7496,7496,7496,7496,7496,7496,7496,7443.1]

Thanks,
Debasis

mreiche · April 20, 2023, 5:44pm

“Yes @mreiche due to high number of “get_misses” only few records are getting read from disk. Thus the cache miss ratio in UI shows very low percentage.”

get_misses and cache_misses are mutually exclusive. A get_miss is a get on a document that does not exist. A cache_miss is a get on a document that exists, but is not in RAM.

cache-hits as described by the OP will result in documents not being read from disk. Use --num–threads 1. If multiple concurrent reads are required, try executing multiple copies of cbc-pillowfight (the separate cbc-pillowfight executions might use different random sequences - I don’t know, you’ll be able to tell from the cache-miss ratio). And there is --rand-space-per-thread coming in a future release.
get_misses - by asking pillowfight to read documents that you haven’t written (why would you do that??) will also result in documents not being read from disk.

there is no magical “FULL RANGE” that pillowfight uses. If you ask pillowfight to write one million (num_items) (and num_cycles at least num_items) sequential documents, it will write one million sequential documents (except if you kill it before it is done). If you ask pillow fight to randomly read one million (num-items) documents, it will not attempt to read documents outside the one-million. If you generate one-million (num-items) documents with pillowfight, and then ask pillow fight to randomly read from ten-million (num-items) documents, nine out of ten of those reads is going to be a get_miss - that’s on you.

here’s a handy curl command to get get_misses and cache miss rate. There are 60 entries for each as these are the numbers for the last 60 seconds.

curl -k -s -u Administrator:password http://localhost:8091/pools/default/buckets/my_bucket/stats | jq .op.samples.ep_cache_miss_rate,.op.samples.get_misses

deedstoke · April 27, 2023, 11:47pm

We are using the pilowfight version (Version=3.2.5) and we did not find any option such as –rand . The CB Version we are using is 7.1.

mreiche · April 29, 2023, 2:13pm

See my previous reply:

Apparently not released yet. So either use 1 thread (in mutiple cbc-pillowfight executions if desired). Or wright your own load driver. GitHub - mikereiche/loaddriver

" Build couchbase-server-7.2.0-5304 contains libcouchbase commit 8af01cb with commit message:

~~CCBC-1546~~: Add arg to allow threads to work from different rand numbers"

deedstoke · May 1, 2023, 12:43pm

Could you please let us know which version of pillowfight the mentioned option is available ..

.

mreiche · May 8, 2023, 5:54pm

See the post immediately above yours.

system · August 6, 2023, 5:54pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cbc-Pillowfight test for GETS Couchbase Server	5	465	October 10, 2023
Can't seem to generate cache misses using cbworkloadgen Couchbase Server	1	1762	June 11, 2015
How can cache miss ratio be <1% but active resident ratio is only 12%? Couchbase Server	5	5236	June 6, 2016
Couhbase Cbstat Couchbase Server	2	272	September 14, 2023
Couchbase Pillowfight Rate Limit Issue Couchbase Server	4	198	August 21, 2024

Cache Miss Ratio in case Lower value of Resident Ratio

Related topics