Retrieving missed in cache keys performance

Hi all,
I’m evaluating Couchbase right now and I’m not very satisfied with the getBulk performance results for values stored in Hard Drive.
The tests are performed on the machine with 10 Gb Ram.
I’m trying to increase the speed of retrieving 1200 keys from 60 Gb bucket stored mostly in Hard Drive.
Retrieving keys which are missed in cache decreasing from 300-400 down to 30-40 keys per second with increasing size of bucket. one value is approximately 6000 bytes.
Am I correct that performance of retrieving for missed in cache keys should not decrease significantly with increasing of total keys number in bucket?
What could you suggest to increase speed of fetching data from HDD/SSD?
Thank you!

Are you having problems with just with getting the documents the first time you ask for the keys from the hard drive only? Do you get faster result the second time you call them from memory?
What method are you using to get the data from CB? SDK ? memcached protocal?

Yes, I have problems with hard drive only. At the second time when I’m calling keys from memory it works fast. I do understand that Couchbase was designed as in-memory database, but when my data grows I can’t store all the data in memory.
Could you confirm/not confirm that the delays of retrieving from Hard Drive should not grow significantly with growing of total data size?
If you confirm I will try to identify my problems with hard drive.
I use getBulk method from CB Java client.

Yes if you are seeing xx% of your calls for documents are coming from HD. Those calls will have delays no matter how big your cluster is.
If you are having concerns with cache misses and delay from HD. I would recommend you increase your working set of memory.
Go to this link it talks more about it.
Here is a link that goes into how the working set works

To add some information to HouseHippo comment, most of the time when Couchbase cluster starts to have cache misses with a high rate (based on your application) it is necessary to add more memory (to be able to deal with your working set). To achieve that you can do it in both ways

  • add more RAM Quota to your Bucket
  • add new nodes to your cluster (this has some benefits: your add more RAM but also distribute the read/write to more queues)