Should I split my data into multiple buckets?

nozzer · December 9, 2015, 12:23pm

Hi folks,

I have 2 document types that are linked by a common key (customer id). They have different usage properties though, so I was wondering if I would benefit from provisioning separate buckets for them.

Customer Record - this will be read frequently, but updated infrequently. I will probably use strong consistency so I can read my writes. The average doc size will probably be about 10kb, and let’s say I expect 1 million+ records
Customer Event - this will be read infrequently, but will require fast writes. I anticipate these documents being immutable so each write is an insert, not an update. The average doc size will probably be about 1-2kb, and let’s say we get 1million+ records per day, and we retain 1 years worth. It is possible this data will be synchronised with Apache Spark for real-time analysis purposes. Some events will necessitate an update to the related Customer Record

There are some use cases where I will need to join across the 2 documents, they are not performance critical so a union (or 2 separate queries and an in-app join) would probably be OK.

What do you think? Will the different usage mean that performance of both document types are compromised if they are in the same bucket?

I’m using enterprise edition 4.00, but I would be interested to know if the advice would be any different for the community edition since it scales differently

thanks

Nozzer

WillGardella · December 9, 2015, 8:30pm

Hi Nozzer,
You probably want two buckets so that you can manage them separately as they grow. The resource quotas can be different for each bucket, which might be useful if you expect very different sizes, growth and traffic patterns.

BTW, I’m the PM for the Couchbase Spark Connector, so I would be interested to hear more about your experience with it once you start using it. Let me know if you’re willing to share feedback.
Best,
-Will

nozzer · December 10, 2015, 8:49am

Thanks Will.

I’ll let you know how I get on with the spark connector

Nozzer

Topic		Replies	Views
Best Approach for Large Bucket(s) Design Couchbase Server	2	527	March 15, 2021
To split or not to split up data between buckets Couchbase Server	0	859	June 22, 2018
Multiple Buckets Vs Single Bucket Couchbase Server	1	1184	July 26, 2018
One bucket with multiple document kinds vs one bucket per document type Couchbase Server	0	561	June 10, 2019
Multi buckets Versus Single bucket, best practice/recommendation? Couchbase Server	5	8912	May 27, 2019

Should I split my data into multiple buckets?

Related topics