I have a 3 node cluster running on m5a.8xlarge instances on AWS (32 cores, 128 GB). There is about 1.35 billion documents loaded into Couchbase on a single bucket.
I run into a few issues with loading the data (Couchbase kept using too much memory with the 90% of memory). I’m currently running with 82 GB memory quota for the bucket.
I’m using Couchbase community 7.0 beta to try things out.
I defined a couple of indexes on the data:
- Primary index (
CREATE PRIMARY INDEX ON BucketName
) - Secondary index (
CREATE INDEX
users_by_email_and_nameON BucketName (
email,
username) as u WHERE u.
$meta.
$type= 'Users'
)
It has been taking forever for them to index.
It seems that it is indexing at a rate of < 4K / sec at max, and usually at half that.
The nodes aren’t really busy, max CPU usage is around 12%.
I tried giving the indexing process more memory (gave it 20GB), but it didn’t seem to help any here.
I tried creating the indexes while I was inserting the data, but due to the memory usage I had node failures. After I sorted this out the index was stuck on the warmup
state for over 6 hours and I dropped and re-created it.
Is there something that I’m missing with how to make it go faster?
Is this behavior expected? Am I doing something wrong?