Couchbase server High CPU with sync gateway

Hi,

I have the following installation :
1 server with Couchbase server EE 6.0.3 and sync gateway 2.5.0.
16 Gb RAM and 8 CPUs.

Since yesterday, sync gateway seems to have an heavy impact on couchbase server. I started to see a huge CPU usage on one of the cbq-engine process (500-600%), and there are like 20-30K ops. per seconds on the statistics dashboard.

It finally calmed down, I don’t know why.
I stopped sync gateway this morning in order to start a import (7k documents), and then started it again. And boom, huge couchbase server cpu usage, huge response times, no sync on the mobile (too long)

I don’t know if there’s something wrong with my configuration (everything on the same machine for now), or if it is normal and I just have to wait for SG to finish his job, or if there is a SG configuration I have to set up…

Here’s the SG sync function :

function (doc, oldDoc) {
    if (doc.sysSyncActive && doc.sysSyncChannel) {
        channel(doc.sysSyncChannel);
        access("sync_g_user", doc.sysSyncChannel);
    }
}

Any hints would be welcome
Thanks
Jérémie

It will “calm down” once the Sync Gateway has done the import processing (applying access policies and adding relevant sync metadata to the documents) . What are the response times that you are seeing? How long does it take to calm down?

Noticed that you are on EE version of Couchbase server . Not sure if you are on EE version of Sync Gateway. Please try with the latest 2.6.1 EE version of Sync Gateway and let us know what you observe.

We have been making incremental improvements to the import processing. Our recommendation is to try and batch the imports to 3K/second. At that rate, Sync Gateway should be able to keep up with the import rate while ensuring replication latencies are consistent at 3-5 seconds.

You should expect to see a lot of improvements in import throughput in our upcoming release of Sync Gateway.

I don’t know if there’s something wrong with my configuration (everything on the same machine for now),

Couple of things to note

  • Would recommend to assign access grants to the user via the REST API. It seems like you are probably reassigning the user to the same channel within the sync function. This is an expensive operation. More on this here
  • In prod, ensure you have one import node
1 Like

Hi Priya,
Thanks for your answers.

It will “calm down” once the Sync Gateway has done the import processing (applying access policies and adding relevant sync metadata to the documents) . What are the response times that you are seeing? How long does it take to calm down?

Yesterday, it took something like an hour to calm down (at least). Today, I stopped SG before it could calm down (before 2h).
We are indeed using CB and SG EE. I’ll upgrade SG to 2.6.1 and see if there’s improvements.

Would recommend to assign access grants to the user via the REST API. It seems like you are probably reassigning the user to the same channel within the sync function. This is an expensive operation. More on this here

Yes I saw that too, i’ll skip the “access” part in my sync function.

One last question : when I start SG, does it read all documents again ? I have 2930000 docs in my bucket, I was wondering if that would be another reason for my problems.

Thanks again,
Jérémie

It wouldn’t reimport unless the documents have changed. Let us know what throughput you observe with 2.6.1 but I won’t be surprised if the results are not significantly different. You are running CBS and SGW on a 16 core/ 8GB RAM machine. On an appropriately sized machine with sufficient cores/RAM, you should see expect much better results.

If possible I would recommend batch loading the initial data into the bucket so it’s imported in chunks.

1 Like

I upgraded SG to 2.6.1 and changed the way I use the access function in the sync function. I could still see CB was under load but much less than before. So far so good for now, thanks a lot.

Another question, a little out of topic : when SG updates the metadata of a document, does it trigger the eventing functions ? I think it does, but I’m asking to be sure.

Thanks anyway for all your answers.
Jérémie

AFAIK, Yes. It does. So that implies that you should take care to avoid recursive imports. For instance, when a doc is added (updated)-> eventing triggered . Sync Gateway imports doc and updates metadata. Eventing triggered again for the doc. Your eventing function should ignore the second read.

Yes, I’m trying to skip them, but I’m not sure I’m doing it right. Is there a way, in the eventing function, to know what triggered the change on the document ? Like a change in the document itself or a change in the metadata ?

Thanks !

Tagging @Siri who may be able to help with your evening question.

Hi @jeremieburtin - yes, eventing triggers when any change happens on the document, either the body or the extended attributes. Unfortunately, we don’t have the old value available, so it’s not directly possible to tell what changed. It is possible to save the document seen to another bucket using eventing, and retrieve the old saved value when the document changes next time the document changes, thereby obtaining both the old and new value. This incurs the overhead of saving the old value, but if you are okay with the overhead, please let me know, I can post an illustrative sample event handler.

Thanks,
Siri

1 Like

Hey @Siri, thanks for your answers !
Saving an history of the documents is what we are currently doing, this way we can manage when we trigger what we want in the eventing action. So glad to see we were not wrong doing so !

Thanks again :slight_smile:

Jérémie