I have tested different sync_gateway configurations and found out that in our use case the absolute winning configuration is to use separate SG for pull and push. And configure them so that push sg process has high priority and pull low.
We are seeing 10x performance improvements using this setup, because now push-events will occur at higher priority than pull and they stop causing more “entropy” to the whole system later on when every client is just pulling. I am using about 500 clients to test this.
I am wondering if there is already some configuration option in SG that could be used to adjust push & pull priority according to the load application use case is causing?
One other interesting observation in this test is that sync_gateway takes about 80% of CPU all the time and couchbase server is doing very little work (like 7% cpu)! This is also interesting to me, that why is it actually that sync_gatweway is such a bottleneck?
Hi there. This is a really interesting post! Am I right in thinking you have 2 separate sync-gateway processes? Are they on 2 different machines? And for replication each couchbase lite client connects to both sync-gateways, one for outbound changes, and one for inbound changes?
What kind of throughput are you testing with (i.e. what’s the read/write load for each of your 500 clients?)
It’s true that push and pull functionality are basically distinct code paths within Sync Gateway, and when running as a single process they are going to be competing for CPU/memory. Based on that, splitting the load to two different machines is going to show improvement. However, I’m surprised that you’re seeing a significant benefit when running multiple sync_gateway instances on a single machine. If you can share some more specifics, I’d like to dig into this a bit. Can you share:
The machine specifications
What version of Sync Gateway you’re running
The read/write load (writes/second, reads/second)
What metric you’re using for the “10x performance improvement”
Clients: osx, Server side: Single windows 10 machine, with Intel i7 4770k, 16 Gb ram, SSD disk
(have to check this later)
I measure the time it takes to run the whole test and finally wait until all 500 DB:s are replicated to equal state
The test generates 1 doc for each 500 dbs, then replicates all to every node and then every node makes a change. So, there will be ton of conflicts and lots of traffic to replication.
My current understanding is that the the setting of process priority is the key here, why 2 SG-setup is so much faster. Every push that reaches server, causes 499 more pulls. So if all the pushes are done first as highest priority, it will make the whole process way faster… And main cause for this post from me was that I started considering if this would be even generally a usefull setup? Would it be so that in many use cases a single push causes the whole distributed system to ripple?
That helps clarify the scenario - a few more followup questions, if you don’t mind. I’m trying to sort out if there’s something else going on in the sequence of the updates that’s causing the difference in performance, aside from just the Sync Gateway split.
When you say the pushes are ‘done first as highest priority’ - are you actually giving the push replication any priority from the client side? Or do you just mean that there’s a SG node dedicated solely to push replications?
Are all your clients making the same change to the docs they pull (so that duplicates get ignored), or is the expected result 500 docs with ~498 conflicts per doc?