We have performed some tests for comparing the performance of Couchbase and MongoDB.
The results were:
Test 1: Getting documents by their ID. In this case Couchbase performed around 21K operations per second, and MongoDB performed around 9K operations per second.
Test 2: Searching documents by one of their fields, having an index on that field. In this case Cauchbase 4.5.1 Enterprise Edition performed around 2K op/sec, Cauchbase 4.1.0 Community Edition performed around 0.5K op/sec, and MongoDB 3.2.10 Community Server performed around 9K op/sec.
For improving the performance of Cochbase in Test 2 I have set up adhoc=false (prepared statements) and increased the number of Query (N1QL) endpoints to 100.
This is the index I’m using: CREATE INDEX firstname_index ON default(firstname) USING GSI
The query I’m doing, using the Java API is:
final N1qlQuery query = N1qlQuery.parameterized(“SELECT firstname FROM default WHERE firstname = $1”,
JsonArray.from("Walter " + id),
Regarding the index setting, I have tested both Standard Global Secondary Indexes and Memory-Optimized Global Secondary Indexes, without noticing a major difference.
The Data RAM Quota is only 2048 MB but only 72.2 MB are in use.
For ex, below are few points to high-light the importance and non-trivial nature of this problem:
query is cpu/mem intensive, where as index is also disk intensive.
If your data is sized, to say 4 nodes, but you want to scale your queries, then you need to plan more query/n1ql nodes.
If you carefully design your queries to use covering indexes, then you can avoid unnecessary load onto your data nodes. Then, you can scale index nodes based on number of indexes you have (and number of replica/duplicate indexes etc).
And, don’t ignore the load an index node (and every index) adds to the whole system (and data nodes), especially when you have high mutation rates to your documents.
running all services on all nodes is simple, but its not hard to visualize how it hurts scaling, and causes unnecessary resource contention between various services. For ex: if you have 4 important indexes, what’s the point in running index service on all 10 nodes. Note that, typically, one index can node can easily serve multiple query/n1ql nodes.
Do you mean that it’s fine to use regular AWS EC2 (shared hardware) VMs?