doc inserts suddenly slows down after 14million records
Hello,
I was trying out this dev preview and face some problems when loading lots of data..
I used the default bucket, set it as persistent to disk type (couchbase type if i remember correctly) and allotted 2GB of RAM. I only have 1 node and I turned off replication. I was trying to load around 68 millions records of data on couchbase server 2.0, but around the 14M mark it starts to slow down. At this point the size of the data is arond 6+GB

interestingly the ops / sec is not totally 0, but it is noticeable that it went down

I tried to see what the memcached server is doing thru strace but it's just looping thru this
[root@hp2u test_tracking_dbs]# strace -p 19924
Process 19924 attached - interrupt to quit
clock_gettime(CLOCK_MONOTONIC, {5636250, 808253985}) = 0
epoll_wait(3, {}, 32, 829) = 0
clock_gettime(CLOCK_MONOTONIC, {5636251, 637424985}) = 0
epoll_wait(3, {}, 32, 1000) = 0
clock_gettime(CLOCK_MONOTONIC, {5636252, 637523985}) = 0
epoll_wait(3, {}, 32, 1000) = 0
clock_gettime(CLOCK_MONOTONIC, {5636253, 636993985}) = 0
epoll_wait(3, {}, 32, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {5636253, 637960985}) = 0
epoll_wait(3, {}, 32, 1000) = 0
clock_gettime(CLOCK_MONOTONIC, {5636254, 638464985}) = 0
epoll_wait(3, {}, 32, 1000) = 0
clock_gettime(CLOCK_MONOTONIC, {5636255, 638872985}) = 0
epoll_wait(3, <unfinished ...>
other than adding more nodes, I am wondering what other things i could try to tweak from the server?
from my understanding there should be some setting here that could help, since I've tested loading this data to a couchbase-single-server 1.1 and was able to load the complete data at a rate of ~4k docs/sec.
thanks in advance!
update: i've attached some sample charts from the server on this topic, but it doesnt display when i hit save..
when i click edit..it's back again..
Unfortunately, the problem with your strace, is that you're likely only following one thread. WIth strace on Linux, you have to specify each thread's OS LWP pid to see what they're all doing.
With regard to your workload observations, my guess is that you got that far since during the data load, the system was also pushing data out to disk. After you used available memory, the system would then eject items already persisted. This is why you were able to get well beyond the 2GB of RAM before you'd see a slowdown (Couchbase was doing things in parallel for you!), but as you'd eventually overwhelm RAM, the system starts sending temporary out of memory errors to your client. You may be able to see that in the graphs.
On comparing to the Couchbase Single Server 1.1, we're still working on some performance related tasks in Couchbase Server 2.0 (which is different than Couchbase Single Server 2.0). I bet we'll close that gap and you'll see a slightly different pattern, but a similar or superior throughput from Couchbase Server 2.0 once released.
p.s.: sorry for the trouble posting charts. we'll look into that.