Strange behavior when one node is down
I'm testing our php app in couchbase as a drop-in replacement for memcache. I didn't change the code, so still using pecl/memcache extension (note I use memcache extension, without "d").
Actually detect a issue when a node go down.
This is the case:
- Couchbase 1.8.1 Enterprise Edition, 3 node cluster (A,B,C), automatic failover (30 seconds).
- Storing on couchbase bucket-type with replication (1 copy) on port 11214
- Cliente: PHP 5.3.2 with PECL/memcache 2.2.5
- Test case:
* php script:
+ Set 10 key/values. Use this code to connect (I use addServer as I used on memcache; i know this is not necessary):
(...) $memcache = new Memcache; $memcache->addServer('A',11214); $memcache->addServer('B',11214); $memcache->addServer('C',11214); for ($i=1;$i<=10;$i++) { $memcache->set('key_'.$i,'content of key '.$i,false,3600) or die ("Fail to set"); }; (...)
+ Loop 1000 times with one second sleep:
(...) for ($a=1;$a<=1000;$a++) { $memcache_r = new Memcache; $memcache_r->addServer('A',11214); for ($i=1;$i<=10;$i++) { echo $memcache->get('key_'.$i); }; $memcache_r->close(); unset($memcache_r); sleep(1); } (...)
I start running this code and see every one second all the keys printed.
I'm let the script to run several seconds (I'm assured than all the sets go ok and I get all the keys at least once) and shutdown couchbase on server B.
Now, the strange behavior:
- Since couchbase on server B is shutted down, I get two types of results:
* result 1: Fail all gets
* result 2: Get ok only one key, and the rest of keys fail.
I get this results randomly, for example, I get 3 "result 1", then 1 "result 2", then 2 "result 1" an so.
- 30 seconds later, couchbase mark server B down, make automatic failover and all the keys printed again.
Is this a normal behavior?
This is because I am using memcache libraries?
When server B is down, why sometimes get result and sometimes not? I expected not to get some keys (keys who are living on server B), but I get sometimes one get ok and sometimes all the get fails. And I don't know why.
Thanks in advance
Fernando.-
But, until autofailover, I get no inconsistents results. You can see the code:
(...) for ($a=1;$a<=1000;$a++) { $memcache_r = new Memcache; $memcache_r->addServer('A',11214); for ($i=1;$i<=10;$i++) { echo $memcache->get('key_'.$i); }; $memcache_r->close(); unset($memcache_r); sleep(1); } (...)
On this test, cluster is server A,B,C.
Only for test purposes, when make a set, I connect to the cluster :addServer('A',11214), addServer('B',11214), addServer('C',11214)
Only for test purposes, I made a loop and make all the gets every one second. When make a get, I connect to server A only, and make all the get request. Everything go ok.
When I check everything is fine, I shutdown server B, and start to get this strange behavior: sometimes get only one key, sometimes get no keys at all.
I understand this if I get only one key result and this was consistent (I get one key all the time until autofailover). But this is not I get, I don't get consistent result.
Of course, before 30 secs of autofailover, everything go ok again.
I'm not sure if I get your code. Why do you connect to the memcache server in the loop? Shouldn't this be done outside of it?
Also, do you use MOXI do connect to the cluster?
I connect in the loop because is a test and I try to simulate connect/disconnect, like I have on production environment with 10+ webservers.
Yes, I use moxi on server side (couchbase default installation). No moxi setup on client/webserver.
Yes, this seems fairly normal. Since only some of the items are mapped to server B, if your get() happens to try to fetch items that are only on server A and C, then you'd get results just fine. Only when you hit server B will you have failures... until autofailover kicks in and then all data should be available-- as long as it was replicated.