Enyim's Membase client not able to reconnect to Membase servers
Hi,
I'm seeing some strange behaviour with the Membase client from Enyim. When using a single instance of the client, the client is not able to reconnect to server when connection is lost. It is easily reproducible by running work on the client while restarting the Membase server service. The client is never able to reconnect (at least not after the 20 minutes I left it running.)
When using a new instance of the client on each run, it will reconnect, but that is not the recommended setup according to [URL=http://github.com/enyim/EnyimMemcached/wiki/MembaseClient-Usage]http://github.com/enyim/EnyimMemcached/wiki/MembaseClient-Usage[/URL]
Membase server version is 1.0.3
Enyim Membase client is the official 2.4 from enyim.com.
I'm running one instance of the Membase server localy, but see the same behaviour in our webfarm with a Membase cluster with two servers.
I also tested the Enyim memcached client on the same server, and it is able to reconnect after some time (a few mintues?).
-Kenneth
I observed the same behaviour. I believe previous versions back to 2.0b have the same issue. One big problem this may cause is that when the host server is restarted (let's say after OS updates), if the cache client makes a request before the cache server can handle it, client cannot access it forever until it is restarted. And if the cache is being used in a critical manner (i.e. not as a side-cache) then this possibility becomes a serious problem.
If the client had thrown an exception when the server is unreachable under this circumstance, it would be easy to handle that and initialize a new client. But it simply returns null and - as far as I know - there is no way to distinguish this from a cache miss.
Would love to hear a workaround or have this behaviour changed to raise an exception.
Ilhan
In testing, it's been shown that the client will reconnect after 2 minutes (it's configurable, but this is the default). The default is being lowered to 10 seconds which we believe makes much more sense.
Ilhan, can you confirm whether this is the same behavior that you are seeing? And if you lower the reconnect time, does that help mitigate the issue?
Perry
Thanks for your reply Perry, I wasn't aware of that setting and will look further into how we can use connectionTimeout and deadTimeout settings more effectively.
Does NorthScaleClient support "deadTimeout " as well? It's not mentioned here: [url]http://github.com/enyim/EnyimMemcached/wiki/NorthScaleClient-configuration[/url]
I'm not able to test and confirm if this is the same behaviour we are seeing at the moment but will let you know as soon as I got my hands back on this subject.
Yes, the NorthScaleClient does supports the "deadTimeout". It's relatively new (got implemented after 2.4) and will be available in the next release.
When a pool url fails the client tries to connect to the next in the list for up to date config information. when all urls fail*, the config listener sleeps for deadTimeout time, then tries to reconnect the first url in the list again.
* this does not mean that all servers are dead, just the ones the client tries to get the config from
Hope that clears things up, let me know if there's anything else I can do.
Perry
Thanks Kenneth, I'll take a look at this and get back to you.
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!