Membase Bucket - Performance Concern with multi node setup
I was looking for a distributed key value database as cache for our existing RDBMS project(it lacks performance now..) and found membase as a reliable solution. I started using it membase 1.7. But when I compared performance of membase cache with existing RDBMS database cache by storing and getting large records, iam getting poor performance with membase cache. Testing scenario follows;
I configured servers such a way to form 1 clustor (one master server and one replica server which is created as "join clustor" say for example server1 and server2 - both with membase bucket name "testbucket"). The configuration file is :
When I stored 500 data( say key as "testkey1" ), it always stores in server2(replica server) and when I stored 1000 data ( say key as "testkey2" ), it always stores in server1. I checked it many times by swapping uris ( first server2, then server1) , clearing cache, restarting membase servers. But all are in vein. still 1000+ records always stores in server1 irrespective of configuration settings.
Using existing RDBMS cache:
Step1: Clear all RDBMS cache
Step2: Get(testkey1)- for the first time,as there is no items in RDBMS cache, it will fetch(500 items) from my RDBMS databse.
Step3: Again call Get - as there is item in RDBMS cache, it will fetch from cache. Say time for Getting from cache: 50ms
Step4: Clear all RDBMS cache
Step5: Get(testkey2)- for the first time,as there is no items in RDBMS cache, it will fetch(1000 items) from my RDBMS databse.
Step6: Again call Get - as there is item in RDBMS cache, it will fetch from cache. Say time for Getting from cache: 100ms
Using Membase cache:
Step1: Clear all membase cache
Step2: Get(testkey1)- for the first time,as there is no items in membase cache, it will fetch(500 items) from my RDBMS databse. After that it will store this data to membase cache.
Step3: Again call Get - as there is item in membase cache, it will fetch from cache. Say time for Getting from cache: 100ms
Step4: Clear all membase cache
Step5: Get(testkey2)- for the first time,as there is no items in membase cache, it will fetch(1000 items) from my RDBMS databse. After that it will store this data to membase cache.
Step6: Again call Get - as there is item in membase cache, it will fetch from cache. Say time for Getting from cache: 50ms
In both cases, my perfomance application runs in Server1(master machine)
Although for large records (1000), there is significant performance improvement (over 50%) when using membase cache got, it is noted that for 500, membase cache has poor performance (degradation) compared to RDBMS cache. When I checked the membase console, i found out the reason that testkey2(1000 item key) always stores to server1 and testkey1(500recods) stores in server2(replica) server always. So the reason for much gain of performance may be due to fetching of data from the same machine(server1) that my performance application runs. Because degration happens in fetching data via network ( as my application is in server 1 and 500 data stored in server2). I checked it many times by swapping uris ( first server2, then server1) , clearing cache, restarting membase servers. But all are in vein. still 500+ records always stores in server2 irrespective of configuration settings. So how can I gain or trust high performance in distributed enviornment if I have more number of clustor node and replica nodes? I heard that the internal logic of membase server decides the storing of items in different nodes ( based on which parameters?). Could you please reply?
Hi perry, please find the answers inline.
>>What kind of cache is your "RDBMS cache"?
We are using oracle now. Cache in the sense that oracle maintains one buffer_cache for recently executed queries ( for particular records) so that the next time it will fetch from this buffer cache if same query is executed for the same no. of records
>>Are the Membase servers sitting on the same LAN as your clients? >>Are you in "the cloud"?
Our membase servers are inter connected via LAN network. and my client app resides in one of the membase servers (as mentioned earlier)
>>What language is this written in and are you using a client-side Moxi: http://www.couchbase.org/wiki/display/membase/Moxi
We are not using any moxies. our client app is written in C# and we are storing and fetching data using enyim 2.8 library ( using MembaseClient class)
>>I suspect that there is something wrong with your Membase configuration since the data (and it's replica) should be evenly distributed across both nodes. Membase does not have a concept of a "master" or "replica" node, they are all equal.
As mentioned in my earlier post, currently we are using 2 membase servers ( one as just replicated server - for evaluating the fault-tolerence or fail -over capacity of membase). But as my understanding, membase will utilize both of these servers.
>>Can you post a screenshot of your Membase UI when browsing to the "manage servers" screen?
I am attaching screen shots of my Membase UI
http://www.couchbase.org/forums/sites/default/files/u1001411/membase_servers.jpg
http://www.couchbase.org/forums/sites/default/files/u1001411/membase_servernodes.jpg
http://www.couchbase.org/forums/sites/default/files/u1001411/membase_clusteroverview.jpg



Please check and reply..
Could you please check it?
Thanks Prasad, the screenshots look correct...interesting that you only have 1 item in Membase, but that should still be fine.
Can you please post your Enyim configuration? You might also want to consider upgrading to the latest version (2.11 I believe).
Perry
http://www.couchbase.org/forums/sites/default/files/u1001411/app.config.jpg
For some reason I'm having problems replying within my own forum :-(
Can you confirm for me which configuration section you're using? If you're not using the "membase" one, can you please try that?
Also, if it's still not working, can you take a brief packet capture of this test? I'd like to see how fast the Membase server is actually responding with data to your requests to try and pinpoint where the issue might be.
Thanks
Perry
I am using membase config section. By the way how can I take packet capture? could you please guide me?
I really suspect that there's something else going wrong with your application, there's no reason that a Membase get would take >2ms, let alone 50-100.
Considering these are on the same machine, there shouldn't be any network delay. You can take a packet capture with wireshark: http://www.wireshark.org/
You may also want to upgrade to the latest Enyim client which has some performance monitoring capabilities: https://github.com/enyim/EnyimMemcached/wiki/Configure-the-Performance-M...
Perry
Hi Prasad, happy to help here. What kind of cache is your "RDBMS cache"?
In general, you should not be seeing anywhere near 50-100ms response times from Membase so I'm sure we can get it to be much better.
Can you please also describe your environment a bit more:
-Are the Membase servers sitting on the same LAN as your clients?
-Are you in "the cloud"?
-What language is this written in and are you using a client-side Moxi: http://www.couchbase.org/wiki/display/membase/Moxi
-I suspect that there is something wrong with your Membase configuration since the data (and it's replica) should be evenly distributed across both nodes. Membase does not have a concept of a "master" or "replica" node, they are all equal. Can you post a screenshot of your Membase UI when browsing to the "manage servers" screen?
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!