membase get performance improvement with more nodes?
Now I'm evaluating certain parameters of membase like scalability,availability and performance..
While evaluating the performance due to scalability,I have found some performance degradation...
My scenario is as follows
.......one membase server and multiple client request(30 request is simulated in multiple threads)
I calculated the total time for finishing all the request(request is get operation of a key that is already stored).
It is about 2400ms
.......Next I repeat the above scenario with a membase cluster with one more node(scalabilty).I expect
Cluster with multiple node can handle these requests faster than the earlier case,since all these request is
distributed to both node.Both node can serve the request because both node will contain the requested key because
of replication.
But the result is negative ...it takes 2800 ms sometimes more than that.
What could be the reason for this?How can I improve the performance?
I have some more question....
Does the membase sever handles concurrent request?should i have to do any
thing for handling concurrent client request?I'M using this set up in window server 2008 using enyim client.
What is moxi?does membase server defult moxi server support?
The code shows follows
<?xml version="1.0"?>
<configuration>
<configSections>
<sectionGroup name="enyim.com">
<section name="membase" type="Membase.Configuration.MembaseClientSection, Membase"/>
<section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler,log4net" />
</sectionGroup>
</configSections>
<enyim.com>
<membase>
<!--<servers userName="Administrator" password="123456" bucket="default" bucketPassword="">-->
<servers>
<add uri="http://10.1.27.36:8091/pools/default"/>
<add uri="http://10.1.27.113:8091/pools/default"/>
<add uri="http://10.1.26.178:8091/pools/default"/>
</servers>
<locator factory="Membase.VBucketAwareOperationFactory,Membase"/>
<performanceMonitor factory="Membase.Configuration.DefaultPerformanceMonitorFactory, Membase" />
<keyTransformer type="Enyim.Caching.Memcached.TigerHashKeyTransformer, Enyim.Caching"/>
<socketPool minPoolSize="2" maxPoolSize="100" connectionTimeout="10:00:00" deadTimeout="00:02:00"/>
<!--<locator type="Enyim.Caching.Memcached.VBucketNodeLocator, Enyim.Caching" />-->
<!--<authentication type="Enyim.Caching.Memcached.PlainTextAuthenticator, Enyim.Caching" userName="Administrator" password="123456" zone=""/>-->
</membase>
<log4net debug="true">
<appender name="LogFileAppender" type="log4net.Appender.FileAppender,log4net" >
<param name="File" value="c:\\error-log.txt" />
<param name="AppendToFile" value="true" />
<layout type="log4net.Layout.PatternLayout,log4net">
<param name="ConversionPattern" value="%d [%t] %-5p %c [%x] <%X{auth}> - %m%n" />
</layout>
</appender>
<root>
<priority value="ALL" />
<appender-ref ref="LogFileAppender" />
</root>
<category name="testApp.LoggingExample">
<priority value="ALL" />
</category>
</log4net>
</enyim.com>
<startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/>
</startup>
</configuration>
Can you post your app.config or other configuration for the client?
Perry
Seems your spam filter is blocking the XML code. Can I send you the code as an email. Please confirm
How many concurrent requests can be processed by a cluster having one node? Does it increase with the addition of more nodes to the cluster?
Sujina, I tried to push your post through, but it came up blank for some reason.
I'll add your account to our whitelist, but you can also email that to perry@couchbase.com
Perry
[EDIT] - Never mind, I see it there and will respond.
Well, the configuration looks fine, I think we need to look more closely at the overall environment.
Can you look at the ping times between your client and the Membase servers? Also, how many requests are you doing? If it's only a few thousand, you probably don't have a large enough sampling size to get an accurate measurement.
Perry
As far as concurrent requests go, Membase (and the clients) are very multi-threaded and so can handle many requests at once, and all of those requests should be responded to very quickly...allowing for even higher scale.
And yes, this increases with more nodes in the cluster since each one is only doing its own work.
Perry
Thanks for your quick reply...
>>Can you look at the ping times between your client and the Membase servers?
Below shows the ping status,
C:\Users\sujina.a>ping 10.1.27.53
Pinging 10.1.27.53 with 32 bytes of data:
Reply from 10.1.27.53: bytes=32 time=22ms TTL=127
Reply from 10.1.27.53: bytes=32 time=1ms TTL=127
Reply from 10.1.27.53: bytes=32 time=1ms TTL=127
Reply from 10.1.27.53: bytes=32 time=1ms TTL=127
Ping statistics for 10.1.27.53:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 1ms, Maximum = 22ms, Average = 6ms
C:\Users\sujina.a>ping 10.1.27.36
Pinging 10.1.27.36 with 32 bytes of data:
Reply from 10.1.27.36: bytes=32 time=5ms TTL=127
Reply from 10.1.27.36: bytes=32 time=1ms TTL=127
Reply from 10.1.27.36: bytes=32 time=1ms TTL=127
Reply from 10.1.27.36: bytes=32 time=1ms TTL=127
Ping statistics for 10.1.27.36:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 1ms, Maximum = 5ms, Average = 2ms
C:\Users\sujina.a>ping 10.1.26.39
Pinging 10.1.26.39 with 32 bytes of data:
Reply from 10.1.26.39: bytes=32 time=31ms TTL=128
Reply from 10.1.26.39: bytes=32 time=4ms TTL=128
Reply from 10.1.26.39: bytes=32 time<1ms TTL=128
Reply from 10.1.26.39: bytes=32 time<1ms TTL=128
Ping statistics for 10.1.26.39:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 31ms, Average = 8ms
C:\Users\sujina.a>ping 10.1.26.124
Pinging 10.1.26.124 with 32 bytes of data:
Reply from 10.1.26.124: bytes=32 time=1ms TTL=128
Reply from 10.1.26.124: bytes=32 time=8ms TTL=128
Reply from 10.1.26.124: bytes=32 time<1ms TTL=128
Reply from 10.1.26.124: bytes=32 time<1ms TTL=128
Ping statistics for 10.1.26.124:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 8ms, Average = 2ms
please check it...
>>how many requests are you doing?
We are having only 100 request.
>>If it's only a few thousand, you probably don't have a large enough sampling size to get an accurate measurement.
we don't understand this point.Could you please tell What is this samping size?
During our analysis we noticed the following,
Initially we have a cluster with one node(10.1.27.36) having 'X'item count. we added one more node(10.1.26.39) to this cluster. After rebalancing, the membase console shows both ActiveItem and ReplicaItem count is '0' in node 10.1.26.39 and the ActiveItem and Replica Item count on node 10.1.27.36 is 'x' and 'y'.
Is there any problem in rebalancing or membaseconsole?
we need one more clarification, we are requesting an item from 100 clients which is an ActiveItem of 10.1.27.36 within a cluster having nodes 10.1.26.39 and 10.1.27.36. Our assumption is the request will handle by both the nodes(let us say 50 request by 10.1.26.39 and 50 request by 10.1.27.36). Is our assumption correct?
But when we checked with membase console, we found that all the GET operations are performed only in one node.In this scenario does we get performance benefit due to multiple nodes?
Sujina, those ping times look "okay" but not great. Something in the network is causing 4-8ms of delay which is going to greatly slow down Membase's performance. Considering we are expecting to return the information in 1-2ms, this is a 4x slowdown.
To your other questions,
-By sampling size, I am referring to proper benchmark techniques of repeating the test across multiple thousand of requests and averaging the results from doing that multiple times. If you only have a small number of requests, a single "outlier" will cause the overall result to look much higher and be incorrect. Membase is designed to provide 1-2ms latency at the 99th percentile of requests.
-The item count information there sounds quite incorrect. Did rebalancing actually happen (the progress bars in the UI?)? Are you using any 'flush' commands? We've resolved a few bugs with that (specifically around replication) in the latest version (1.7.1) so I would try with that. It also seems that these nodes might be on separate subnets. Do you have firewalls or routers that might be conflicting with the traffic being sent over? You need to ensure that the correct ports are open (http://www.couchbase.org/wiki/display/membase/Membase+Behind+a+2nd+Firewall)
-A given key is only "active" on one node at a time, regardless of replication. The performance and scale benefit that you get with Membase is from accessing many keys and all of those requests will be spread across multiple nodes.
Perry, Thank you very much for the reply. Again I need some more clarifications.
1. I have multiple nodes in membase cluster(say 4 nodes).How will be the communication between these different nodes?.
2. If I request for multiple store operation simultanoiusly, some of them are failed. Is this due to any configuration mistake?(For example, for 10 requset only 7 - 8 and for 50 request 40 - 43 are executed.)
3.Our assumption is, CACHE HIT RATIO should be increased with the addition of more nodes in a cluster. Is our understanding correct? Could you please suggest a way to verify this?.We stored 1000 items in membase with 1 node and we requested for the same 1000 items. In membase web console, we noticed a graphical representation for cache miss ratio and disk reads per second. When we tried this with multiple node the cache miss ratio is some what same as previous one but the disk reads per seconds showed some veriations. Could you please help us to calculate the cache hit ratio from this?
What does this "ep_bg_fetches" means?
4.When we requested for 100 items simultaniously, some time our test application failed with an error message "Membase has stopped working" and some times with an error "VBucketAwareOperationFactory does not have a default constructor". Why this happening?
1. I have multiple nodes in membase cluster(say 4 nodes).How will be the communication between these different nodes?.
[pk] - There's a layer of "management" communication on port 8091 for the Erlang processes...very low bandwidth. There will also be replication traffic which will account for much more of the actual network usage. This is based somewhat on your incoming write rate multiplied by the number of replicas you have. Gets are not replicated. Does that answer your question?
2. If I request for multiple store operation simultanoiusly, some of them are failed. Is this due to any configuration mistake?(For example, for 10 requset only 7 - 8 and for 50 request 40 - 43 are executed.)
[pk] - This would not be expected if you're using the standard 'store' command (just a memcached 'set'). Can you post any error message you're getting from those?
3.Our assumption is, CACHE HIT RATIO should be increased with the addition of more nodes in a cluster. Is our understanding correct? Could you please suggest a way to verify this?.We stored 1000 items in membase with 1 node and we requested for the same 1000 items. In membase web console, we noticed a graphical representation for cache miss ratio and disk reads per second. When we tried this with multiple node the cache miss ratio is some what same as previous one but the disk reads per seconds showed some veriations. Could you please help us to calculate the cache hit ratio from this?
[pk] - I would expect that to be correct. Cache hit ratio describes how many requests are being serviced from disk as opposed to RAM...more RAM should increase that. However, if you're comparing 1 node to 2 nodes, you've likely not actually increased the amount of RAM available. This is because 2 nodes would make use of replication (effectively doubling the amount of data you're storing) and so not actually increase any capacity. To verify this test, you can either a) add a third or more nodes b) increase the RAM quota of the bucket instead of just adding nodes or c) disable replication and test 1 node to 2 nodes. Keep in mind that with 'b', any data already on disk would have to be read into RAM first...so you should repeat the test multiple times to load the cache.
What does this "ep_bg_fetches" means?
[pk] - That is the total count of items that were serviced from disk.
4.When we requested for 100 items simultaniously, some time our test application failed with an error message "Membase has stopped working" and some times with an error "VBucketAwareOperationFactory does not have a default constructor". Why this happening?
[pk] - That sounds like an error in the client, can you ensure that you are using the latest version and post logs if it still exists?
Thanks!
Perry
Definitely seems like something is off there. When using the latest Enyim client (and the Membase configuration section), you don't need to be concerned with Moxi at all.
Can you post your app.config or other configuration for the client?
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!