Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.6.x

Single node membase

6 replies [Last post]
  • Login or register to post comments
Tue, 05/24/2011 - 11:30
arash
Offline
Joined: 05/24/2011
Groups: None

Hi
We have a small infrastructure and I am looking for a key-value database solution and first thing comes in mind is membase. I understand that this solution will give the opportunity to scale much easily in future when we have more boxes for this purpose. My question is does Membase provide any other advantage - speed, throughput etc - on single node comparing with a RDBMS?
thanks,
arash

Top
  • Login or register to post comments
Tue, 05/24/2011 - 15:41
perry
Offline
Joined: 10/11/2010
Groups:

Performance, scalability and schema-free development are the main advantages over an RDBMS. I can go into more detail, but if a high performance, highly available, scalable database is what you're looking for, Membase is definitely the right choice.

Can you go into some more details about your use case and what your requirements are in terms of sizing and performance?

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Wed, 05/25/2011 - 03:03
sujina
Offline
Joined: 05/24/2011
Groups: None

I go through the answer..It is mentioned that it is scalable...but scalable in the sense of memory or number of servers?when we add more server the total bucket size increases,but memory is utilized for storing the same data
for supporting replication.Does my understanding is correct?

I want to implement the membase for getting high performance and availability.Initially I want to use memcached which will high performance since the data is retrieving from cache.But I cannot store the data items more than 1MB.I tried with the latest version memcached1.4.5.but failed..Now I choose membase since it allows to store data items of 20MB.But I'm confused with its memory consumption and performance issue..

could you please tell how it achieve high performance?

What to do for avoiding persistent storage?I'm using oracle database as back end storage system and use membase for storing data in cache.So that next time it will retrieve from cache.So speed increases..Does membase meet my requirement?

Then while I'm using membase it is showing an error that VbucketNodeLocater does not have default constructor when I'm trying to retrieve the data items from slave server by stopping the service of master.Is there any fix?

please help..

Top
  • Login or register to post comments
Fri, 05/27/2011 - 11:14
perry
Offline
Joined: 10/11/2010
Groups:

Answers inline:

I go through the answer..It is mentioned that it is scalable...but scalable in the sense of memory or number of servers?when we add more server the total bucket size increases,but memory is utilized for storing the same data
for supporting replication.Does my understanding is correct?
[pk] - I mean both memory and number of servers. As you add more servers, the total amount of RAM available to the cluster (and each bucket) will increase. When you go from 1 to 2 servers, replication will kick in (if it is configured) and so you won't really gain much space. Adding more and more servers will spread all the data (both active and replica) across the cluster which increases your overall capacity.

When I referred to "scalability", I really meant that the software in general is much more scalable than an RDBMS. With Oracle/MySQL/etc you don't have the ability to simply add more servers to increase capacity...you have to go through a potentially painful sharding process that requires taking down the application and making code changes. Membase removes all of this pain.

I want to implement the membase for getting high performance and availability.Initially I want to use memcached which will high performance since the data is retrieving from cache.But I cannot store the data items more than 1MB.I tried with the latest version memcached1.4.5.but failed..Now I choose membase since it allows to store data items of 20MB.But I'm confused with its memory consumption and performance issue..
could you please tell how it achieve high performance?
[pk] - I'm not sure exactly what your question here is. You are correct that Membase allows you to store items larger than 1MB (a limitation with memcached), but this doesn't really change the performance that it provides. We have made a number of improvements to memory usage (removing the slab allocator of memcached is a major one) but continue to serve data similar to memcached which is where we get our high performance from.

In general, I would recommend trying to keep your items as small as possible. This will decrease the amount of time it takes to transfer them over the network. While we support larger items, you will get better performance from smaller ones.

What to do for avoiding persistent storage?I'm using oracle database as back end storage system and use membase for storing data in cache.So that next time it will retrieve from cache.So speed increases..Does membase meet my requirement?
[pk] - Yes, it sounds like it does meet your requirements. Membase can be used both as a persistent database and as a persistent cache. If you don't NEED to store the data in Oracle, then you may be able to use Membase only. Since we keep as much data in RAM as possible, you can still use Membase as a cache.

Then while I'm using membase it is showing an error that VbucketNodeLocater does not have default constructor when I'm trying to retrieve the data items from slave server by stopping the service of master.Is there any fix?
please help..
[pk] - The current version does not perform "automatic failover" so when you stop one of the servers, you need to press the "failover" button in the UI. Future versions will implement this automatically.

Let me also say that Membase does not have a "master" and "slave". All nodes are equal and all are serving their own dataset while also storing replica data from other servers.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Thu, 06/09/2011 - 05:22
prasad
Offline
Joined: 06/09/2011
Groups: None

Edit: Double-posted at: http://www.couchbase.org/forums/thread/membase-bucket-performance-concer...

See above thread for response

Perry Krug

Top
  • Login or register to post comments
Tue, 07/05/2011 - 07:09
prasad
Offline
Joined: 06/09/2011
Groups: None

>>.......Adding more and more servers will spread all the data (both active and replica) across the cluster which >>increases your overall capacity.

How does it increases the overall capacity( both RAM and disk)?
When I configured two nodes to form as "join cluster", overall RAM quota and Disk usage increased. But however, whenever new store operations come, membase will keep the new data in both RAM and disk on both nodes. So as my understanding, there will be no benifit of memory usage.

If you suggest a better configuration way to get the benefit of memory and performance, it will be useful
I mean, Is there any way to free the memory of replica after saving the data to disk only without data lose?

>>.......Also there is no partial sharing of data between nodes so that when a get operation comes, data can be retrieved in parallel from both...So no performance gain by adding more nodes...

How I can get the benefit of performance than replication while using joined cluster?

or
The performance benefit is only when multiple client request comes?Does it choose the free node when a client request comes?Which algorithm is using for choosing the servers when a request comes?If store and replication are doing Simultaneously always all nodes are busy...Is right? or if node is busy ,replication task is keeping as pending task?

Joined cluster configuartion set up or multiple single cluster set up with single node is good?

Could you please reply.

Top
  • Login or register to post comments
Tue, 07/05/2011 - 15:48
steve
Offline
Joined: 03/15/2010
Groups: None

>>>.......Adding more and more servers will spread all the data (both active and replica) across the cluster which >>increases your overall capacity.
>How does it increases the overall capacity( both RAM and disk)?
>When I configured two nodes to form as "join cluster", overall RAM quota and Disk usage increased.
>But however, whenever new store operations come, membase will keep the new data in both RAM and disk on both nodes.
>So as my understanding, there will be no benifit of memory usage.

Hi Prasad,

Certainly when you grow from a single node to two nodes, and have a replica count of 1, it may initially seem like nothing's changed. However, with the additional RAM & disk that the new node brings, you'll gain the benefits of actual replication kicking in.

In addition, requests for items are spread across two nodes. Node A, for example, would be the primary server for half of the vbuckets, so requests for those vbuckets would go to Node A. And, Node B (the node you just added to the cluster) would be the primary server for the other half of the vbuckets. So requests for those vbuckets would go to Node B.

Imagine for a second that your application has a read-heavy access pattern. The GET's are now split across 2 times the number of CPU's, memory channels, and network interfaces, so you'll get the benefits of that situation.

>If you suggest a better configuration way to get the benefit of memory and performance, it will be useful
>I mean, Is there any way to free the memory of replica after saving the data to disk only without data lose?

Membase handles this automatically for you. That is, there's not an API approach to have Membase free memory of replicas after replicated items have been persisted to disk.

>>>.......Also there is no partial sharing of data between nodes so that when a get operation comes,
>data can be retrieved in parallel from both...
>So no performance gain by adding more nodes...

Just to clarify, I hope it's clear there is performance gain by adding more nodes.

>How I can get the benefit of performance than replication while using joined cluster?
>or
>The performance benefit is only when multiple client request comes?

The performance benefit is certainly clearer when there's lots of concurrent requests on multiple connections all making demands of a Membase cluster. That is, if you have 50+ web app servers, each with 20 to 200+ worker processes, then all that concurrent client demand would benefit from hitting a Membase cluster of 2 nodes rather than a cluster of 1 node.

>Does it choose the free node when a client request comes?
>Which algorithm is using for choosing the servers when a request comes?

Membase uses a hashing algorithm that's described here...

http://dustin.github.com/2010/06/29/memcached-vbuckets.html

Also, more info here...

http://www.couchbase.org/wiki/display/membase/vBuckets

In that, you'll see that Membase clients (and the moxi proxy) are not choosing a free node when a client request comes in. Instead, the client (or moxi proxy) is choosing a the primary server that owns the vbucket where the item should be living.

>If store and replication are doing Simultaneously always all nodes are busy...Is right?
>or if node is busy ,replication task is keeping as pending task?

Yes, if I understand the question correctly -- when you put load onto a Membase cluster, it will have asynchronous persistence and asynchronous replication activity in the background.

>Joined cluster configuartion set up or multiple single cluster set up with single node is good?

Membase was designed to scale by having its nodes clustered. It's a key feature of Membase and how it was meant to be used, in contrast to instead having individual nodes not clustered.

You could always have more than one cluster, of course. Some users deploy an independent cluster of Membase for each one of their online properties, which is a useful and pragmatic approach.

Cheers,
Steve

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker