Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Couchbase | Couchbase Server 2.0

Scaling index reads (Performance)

3 replies [Last post]
  • Login or register to post comments
Fri, 08/26/2011 - 00:32
oliver
Offline
Joined: 08/26/2011
Groups: None

Hi,

I'm very excited about couchbase 2.0, and i'm investigating the possibility of getting replacing Cassandra with Membase from our current stack. After having run Cassandra in production for some time, we've seen it's shortcomings firsthand and want to find a better solution.

The new indexing and query features in couchbase are very appealing, however i'm curious as to the performance implications of querying the indexes, and i haven't been able to find any detailed documentation or descriptions about the a) the implementation or b) it's current performance/scaling characteristics.

I've read somewhere that the system uses scatter-gather, without further information which leads me to believe that the implementation works something like this:

- Each server in the cluster maintains an index of the objects residing on that server (built incrementally by underlying couchdb store)
- A query is sent to all servers (either via a smart client library or from inside the server) - the scatter phase -, and the results are then gathered on that one machine and sent back to the client with irrelevant data removed.

So, actual questions:
a) Is that how it works?
b) If yes, does that not imply that we will never be able to scale index reads above the capabilities of the smallest box in the cluster (and also that index reads will use processing on ALL machines in the cluster)
c) if yes, is there a roadmap or simliar for moving to some sort of distributed indexes where index reads don´t go across the entire cluster but exhibit linearly scaling characteristics when adding servers like the underlying key/value system?

Thank you,
Oliver

Top
  • Login or register to post comments
Fri, 08/26/2011 - 06:57
jan
Offline
Joined: 02/15/2011
Groups: None

Hi Oliver,

thanks for writing. You've got it mostly right. In detail:

a) Currently, yes.
b) Correct, I think we recommend to not under-provision cluster nodes, so you get a good baseline speed there.
b.1) Correct again.
c) Totally, we are currently researching different ways to improve the current systems. We are focussing on making views as simple, fast and elastic (excuse the marketing spiel, I think it fits nicely here :) as you'd expect from us. Areas of improvement include caching results, caching view ranges (does the scatter gather has to ask a node at all and smarter and more efficient per-node data aggregation as well as moving some query-time computations to index-time.

Let us know if you have any other questions.

Cheers
Jan
--

Top
  • Login or register to post comments
Fri, 08/26/2011 - 12:07
oliver
Offline
Joined: 08/26/2011
Groups: None

Hey Jan,

Thanks for the detailed reply (and subreplies, you´re a programmer, right?)

Are there any benchmarks/performance tests available that can give us a sense of the performance limits of the current index query system?

With regards to c), what about distributing each index across vbuckets ( or simliar ), with some sort of overall understanding of what range of index data are in each vbucket?

Again, thanks for the reply!

Best,
Oliver

Top
  • Login or register to post comments
Mon, 08/29/2011 - 05:19
jan
Offline
Joined: 02/15/2011
Groups: None

Hi Oliver,

Thanks for the detailed reply (and subreplies, you´re a programmer, right?)

Yea :)

Are there any benchmarks/performance tests available that can give us a sense of the performance limits of the current index query system?

Not currently, but we are working on them. You can make your own, too :)

With regards to c), what about distributing each index across vbuckets ( or simliar ), with some sort of overall understanding of what range of index data are in each bucket?

We are currently exploring biggerbangforthebuck opportunities, but I'm sure we'll look into this as well at some point ;)

Cheers
Jan
--

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker