Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Couchbase | Couchbase Server 2.0

How does indexing the documents impact the other async operations in couchbase.

2 replies [Last post]
  • Login or register to post comments
Thu, 02/28/2013 - 16:25
shahshi15
Offline
Joined: 02/11/2013
Groups: None

Greetings,

I have couple of questions regarding indexing in Couchbase buckets.

1) How is indexing related/proportional to the number of documents in the bucket?
2) Whenever the indexing happens (on a regular time interval or on stale=false when querying views), does Couchbase index ALL the documents in the bucket or only the documents that are recently added/updated/deleted?

Thanks for the input in advance!
Shivang

Top
  • Login or register to post comments
Sat, 03/02/2013 - 13:33
mikew
Offline
Joined: 03/14/2011
Groups:

In Couchbase 2.0 the indexer and Couchbase engine are separate components. Let's go through two scenarios of how indexing would take place and I think that should give a good answer to your questions.

In the first scenario, let's say you just created a new Couchbase cluster and have not put any documents in the database yet, but you have created a view already. When you add your first document to Couchbase that document must hit disk before it is eligible to be read by the indexer. When the document is written it will receive a unique sequence number. These sequence numbers start at 0 and increase monotonically after each new or updated document is written to disk. This sequence number is also passed to the indexer and this is how the indexer knows whether or not it has indexed the latest documents written to disk. So in this example, when your cluster contained no documents, but had a single view created the indexer had sequence id 0 as it's latest document indexed. If you called the view with stale=false at this point it would return an empty result. When you add your first document and it is written to disk the indexer is then notified that the latest sequence id is 1. If you call stale=false now then the indexer will index all of the items that it has not indexed yet, so in this case it will index the single document you added. Indexing is incremental so you only index the new documents that have been added since the last time the indexer is run.

Now for the second scenario lets say you've been running your cluster for a while and you have thousands of documents and you create a new view. In this case the indexer will start out with sequence number 0, but know that the latest sequence number is 100,000. In this case the indexer will have to index everything up until it reads everything written before sequence number 100,000.

So now to address your questions.
1) Indexing is incremental. Couchbase will only index documents that have not passed through a views index once. Updating an index with millions of documents in no more complex than updating an index with hundreds of documents.
2) Only the recently added documents are indexed.

Top
  • Login or register to post comments
Wed, 03/06/2013 - 11:07
shahshi15
Offline
Joined: 02/11/2013
Groups: None

Mike !

This is as clear as it can get ! Thank you so much for the response. Really appreciate the input.

Shivang

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker