Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Couchbase | Couchbase Server 1.8.x

How can I tell couchbase to purge old data that has not expired yet?

11 replies [Last post]
  • Login or register to post comments
Wed, 07/04/2012 - 19:56
ckilborn
Offline
Joined: 03/29/2012
Groups: None

I need to clean out my data set and I want to get rid of old data that has not expired yet.

Top
  • Login or register to post comments
Thu, 07/05/2012 - 13:13
mikew
Offline
Joined: 03/14/2011
Groups:

Without an expiration time you will have to do this manually by deleting all unwanted items. If you don't need any of the data in your cluster then you can do a flush_all.

Top
  • Login or register to post comments
Thu, 07/05/2012 - 13:21
ckilborn
Offline
Joined: 03/29/2012
Groups: None

Thanks for the reply

I don't want to delete all my items

My items have expiration date - is there any way to tell couchbase to delete items that are scheduled to expire after a certain date?

Top
  • Login or register to post comments
Sun, 07/08/2012 - 20:02
mikew
Offline
Joined: 03/14/2011
Groups:

There isn't a command that allows you to do this in Couchbase at the moment.

Top
  • Login or register to post comments
Wed, 07/11/2012 - 21:04
ckilborn
Offline
Joined: 03/29/2012
Groups: None

This really needs to happen - I can't just keep throwing hardware at couchbase.

I understand that things are removed then they expire but there has to be another way to control the # of items in couchbase.

Top
  • Login or register to post comments
Wed, 07/11/2012 - 21:25
mikew
Offline
Joined: 03/14/2011
Groups:

What is your use case for needing a command like this? Can you also explain what hardware you are adding to your cluster? And why did you set such a high expiration time in the first place?

Top
  • Login or register to post comments
Thu, 07/12/2012 - 08:50
ckilborn
Offline
Joined: 03/29/2012
Groups: None

We have a very diverse set of data that we are caching. Our document set is growing very large because of increased traffic. We want to delete items in couch base that are x days old or ideally haven't been accessed in x days. Currently we have to keep throwing more hardware at couchbase in order to avoid OOM errors and high cache miss ratios.

If limited the amount of disk storage for each node, would this help us? Would couchbase then evict data that was old and inactive but not expired?

Top
  • Login or register to post comments
Thu, 07/12/2012 - 10:06
mikew
Offline
Joined: 03/14/2011
Groups:

Your old an inactive data will be evicted to disk if it is not recently used. We currently require that every item stores its meta data in memory and the size of the meta data for an evicted item is 72 bytes + the key size so the number of items you have does have permanent memory overhead. When the memory usage approaches 80% we evict 15% of the items to disk and this makes room for newly used items. Can you describe your workload and cluster setup? How many items? Number of get/sets per second? Memory allocated for your bucket? Number of servers and number of replicas?

Top
  • Login or register to post comments
Thu, 07/12/2012 - 10:39
ckilborn
Offline
Joined: 03/29/2012
Groups: None

number of items 65,000,000
overhead (bytes) 150
average key size (bytes) 200
average object size (bytes) 1024

data needed (GiB) 83.18
working set (%) 30.00%
working set memory needed (GiB) 24.95

Nodes 8
Sizes per Node 4GB
Total 32GB

Replicas 1

Avg # of ops/sec - 3k
Avg # of gets/sec 2.7k

Top
  • Login or register to post comments
Thu, 07/12/2012 - 18:31
mikew
Offline
Joined: 03/14/2011
Groups:

It looks like your setup is ok and if your working set does fit into memory then the cache misses should only happen when newly used data is cycling into Couchbase. Also, removing old items is unlikely to vastly improve performance since the memory reclaimed would be relatively small. The only way to improve performance here would be to add more memory, but as you mentioned your working set already fits into memory so this also probably isn't necessary at the moment.

Also, how high is your cache miss ratio?

Top
  • Login or register to post comments
Thu, 07/12/2012 - 20:40
ckilborn
Offline
Joined: 03/29/2012
Groups: None

Cache miss ratio is approx 3

We are very frustrated that we need to schedule downtime every time we have to re-balance - either after adding or removing a node. All our boxes have SSD hard drives and the process still takes over 30 mins - even after that there is still a huge disk write queue.

Top
  • Login or register to post comments
Fri, 07/13/2012 - 13:49
ckilborn
Offline
Joined: 03/29/2012
Groups: None

It would be awesome if couchbase 1.8.1 had a community edition since it fixes so many re-balancing issues.

When is that planned?

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker