couchDB metadata
CouchDB metadata is about 120 bytes per document
Since we have a lot of documents to be indexed (1000 per second), we build up 10GB of couchDB metadata per bucket (not related to real data) which must be kept in memory *per day*. Originally we thought that small documents were in the confort zone of couchbase, but while memcached handles small documents well, couchDB does not.
This means that we need to manually compact documents in to a single one to reduce the amount of overhead on the couchbase for storage. Unfortunately while couchDB is doing fine with large documents, memcached is not, especially the 20MB limit means that if we want to go bigger t we get out of the spec of current couchDB. What would be the best way of getting the best of both worlds?
Hi natalinobusa
Before I answer your question just wanted to clarify if you are asking about CouchDB or Couchbase. They are two different open source projects and products.
Read about the differences here: http://www.couchbase.com/couchdb
Couchbase does have metadata of around 120 bytes. A lot of our users who need operations / second in the hundreds of thousands of ops use binary data (to compress).
In Couchbase Server 2.0, you have two options, who can store blob data or JSON documents. JSON documents can also be queried using views etc...
So depending on your use case, I would recommend using application side compressed BLOB data in one bucket for pure K/V access that you dont need views on.
For data that you need to query, store it as JSON and build indexes on it.
In future releases, compression is something we will look into. there are various ways to compress (client side / server side - page level, document level, de-duping attributes etc...) we will try to see what works best. but no plans to support this in the short term.
Hope this helps.