Migration from relational

Hi
I hope someone can help since I'm getting lost.
A bit of background might help:
We're building a new release of our software and are looking into NoSQL, given our background with SQL Server for years.
We use a Master-Model approach where our Model is our standard database with some base data and all the various data structure (tables, views, sps, FK, etc) , which is 'copied' for each new client (biz-2-biz environment, used by the business' customers). Simple set-up with two SQL replicated instances.
We're used to having a seperate database for each customer, about 60 now. We want to be able to scale faster with the new version without the hurting. We want to grow to 200 easy.

I need to make sure that no data of one customer, ever comes up when another customer works on these data (through our software). So when I read about buckets, that looked good but I can only add 10. OK, views then, maybe that can seperate my data; also limited. Maybe more instances like Raven does; only 1 instance.

I'm kind of new to NoSQL so maybe I'm making things too difficult or I'm missing some information, but I hope someone can help me understand how I would set that up in CouchBase (which performs amazing btw in our simple insert / read test!).

Regards

1 Answer

« Back to question.

Shared Buckets vs Separate buckets
Many buckets vs Few Buckets.

BEST - Few Buckets and Separate Buckets.
Many times its related to design , then budget. If budget was not a factor. Each user would get its own cluster and only have 1-3 buckets.
Benefit:
1. Users are safe in his own environment and doesn't effect other users.
2. Metrics are related to that users patterns only.

WORSE - Shared Buckets and Many Buckets.
I call this situation "Rent-A-Ferrari". In this case you have lots of users with a small data set (100MB to 20GB), but they want the reliability of Couchbase. You can give users an account number like EX. 0001 to 9999. So when you insert them into a bucket just add their account number like this Ex 0789_their-key. Its best to hash their key too Ex. 0789_ + Sha-1('their-key'). This way keys are always the same size.
Benefit:
1. Users can get up and running with Couchbase very fast.

To convert your data from RDBMS to NoSQL look into TalenD. http://blog.couchbase.com/mysql-couchbase-using-talend-etl

Hi HouseHippo, just found your response in my spambox. Thanks for reply.
OK, so it's like one big database with all customer data in one store, seperated by a key of some sort. Buckets and views are not for that.
Security thus needs to be handled in my application since I cannot segregate data access on the data-store (e.g. couchbase). That makes it clearer.
It also implies that I cannot allocate different resources for different clients like I'm use to, althoug given the performance that might not be necessary. Same for backups, which in my current environment differ per client; that's the same for all.

I get a lot better (also from reading up much more). Thanx.

You can have more then one bucket, 2-3 is fine. The first bucket as Memcached bucket to store data with low TTL (2 hours or less) or data that really is not important. The second bucket as Couchbase bucket with TTL of hours to weeks. The third bucket as Couchbase bucket with no TTL. You do not want to mix TTL and non TTL items. You can do a map reduce job by the account number in the keys and count the items a user inserted to keep caps of document count and size. http://blog.couchbase.com/calculating-average-document-size-documents-st.... This is good for items with TTL as they will disappear and the view will adjust. Also you can do Cbbackup via Regular expression.(I.e. filter by key prefix)