Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Couchbase | Couchbase Server 2.0

Use Case Questions

3 replies [Last post]
  • Login or register to post comments
Sun, 12/18/2011 - 13:56
joshdmiller
Offline
Joined: 12/18/2011
Groups: None

Greetings.

I, like many, am confused about the Couchbase product offerings. I recently read the blog post on 2012 changes to expect and, also like many, I find the product simplification and sharper focus a very welcome change. But in the shorter term, I have a couple of specific questions as it relates to a specific open-source, web-based SaaS we are developing. I have read the Couchbase Server 2.0 manual, much of the wiki, and some of the blog posts, so I do hope these are not stupid questions.

Some general information about the project: our specific use case involves a moderate amount of per-user data (varying blocks of free-form text and unstructured citations) in addition to a shared pool of searchable data which will initially number around 300GB and grow significantly over time. The shared data is integrated from API calls, targeted web scrapers, and licensed data and is most amenable to bulk uploads/inserts that occur irregularly, but will be accessed frequently by users and forms the basis of most of the per-user data. Our environment is EC2. We feel our use case is well suited to document-oriented data stores and we were leaning more toward CouchDB than MongoDB for horizontal scalability. Now we're interested in CouchDB versus Couchbase Server.

1) As I understand it, Couchbase Server is fundamentally Memcached with some additional features from CouchDB, so it would be fair to say Couchbase Server is more like Memcached with persistence than it is like CouchDB with in-memory caching. It also seems the product is converging more toward the latter as releases progress. Am I correct here?

2) While per-user data would do well cached in memory in a cluster as the data is relatively small, it would not necessarily be feasible in the short term to cache in memory the 300GB of shared data. Understanding the performance trade-off for "Disk > Memory", are peoples' experience here that this is okay? Need it be cached in memory? Or is this just not a strong use case?

3) The large pool of shared data is the basis of a search feature (location of citations) where, as I understand it, multiple views would be needed: one to search by author, another by source, etc., which are simple enough. There will also need to be the ability to search by multiple keywords (eventually with boolean operations). Combining some of these would be an important feature. To be honest, I am unsure how to approach the creation of the complex views. Can someone point me in the right direction? Or is this just not a good use case?

4) If there are five different ways of obtaining the same document, i.e. five different views, is that document stored in memory and on disk five times?

5) As I understand it, the default object size is 1 MB, meaning any key's value must be relatively small. We will be storing text-based documents for our users that, with metadata, could exceed that number. Is this a flexible number and, if so, what are the consequences to increasing this number? We can also store the text files in another location (like S3) with the value containing a reference, but this increases latency.

6) Similarly, I read that bucket size is limited to 25MB, but I am not sure I understand what this really means. If we have 300GB of data, do we need to create 120,000 buckets and somehow programatically partition the data? I'm sure I'm misunderstanding something here.

7) On CouchDB, we liked the idea of incorporating some of the business logic on the database through the use of validation functions and even custom query servers. Is this possible with Couchbase Server?

8) Similarly, CouchDB supports the elimination of middleware for many use cases through creative use of design documents. I understand Couchbase Server does not support attachments, so a "CouchApp" would not be possible at this time. Is Couchbase Server intended to sit exclusively behind the middleware in a web app or could a javascript client communicate effectively and securely with the Couchbase Server?

9) We will be creating native mobile applications, so Couchbase Mobile intrigued us. The data replicated between the mobile and the cloud will obviously be just the user's data, which I assume is just a filtered replication. Can someone confirm?

10) Lastly, by making Couchbase Mobile an "add-on" to Couchbase Server in the future, will it adopt a similar architecture and feature set, or will it still be like CouchDB on a mobile?

I appreciate you bearing with me as this was a long post. In advance, many thanks for the help!

Josh

Top
  • Login or register to post comments
Sun, 12/18/2011 - 23:18
ingenthr
Offline
Joined: 03/16/2010
Groups:

Indeed it is a long post. I'll try to address everything with a similar numeric order.

1) I would say that is correct, but it retains all of the programming model of Membase and maintains most of the important bits of the REST interface with Couch.

2) That's absolutely a good use case. I'll be pretty honest though that the biggest challenges have frequently been situations where the systems are, in my humble opinion, pretty unbalanced. Running a 72GB memory EC2 instance which can only crank out 70 or so IOPS off of an EBS volume isn't very balanced. To be fair, you can stripe those EBS volumes, but the costs change a bit. As long as the IO on the system supports pulling ejected data in as needed, it should work well.

3) The view interface Couchbase Server provides maps almost exactly to the CouchDB view API. The only exceptions are things like list functions, which are less common and more appropriate for couchapps. I know that doesn't address the specifics, but range query searches for author or source and dealing with JSON array responses for the keywords are all ways I can kind of think to address it.

4) Only stored in memory and on disk once. We work hard to be very efficient in both disk storage and memory usage.

5) Actually, object sizes are up to 20MByte. That should be tunable in the final 2.0 release. There's no architectural limit technically; it's more about keeping things reasonable.

6) Bucket size isn't limited. The RAM allocation to a bucket is limited by the amount of memory on the system, but anything past that overflows to disk.

7) That is not currently possible with Couchbase Server. Validation and custom queries, when building large distributed apps, are best moved to another tier. They work well in couchapps, but they're also a hindrance if you have lots of writes and need to scale. In some online game cases, we even look at creative ways of pushing validation down to the client in a trusted way.

8) Couchbase server most certainly does support attachements. It's the lack of list/validation functions that may make it less suitable for CouchDB style couchapps.

9) That's our intent, yes. Right now the way this filtering is done needs some enhancement, but that's what we're working on.

10) At the moment, the intent is to provide high-level APIs (currently through both CouchCocoa and Ektorp), with the lower level HTTP REST still available to users The feature set, at the moment, is closer to CouchDB than anything else. The mobile community in general (which is quite vibrant) and those of us at Couchbase are trying to figure out the right balance here. There's always a tension with mobile, as full CouchDB feature set and startup time or footprint are two ends of a spectrum. I'd encourage you to join the mobile-couchbase list and get your opinions out there.

I hope that helps. I'd be glad to answer any other questions which arise.

Top
  • Login or register to post comments
Mon, 12/19/2011 - 14:41
joshdmiller
Offline
Joined: 12/18/2011
Groups: None

Wow, thank you for the detailed answers - I really appreciate it. Also, I apologize for a couple of the questions that I can see now were flat-out wrong.

I am still unsure about views and cloud topology in our use case. More research is required here.

Top
  • Login or register to post comments
Tue, 04/17/2012 - 13:40
Jeannine89
Offline
Joined: 04/17/2012
Groups: None

Making Couchbase Mobile sounds like a great idea! Make sure it is perfectly compatible with Android as it gets more and more popular now.
I guess the majority of programs these days have their mobile versions, so I wouldn't like to drag behind as well.
I'm working on the mobile version of PDF Viewer and it's a valuable experience for me as I'm planning to develop some more interesting and useful tools.

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker