Multiple data center/availability zone thoughts
We're running in multiple availability zones on EC2. Our use case for CouchBase is for a distributed, high availability memcached replacement.
1. I'd like replicas to be stored in a separate availability zone. There's no location awareness currently.
2. I'd like a whole availability zone to go down without impacting CouchBase availability beyond 30 seconds. This means having a third of the nodes failing, plus maybe a couple of others. We're not at the point where we need more than 3 nodes, so the current restrictions aren't impacting us.
3. I'd also like Moxi to support locality as well. That is, I'd like it to retrieve data from the same availability zone if possible.
As a follow up to #3, the things I'd be trying to avoid are both the latency in and cost of transferring data between availability zones, when possible. The first use case will be PHP sessions, so an in-process cache won't work (we already make usage of in-process caching where possible). Due to the asynchronous nature of multiple PHP requests anyway, a slight delay in cache coherency isn't a big deal. For all our other data the cache is never treated as authoritative.
i too have a similar requirement.
on ec2 we're running in multiple availability zones, (like most serious apps will be doing), and i have the same issues of data transfer latency and cost (across zones).
the location awareness would be a possible solution to this problem.
in addition we run a membase instance on each web server, if they were able to update the local instance by being location aware and always go to that instance then it would again be faster, but obviously consistency would be an issue - having said that in this case we simply avoid the cache and query the db or service directly and then put it in the cache.
anyways just to chip in....
To define this idea even further, I'd like like the following:
- One copy of the data in each availability zone -- shouldn't be too hard. Define the AZ in the config file for each instance of CouchBase and Moxi. If I'm using three AZs, and replicas are stored in another AZ if possible, set replicas to 2 and I'm done.
- Reading preferentially from the local availability zone
- Writing updates all availability zones (as it does already)
Ideally, the implicit relaxed coherency would be an option when creating a bucket. It would be a local-reads versus consistency toggle.
We are looking at ways of aligning underlying failures to the system's tolerance for failure (i.e. master and replica across AZs), but don't have any functionality for that yet. Architecturally speaking, it's pretty straightforward to do but we've not added the ability to describe that configuration or the UI for it yet.
That affects question #1 and #2.
For question #3, we do not, intentionally, allow for inconsistent reads at the moment. Moxi always goes to the master node for any given data item. By design, data is not multi-mastered. We think this is the correct approach for many, many applications and if you're looking for better locality and are willing to give up inconsistency, you may want to have a small in-process cache. The fastest IO is the one you don't have to do. :)
I hope that helps, if you've any followup questions, please let me know.
p.s.: You may want to try contacting sales@couchbase to have a more detailed conversation. It sounds like you have some specific things you're trying to achieve, and they can probably help.