Membase Community Edition - Ready for production environments?
hey guys,
sorry, had difficulty getting my password for my other account (the @tombola.com one) so had to create another account.
We're currently using the community edition of membase in our product rewrite, which thankfully hasn't yet gone to production.
We have had any number of problems with random crashes, most notably along the same lines as:
http://forums.membase.org/thread/membase-fall-down-every-2-days
http://forums.membase.org/thread/membase-application-fault-ever-8-seconds
http://forums.membase.org/thread/service-memcached-exited-error-messages
http://forums.membase.org/thread/membase-crashes-every-few-seconds
I tweeted yesterday that I was about to switch from Membase to AppFabric and Matt Ingenthron (@ingenthr) replied asking if there was anything you guys could do to help.
I guess my question is, is membase community version ready for production environments? The bugs above are all valid bugs, and as yet there don't appear to have been any rapid fixes unless you go commercial, and that really isn't an option for us at present. I understand (and support) the business model you guys are taking, and understand you want to have a revenue stream, and I like the way you've placed limits on the way you scale with the community edition. But to have a product with bugs that is significantly delayed in fixes when they are found isn't really an option for us.
I'm attempting a final re-install of the enterprise edition today in the hopes that at least in our dev environment I can get something that works for more than a day or two, but it'd be useful if you guys could address the 'delayed hotfixes' issue on your community product.
Keen to continue using this product, but obviously can't roll something into production that crashes every 2 days.
Thanks for any and all updates on this.
Cheers,
Terry
very impressed with the level of reply here guys.
I think in the first instance, I shall do a clean re-install today, and as soon as I get any issues I shall reply here and attempt to document exactly what is happening, and we'll work from there.
I guess I was somewhat negative in the initial mail, when the software was running I was over the moon with the performance gains we were seeing, so hopefully if and when the issues raise their head, if we can indeed get them resolved then I'll be another of your very happy users.
Cheers,
Terry
Excellent, thanks Terry!
as an update to this, apologies for the delay in replying - I was at a Dev conference in london last week so didn't get a chance to monitor the services.
I've now managed to generate a Diagnostic report which is unfortunately 153mb - I'm more than happy to send this on to someone if we can find an agreeable way of sharing?
The first lines of reporting (which happen for us around line 6800 of the file) are:
2011-03-08 17:28:05.039 ns_memcached:1:info:message - Bucket "default" loaded on node 'ns_1@192.168.1.171' in 0 seconds.
2011-03-11 14:00:55.508 ns_node_disco:2:info:cookie update - Node 'ns_1@192.168.1.171' synchronized otp cookie ouynfheyfhdfvngb from cluster
2011-03-11 14:00:55.617 menelaus_app:1:info:web start ok - Membase Server has started on web port 8091 on node 'ns_1@192.168.1.171'.
2011-03-11 14:01:12.164 ns_port_server:0:info:message - Port server memcached on node 'ns_1@192.168.1.171' exited with status 3. Restarting. Messages: File: stored-value.hh, Line 974
Expression: v->isDirty() == isDirty
2011-03-11 14:01:12.164 supervisor_cushion:1:warning:port exited too soon after restart - Service memcached exited on node 'ns_1@192.168.1.171' in 0.00s
2011-03-11 14:01:12.164 supervisor_cushion:1:warning:port exited too soon after restart - Service memcached exited on node 'ns_1@192.168.1.171' in 0.00s
2011-03-11 14:01:17.180 supervisor_cushion:1:warning:port exited too soon after restart - Service memcached exited on node 'ns_1@192.168.1.171' in 0.00s
2011-03-11 14:02:45.171 ns_node_disco:2:info:cookie update - Node 'ns_1@192.168.1.171' synchronized otp cookie ouynfheyfhdfvngb from cluster
2011-03-11 14:02:45.375 menelaus_app:1:info:web start ok - Membase Server has started on web port 8091 on node 'ns_1@192.168.1.171'.
2011-03-11 14:03:06.359 ns_port_server:0:info:message - Port server memcached on node 'ns_1@192.168.1.171' exited with status 255. Restarting. Messages:
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
2011-03-11 14:03:08.093 supervisor_cushion:1:warning:port exited too soon after restart - Service memcached exited on node 'ns_1@192.168.1.171' in 1.73s
2011-03-11 14:03:13.845 supervisor_cushion:1:warning:port exited too soon after restart - Service memcached exited on node 'ns_1@192.168.1.171' in 0.75s
though obviously there's a lot more in the file.
Help?
Cheers,
Terry
Hi Terry,
No problem for the delay in reply. Thanks for giving us another shot there and the tenacity.
That is actually enough info, thanks for digging into the logs. I think I know what this is.
As we'd said earlier, we want to address these kinds of issues as they come up. Let me find the right way to get a fix to you, and I'll be back in touch with you shortly.
Thanks,
Matt
ahh fantastic, I look forward to that.
As an aside to this, is this a problem we'd have run into in the commercial edition too? If I can add any weight to the case to go commercial (and it gets me software that 'just works') then I'm happy to do the sell at this end.
Cheers,
Terry
It is an issue that you would have run into with the Enterprise Edition, but we've fixed that in the 1.6.5.3 release that will be released momentarily.
If you email me at perry -at- couchbase -dot- com I can provide it to you directly until we've posted it "officially".
Perry
Thanks for your feedback Terry, and thanks for giving us the chance to
"make it right" with you.
First, let me say that if you're seeing regular, persistent crashes,
that is certainly not what we intend anyone to run into with community
edition. We looked back at those forum postings, and yes those folks
were running into issues which will be addressed in the next community
release, but we do not believe they're the kinds of things that /most
people/ are running into. It's possible that we've gotten this wrong,
so if you're seeing these on a regular basis, we want to get more info
and try to address it. As you can see from the forums, we followed up
on all of those postings. In some cases it was isolated issues, and in
other cases we have requested more information. We believe, and have
lots of evidence for there being many, many people downloading and
running without ever seeing an issue like this.
It's hard to say definitively whether the Community Edition is ready for
production. This is partially because the definition of, and service
levels required for, production are so very different from user to user
and environment to environment. You'll have to make that evaluation for
yourself.
Community edition is the latest changes, including features and
bugfixes, and doesn't go through the stringent QA and regression testing
that we put the Enterprise Edition through. That said, there are a few
companies who are using it in production but they build up their own
expertise to manage and maintain it since we do not provide support
(other than through the community) for it. We will definitely release
updates to the Community Edition, but our primary focus is on making
Enterprise Edition customers successful.
I would say that if you are planning on using Membase in production, and
are getting value out of it, then it makes sense to use the Enterprise
Edition and allow us to provide you support and guidance to help make it
(and you) successful. If something happens to Membase in your
production environment in the middle of the night (whether it's the
software or not), you may need the escalation path we provide for quick
resolution of issues. If you don't need that quick resolution and and
it meets your needs after testing, then perhaps Community Edition is
right for you.
I totally understand that you're wary of deploying something that may
crash (whether it be 2 days or 2 weeks), but we all know software has
bugs. For a growing number of customers, the value of Membase outweighs
the potential hiccups in operation and the Enterprise Edition allows us
to guarantee you the best software and expertise available.
By the way, the current "official" release of the Enterprise Edition
does not have any of the above bug fixes in it yet. There are a number
of happy customers with that release in production. We've provided
fixes as needed to some customers, but the version that you download off
of the web site does not yet have those included. We're looking to
release 1.6.5.3 Enterprise in the next week or two with a roll up all of
the various bug fixes.
I'm eager to continue the conversation with you Terry and I really
appreciate your continued interest in Membase.
-Perry and Matt
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!