Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.7.x

Membase service monitoring

1 reply [Last post]
  • Login or register to post comments
Mon, 09/26/2011 - 14:25
yluppo
Offline
Joined: 08/04/2011
Groups: None

Hi,

We are using Membase Server 1.7.1.1 and Enyim client version 2.11.

Using the client property:

Enyim.Caching.Memcached.IMemcachedNode.IsAlive

We've created a monitoring web page that we ping every minute to see if any Membase service is either shutdown or in a funky state.

The reason we made that page is because simply monitoring if the Windows service was up didn't catch some of the issues we've seen where a Membase server goes into a weird state and doesn't answer requests anymore although it is still technically running.

We have a thread on this:

http://www.couchbase.org/forums/thread/membase-service-needs-be-restarte...

This hasn't happened in a while, which is why the thread is not moving recently.

At any rate our issue is that hitting this page on each server doesn't return a very consistent result.

In fact, if we consider a Membase cluster with 8 servers. Each of which is also serving our web app. If for example we turn off Membase on machine 'F', when we hit our monitoring web page on all servers, only one box will show the ISALIVE property as false for machine 'F', generally not the same box (here machine 'F').

All the other machines show everything is ok, ISALIVE = True.

I believe Membase is built using Erlang and at first I thought well maybe it takes a minute for the Mnesia database to propagate to all servers. But after 10 min, still only that one server is showing that machine 'F' is not serving Membase requests.

That machine is nothing special, a couple days ago we tried turning off the Membase service on the same server and another box picked up that Membase was down on the box 'F', but none of the others.

Is there some caching issue w/ the cluster state? Is there like a "NameNode" that's the only box managing server state in the cluster? Anything of that nature that would explain the behavior observed?

Thanks.

Top
  • Login or register to post comments
Sat, 10/15/2011 - 15:31
ingenthr
Offline
Joined: 03/16/2010
Groups:

I think you need an updated Enyim client. There were some recent fixes that may explain this. When the connection is momentarily dropped, the server was getting marked automatically as "dead" by Enyim until restart. That's now been fixed and there is a reconnect, though there's possibly a better reconnect routine coming.

The current stable client can be downloaded here: http://www.couchbase.org/code/couchbase/net

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker