Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.6.x

Memcached process stuck and timeout while moxi up

4 replies [Last post]
  • Login or register to post comments
Mon, 04/25/2011 - 12:10
orieg
Offline
Joined: 03/14/2011
Groups: None

Hi, we use membase from the Git repo on Amazon EC2 with EBS volumes. Occasionnaly when adding/removing node and doing a rebalance we start to see timeout errors on some servers, rebalance fail. After some troubleshooting we figured out that the memcached process is running but timeout any request on its binary port. the process seems running, doing few IOs. Moxi still respond to the STAT command but will timeout when doing get/set. The only way we found to work around the issue is to kill the memcached process, it get automatically restarted, then the rebalance will work.

IOs on EBS are really bad, would it be the persistence that stuck the memcached process ? Why does the webinterface and moxi still report the node as up if all requests to memcached timeout ?

Regards,
Nicolas

Top
  • Login or register to post comments
Mon, 04/25/2011 - 12:39
perry
Offline
Joined: 10/11/2010
Groups:

Hi Nicholas, the memcached process should not be timing out regardless of what is going on underneath.

What version of Membase are you using?

Do the logs show any errors or output from the memcached process?

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Mon, 04/25/2011 - 13:14
orieg
Offline
Joined: 03/14/2011
Groups: None

In such case the error we see is a badmatch / error timeout for the failing node. So, it definitely report it in the log but the node is still part of the cluster and considered up so clients seem to continue to try to connect to it.

We are running the tip of tree from the git repo as of couple of weeks ago. I could get you the specifics if you need too.

In the following log, you could see the issue. When it's happen we where removing some vbucket. We now only use the default bucket as the vbucket aware client i java was showing some really bad perf and overhead on our client side. We now stick to the default bucket but still can see the issue of the memcached process being stuck occasionally.

{"list":[{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1300907968844.0,"shortText":"message","text":"Created bucket \"user-db\" of type: membase\n"},{"node":"ns_1@dev-rtb-membase01.us-east-
1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300907970208.0,"shortText":"message","text":"Bucket \"user-db\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1300911042628.0,"shortText":"message","text":"Shutting down bucket \"default\" on 'ns_1@dev-rtb-membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1300911042637.0,"shortText":"message","text":"Deleted bucket \"default\"\n"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1300911056018.0,"shortText":"message","text":"Created bucket \"rtb\" of type: membase\n"},{"node":"ns_1@dev-rtb-membase01.us-east-
1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300911057387.0,"shortText":"message","text":"Bucket \"rtb\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":4,"module":"ns_node_disco","tstamp":1300911075194.0,"shortText":"node up","text":"Node 'ns_1@dev-rtb-
membase01.us-east-1b.private' saw that node 'ns_1@dev-rtb-membase02.us-east-1b.private' came up."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"menelaus_app","tstamp":1300911075494.0,"shortText":"web start ok","text":"Membase Server has started on web port 8091 on node 'ns_1@dev-rtb-membase02.us-east-1b.private'."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":3,"module":"ns_cluster","tstamp":1300911075517.0,"shortText":"message","text":"Node ns_1@dev-rtb-membase02.us-east-1b.private joined cluster"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300911077125.0,"shortText":"message","text":"Bucket \"rtb\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 0 seconds."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"warning","code":1,"module":"supervisor_cushion","tstamp":1300911077376.0,"shortText":"port exited too soon after restart","text":"Service moxi_stats_collector exited on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1.86s\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"warning","code":1,"module":"supervisor_cushion","tstamp":1300911077376.0,"shortText":"port exited too soon after restart","text":"Service moxi_stats_collector exited on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1.86s\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300911078140.0,"shortText":"message","text":"Bucket \"user-db\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":4,"module":"ns_orchestrator","tstamp":1300911078899.0,"shortText":"message","text":"Starting rebalance, KeepNodes = ['ns_1@dev-rtb-membase01.us-east-1b.private',\n       
                          'ns_1@dev-rtb-membase02.us-east-1b.private'], EjectNodes = []\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1300911078963.0,"shortText":"message","text":"Shutting down bucket \"user-db\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for server shutdown"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":1,"module":"ns_orchestrator","tstamp":1300911126669.0,"shortText":"message","text": "Rebalance completed successfully.\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300911435187.0,"shortText":"message","text":"Bucket \"user-db\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds. (repeated 1 times)"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1300928749751.0,"shortText":"message","text":"Shutting down bucket \"user-db\" on 'ns_1@dev-rtb-membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1300928749778.0,"shortText":"message","text":"Deleted bucket \"user-db\"\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1300928749799.0,"shortText":"message","text":"Shutting down bucket \"user-db\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1300928801202.0,"shortText":"message","text":"Created bucket \"user_info\" of type: membase\n"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info"
,"code":1,"module":"ns_memcached","tstamp":1300928802508.0,"shortText":"message","text":"Bucket \"user_info\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1300928802567.0,"shortText":"message","text":"Bucket \"user_info\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1301366631336.0,"shortText":"message","text":"Shutting down bucket \"rtb\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1301366631346.0,"shortText":"message","text":"Shutting down bucket \"rtb\" on 'ns_1@dev-rtb-membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1301366631358.0,"shortText":"message","text":"Deleted bucket \"rtb\"\n"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1301366644563.0,"shortText":"message","text":"Created bucket \"ad_info\" of type: membase\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1301366645919.0,"shortText":"message","text":"Bucket \"ad_info\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1301366645972.0,"shortText":"message","text":"Bucket \"ad_info\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1301366654618.0,"shortText":"message","text":"Created bucket \"stats\" of type:membase\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1301366656017.0,"shortText":"message","text":"Bucket \"stats\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1301366656034.0,"shortText":"message","text":"Bucket \"stats\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":12,"module":"menelaus_web","tstamp":1302401411611.0,"shortText":"message",
"text":"Created bucket \"default\" of type: membase\n"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1302401413253.0,"shortText":"message","text":"Bucket \"default\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1302401413545.0,"shortText":"message","text":"Bucket \"default\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637166604.0,"shortText":"message","text":"Shutting down bucket \"ad_info\" on 'ns_1@dev-rtb-
membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637166635.0,"shortText":"message","text":"Shutting down bucket \"ad_info\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1302637166667.0,"shortText":"message","text":"Deleted bucket \"ad_info\"\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637177910.0,"shortText":"message","text":"Shutting down bucket \"stats\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637177925.0,"shortText":"message","text":"Shutting down bucket \"stats\" on 'ns_1@dev-rtb-membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1302637177939.0,"shortText":"message","text":"Deleted bucket \"stats\"\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637187882.0,"shortText":"message","text":"Shutting down bucket \"user_info\" on 'ns_1@dev-rtb-membase02.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":2,"module":"ns_memcached","tstamp":1302637187891.0,"shortText":"message","text":"Shutting down bucket \"user_info\" on 'ns_1@dev-rtb-membase01.us-east-1b.private' for deletion"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":11,"module":"menelaus_web","tstamp":1302637187912.0,"shortText":"message","text":"Deleted bucket \"user_info\"\n"},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":4,"module":"ns_memcached","tstamp":1302637193351.0,"shortText":"message","text":"Control connection to memcached on 'ns_1@dev-rtb-membase02.us-east-1b.private' disconnected: {{badmatch,\n                                  
                                                             {error,\n                                                        
                                        timeout}},\n                                                                          
                    [{mc_client_binary,\n                                                                                     
           stats_recv,\n                                                                                                4},\n 
                                                                                              {mc_client_binary,\n            
                                                                                    stats,\n                                  
                                                              4},\n                                                           
                                    {ns_memcached,\n                                                                          
                      handle_call,\n                                                                                          
      3},\n                                                                                               {gen_server,\n      
                                                                                          handle_msg,\n                       
                                                                         5},\n                                                
                                               {proc_lib,\n                                                                   
                             init_p_do_apply,\n                                                                               
                 3}]}"},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":4,"module":"ns_memcached","ts
tamp":1302637193384.0,"shortText":"message","text":"Control connection to memcached on 'ns_1@dev-rtb-membase01.us-east-1b.priv
ate' disconnected: {{badmatch,\n                                                                                              
 {error,\n                                                                                                timeout}},\n        
                                                                                      [{mc_client_binary,\n                   
                                                                             stats_recv,\n                                    
                                                            4},\n                                                             
                                  {mc_client_binary,\n                                                                        
                        stats,\n                                                                                              
  4},\n                                                                                               {ns_memcached,\n        
                                                                                        handle_call,\n                        
                                                                        3},\n                                                 
                                              {gen_server,\n                                                                  
                              handle_msg,\n                                                                                   
             5},\n                                                                                               {proc_lib,\n 
                                                                                               init_p_do_apply,\n             
                                                                                   3}]}"},{"node":"ns_1@dev-rtb-membase01.us-e
ast-1b.private","type":"info","code":0,"module":"ns_port_server","tstamp":1302637237748.0,"shortText":"message","text":"Port s
erver memcached on node 'ns_1@dev-rtb-membase01.us-east-1b.private' exited with status 134. Restarting. Messages: Flusher being destroyed in state stopping\nFlusher being destroyed in state stopping\nmemcached: queueditem.hh:62: QueuedItem::~QueuedItem(): Assertion `stats->memOverhead.get() < ((size_t)1<<(sizeof(size_t)*8-1))' failed."},{"node":"ns_1@dev-rtb-membase02.us-east
-1b.private","type":"info","code":0,"module":"ns_port_server","tstamp":1302637245762.0,"shortText":"message","text":"Port server memcached on node 'ns_1@dev-rtb-membase02.us-east-1b.private' exited with status 134. Restarting. Messages: Flusher being destroyed in state stopping\nmemcached: queueditem.hh:62: QueuedItem::~QueuedItem(): Assertion `stats->memOverhead.get() < ((size_t)1<<(sizeof(size_t)*8-1))' failed."},{"node":"ns_1@dev-rtb-membase02.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1302637248005.0,"shortText":"message","text":"Bucket \"default\" loaded on node 'ns_1@dev-rtb-membase02.us-east-1b.private' in 1 seconds."},{"node":"ns_1@dev-rtb-membase01.us-east-1b.private","type":"info","code":1,"module":"ns_memcached","tstamp":1302637248110.0,"shortText":"message","text":"Bucket \"default\" loaded on node 'ns_1@dev-rtb-membase01.us-east-1b.private' in 1 seconds."}]}

Top
  • Login or register to post comments
Mon, 04/25/2011 - 14:13
perry
Offline
Joined: 10/11/2010
Groups:

It would be good if you can install and test with the latest stable version (1.6.5.3) as the tip of the tree is not going to be as fully tested or stable.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Mon, 04/25/2011 - 15:21
orieg
Offline
Joined: 03/14/2011
Groups: None

Thanks Perry. I will give a try later this week and will let you know.

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker