[MB-4334] moxi assert during rebalance/multiget Created: 06/Oct/11 Updated: 13/May/12 Resolved: 13/May/12 |
|
| Status: | Closed |
| Project: | Couchbase Server |
| Component/s: | moxi |
| Affects Version/s: | None |
| Fix Version/s: | 1.8.1 |
| Security Level: | Public |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Keith Batten | Assignee: | Steve Yen |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 1.8.1-release-notes | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
client side moxi version 1.7.0
cluster version 1.7.1 moxi command line: 3 nodes centos 5.4 64bit |
||
| Attachments: |
|
| Description |
|
1) Set up 3 node cluster
2) Start up moxi /opt/moxi/bin/moxi -v http://10.1.5.41:8091/pools/default/bucketsStreaming/default,http://10.1.5.42:8091/pools/default/bucketsStreaming/default,http://10.1.5.134:8091/pools/default/bucketsStreaming/default -Z port_listen=11211 -u root -t 16 3) Start up set/get load against cluster 4) Start up rebalance in a loop (continually add/remove node) 5) load 1000 keys to be used for multiget 6) run ascii multiget in a loop moxi: cproxy.c:927: cproxy_reserve_downstream: Assertion `d->timeout_tv.tv_sec == 0' failed. Thread 19 (process 24286): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f3baf0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f3baf0, flags=0) at event.c:1558 #3 0x0000000000408da6 in main (argc=9, argv=0x7fff83c83bf8) at memcached.c:5061 Thread 18 (process 24287): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f40640, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f40640, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 17 (process 24288): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f44c30, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f44c30, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 16 (process 24289): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f49220, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f49220, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 15 (process 24290): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f4d910, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f4d910, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 14 (process 24291): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f52000, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f52000, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 13 (process 24292): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f566f0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f566f0, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 12 (process 24293): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f5b150, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f5b150, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 11 (process 24294): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f5f8d0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f5f8d0, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 10 (process 24295): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f641c0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f641c0, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 9 (process 24296): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f68ab0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f68ab0, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 8 (process 24297): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f6d3a0, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f6d3a0, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 7 (process 24298): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f71c90, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f71c90, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 6 (process 24299): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f76580, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f76580, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 5 (process 24301): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f7fb60, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f7fb60, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 4 (process 24302): #0 0x0000003e5f4d4018 in epoll_wait () from /lib64/libc.so.6 #1 0x0000000000447ab4 in epoll_dispatch (base=0x8f84850, tv=<value optimized out>) at epoll.c:404 #2 0x0000000000441f95 in event_base_loop (base=0x8f84850, flags=0) at event.c:1558 #3 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 3 (process 24303): #0 0x0000003e6000ab99 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x0000000000412a50 in assoc_maintenance_thread (arg=<value optimized out>) at assoc.c:220 #2 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 2 (process 24304): #0 0x0000003e5f4cae46 in poll () from /lib64/libc.so.6 #1 0x000000000044fc56 in Curl_socket_ready () #2 0x000000000046671b in Curl_do_perform () #3 0x000000000044993b in run_rest_conflate (arg=0x8f8b930) at rest.c:293 #4 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 Thread 1 (process 24300): #0 0x0000003e5f430265 in raise () from /lib64/libc.so.6 #1 0x0000003e5f431d10 in abort () from /lib64/libc.so.6 #2 0x0000003e5f4296e6 in __assert_fail () from /lib64/libc.so.6 #3 0x000000000041a5ad in cproxy_reserve_downstream (ptd=0x90bbe00) at cproxy.c:927 #4 0x000000000041a7ae in cproxy_assign_downstream (ptd=0x90bbe00) at cproxy.c:1783 #5 0x000000000040c86b in drive_machine (c=0x92f8600) at memcached.c:1988 #6 0x0000000000442321 in event_base_loop (base=0x8f7ae70, flags=0) at event.c:1308 #7 0x00000000004137e5 in worker_libevent (arg=<value optimized out>) at thread.c:272 #8 0x0000003e600064a7 in start_thread () from /lib64/libpthread.so.0 #9 0x0000003e5f4d3c2d in clone () from /lib64/libc.so.6 |
| Comments |
| Comment by Steve Yen [ 21/Mar/12 ] |
| Did another code review. Don't see still (yet?) how d->timeout_tv can be non-zero at that point since any release downstream has its downstream_tv cleared. |
| Comment by Alex Ma [ 23/Apr/12 ] |
|
Another user saw the same assert with multiset/multiget Server 1.8/moxi 1.8
Ascii protocol |
| Comment by Steve Yen [ 23/Apr/12 ] |
|
Confirming, the crash happened during rebalance?
Were they using ascii or binary moxi protocol? If it wasn't a rebalance, can I get a stats proxy from a relatively long-running moxi (privatized and/or scrubbed, of course). Thanks, Steve |
| Comment by Alex Ma [ 24/Apr/12 ] |
|
STAT basic:version 1.7.2_5_g3119473
STAT basic:nthreads 5 STAT basic:hostname labs6 STAT memcached:settings:maxbytes 67108864 STAT memcached:settings:maxconns 1024 STAT memcached:settings:tcpport 0 STAT memcached:settings:udpport -2 STAT memcached:settings:inter NULL STAT memcached:settings:verbosity 0 STAT memcached:settings:oldest 0 STAT memcached:settings:evictions on STAT memcached:settings:domain_socket NULL STAT memcached:settings:umask 700 STAT memcached:settings:growth_factor 1.25 STAT memcached:settings:chunk_size 48 STAT memcached:settings:num_threads 5 STAT memcached:settings:stat_key_prefix : STAT memcached:settings:detail_enabled no STAT memcached:settings:reqs_per_event 20 STAT memcached:settings:cas_enabled yes STAT memcached:settings:tcp_backlog 1024 STAT memcached:settings:binding_protocol auto-negotiate STAT memcached:stats:pid 26795 STAT memcached:stats:uptime 83477 STAT memcached:stats:time 1335240838 STAT memcached:stats:version 1.7.2_5_g3119473 STAT memcached:stats:pointer_size 64 STAT memcached:stats:rusage_user 5940.055230 STAT memcached:stats:rusage_system 8618.746638 STAT memcached:stats:curr_connections 123 STAT memcached:stats:total_connections 36761379 STAT memcached:stats:connection_structures 304 STAT memcached:stats:cmd_get 0 STAT memcached:stats:cmd_set 0 STAT memcached:stats:cmd_flush 0 STAT memcached:stats:get_hits 0 STAT memcached:stats:get_misses 0 STAT memcached:stats:delete_misses 0 STAT memcached:stats:delete_hits 0 STAT memcached:stats:incr_misses 0 STAT memcached:stats:incr_hits 0 STAT memcached:stats:decr_misses 0 STAT memcached:stats:decr_hits 0 STAT memcached:stats:cas_misses 0 STAT memcached:stats:cas_hits 0 STAT memcached:stats:cas_badval 0 STAT memcached:stats:bytes_read 447531408952 STAT memcached:stats:bytes_written 260809934767 STAT memcached:stats:limit_maxbytes 67108864 STAT memcached:stats:accepting_conns 1 STAT memcached:stats:listen_disabled_num 0 STAT memcached:stats:threads 5 STAT memcached:stats:conn_yields 0 STAT proxy_main:conf_type dynamic STAT proxy_main:behavior:cycle 200 STAT proxy_main:behavior:downstream_max 1024 STAT proxy_main:behavior:downstream_conn_max 128 STAT proxy_main:behavior:downstream_weight 0 STAT proxy_main:behavior:downstream_retry 1 STAT proxy_main:behavior:downstream_protocol 8 STAT proxy_main:behavior:downstream_timeout 5000 STAT proxy_main:behavior:downstream_conn_queue_timeout 200 STAT proxy_main:behavior:connect_timeout 400 STAT proxy_main:behavior:auth_timeout 100 STAT proxy_main:behavior:wait_queue_timeout 200 STAT proxy_main:behavior:time_stats 0 STAT proxy_main:behavior:connect_max_errors 5 STAT proxy_main:behavior:connect_retry_interval 30000 STAT proxy_main:behavior:front_cache_max 200 STAT proxy_main:behavior:front_cache_lifespan 0 STAT proxy_main:behavior:front_cache_spec STAT proxy_main:behavior:front_cache_unspec STAT proxy_main:behavior:key_stats_max 4000 STAT proxy_main:behavior:key_stats_lifespan 0 STAT proxy_main:behavior:key_stats_spec STAT proxy_main:behavior:key_stats_unspec STAT proxy_main:behavior:optimize_set STAT proxy_main:behavior:usr STAT proxy_main:behavior:host STAT proxy_main:behavior:port 0 STAT proxy_main:behavior:bucket STAT proxy_main:behavior:port_listen 11212 STAT proxy_main:behavior:default_bucket_name default STAT proxy_main:stats:stat_configs 108 STAT proxy_main:stats:stat_config_fails 0 STAT proxy_main:stats:stat_proxy_starts 2 STAT proxy_main:stats:stat_proxy_start_fails 0 STAT proxy_main:stats:stat_proxy_existings 107 STAT proxy_main:stats:stat_proxy_shutdowns 0 STAT 11212:default:info:port 11212 STAT 11212:default:info:config_ver 108 STAT 11212:default:info:behaviors_num 4 STAT 11212:default:behavior:downstream_max 1024 STAT 11212:default:behavior:downstream_conn_max 128 STAT 11212:default:behavior:downstream_weight 0 STAT 11212:default:behavior:downstream_retry 1 STAT 11212:default:behavior:downstream_protocol 8 STAT 11212:default:behavior:downstream_timeout 5000 STAT 11212:default:behavior:downstream_conn_queue_timeout 200 STAT 11212:default:behavior:connect_timeout 400 STAT 11212:default:behavior:auth_timeout 100 STAT 11212:default:behavior:wait_queue_timeout 200 STAT 11212:default:behavior:time_stats 0 STAT 11212:default:behavior:connect_max_errors 5 STAT 11212:default:behavior:connect_retry_interval 30000 STAT 11212:default:behavior:front_cache_max 200 STAT 11212:default:behavior:front_cache_lifespan 0 STAT 11212:default:behavior:front_cache_spec STAT 11212:default:behavior:front_cache_unspec STAT 11212:default:behavior:key_stats_max 4000 STAT 11212:default:behavior:key_stats_lifespan 0 STAT 11212:default:behavior:key_stats_spec STAT 11212:default:behavior:key_stats_unspec STAT 11212:default:behavior:optimize_set STAT 11212:default:behavior:usr default STAT 11212:default:behavior:host STAT 11212:default:behavior:port 0 STAT 11212:default:behavior:bucket STAT 11212:default:behavior:port_listen 11212 STAT 11212:default:behavior:default_bucket_name default STAT 11212:default:behavior-0:downstream_weight 0 STAT 11212:default:behavior-0:downstream_retry 1 STAT 11212:default:behavior-0:downstream_protocol 8 STAT 11212:default:behavior-0:downstream_timeout 5000 STAT 11212:default:behavior-0:downstream_conn_queue_timeout 200 STAT 11212:default:behavior-0:connect_timeout 400 STAT 11212:default:behavior-0:auth_timeout 100 STAT 11212:default:behavior-0:usr default STAT 11212:default:behavior-0:host 172.17.18.153 STAT 11212:default:behavior-0:port 11210 STAT 11212:default:behavior-0:bucket STAT 11212:default:behavior-1:downstream_weight 0 STAT 11212:default:behavior-1:downstream_retry 1 STAT 11212:default:behavior-1:downstream_protocol 8 STAT 11212:default:behavior-1:downstream_timeout 5000 STAT 11212:default:behavior-1:downstream_conn_queue_timeout 200 STAT 11212:default:behavior-1:connect_timeout 400 STAT 11212:default:behavior-1:auth_timeout 100 STAT 11212:default:behavior-1:usr default STAT 11212:default:behavior-1:host 172.17.18.151 STAT 11212:default:behavior-1:port 11210 STAT 11212:default:behavior-1:bucket STAT 11212:default:behavior-2:downstream_weight 0 STAT 11212:default:behavior-2:downstream_retry 1 STAT 11212:default:behavior-2:downstream_protocol 8 STAT 11212:default:behavior-2:downstream_timeout 5000 STAT 11212:default:behavior-2:downstream_conn_queue_timeout 200 STAT 11212:default:behavior-2:connect_timeout 400 STAT 11212:default:behavior-2:auth_timeout 100 STAT 11212:default:behavior-2:usr default STAT 11212:default:behavior-2:host 172.17.18.152 STAT 11212:default:behavior-2:port 11210 STAT 11212:default:behavior-2:bucket STAT 11212:default:behavior-3:downstream_weight 0 STAT 11212:default:behavior-3:downstream_retry 1 STAT 11212:default:behavior-3:downstream_protocol 8 STAT 11212:default:behavior-3:downstream_timeout 5000 STAT 11212:default:behavior-3:downstream_conn_queue_timeout 200 STAT 11212:default:behavior-3:connect_timeout 400 STAT 11212:default:behavior-3:auth_timeout 100 STAT 11212:default:behavior-3:usr default STAT 11212:default:behavior-3:host 172.17.18.58 STAT 11212:default:behavior-3:port 11210 STAT 11212:default:behavior-3:bucket STAT 11212:default:stats:listening 2 STAT 11212:default:stats:listening_failed 0 STAT 11212:default:frontcache:max 0 STAT 11212:default:frontcache:oldest_live 0 STAT 11212:default:frontcache:tot_get_hits 0 STAT 11212:default:frontcache:tot_get_expires 0 STAT 11212:default:frontcache:tot_get_misses 0 STAT 11212:default:frontcache:tot_get_bytes 0 STAT 11212:default:frontcache:tot_adds 0 STAT 11212:default:frontcache:tot_add_skips 0 STAT 11212:default:frontcache:tot_add_fails 0 STAT 11212:default:frontcache:tot_add_bytes 0 STAT 11212:default:frontcache:tot_deletes 0 STAT 11212:default:frontcache:tot_evictions 0 STAT 11212:default:pstd_stats:num_upstream 13 STAT 11212:default:pstd_stats:tot_upstream 36668702 STAT 11212:default:pstd_stats:num_downstream_conn 11224 STAT 11212:default:pstd_stats:tot_downstream_conn 92673 STAT 11212:default:pstd_stats:tot_downstream_conn_acquired 95418023 STAT 11212:default:pstd_stats:tot_downstream_conn_released 94408123 STAT 11212:default:pstd_stats:tot_downstream_released 59123847 STAT 11212:default:pstd_stats:tot_downstream_reserved 59123792 STAT 11212:default:pstd_stats:tot_downstream_reserved_time 0 STAT 11212:default:pstd_stats:max_downstream_reserved_time 0 STAT 11212:default:pstd_stats:tot_downstream_freed 0 STAT 11212:default:pstd_stats:tot_downstream_quit_server 81449 STAT 11212:default:pstd_stats:tot_downstream_max_reached 0 STAT 11212:default:pstd_stats:tot_downstream_create_failed 0 STAT 11212:default:pstd_stats:tot_downstream_connect_started 92673 STAT 11212:default:pstd_stats:tot_downstream_connect_wait 92673 STAT 11212:default:pstd_stats:tot_downstream_connect 83678 STAT 11212:default:pstd_stats:tot_downstream_connect_failed 146 STAT 11212:default:pstd_stats:tot_downstream_connect_timeout 118 STAT 11212:default:pstd_stats:tot_downstream_connect_interval 928451 STAT 11212:default:pstd_stats:tot_downstream_connect_max_reached 0 STAT 11212:default:pstd_stats:tot_downstream_waiting_errors 0 STAT 11212:default:pstd_stats:tot_downstream_auth 83678 STAT 11212:default:pstd_stats:tot_downstream_auth_failed 28 STAT 11212:default:pstd_stats:tot_downstream_bucket 83678 STAT 11212:default:pstd_stats:tot_downstream_bucket_failed 0 STAT 11212:default:pstd_stats:tot_downstream_propagate_failed 439407 STAT 11212:default:pstd_stats:tot_downstream_close_on_upstream_close 0 STAT 11212:default:pstd_stats:tot_downstream_conn_queue_timeout 0 STAT 11212:default:pstd_stats:tot_downstream_conn_queue_add 0 STAT 11212:default:pstd_stats:tot_downstream_conn_queue_remove 0 STAT 11212:default:pstd_stats:tot_downstream_timeout 31811 STAT 11212:default:pstd_stats:tot_wait_queue_timeout 0 STAT 11212:default:pstd_stats:tot_auth_timeout 28 STAT 11212:default:pstd_stats:tot_assign_downstream 59123792 STAT 11212:default:pstd_stats:tot_assign_upstream 59123792 STAT 11212:default:pstd_stats:tot_assign_recursion 0 STAT 11212:default:pstd_stats:tot_reset_upstream_avail 0 STAT 11212:default:pstd_stats:tot_multiget_keys 417202790 STAT 11212:default:pstd_stats:tot_multiget_keys_dedupe 0 STAT 11212:default:pstd_stats:tot_multiget_bytes_dedupe 0 STAT 11212:default:pstd_stats:tot_optimize_sets 0 STAT 11212:default:pstd_stats:tot_retry 0 STAT 11212:default:pstd_stats:tot_retry_time 0 STAT 11212:default:pstd_stats:max_retry_time 0 STAT 11212:default:pstd_stats:tot_retry_vbucket 0 STAT 11212:default:pstd_stats:tot_upstream_paused 59123792 STAT 11212:default:pstd_stats:tot_upstream_unpaused 59123792 STAT 11212:default:pstd_stats:err_oom 0 STAT 11212:default:pstd_stats:err_upstream_write_prep 0 STAT 11212:default:pstd_stats:err_downstream_write_prep 0 STAT 11212:default:pstd_stats:tot_cmd_time 8514072310757 STAT 11212:default:pstd_stats:tot_cmd_count 59123792 STAT 11212:default:pstd_stats:tot_local_cmd_time 1397715416206 STAT 11212:default:pstd_stats:tot_local_cmd_count 12464580 STAT 11212:default:pstd_stats_cmd:regular_get:seen 40974976 STAT 11212:default:pstd_stats_cmd:regular_get:hits 0 STAT 11212:default:pstd_stats_cmd:regular_get:misses 0 STAT 11212:default:pstd_stats_cmd:regular_get:read_bytes 10804703637 STAT 11212:default:pstd_stats_cmd:regular_get:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_get:cas 28956854 STAT 11212:default:pstd_stats_cmd:regular_get_key:seen 417202790 STAT 11212:default:pstd_stats_cmd:regular_get_key:hits 439615684 STAT 11212:default:pstd_stats_cmd:regular_get_key:misses 1994688 STAT 11212:default:pstd_stats_cmd:regular_get_key:read_bytes 10339072333 STAT 11212:default:pstd_stats_cmd:regular_get_key:write_bytes 175962018687 STAT 11212:default:pstd_stats_cmd:regular_get_key:cas 0 STAT 11212:default:pstd_stats_cmd:regular_set:seen 18148539 STAT 11212:default:pstd_stats_cmd:regular_set:hits 0 STAT 11212:default:pstd_stats_cmd:regular_set:misses 0 STAT 11212:default:pstd_stats_cmd:regular_set:read_bytes 44070039206 STAT 11212:default:pstd_stats_cmd:regular_set:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_set:cas 0 STAT 11212:default:pstd_stats_cmd:regular_add:seen 0 STAT 11212:default:pstd_stats_cmd:regular_add:hits 0 STAT 11212:default:pstd_stats_cmd:regular_add:misses 0 STAT 11212:default:pstd_stats_cmd:regular_add:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_add:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_add:cas 0 STAT 11212:default:pstd_stats_cmd:regular_replace:seen 0 STAT 11212:default:pstd_stats_cmd:regular_replace:hits 0 STAT 11212:default:pstd_stats_cmd:regular_replace:misses 0 STAT 11212:default:pstd_stats_cmd:regular_replace:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_replace:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_replace:cas 0 STAT 11212:default:pstd_stats_cmd:regular_delete:seen 278 STAT 11212:default:pstd_stats_cmd:regular_delete:hits 0 STAT 11212:default:pstd_stats_cmd:regular_delete:misses 0 STAT 11212:default:pstd_stats_cmd:regular_delete:read_bytes 7739 STAT 11212:default:pstd_stats_cmd:regular_delete:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_delete:cas 0 STAT 11212:default:pstd_stats_cmd:regular_append:seen 0 STAT 11212:default:pstd_stats_cmd:regular_append:hits 0 STAT 11212:default:pstd_stats_cmd:regular_append:misses 0 STAT 11212:default:pstd_stats_cmd:regular_append:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_append:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_append:cas 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:seen 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:hits 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:misses 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_prepend:cas 0 STAT 11212:default:pstd_stats_cmd:regular_incr:seen 0 STAT 11212:default:pstd_stats_cmd:regular_incr:hits 0 STAT 11212:default:pstd_stats_cmd:regular_incr:misses 0 STAT 11212:default:pstd_stats_cmd:regular_incr:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_incr:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_incr:cas 0 STAT 11212:default:pstd_stats_cmd:regular_decr:seen 0 STAT 11212:default:pstd_stats_cmd:regular_decr:hits 0 STAT 11212:default:pstd_stats_cmd:regular_decr:misses 0 STAT 11212:default:pstd_stats_cmd:regular_decr:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_decr:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_decr:cas 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:seen 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:hits 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:misses 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_flush_all:cas 0 STAT 11212:default:pstd_stats_cmd:regular_cas:seen 0 STAT 11212:default:pstd_stats_cmd:regular_cas:hits 0 STAT 11212:default:pstd_stats_cmd:regular_cas:misses 0 STAT 11212:default:pstd_stats_cmd:regular_cas:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_cas:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_cas:cas 0 STAT 11212:default:pstd_stats_cmd:regular_stats:seen 0 STAT 11212:default:pstd_stats_cmd:regular_stats:hits 0 STAT 11212:default:pstd_stats_cmd:regular_stats:misses 0 STAT 11212:default:pstd_stats_cmd:regular_stats:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_stats:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_stats:cas 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:seen 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:hits 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:misses 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_stats_reset:cas 0 STAT 11212:default:pstd_stats_cmd:regular_version:seen 0 STAT 11212:default:pstd_stats_cmd:regular_version:hits 0 STAT 11212:default:pstd_stats_cmd:regular_version:misses 0 STAT 11212:default:pstd_stats_cmd:regular_version:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_version:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_version:cas 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:seen 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:hits 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:misses 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_verbosity:cas 0 STAT 11212:default:pstd_stats_cmd:regular_quit:seen 36668411 STAT 11212:default:pstd_stats_cmd:regular_quit:hits 0 STAT 11212:default:pstd_stats_cmd:regular_quit:misses 0 STAT 11212:default:pstd_stats_cmd:regular_quit:read_bytes 146673644 STAT 11212:default:pstd_stats_cmd:regular_quit:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_quit:cas 0 STAT 11212:default:pstd_stats_cmd:regular_getl:seen 0 STAT 11212:default:pstd_stats_cmd:regular_getl:hits 0 STAT 11212:default:pstd_stats_cmd:regular_getl:misses 0 STAT 11212:default:pstd_stats_cmd:regular_getl:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_getl:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_getl:cas 0 STAT 11212:default:pstd_stats_cmd:regular_unl:seen 0 STAT 11212:default:pstd_stats_cmd:regular_unl:hits 0 STAT 11212:default:pstd_stats_cmd:regular_unl:misses 0 STAT 11212:default:pstd_stats_cmd:regular_unl:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_unl:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_unl:cas 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:seen 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:hits 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:misses 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:read_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:write_bytes 0 STAT 11212:default:pstd_stats_cmd:regular_ERROR:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_get:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_get:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_get:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_get:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_get:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_get:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_get_key:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_set:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_set:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_set:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_set:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_set:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_set:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_add:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_add:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_add:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_add:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_add:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_add:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_replace:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_delete:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_append:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_append:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_append:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_append:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_append:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_append:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_prepend:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_incr:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_decr:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_flush_all:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_cas:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_stats:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_stats_reset:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_version:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_version:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_version:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_version:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_version:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_version:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_verbosity:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_quit:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_getl:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_unl:cas 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:seen 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:hits 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:misses 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:read_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:write_bytes 0 STAT 11212:default:pstd_stats_cmd:quiet_ERROR:cas 0 |
| Comment by Steve Yen [ 24/Apr/12 ] |
|
total connections look large...
STAT memcached:stats:total_connections 36761379 probably due to client app closing conns rather than reusing them... STAT 11212:default:pstd_stats:num_upstream 13 STAT 11212:default:pstd_stats:tot_upstream 36668702 although num downstream conns also looks large (but probably less likely to be the problem right now)... STAT 11212:default:pstd_stats:num_downstream_conn 11224 STAT 11212:default:pstd_stats:tot_downstream_conn 92673 across 4 server nodes, across 4 worker threads, each worker thread has 701 conns. Seems large. also, some downstream_timeouts, perhaps due to rebalance slowness... STAT 11212:default:pstd_stats:tot_downstream_timeout 31811 until i figure out the underlying reason for the crash, here's some workaround ideas... - change the client app config to using persistent client-side memcached connections. I recall another user did this and was able to avoid some moxi crash (possibly the same crash?) - increase downstream_timeout |
| Comment by Steve Yen [ 25/Apr/12 ] |
|
Replicated crash on OSX moxi 1.8.0-3-g5f01fa8...
Process: moxi [9936] Path: /Users/steveyen/work/membase/dev/voltron.couchbase/build/moxi/moxi Identifier: moxi Version: ??? (???) Code Type: X86-64 (Native) Parent Process: bash [8289] Date/Time: 2012-04-25 15:38:34.254 -0700 OS Version: Mac OS X 10.6.8 (10K549) Report Version: 6 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Crashed Thread: 4 Application Specific Information: Assertion failed: (d->timeout_tv.tv_sec == 0), function cproxy_reserve_downstream, file cproxy.c, line 927. Thread 0: Dispatch queue: com.apple.main-thread 0 libSystem.B.dylib 0x00007fff87c41c0a kevent + 10 1 moxi 0x000000010004be07 kq_dispatch + 423 2 moxi 0x0000000100046080 event_base_loop + 576 3 moxi 0x0000000100005ac3 main + 3891 (memcached.c:5063) 4 moxi 0x0000000100001504 start + 52 Thread 1: 0 libSystem.B.dylib 0x00007fff87c3b272 write + 10 1 moxi 0x00000001000314cd log_error_write + 749 (log.c:230) 2 moxi 0x000000010000a282 drive_machine + 242 (memcached.c:3456) 3 moxi 0x00000001000462c2 event_base_loop + 1154 4 moxi 0x00000001000128c0 worker_libevent + 128 (thread.c:274) 5 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 6 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 2: 0 libSystem.B.dylib 0x00007fff87c3b272 write + 10 1 moxi 0x00000001000314cd log_error_write + 749 (log.c:230) 2 moxi 0x0000000100001c40 conn_set_state + 160 (memcached.c:668) 3 moxi 0x0000000100026c5d a2b_process_downstream_response + 2397 (cproxy_protocol_a2b.c:764) 4 moxi 0x000000010000ac0c drive_machine + 2684 (memcached.c:1990) 5 moxi 0x00000001000462c2 event_base_loop + 1154 6 moxi 0x00000001000128c0 worker_libevent + 128 (thread.c:274) 7 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 8 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 3: 0 libSystem.B.dylib 0x00007fff87c3b272 write + 10 1 moxi 0x00000001000314cd log_error_write + 749 (log.c:230) 2 moxi 0x0000000100027a0e cproxy_process_a2b_downstream_nread + 654 (cproxy_protocol_a2b.c:541) 3 moxi 0x000000010000ac0c drive_machine + 2684 (memcached.c:1990) 4 moxi 0x00000001000462c2 event_base_loop + 1154 5 moxi 0x00000001000128c0 worker_libevent + 128 (thread.c:274) 6 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 7 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 4 Crashed: 0 libSystem.B.dylib 0x00007fff87c770b6 __kill + 10 1 libSystem.B.dylib 0x00007fff87d179f6 abort + 83 2 libSystem.B.dylib 0x00007fff87d049bc __pthread_markcancel + 0 3 moxi 0x000000010001a5b7 cproxy_reserve_downstream + 839 (cproxy.c:926) 4 moxi 0x000000010001a774 cproxy_assign_downstream + 196 (cproxy.c:1785) 5 moxi 0x0000000100021905 cproxy_process_upstream_ascii + 901 (cproxy_protocol_a.c:150) 6 moxi 0x0000000100008083 try_read_command + 835 (memcached.c:3205) 7 moxi 0x000000010000a560 drive_machine + 976 (memcached.c:3536) 8 moxi 0x00000001000462c2 event_base_loop + 1154 9 moxi 0x00000001000128c0 worker_libevent + 128 (thread.c:274) 10 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 11 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 5: 0 libSystem.B.dylib 0x00007fff87c63a6a __semwait_signal + 10 1 libSystem.B.dylib 0x00007fff87c67881 _pthread_cond_wait + 1286 2 moxi 0x0000000100011941 assoc_maintenance_thread + 305 (assoc.c:223) 3 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 4 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 6: 0 libSystem.B.dylib 0x00007fff87c6c932 select$DARWIN_EXTSN + 10 1 libconflate.0.dylib 0x00000001000c2996 Curl_socket_ready + 438 2 libconflate.0.dylib 0x00000001000b67f0 Curl_do_perform + 496 3 libconflate.0.dylib 0x0000000100096719 run_rest_conflate + 801 (rest.c:293) 4 libSystem.B.dylib 0x00007fff87c61fd6 _pthread_start + 331 5 libSystem.B.dylib 0x00007fff87c61e89 thread_start + 13 Thread 4 crashed with X86 Thread State (64-bit): rax: 0x0000000000000000 rbx: 0x0000000100053f05 rcx: 0x0000000100509ac8 rdx: 0x0000000000000000 rdi: 0x00000000000026d0 rsi: 0x0000000000000006 rbp: 0x0000000100509ae0 rsp: 0x0000000100509ac8 r8: 0x0000000000000003 r9: 0x0000000000000000 r10: 0x00007fff87c730fa r11: 0x0000000000000206 r12: 0x000000000000039f r13: 0x000000010005230f r14: 0x0000000100052e36 r15: 0x00007fff70c88790 rip: 0x00007fff87c770b6 rfl: 0x0000000000000206 cr2: 0x0000000102997000 Binary Images: 0x100000000 - 0x100064fef +moxi ??? (???) <64AE645C-76DA-41FB-A00C-C4EB4DDFE5D9> /Users/steveyen/work/membase/dev/voltron.couchbase/build/moxi/moxi 0x100093000 - 0x1000d9fef +libconflate.0.dylib ??? (???) <A2698F74-7010-74E0-F8BF-AB40EA051667> /opt/couchbase/lib/libconflate.0.dylib 0x1000e7000 - 0x1000e9ff7 +libhashkit.0.dylib ??? (???) <EAF5A8E0-FD91-4E2E-07AC-E891EAE0F003> /opt/couchbase/lib/libhashkit.0.dylib 0x1000ed000 - 0x1000f3fef +libvbucket.1.dylib 3.0.0 (compatibility 3.0.0) <2581DCBA-8D50-E90A-ECAF-372830B80C0D> /opt/couchbase/lib/libvbucket.1.dylib 0x1000f9000 - 0x100109fef +libmemcached.6.dylib 7.0.0 (compatibility 7.0.0) <58E35902-BFC2-DF19-5802-7FAC3E201773> /opt/couchbase/lib/libmemcached.6.dylib 0x7fff5fc00000 - 0x7fff5fc3bdef dyld 132.1 (???) <69130DA3-7CB3-54C8-ABC5-423DECDD2AF7> /usr/lib/dyld 0x7fff813a2000 - 0x7fff813a6ff7 libmathCommon.A.dylib 315.0.0 (compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> /usr/lib/system/libmathCommon.A.dylib 0x7fff85634000 - 0x7fff85645ff7 libz.1.dylib 1.2.3 (compatibility 1.0.0) <97019C74-161A-3488-41EC-A6CA8738418C> /usr/lib/libz.1.dylib 0x7fff87c28000 - 0x7fff87de9fef libSystem.B.dylib 125.2.11 (compatibility 1.0.0) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib 0x7fff8835d000 - 0x7fff8839afff com.apple.LDAPFramework 2.0 (120.1) <54A6769E-D7E2-DBE2-EA61-87B9EA355DA4> /System/Library/Frameworks/LDAP.framework/Versions/A/LDAP 0x7fff88499000 - 0x7fff884abfe7 libsasl2.2.dylib 3.15.0 (compatibility 3.0.0) <76B83C8D-8EFE-4467-0F75-275648AFED97> /usr/lib/libsasl2.2.dylib 0x7fff884ef000 - 0x7fff88510fff libresolv.9.dylib 41.1.0 (compatibility 1.0.0) <9410EC7F-4D24-6740-AFEE-90405750FAD7> /usr/lib/libresolv.9.dylib 0x7fff8a461000 - 0x7fff8a580fe7 libcrypto.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <14115D29-432B-CF02-6B24-A60CC533A09E> /usr/lib/libcrypto.0.9.8.dylib 0x7fff8a6b3000 - 0x7fff8a6b4ff7 com.apple.TrustEvaluationAgent 1.1 (1) <5952A9FA-BC2B-16EF-91A7-43902A5C07B6> /System/Library/PrivateFrameworks/TrustEvaluationAgent.framework/Versions/A/TrustEvaluationAgent 0x7fff8a6f7000 - 0x7fff8a734ff7 libssl.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <F743389F-F25A-A77D-4FCA-D6B01AF2EE6D> /usr/lib/libssl.0.9.8.dylib 0x7fffffe00000 - 0x7fffffe01fff libSystem.B.dylib ??? (???) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib |
| Comment by Steve Yen [ 25/Apr/12 ] |
| Added scripts from Keith B. and moxi -vvv output leading up to the crash |
| Comment by Steve Yen [ 25/Apr/12 ] |
| http://review.couchbase.org/15232 |
| Comment by Steve Yen [ 26/Apr/12 ] |
| Test using Keith's scripts has been running for several hours now with the fix, without a crash. It only stopped because somebody reset the cluster. |
| Comment by Steve Yen [ 10/May/12 ] |
| reopening to get it out of the current sprint |