Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | SDKs | SDKs

Timed out while reading configuration over HTTP

3 replies [Last post]
  • Login or register to post comments
Sun, 02/10/2013 - 06:00
uvmarko
Offline
Joined: 07/19/2012
Groups: None

Hi I am struggling with an issue connecting to cb.
I have a cluster of 3 couchbase 2.0 community servers running on amazon ec2s. client servers are connected to them using the Java API through Tomcat
today I got a Crash on on of the client machines :

The system is going down for power off NOW!
...
INFO: Closing Spring root WebApplicationContext
Feb 10, 2013 8:35:53 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [Memcached IO over {MemcachedConnection to ip-10-196-218-47.ec2.internal/10.196.218.47:11210 domU-12-31-39-02-31-09.compute-1.internal/10.248.54.247:11210 ip-10-46-217-81.ec2.internal/10.46.217.81:11210}] but has failed to stop it. This is very likely to create a memory leak.
Feb 10, 2013 8:35:53 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [New I/O client worker #1-1] but has failed to stop it. This is very likely to create a memory leak.
...

I've terminated this instance and created a new one but it can't seem to connect to the cluster , I get this error in Java :

013-02-10 09:43:25.247 WARN com.couchbase.client.vbucket.ConfigurationProviderHTTP:  Timed out while reading configuration over HTTP
java.net.SocketTimeoutException: connect timed out

I've got two machines that are connected and working, and one that fails to connect, I tried it with new machines and they all fail. I've tried connecting through different servers in the cb cluster and still fail.
CBStats all on the bucket and node gives me :

accepting_conns:                    1
 auth_cmds:                          2831
 auth_errors:                        0
 bucket_active_conns:                1
 bucket_conns:                       18
 bytes:                              1382987584
 bytes_read:                         18683928043
 bytes_written:                      295454749194
 cas_badval:                         21308
 cas_hits:                           1137248
 cas_misses:                         0
 cmd_flush:                          0
 cmd_get:                            11812704
 cmd_set:                            329016
 conn_yields:                        24850
 connection_structures:              5000
 curr_connections:                   37
 curr_conns_on_port_11209:           27
 curr_conns_on_port_11210:           8
 curr_items:                         289612
 curr_items_tot:                     572649
 curr_temp_items:                    0
 daemon_connections:                 4
 decr_hits:                          0
 decr_misses:                        0
 delete_hits:                        38951
 delete_misses:                      4611
 ep_access_scanner_last_runtime:     0
 ep_access_scanner_num_items:        33276
 ep_access_scanner_task_time:        2013-02-11 10:00:00
 ep_allow_data_loss_during_shutdown: 1
 ep_alog_block_size:                 4096
 ep_alog_path:                       /mnt/ebs/PoolPartyBucket/access.log
 ep_alog_sleep_time:                 1440
 ep_alog_task_time:                  10
 ep_backend:                         couchdb
 ep_bg_fetch_delay:                  0
 ep_bg_fetched:                      169
 ep_bg_load:                         861623
 ep_bg_load_avg:                     5098
 ep_bg_max_load:                     79133
 ep_bg_max_wait:                     3817
 ep_bg_meta_fetched:                 0
 ep_bg_min_load:                     96
 ep_bg_min_wait:                     42
 ep_bg_num_samples:                  169
 ep_bg_remaining_jobs:               0
 ep_bg_wait:                         33946
 ep_bg_wait_avg:                     200
 ep_chk_max_items:                   5000
 ep_chk_period:                      1800
 ep_chk_persistence_remains:         0
 ep_chk_persistence_timeout:         10
 ep_chk_remover_stime:               5
 ep_commit_num:                      2515908
 ep_commit_time:                     11
 ep_commit_time_total:               26399696
 ep_concurrentDB:                    1
 ep_config_file:                     
 ep_couch_bucket:                    PoolPartyBucket
 ep_couch_host:                      localhost
 ep_couch_port:                      11213
 ep_couch_reconnect_sleeptime:       250
 ep_couch_response_timeout:          180000
 ep_data_age:                        1
 ep_data_age_highwat:                37
 ep_data_traffic_enabled:            0
 ep_dbinit:                          0
 ep_dbname:                          /mnt/ebs/PoolPartyBucket
 ep_degraded_mode:                   0
 ep_diskqueue_drain:                 2927055
 ep_diskqueue_fill:                  2926454
 ep_diskqueue_items:                 0
 ep_diskqueue_memory:                0
 ep_diskqueue_pending:               0
 ep_exp_pager_stime:                 3600
 ep_expired_access:                  0
 ep_expired_pager:                   0
 ep_expiry_window:                   3
 ep_failpartialwarmup:               0
 ep_flush_all:                       false
 ep_flush_duration_total:            33458
 ep_flushall_enabled:                0
 ep_flusher_state:                   running
 ep_flusher_todo:                    0
 ep_getl_default_timeout:            15
 ep_getl_max_timeout:                30
 ep_ht_locks:                        5
 ep_ht_size:                         3079
 ep_inconsistent_slave_chk:          0
 ep_initfile:                        
 ep_io_num_read:                     169
 ep_io_num_write:                    2921439
 ep_io_read_bytes:                   213771
 ep_io_write_bytes:                  17503118437
 ep_item_begin_failed:               0
 ep_item_commit_failed:              0
 ep_item_flush_expired:              0
 ep_item_flush_failed:               0
 ep_item_num_based_new_chk:          1
 ep_items_rm_from_checkpoints:       1770140
 ep_keep_closed_chks:                0
 ep_klog_block_size:                 4096
 ep_klog_compactor_queue_cap:        500000
 ep_klog_compactor_stime:            3600
 ep_klog_flush:                      commit2
 ep_klog_max_entry_ratio:            10
 ep_klog_max_log_size:               2147483647
 ep_klog_path:                       
 ep_klog_sync:                       commit2
 ep_kv_size:                         1268463867
 ep_max_checkpoints:                 2
 ep_max_data_size:                   2097152000
 ep_max_item_size:                   20971520
 ep_max_size:                        2097152000
 ep_max_txn_size:                    10000
 ep_max_vbuckets:                    1024
 ep_mem_high_wat:                    1572864000
 ep_mem_low_wat:                     1258291200
 ep_mem_tracker_enabled:             true
 ep_min_data_age:                    0
 ep_mlog_compactor_runs:             0
 ep_mutation_mem_threshold:          0
 ep_num_access_scanner_runs:         23
 ep_num_eject_failures:              105711
 ep_num_expiry_pager_runs:           548
 ep_num_non_resident:                199322
 ep_num_not_my_vbuckets:             5
 ep_num_ops_del_meta:                0
 ep_num_ops_get_meta:                319
 ep_num_ops_set_meta:                0
 ep_num_pager_runs:                  2
 ep_num_value_ejects:                199759
 ep_oom_errors:                      0
 ep_overhead:                        18173688
 ep_pager_active_vb_pcnt:            40
 ep_pager_unbiased_period:           60
 ep_pending_ops:                     0
 ep_pending_ops_max:                 0
 ep_pending_ops_max_duration:        0
 ep_pending_ops_total:               0
 ep_postInitfile:                    
 ep_queue_age_cap:                   900
 ep_queue_size:                      0
 ep_startup_time:                    1358522939
 ep_storage_age:                     0
 ep_storage_age_highwat:             35
 ep_store_max_concurrency:           10
 ep_store_max_readers:               9
 ep_store_max_readwrite:             1
 ep_stored_val_type:                 
 ep_tap_ack_grace_period:            300
 ep_tap_ack_initial_sequence_number: 1
 ep_tap_ack_interval:                1000
 ep_tap_ack_window_size:             10
 ep_tap_backfill_resident:           0.9
 ep_tap_backlog_limit:               5000
 ep_tap_backoff_period:              5
 ep_tap_bg_fetch_requeued:           0
 ep_tap_bg_fetched:                  3043860
 ep_tap_bg_max_pending:              500
 ep_tap_keepalive:                   300
 ep_tap_noop_interval:               20
 ep_tap_requeue_sleep_time:          0.1
 ep_tap_throttle_cap_pcnt:           10
 ep_tap_throttle_queue_cap:          1000000
 ep_tap_throttle_threshold:          90
 ep_tmp_oom_errors:                  0
 ep_too_old:                         0
 ep_too_young:                       0
 ep_total_cache_size:                1261592079
 ep_total_del_items:                 121353
 ep_total_enqueued:                  2926454
 ep_total_new_items:                 608274
 ep_total_persisted:                 2921439
 ep_uncommitted_items:               0
 ep_value_size:                      1205492644
 ep_vb0:                             0
 ep_vb_snapshot_total:               403443
 ep_vb_total:                        682
 ep_vbucket_del:                     0
 ep_vbucket_del_fail:                0
 ep_version:                         2.0.0r_145_g4ac7f92
 ep_waitforwarmup:                   0
 ep_warmup:                          1
 ep_warmup_batch_size:               1000
 ep_warmup_dups:                     0
 ep_warmup_min_items_threshold:      100
 ep_warmup_min_memory_threshold:     100
 ep_warmup_oom:                      0
 ep_warmup_thread:                   complete
 ep_warmup_time:                     17646
 get_hits:                           11774820
 get_misses:                         37884
 incr_hits:                          0
 incr_misses:                        0
 libevent:                           2.0.11-stable
 limit_maxbytes:                     67108864
 listen_disabled_num:                0
 max_conns_on_port_11209:            1000
 max_conns_on_port_11210:            9000
 mem_used:                           1382987584
 pid:                                1512
 pointer_size:                       64
 rejected_conns:                     0
 rusage_system:                      4168.646269
 rusage_user:                        28050.627656
 tap_checkpoint_end_received:        247989
 tap_checkpoint_end_sent:            246888
 tap_checkpoint_start_received:      249839
 tap_checkpoint_start_sent:          247791
 tap_connect_received:               246
 tap_delete_received:                82679
 tap_delete_sent:                    3032897
 tap_mutation_received:              1350278
 tap_mutation_sent:                  16131847
 tap_opaque_received:                6656
 tap_opaque_sent:                    46796
 tap_vbucket_set_received:           682
 threads:                            4
 time:                               1360499327
 total_connections:                  4740
 uptime:                             1976414
 vb_active_curr_items:               289612
 vb_active_eject:                    62755
 vb_active_expired:                  0
 vb_active_ht_memory:                8563192
 vb_active_itm_memory:               758064076
 vb_active_meta_data_memory:         29914947
 vb_active_num:                      341
 vb_active_num_non_resident:         62586
 vb_active_num_ref_ejects:           53
 vb_active_num_ref_items:            43915
 vb_active_ops_create:               325000
 vb_active_ops_delete:               55701
 vb_active_ops_reject:               0
 vb_active_ops_update:               1170615
 vb_active_perc_mem_resident:        78
 vb_active_queue_age:                0
 vb_active_queue_drain:              1553967
 vb_active_queue_fill:               1553797
 vb_active_queue_memory:             0
 vb_active_queue_pending:            0
 vb_active_queue_size:               0
 vb_dead_num:                        0
 vb_pending_curr_items:              0
 vb_pending_eject:                   0
 vb_pending_expired:                 0
 vb_pending_ht_memory:               0
 vb_pending_itm_memory:              0
 vb_pending_meta_data_memory:        0
 vb_pending_num:                     0
 vb_pending_num_non_resident:        0
 vb_pending_num_ref_ejects:          0
 vb_pending_num_ref_items:           0
 vb_pending_ops_create:              0
 vb_pending_ops_delete:              0
 vb_pending_ops_reject:              0
 vb_pending_ops_update:              0
 vb_pending_perc_mem_resident:       0
 vb_pending_queue_age:               0
 vb_pending_queue_drain:             0
 vb_pending_queue_fill:              0
 vb_pending_queue_memory:            0
 vb_pending_queue_pending:           0
 vb_pending_queue_size:              0
 vb_replica_curr_items:              283037
 vb_replica_eject:                   137004
 vb_replica_expired:                 0
 vb_replica_ht_memory:               8563192
 vb_replica_itm_memory:              503528003
 vb_replica_meta_data_memory:        29262613
 vb_replica_num:                     341
 vb_replica_num_non_resident:        136736
 vb_replica_num_ref_ejects:          145
 vb_replica_num_ref_items:           1939
 vb_replica_ops_create:              283274
 vb_replica_ops_delete:              65652
 vb_replica_ops_reject:              0
 vb_replica_ops_update:              1021197
 vb_replica_perc_mem_resident:       51
 vb_replica_queue_age:               0
 vb_replica_queue_drain:             1373088
 vb_replica_queue_fill:              1372657
 vb_replica_queue_memory:            0
 vb_replica_queue_pending:           0
 vb_replica_queue_size:              0
 version:                            1.4.4_600_g7ea975a

so I know the cluster can accept new connections
I can telnet to all ports without a problem from the client to any CB node.
Everything was working fine except inconsistent number of replicated Items that started in a previous crash. but other than that for the last 3 weeks I haven't experience any trouble with the CB cluster.

Can anyone provide any insight for the root of this problem?

Thanks

Top
  • Login or register to post comments
Sun, 02/10/2013 - 14:34
ingenthr
Offline
Joined: 03/16/2010
Groups:

For the first issue, you'll want to be calling the .shutdown() method on the client object at the appropriate time. That should make that issue go away.

For the second, I know you have connectivity, but there could be something wrong when trying to get the bucket configuration. Are there any unusual entries in the logs on the cluster?

You may also try getting the configuration for a bucket like a client would. During bootstrap, it'll go to a URL like:
http://(node):8091/pools/default/bucketsStreaming/(bucketname)

You may want to try that with some other HTTP client to see if it's seeing the same thing.

If you want to see this from the Java client, turn the debug to either CONFIG or DEBUG level on com.couchbase.client.vbucket.* (depending on whether you're using JDK logging or Log4J-- see the docs). That would be good info on what exact URL it's trying to retrieve when timing out.

Top
  • Login or register to post comments
Mon, 02/11/2013 - 01:44
uvmarko
Offline
Joined: 07/19/2012
Groups: None

Thanks for reply, I created a new cluster and replicated all the data to it via XDCR. then I switched the client to connect to it. but the same issues reappeared, initial connection was fine on two servers but the third one failed, and after a while a server closed the connection which is weird since everything was working fine before.
I replaced the Java API to version 1.1 and used Connection Factory to create the connection and now things appears to work fine, I haven't seen a problem connecting since the update but not enough time has past to tell if its reliable.
I want to implement connection shutdown like you suggested, but I am not sure whats the correct way to do it, since my application needs to be constantly connected to the CB servers, should I do it with Connection Manager like the beer example?
Whats the point of shutting down a client when its useless without a connection, I mean a shutdown might be needed by the nodes, but the client is killed in my system if a connection fails so is there still a point for a client shutdown?
Why is the memcached thread suspended? for a month it worked fine and now in 2 days it happend every few hours, could this be related to the configuration fetching problem?

Thanks a lot for the help!
Yuval

Top
  • Login or register to post comments
Thu, 03/28/2013 - 05:55
uvmarko
Offline
Joined: 07/19/2012
Groups: None

Hi,
So I am still having a problem when a new instance tries to connect to the couchbase cluster, other issues seems to be solved.
Again when I have 2 instances running and connected to the cb cluster, I am starting a new instance and I receive this warning after enabling logging

2013-03-28 12:39:08 DEBUG ConfigurationProviderHTTP:? - Attempting to read configuration from URI: http://xx.xxx.xxx.xxx:8091/pools
2013-03-28 12:39:09 WARN ConfigurationProviderHTTP:? - Timed out while reading configuration over HTTP
java.net.SocketTimeoutException: connect timed out

when I try to get http://xx.xxx.xxx.xxx:8091/pools using curl, I am getting the file correctly. from the same server.
what should I do next to solve this issue?
Thanks

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker