[MB-7479] No data returned when using 2 or more nodes via gui as well as server crash Created: 02/Jan/13  Updated: 05/Apr/13  Resolved: 27/Feb/13

Status: Closed
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 2.0
Fix Version/s: 2.1.0
Security Level: Public

Type: Bug Priority: Major
Reporter: makeawish Assignee: Xiaoqin Ma (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:  running version 2.0.0 enterprise edition (build-1976) on 2 Servers Ubuntu 11.10

Attachments: File ns-diag-20130103182922.rar     JPEG File screenshot-1.jpg     JPEG File screenshot-2.jpg     JPEG File screenshot-3.jpg     JPEG File screenshot-4.jpg    

 Description   
Installed 2 Servers with latest version, created initial server and added some basic docs, index like post_picture_1 and so on, docs are in jason format.
As long its only single server all works fine. When i add 2nd Server and finish rebalance i see the docs are distributed among both servers. When i go
to buckets it shows activity indicator and comes back with "There are currently no documents in this bucket." if i search for the doc by index it either
returns nothing or sometimes reports it lost communication to server. When going back to the server the gui was run on in the os, i can verify status
that couchbase no longer runs and i need to start manually again.
So i am not certain if the 2 issues are related to each other why there is no data in the document screen in gui once i use cluster. All works normal
with single server. I also was able to create same issues with 3 server cluster

 Comments   
Comment by Mike Wiederhold [ 02/Jan/13 ]
Can you attach the server logs so I can take a look? They can generated by running /opt/coucbase/bin/collectinfo. You can also get them by clicking on the logs tab in the web console and then clicking the link that says "generate diagnostic report".
Comment by makeawish [ 03/Jan/13 ]
Uploaded diag file
If you need anything else, let me know
Comment by makeawish [ 03/Jan/13 ]
Shows both servers and 5 docs
Comment by makeawish [ 03/Jan/13 ]
Data Bucket shows also 5 docs in default
Comment by makeawish [ 03/Jan/13 ]
Screen shows searching for docs for about 10 to 15 sec
Comment by makeawish [ 03/Jan/13 ]
Returns no Docs in bucket even so there are docs in the bucket
Comment by Mike Wiederhold [ 04/Jan/13 ]
How much memory does each server contain? Can you also run top on one of the machines and paste the output here? From looking at the logs it appears that one of the processes in the server is being kill by the OS. This information will help me to understand if this is the case.
Comment by makeawish [ 09/Jan/13 ]
Right now they ron with 512 MB, but i thought it was the mem issue and moved the VM's to GB and it behaved still same way. The single server (non cluster ) behaves just fine with 512 MB or 8 GB
Comment by Chiyoung Seo [ 15/Feb/13 ]
For the bug distributions in the engine team.
Comment by Xiaoqin Ma (Inactive) [ 20/Feb/13 ]
@makawish, what is the size of your document? Do you have log files? Thanks!
Comment by Mike Wiederhold [ 20/Feb/13 ]
Xiaoqin,

The logs for this issue are already attached.
Comment by Mike Wiederhold [ 20/Feb/13 ]
I looked through the logs and memcached looks like it is continually killed by the OS for some reason. See the message below where sigkill (137) is reported. I don't think there is anything we can do about this since a non-Couchbase process is killing us.

=========================CRASH REPORT=========================
  crasher:
    initial call: ns_port_server:init/1
    pid: <0.881.0>
    registered_name: ns_port_memcached
    exception exit: {abnormal,137}
      in function gen_server:terminate/6
    ancestors: [<0.879.0>,ns_port_sup,ns_server_sup,ns_server_cluster_sup,
                  <0.58.0>]
    messages: [{'EXIT',#Port<0.7456>,normal}]
    links: [<0.879.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 75025
    stack_size: 24
    reductions: 1342314
  neighbours:

[error_logger:error,2012-12-20T0:58:59.107,ns_1@10.181.2.115:error_logger<0.5.0>:ale_error_logger_handler:log_msg:76]** Generic server <0.879.0> terminating
** Last message in was {die,{abnormal,137}}
** When Server state == {state,memcached,5000,
                               {1355,962940,896183},
                               undefined,infinity}
** Reason for termination ==
** {abnormal,137}
Comment by Mike Wiederhold [ 27/Feb/13 ]
I'm closing this issue since the logs show that the memcached process appears to be killed by the os.
Comment by Maria McDuff (Inactive) [ 05/Apr/13 ]
sigkill
Generated at Thu Oct 23 00:26:09 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.