[MB-5651] modify ec2 security group and open traffic to/from port 11209(Replication doesn't work on EC2.) Created: 22/Jun/12  Updated: 19/Feb/14  Resolved: 12/Sep/12

Status: Closed
Project: Couchbase Server
Component/s: documentation
Affects Version/s: None
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Major
Reporter: Pavel Paulau Assignee: MC Brown (Inactive)
Resolution: Fixed Votes: 0
Labels: 1.8.1-release-notes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: EC2, Ubuntu 12.04

ec2-174-129-82-112.compute-1.amazonaws.com (Administrator:password, root:couchbase)

Attachments: GZip Archive ns-diag-20120622171727.tar.gz    

 Description   
Build:
-- 2.0.0-1351-rel

Security group:
-- couchbase

Steps:
-- create 4 nodes cluster (nodes use private ip addresses)
-- create default bucket with 1 replica enabled
-- insert arbitrary number of items (2M in my case)

Actual result:
-- no doc is replicated

Notice:
-- It's not about XDCR.

 Comments   
Comment by Chiyoung Seo [ 22/Jun/12 ]
Mike, please do the initial investigation. Thanks.
Comment by Mike Wiederhold [ 25/Jun/12 ]
Ebucketmigrator is timing out while connecting to memcached. It looks like memcached might have crashed or become unresponsive after the initial vbucket creation mechanism. This appears to be the case since no longer see stats updates from ns_server in the logs. There are no crash reports from memcached though.

=========================CRASH REPORT=========================
  crasher:
    initial call: ebucketmigrator_srv:init/1
    pid: <0.3854.0>
    registered_name: []
    exception error: no match of right hand side value {error,timeout}
      in function ebucketmigrator_srv:connect/4
      in call from ebucketmigrator_srv:init/1
    ancestors: ['ns_vbm_new_sup-default','ns_memcached_sup-default',
                  'single_bucket_sup-default',<0.1187.0>]
    messages: []
    links: [#Port<0.8470>,<0.1200.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 610
    stack_size: 24
    reductions: 7063
  neighbours:

[error_logger:error] [2012-06-22 16:46:33] [ns_1@10.121.13.111:error_logger:ale_error_logger_handler:log_report:72]
Comment by Mike Wiederhold [ 25/Jun/12 ]
Lowering priority since I don't have much information to address further.
Comment by Mike Wiederhold [ 25/Jun/12 ]
Assigning back to Pavel. Is this reproducible or do you have a cluster I can look at?
Comment by Karan Kumar (Inactive) [ 25/Jun/12 ]
@Pavel:

Please put the build number on the ticket.
Comment by Pavel Paulau [ 25/Jun/12 ]
try this cluster:

ec2-174-129-98-20.compute-1.amazonaws.com
ec2-23-20-111-39.compute-1.amazonaws.com
ec2-23-22-235-210.compute-1.amazonaws.com
ec2-107-20-81-224.compute-1.amazonaws.com
Comment by Mike Wiederhold [ 25/Jun/12 ]
The security group doesn't have port 11209 open and this port is used by ns_server for replication. Karan has just edited the security group to enable this port.
Comment by Pavel Paulau [ 25/Jun/12 ]
I think we also should fix this:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-network-ports.html
Comment by Farshid Ghods (Inactive) [ 25/Jun/12 ]
add port 11209 to 2.0 and 1.8 manual
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-network-ports.html
Comment by MC Brown (Inactive) [ 12/Sep/12 ]
Documentation has been updated with port 11209
Generated at Tue Jul 29 09:55:44 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.