[MB-7238] ns_server is still validating ip address in ip file even if erlang already has node name defined (was: 2.0 Build 1941: Couchbase Server does not start after a change in IP, server is looking for the old IP even after the hostname resolves to the new one.) Created: 21/Nov/12 Updated: 29/Nov/12 Resolved: 29/Nov/12 |
|
| Status: | Resolved |
| Project: | Couchbase Server |
| Component/s: | ns_server |
| Affects Version/s: | 2.0-beta-2, 2.0 |
| Fix Version/s: | 2.0 |
| Security Level: | Public |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Bala Keelapudi | Assignee: | Aliaksey Artamonau |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
Couchbase Server does not start after a change in IP, server is looking for the old IP even after the hostname resolves to the new one. Followed the best-practices information to configure the hostname in the couchbase-server file and this issue is reproducible in the 1941 2.0 build. Error messages from the log: [ns_server:info,2012-11-16T13:16:45.502,ns_1@FQDN:dist_manager<0.2732.0>:dist_manager:read_address_config:55]Reading ip config from "/opt/couchbase/var/lib/couchbase/ip" [ns_server:warn,2012-11-16T13:16:45.522,ns_1@FQDN:dist_manager<0.2732.0>:dist_manager:is_good_address:81]Cannot listen on address `OLD IP`: eaddrnotavail The logs are available in the link below: https://s3.amazonaws.com/customers.couchbase.com/jawfishgames/couch14-build-1914.zip update: Apparently as part of process of setting up node name folks just left original /opt/couchbase/var/lib/couchbase/ip. And ns_server's bug is due to attempt to validate that address even though it won't be actually used. |
| Comments |
| Comment by Steve Yen [ 21/Nov/12 ] |
| abhinav now attempting to reproduce |
| Comment by Steve Yen [ 21/Nov/12 ] |
|
please also get DNS diagnostic info... like ping (ask alk).
it could be as simple as DNS propagation delay. |
| Comment by Aleksey Kondratenko [ 21/Nov/12 ] |
|
From error message it appears that DNS resolver still thinks old ip is assigned to this hostname. So in order to help diagnosing this I need both cbcollect_info (or just output of ifconfig -a) and some information about in what ip this hostname is resolved. Simple way is by pinging hostname and sending me output |
| Comment by Abhinav Dangeti [ 21/Nov/12 ] |
|
- So started with 10.1.3.235, 10.1.3.236 (build 1954)
- set host ip's on /etc/hosts - stopped couchbase-server on 10.1.3.236 - changed ip of 10.1.3.236 to 10.1.3.222 - resolved /etc/hosts to the new ip - started couchbase-server back up on 10.1.3.222, server never comes back up. [ Servers available as is ] 10.1.3.222>> ifconfig -a eth0 Link encap:Ethernet HWaddr 00:50:56:97:02:D2 inet addr:10.1.3.222 Bcast:10.255.255.255 Mask:255.0.0.0 inet6 addr: fe80::250:56ff:fe97:2d2/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:109556236 errors:0 dropped:0 overruns:0 frame:0 TX packets:108558164 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:83579290130 (77.8 GiB) TX bytes:86321632329 (80.3 GiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:72816990 errors:0 dropped:0 overruns:0 frame:0 TX packets:72816990 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:120374350544 (112.1 GiB) TX bytes:120374350544 (112.1 GiB) sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) <<Attaching cbcollectinfo_10_1_3_222.zip>> |
| Comment by Aleksey Kondratenko [ 21/Nov/12 ] |
|
a) cannot access .222. b) don't have ping output that I need in order to understand more |
| Comment by Abhinav Dangeti [ 21/Nov/12 ] |
|
If we use ifconfig eth0 10.1.3.222, to change the IP, we see the issue.
However the issue doesn't occur when the IP is changed this way: vim /etc/sysconfig/network-scripts/ifcfg-eth0 (and comment out the BOOTPROTO=dhcp and set it to static) .. # Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) DEVICE=eth0 #BOOTPROTO=dhcp BOOTPROTO=static ONBOOT=yes IPADDR=10.1.3.222 GATEWAY=10.1.0.1 NETMASK=255.255.0.0 ... sudo /etc/init.d/network restart The reason why this worked was because /opt/couchbase/var/lib/couchbase/ip was empty. |
| Comment by Aleksey Kondratenko [ 21/Nov/12 ] |
| Looks like /opt/couchbase/var/lib/.../ip is still being used somehow. I recommend manually deleting it. It's still ns_server's bug if we try do anything about it when hostname is specificied |
| Comment by Steve Yen [ 26/Nov/12 ] |
|
The ip file was/is being used by cbupgrade situation. Please see...
http://www.couchbase.com/issues/browse/MB-7241 |
| Comment by Aleksey Kondratenko [ 28/Nov/12 ] |
| Approved for 2.0. Be careful with using right branch |
| Comment by Steve Yen [ 29/Nov/12 ] |
| i think this fix was merged? -- http://review.couchbase.org/#/c/22895/ |