[MB-7337] [system test] node shown as pending for a long time after index path change Created: 03/Dec/12 Updated: 05/Dec/12 Resolved: 05/Dec/12 |
|
| Status: | Closed |
| Project: | Couchbase Server |
| Component/s: | ns_server |
| Affects Version/s: | 2.0 |
| Fix Version/s: | 2.0.1 |
| Security Level: | Public |
| Type: | Bug | Priority: | Critical |
| Reporter: | Thuan Nguyen | Assignee: | Ketaki Gangal |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | system-test | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: | Windows 2008 R2 64bit | ||
| Attachments: |
|
| Description |
|
Online upgrade 5 nodes cluster from 1.8.1
Cluster has one default bucket with 20 million items. Data path is set to c:/data 10.3.2.11 10.3.2.12 10.3.2.16 10.3.2.10 10.3.2.75 to 2.0.0-1971 10.3.2.11 (data path and index path is set to default path when install 2.0.0-1971) 10.3.2.16 10.3.2.75 10.3.2.76 10.3.2.77 Change index path in node 11 to new path (c:/index), couchbase server on node 11 restart. curl -i -v --data "index_path=c:/index" "http://Administrator:password@10.3.2.11:8091/nodes/self/controller/settings" * About to connect() to 10.3.2.11 port 8091 (#0) * Trying 10.3.2.11... Connection refused * couldn't connect to host * Closing connection #0 curl: (7) couldn't connect to host I try to run again with cygwin style path curl -i -v --data "index_path=/cygdrive/c/index" "http://Administrator:password@10.3.2.11:8091/nodes/self/controller/settings" * About to connect() to 10.3.2.11 port 8091 (#0) * Trying 10.3.2.11... connected * Connected to 10.3.2.11 (10.3.2.11) port 8091 (#0) * Server auth using Basic with user 'Administrator' > POST /nodes/self/controller/settings HTTP/1.1 > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA== > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18 > Host: 10.3.2.11:8091 > Accept: */* > Content-Length: 28 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 400 Bad Request HTTP/1.1 400 Bad Request < Server: Couchbase Server 2.0.0-1971-rel-enterprise Server: Couchbase Server 2.0.0-1971-rel-enterprise < Pragma: no-cache Pragma: no-cache < Date: Mon, 03 Dec 2012 21:14:06 GMT Date: Mon, 03 Dec 2012 21:14:06 GMT < Content-Type: application/json Content-Type: application/json < Content-Length: 47 Content-Length: 47 < Cache-Control: no-cache Cache-Control: no-cache < * Connection #0 to host 10.3.2.11 left intact * Closing connection #0 ["An absolute path is required for index_path"] In log page, see couchbase server restart on node 11 Couchbase Server has started on web port 8091 on node 'ns_1@10.3.2.11'. menelaus_sup001 ns_1@10.3.2.11 13:12:38 - Mon Dec 3, 2012 Shutting down bucket "default" on 'ns_1@10.3.2.11' for server shutdown ns_memcached002 ns_1@10.3.2.11 13:09:28 - Mon Dec 3, 2012 Setting database directory path to c:/Program Files/Couchbase/Server/var/lib/couchbase/data and index directory path to c:/index ns_storage_conf000 ns_1@10.3.2.11 13:09:28 - Mon Dec 3, 2012 Try connect to memcached on node 11, it hang thuan@ubu-1604:/opt/couchbase/bin$ ./cbstats 10.3.2.11:11210 raw warmup |
| Comments |
| Comment by Thuan Nguyen [ 03/Dec/12 ] |
|
Reproduce in ubuntu 11.04 64bit with couchbase server 2.0.0-1971
Install couchbase server 2.0.0-1971 on node 10.3.2.4 and set data and index to default path. Create default bucket. Change index path to /data from default path using curl command huan@ubu-1604:/opt/couchbase/bin$ curl -i -v --data "index_path=/data" "http://Administrator:password@10.3.2.4:8091/nodes/self/controller/settings" * About to connect() to 10.3.2.4 port 8091 (#0) * Trying 10.3.2.4... connected * Connected to 10.3.2.4 (10.3.2.4) port 8091 (#0) * Server auth using Basic with user 'Administrator' > POST /nodes/self/controller/settings HTTP/1.1 > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA== > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18 > Host: 10.3.2.4:8091 > Accept: */* > Content-Length: 16 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 200 OK HTTP/1.1 200 OK < Server: Couchbase Server 2.0.0-1971-rel-enterprise Server: Couchbase Server 2.0.0-1971-rel-enterprise < Pragma: no-cache Pragma: no-cache < Date: Mon, 03 Dec 2012 23:50:40 GMT Date: Mon, 03 Dec 2012 23:50:40 GMT < Content-Length: 0 Content-Length: 0 < Cache-Control: no-cache Cache-Control: no-cache < * Connection #0 to host 10.3.2.4 left intact * Closing connection #0 Couchbase server shutdown as in log below. Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012 I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012 Shutting down bucket "default" on 'ns_1@127.0.0.1' for server shutdown ns_memcached002 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012 Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /data ns_storage_conf000 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012 |
| Comment by Farshid Ghods [ 03/Dec/12 ] |
|
this is not a blocker bug because it does not destroy any data.
we can add this to documentation that resetting the index path will restart the couchbase server |
| Comment by Aliaksey Artamonau [ 03/Dec/12 ] |
| It's not really because of index path change. The problem is that we introduced a regression that would kill memcached port (not memcached itself) only after 60 seconds of wait. In some scenarios this could cause a data loss. For instance, if someone shut couchbase server down and then reboots the machine. On the moment of reboot there can still be memcached process alive writing something to databases. |
| Comment by Farshid Ghods [ 04/Dec/12 ] |
|
Alaiksey ,
can you confirm the expected behavior ( after your fix ) : 1- should couchbase server itself restart ? 2- should memcached restart ? 3- does this restart mccouch ? 4- current index files are wiped out or kept as it is? 5- what happens to the ddoc definitions ? do they get copied over from original to the new location 6- does this API change the index path for all nodes in the cluster or is this per node ? |
| Comment by Aliaksey Artamonau [ 04/Dec/12 ] |
|
1-3. Yes, to apply path changes ns_server restarts itself entirely including memcached and mccouch.
4. Current index files are kept intact. 5. Design document definitions are stored in master database that is stored together with other databases (i.e. in the database directory). 6. The API is per node. |
| Comment by Steve Yen [ 04/Dec/12 ] |
| http://review.couchbase.org/#/c/23020/ |
| Comment by Steve Yen [ 04/Dec/12 ] |
| moved to 2.0.1 per bug-scrub. |
| Comment by Andrei Baranouski [ 05/Dec/12 ] |
|
build 1974, centos 5.7
observation when change index path: couchbase restarts, bucket was deleted Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012 I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012 Shutting down bucket "default" on 'ns_1@127.0.0.1' for deletion ns_memcached002 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012 Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /tmp ns_storage_conf000 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012 Bucket "default" loaded on node 'ns_1@127.0.0.1' in 0 seconds. ns_memcached001 ns_1@127.0.0.1 16:10:01 - Wed Dec 5, 2012 |
| Comment by Aliaksey Artamonau [ 05/Dec/12 ] |
| Bucket should not be deleted when only index path is changed. I cannot reproduce it on my system. Could you please attach logs? |
| Comment by Farshid Ghods [ 05/Dec/12 ] |
|
Ketaki, please reproduce and update logs or pass the cluster to Aliaksey A. |
| Comment by Ketaki Gangal [ 05/Dec/12 ] |
|
Hi Aliaksey,
I can repro this every time on my tests. - Create a 3 node cluster with 2 buckets. -Load 10k items. - Create 1 view - Change index path : curl -i -v --data "index_path=/data" "http://Administrator:password@10.1.3.176:8091/nodes/self/controller/settings" Choosing the index path change on the *master node above. - Post index path change, no data /bucket on the cluster. - ls -a on nodes shows empty @indexes file and empty data dir. [root@grape-003 couchbase]# cd data/ [root@grape-003 data]# ls @indexes isasl.pw ns_log _replicator.couch.1 _users.couch.1 |
| Comment by Ketaki Gangal [ 05/Dec/12 ] |
| Adding logs here. |
| Comment by Ketaki Gangal [ 05/Dec/12 ] |
|
Opened another bug to track the behaviour. http://www.couchbase.com/issues/browse/MB-7368
Not seeing above on the current testing. |
| Comment by Aliaksey Artamonau [ 05/Dec/12 ] |
| We found that the issue was that we didn't wait for memcached termination correctly. Then ns_server would start memcached again while the previous instance was still shutting down. Probably because it's windows, no eaddrinuse errors were reported. ns_server was just unable to connect to memcached. When old memcached instance finally died, node returned to a good state. |
| Comment by Aliaksey Artamonau [ 05/Dec/12 ] |
| fix merged |