DCP upgrade failing after upgrading couchbase cluster from 2.5.1 to 3.0.3

Hi,

I had a 5 node cluster with XDCR replication to remote node. All 6 hosts were running couchbase enterprise 2.5.1 version. We started moving to 3.0.3. On our staging environment all went well, so we moved on.

Firstly I’ve upgraded remote standalone host which is a target for XDCR replication. Then I’ve started with main cluster, upgrading hosts one by one in three steps:

  1. remove node from cluster and rebalance
  2. update couchbase server to 3.0.3
  3. add server with new version to couchbase cluster and rebalance.

Right after I’ve removed last node with 2.5.1 version from cluster and rebalnace has ended I’ve got an error saying: “DCP upgrade failed”. I’ve added updated node to cluster and rebalanced again, but after each succesful rebalance I receive this error.

Cluster is working right now. But I have this information in web UI “Fail Over Warning: Rebalance recommended, some data does not have the desired replicas configuration!”

After each rebalance successful finish I’m getting this:

DCP upgrade exited with reason {{badmatch,
                                 {error,
                                  {failed_nodes,['ns_1@cb2.xxx.com']}}},
                                [{dcp_upgrade,handle_call,3,
                                  [{file,"src/dcp_upgrade.erl"},{line,65}]},
                                 {gen_server,handle_msg,5,
                                  [{file,"gen_server.erl"},{line,585}]},
                                 {proc_lib,init_p_do_apply,3,
                                  [{file,"proc_lib.erl"},{line,239}]}]}

nodes are random there.
Right before this information in info.log it was some timeout:

    [ns_server:info,2015-06-19T11:39:38.383,ns_1@cb2.xxx.com:<0.32698.1266>:janitor_agent:process_apply_config_rv:211]userretailers:Some janitor st
ate change requests (apply_new_config_replicas_phase) have failed:
[{'ns_1@cb2.xxx.com',
     {'EXIT',
         {timeout,
             {gen_server,call,
                 [{'janitor_agent-userretailers','ns_1@cb2.xxx.com'},
                  {apply_new_config_replicas_phase,
                      [{deltaRecoveryMap,undefined},
                       {num_threads,5},
                       {flush_enabled,false},
                       {purge_interval,undefined},
                       {autocompaction,false},
                       {sasl_password,"a"},
                       {auth_type,sasl},
                       {ram_quota,2621440000},
                       {num_replicas,1},
                       {uuid,<<"42aa61917324606d4577e81cbabb27c2">>},
                       {replica_index,false},
                       {type,membase},
                       {num_vbuckets,1024},
                       {servers,
                           ['ns_1@cb0.xxx.com','ns_1@cb1.xxx.com',
                            'ns_1@cb2.xxx.com','ns_1@cb6.xxx.com',
                            'ns_1@cb7.xxx.com']},
                       {map,
                           [['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                            ['ns_1@cb2.xxx.com'],
                                ...........
                            ['ns_1@cb2.xxx.com'],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            ['ns_1@cb6.xxx.com',
                             'ns_1@cb2.xxx.com'],
                                .........
                           ['ns_1@cb1.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            ['ns_1@cb1.xxx.com',
                             'ns_1@cb2.xxx.com'],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            [undefined],
                            ['ns_1@cb1.xxx.com',
                             'ns_1@cb2.xxx.com']]},
                       {map_opts_hash,133465355},
                       {fastForwardMap,undefined},
                       {repl_type,
                           {dcp,
                               [375,376,377,378,385,450,451,452,453,454,455,
                                456,457,458,459,476,477,480,481,482,483,484,
                                485,486,525,526,527,528,529,530,531,532,533,
                                534,535,536,537,538,539,540,541,542,543,544,
                                545,546,560,561,562,563,564,565,566,567,568,
                                569,570,571,575,576,577,578,579,580,581,582,
                                583,584,585,586,587,588,590,591,592,593,594,
                                595,596,597,598,599,600,601,602,603,604,605,
                                617,618,619,620,621,622,623,624,625,626,627,
                                628,629,630,631,632,633,634,635,636,637,638,
                                639,640,641,642,643,644,645,646,647,648,649,
                                650,651,652,653,654,655,660,661,662,663,664,
                                674,675,684,685,689,690,691,692,707,708,724,
                                725,726,727,728,729,730,731,732,733,734,735,
                                736,737,738,739,740,745,746,747,748,749,750,
                                751,752,753,754,755,757,758,759,760,761,774,
                                775,776,777,778,779,780,781,782,783,784,785,
                                786,787,788,789,790,791,792,793,794,795,796,
                                797,798,799,800,801,802,803,804,805,806,808,
                                809,810,811,812,813,814,815,816,817,818,819,
                                820,839,840,841,842,843,844,845,846,847,856,
                                857,858,859,860,861,862,863,864,865,866,867,
                                868,869,884,885,886,897,898,899,900,901,902,
                                903,904,905,906,907,908,909,910,911,912,913,
                                914,915,916,921,922,923,928,929,930,931,932,
                                933,934,935,936,937,938,939,940,941,942,943,
                                944,945,946,947,948,949,950,951,952,953,954,
                                955,956,957,958,959,960,961,962,963,964,965,
                                966,967,968,969,970,971,972,973,974,975,976,
                                977,978,979,980,981,982,983,984,985,986,987,
                                988,989,990,991,992,993,994,995,996,997,998,
                                999,1000,1001,1002,1003,1004,1005,1006,1007,
                                1008,1009,1010,1011,1012,1013,1014,1015,1016,
                                1017,1018,1019,1020,1021,1022,1023]}}],
                      []},
                  30000]}}}}]
[]

I couldn’t found any advice on this. Does anybody know if there is any thing I can tweak/change/restart to get rid of this DCP issue?

thank you,

Marek

Hi Marek, Thanks for reporting this issue. Based on what you mentioned rebalance operation does complete but reports errors on the user-log and log files - correct?

Hi Anil,

thank you for reply. Situation is now changed. I’ve dig deeper in logs and found a few errors about “Weird buckets”. It was from bucket called proximic2 which we aren’t using for a while now. I’ve deleted this bucket and DCP was updated successfully but disk queues for all buckets started to grow and I can not do rebalance. I have information:

Rebalance exited with reason {buckets_shutdown_wait_failed,
[{'ns_1@cb6.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb2.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb1.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb0.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}}]}
	ns_orchestrator002 	ns_1@cb2.xxx.com 	20:42:59 - Fri Jun 19, 2015
Failed to wait deletion of some buckets on some nodes: [{'ns_1@cb6.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb2.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb1.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1@cb0.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}}]

But this bucket proximic2 don’t exists any more. Do you have any advice? Can you help? I can’t do rebalance.
When I’ve tried to create this bucket again I’ve got an info that it already exists.
If it helps couchbase runs on debians 7.8.

thanks

Marek