[MB-8377] CBBackup - Need to ignore deleted items which are unecessary in backup Created: 30/May/13  Updated: 17/Sep/13  Resolved: 17/Sep/13

Status: Closed
Project: Couchbase Server
Component/s: tools
Affects Version/s: 2.0, 2.0.1, 2.1.0
Fix Version/s: 2.2.0
Security Level: Public

Type: Bug Priority: Critical
Reporter: Anil Kumar Assignee: Shashank Gupta
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates to
relates to MB-9075 cbtransfer shows incorrect count(afte... Closed

 Description   
http://www.couchbase.com/issues/browse/MB-7149

In case of Deletes on items – tool currently only captures the snapshot of 'active items' and doesn't consider any items getting deleted. Hence when it transfers it not only transfers current active items but also any deleted items which is unnecessary. To fix this we require some changes in EP-Engine side to provide stats on deleted items so that tool can smartly ignore those. Considering the timeframe for release this won't make it for 2.1.0 but we will have documentation explaining this to users.

[Bala]:
There are 2 issues here. One has been addressed in 2.1 (considering DGM in backup % calculation). But the other part (ignoring tombstones in backup) is not yet addressed. I see Anil's comment that it will not make to 2.1. But should the ticket still be kept open to avoid loosing visibility ?
Also while ignoring tombstones in backup may be involved; how involved is just considering the # of tombstones in the % calculation ? Atleast we will not confuse the customers if we consider that in the denominator and completion displayed is never more than 100% (or max marginially above 100%).

 Comments   
Comment by Perry Krug [ 31/May/13 ]
Agree definitely with the last sentence here...it's better to show too many items need to be transferred than to go over 100% where the user would have no idea or expectation of when it will stop.
Comment by Bin Cui [ 05/Aug/13 ]
http://review.couchbase.org/#/c/27920/
Comment by Shashank Gupta [ 11/Sep/13 ]
I tried the following scenario:

1. Loaded 199982 items in a bucket.
2. Deleted some items. Items remaining : 163757

3. Took backup :

a) With build 2.1.1 : output :

# ./cbbackup http://10.3.3.66:8091 /tmp/backup/ -b default -u Administrator -p password

[########################] 122.1% (199982/163757 msgs)
bucket: default, msgs transferred...
       : total | last | per sec
 batch : 2712 | 2712 | 83.7
 byte : 204864653 | 204864653 | 6320891.2
 msg : 199982 | 199982 | 6170.2



b) With build 2.2.0-821 : output :

# ./cbbackup http://10.3.3.66:8091 /tmp/backup/ -b default -u Administrator -p password

 [#############################] 144.2% (236207/163757 msgs)
bucket: default, msgs transferred...
       : total | last | per sec
 batch : 2729 | 2729 | 88.4
 byte : 204864653 | 204864653 | 6634171.2
 msg : 236207 | 236207 | 7649.1


So the only difference found I found is that with 2.1.1, cbbackup ran upto 122.1% and then terminated successfully, but with 2.2.0, cbbackup ran upto 144.2% and then got terminated successfully. So, now also the user will not be able to predict that upto what extend the process will run.
Comment by Bin Cui [ 16/Sep/13 ]
That's the bug you filed as MB-9075. A fix is pushed for review. It should be part of 2.2.1 hot fix release.
Comment by Maria McDuff [ 17/Sep/13 ]
closing as dupe.
Comment by Maria McDuff [ 17/Sep/13 ]
MB-9075.
Comment by Maria McDuff [ 17/Sep/13 ]
MB-9075.
Generated at Sun Apr 20 15:26:32 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.