Write failures during live cbbackup

heezy · October 7, 2014, 11:27pm

We are testing out our application using couchbase, and currently we have a 5 node, ~200GB total data cluster on amazon AWS. We did a backup using cbbackup (which failed at at 97% - seperate issue), however more importantly we had some critical write failures during the ~2 hour backup.

The documentation for cbbackup says you can backup a live cluster, but obviously write failures during backup are not ideal. The machine we did the backup process on is not in the cluster, but a remote machine that connected to the cluster over http, for performance reasons. We did notice that the TAP queue graph seemed to spike dramatically during this time period, which makes sense because replication is apparently TAP intensive (from backup couchbase blog)

I’m looking into the logs for more information related to the failures, but I’m wondering has anyone else had similar issues backing up what I’m assuming is a large couchbase instance?

asingh · October 13, 2014, 7:31am

Hi,

Write failures typically point to some issue with underlying I/O subsystem. I would check the /var/log/messages around the time failure happened. Are these instances running in virtual environment? We have seen cases Vmware marked disks as read-only because I/O controller saturation caused by the backup script.

Also could you confirm if the filesystem where you’re backing up isn’t NFS? sqlite files are known to have issues with NFS implementations - Sqlite NFS FAQ

–
Abhishek

Topic		Replies	Views
Java SDK 1.4 failing during cbbackup Java SDK	4	2277	January 8, 2015
Do writes still fail after node failure? Couchbase Server	1	1510	August 9, 2016
Looking for a Consultant/Help Troubleshooting Setup Couchbase Server	7	1211	January 30, 2019
Sizing & cbbackup Couchbase Server	4	3439	August 11, 2015
Couchbase failed to backup and write commit Couchbase Server	8	3649	July 7, 2015

Write failures during live cbbackup

Related topics