Corrupted files preventing me from doing anything

Poofcakes · March 12, 2018, 10:49am

Hi,

I had two virtual cloud machines with couchbase installed on it, and the data being shared between these two machines. They had to be shutdown due to a maintenance. The data was saved and new machines were started using the data.

I logged into one of them, and I tried “/etc/init.d/couchbase-server start” to get Couchbase running again. It gave me this error:

cp: cannot stat ‘/opt/couchbase/var/lib/couchbase/ip’: No such file or directory

So then I tried “sudo apt install couchbase-server”. It recognized I already have an installation (see 3rd screenshot), and tried to update Couchbase to the latest version, but would error out due to not being able to find ‘stats.json’ and ‘stats.json.old’. These files apparently got corrupt.

Same story for the other VM, but ‘stats.json.old’ got corrupted for a different bucket, and ‘stats.json’ is fine for that one.

So yeah, I got 3 corrupted files, 2 on one of the machines and 1 on the other machine. Also tried just removing everything, but these files are even preventing me from doing that.

Does anyone have any idea what I can do?

Couple examples

avsej · March 12, 2018, 11:01am

@Poofcakes have you tried to run fsck on that filesystem when both virtual machines unmount it?

Poofcakes · March 12, 2018, 11:10am

I’m not entirely sure how to do that. This is all set-up by my supervisor. The couchbase data with the corrupted files is saved on the mounted drive though, so if I unmount it then it wouldn’t scan the corrupted files right?

avsej · March 12, 2018, 11:18am

You have to unmount filesystem before running fsck. Othewise it will not check and fix the errors. What I would do is to stop all machines, run only one nodes which can use that filesystem, log into the machine, unmount the filesystem with corrupted files and run fsck.

The other way might be to try stop everything and mount filesystem on the host machine, and run fsck there.

Poofcakes · March 12, 2018, 8:37pm

Thank you. We managed to clear the broken files by unmounting the drive and running the xfs_repair command on the device.

Now we have issues with getting the buckets back as they were. The cluster is gone, the nodes are gone, the buckets are gone. The data still exists, but Couchbase doesn’t recognize the data, even if a cluster is made where the data location points straight to the data. Is it possible to salvage this or are we doomed to remake the buckets and re-import everything?

EDIT:
Made a new thread for this issue:

EDIT2: Managed to resolve it by just remaking the buckets. This time it didn’t overwrite and remove all the data, but instead the buckets started slowly filling with the data again.

Topic		Replies	Views
Weird data corruption? Couchbase Server server , python	0	1160	May 14, 2021
Data loss problem (4.0.0-4051) Couchbase Server	1	1621	March 29, 2017
CouchBase down, won't come back up Couchbase Server	2	2034	September 9, 2013
Couchbase installed in a readonly filesystem Couchbase Server	4	924	June 22, 2020
Restore Couchbase Database from .couch files Couchbase Server	5	1302	February 24, 2021

Corrupted files preventing me from doing anything

Related topics