Deletion of Attachment _sync:att: Binary Blobs

Hello!
We use couchbase as a temporary document store (every doc has a TTL of less than a day), and for a while I was noticing that despite frequent manual compactions of the bucket, it continued to grow in size. Recently it has gotten to the point where it’s frequently bumping up against the hard-drive space threshold and causing “low hard-drive space” alerts.

After searching numerous other threads (including the Sync Gateway Issue #1648 that everyone ultimately refers to), I’m stumped. Deleting a document and it’s “attachments” seems to “work”, but there’s a lot of binary sha1 items that persist through compactions that over time take up the vast majority of the hard-drive space used by Couchbase Server.

The only way that I’ve found that they can be deleted is by manually searching for them with something like meta().id LIKE '_sync:att:%' , then MANUALLY deleting them by clicking the icon in the Couchbase Server Console. Given that there’s 200K+ of them that have accumulated in our production environment, this seems un-doable without some sort of automation tool.

I’ve searched the Sync Gateway Public and Admin REST API documentation, and even they say that once the attachment is “removed”, it’ll still exist in the bucked (see PUT /attachment/).

Does anyone have a more elegant solution to this problem that doesn’t require simulating mouse clicks or page-scraping?

Deleting a document won’t free up the storage used by its attachments. Attachment storage is de-duplicated, so there could be other documents using the attachment.

The issue you linked to is about a garbage-collection process to remove attachment data that isn’t being used any any documents. That hasn’t been implemented yet.

For now I think your best bet is to delete the attachments yourself; as you’ve found, they are identified by keys starting with _sync:att: (the rest of the key is a SHA-1 digest of the contents.)

Given that there’s 200K+ of them that have accumulated in our production environment, this seems un-doable without some sort of automation tool.

A simple N1QL query should do the job… something like DELETE FROM bucket WHERE meta(id) LIKE '_sync:att:%';

@jens Thank you! I really appreciate your response!

I don’t know if there’s a issue with my indexes, but even the simple queries (like the one you sent) time out after 5 minutes with an error like:

All indexer replica is down or unavailable or unable to process request - cause: queryport.client.noHost

^ That’s the result of running this query:
DELETE FROM bucket WHERE meta(id) LIKE '_sync:att:%' LIMIT 1;
I only have 275K documents in the database, and even when I only had less than 200 (not 200 thousand, just 200), it would still time out.

Here’s what my indexes look like:

Sorry, I don’t know much about the server, and this issue isn’t related to mobile; you may need to re-post it in a N1QL or server board here to get an answer from the server team.