During programatical database creation, bucket reaches bugged state, in which new scopes and collections are unreachable

Hello!

During testing, we programatically create the bucket/scope/collection structure and the indexes on couchbase.

We use Storage objects, to initialize the structure and indexes in an adhoc way:

  • Create bucket if it does not exists
  • Create scope if it does not exists
  • Create collection if it does not exists
  • Create indexes for the collection
    (Each stage handles concurrent situations and retries appropriately. eg.: When create fails with already exists error we retry with retrieving it instead of attempting the create it again.)

This method worked fine before 7.2.0 but now we encounter the following problem:

  1. The first storage creates its bucket/scope/collection and its indexes.
  2. The following storages which share the same bucket/scope also work fine.
  3. The first storage which use a different bucket creates its bucket/scope/collection and its indexes.
  4. The next storage which share the same bucket/scope with the first one CAN create a new collection, but index creation fails. (Error code: 12003) (“Keyspace not found in CB datastore: default:<bucketName>.<scopeName>.<collectionName>”)

At this point, the bucket is in a permanent bugged state:

  • Index creation fails on collections created after step 3 in the bugged bucket.
  • Index creation works on collections created before step 3 in the bugged bucket.
  • On the web UI under the “Buckets” page, I CAN find scopes/collections created after step 3.
  • On the web UI under the “Query” page under the “Explore Your Data”, I CANNOT find scopes/collections created after step 3.
  • Any index creation on collections created after step 3 fails from the “Query” page as well with the same error.
  • I was unable to recreate this scenario manually (eg.: clicking through the Web UI), so there must some race condition issue to reach this bugged state.
  • If I create a new bucket during step 2 in the Web UI, I can trigger the same bug immediately.

Technical details:

  • We use the latest scala sdk.
  • The bucket/scope/collection/index creation uses the async scala API.
  • The bug happened after upgrading to 7.2.0 build 5325 on both “Community” and “Enterprise” edition.
  • The couchbase instance runs on kubernetes on a docker instance.
  • Creating a fresh docker instance with a flushed persistance storage did not fix it.

Thanks @horvath-martin, we will need to look into this. Can you share the code you’re using to automate this via Scala? And have you tried restarting the node after it reaches this state?

@perry

After restarting the node, the bucket works correctly again:

  • The missing scopes/collections appear under “Explore Your Data”.
  • The index creation query works again on these collections.

Here is the scala code used:

Bucket creation:

Compact operation: (Why is this here? - This was added a couple of years ago, we used and still use 100-200 indexes over 120+ collection and at the time the index creation did not handle well these numbers.
Running the compact operation after bucket creation seemed to improve index creation performance.)

Scope creation:

Await for creation:
await1

Index creation:

Watching indexes:

Await for indexes and watches:

Thank you @horvath-martin, glad to hear that a restart fixes the issue…though still definitely something we need to look into.

My apologies for adding more work for you, but would it be possible to send over that code in text format so that we could test it on our end rather than retyping everything from the images?

I’d also suggest that you open a ticket in issues.couchbase.com and share the code there so that we can track it as a bug.

1 Like

Creation and propagation of scopes, collections and indexes is not synchronous. So when a createXyz() returns, that does not mean that the scope/collection/index will be accessible immediately. Also - if the scope/collection/index is accessible from one node, it is not necessarily accessible from another node.

1, The delayed creation is not a problem since the index creation has a retry mechanism for exactly this reason.
2, The bug was found on an single node setup, so delayed propagation cannot be the cause.
3, The bugged out bucket is stuck in this state permanently. (Until there is a node restart atleast.)

1 Like

open a ticket at issues.couchbase.com

I had a similar issue in testing. And it did indeed involve a race condition. Loading.... And that bug points to Loading... which says they used a work-around of sleeping for 10 seconds between creating a collection and creating an index on that collection. And MB-46643 is not marked as fixed in 7.x. It’s marked “unresolved” with a fixed version of a future version(?).

My “solution” was to create and delete a dummy bucket/scope/collection (or maybe just a bucket(?)) between creating the first bucket/scope/collection and attempting to create the index. I suspect a delay would have worked just as well.

“2, The bug was found on an single node setup, so delayed propagation cannot be the cause.”

I didn’t describe that very well. Even in a “single node” configuration, the “data” and “indexer” are separate processes. The data process needs to propagate the creation of the bucket/scope/collection to the indexer process.

@mreiche @perry

I have opened an issue linking this forum post and provided a scala test file, which I was able to reproduce the bug. (Loading...)

1 Like

Thanks very much for filing that ticket, I’ve shared it with our engineering team. Just to set the right expectations, we will be investigating this with “best effort” and can’t at this time make any guarantee of when it might be fixed. Hopefully @mreiche’s workarounds will help unblock you but please let me know if there is anything else that we can help you with.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.