Couchbase N1QL service fails randomly on setup (single cluster, local machine)

I am writing a script for my Couchbase deployment that I have to test. Writing the tests requires destroying and recreating a local cluster a few times.

The script uses at some point a few n1ql queries (using the Go SDK) to manage indexes. However, the test randomly fails on this step with a “service not available” error. This is especially problematic because it occurs randomly on around half of the tests.

I have confirmed the error comes from the following instruction:

res, err := cluster.Query("SELECT * FROM system:indexes", &gocb.QueryOptions{Timeout: 60 * time.Second})

Which is by the way the only moment I user the Query method in my whole scripts.

I noticed the error rarely occurs on first run, and became less present when adding a long cooldown after destroying the cluster (I didn’t got significantly better results above a 60 seconds cooldown, which still make the test fails quite a few times).

Is their a solution to ensure the Couchbase Server app is totally shutdown and can be safely restarted ? Or any other way to prevent this failure from happening ?

My environment :

  • macOS Big Sur 11.1
  • Couchbase Server Community Edition 6.6.0-7909 (app, not Docker)
  • Go SDK 2.2 with Go 1.15
  • Single Cluster

This is by the way how I create my cluster :

1 ) I run the following command once the app is started :

sh -c "couchbase-cli cluster-init -c 127.0.0.1 --cluster-username Administrator --cluster-password password --cluster-ramsize 1024 --cluster-fts-ramsize 256 --cluster-index-ramsize 256 --services data,query,index,fts" 
  1. I setup my buckets with Go SDK (the handler is a loop across a custom config that basically uses the following command)

cluster.Buckets().CreateBucket(gocb.CreateBucketSettings{
BucketSettings: gocb.BucketSettings{
Name: bucketName,
FlushEnabled: bucketData.Flush,
RAMQuotaMB: bucketData.RamSize,
BucketType: gocb.BucketType(bucketData.Type),
EvictionPolicy: gocb.EvictionPolicyType(bucketData.EvictionPolicy),
},

The command which is failing runs close after those steps. I have also confirmed that everything works fine until then (the cluster is up, with the correct credentials/resources/buckets, accessible through web UI).

Found the solution was to add some parameters to the cluster.WaitUntilReady statement.

cluster.WaitUntilReady(
time.Duration(c.Parameters.Timeout)*time.Second,
&gocb.WaitUntilReadyOptions{
DesiredState: gocb.ClusterStateOnline,
ServiceTypes: gocb.ServiceType{
gocb.ServiceTypeQuery,
gocb.ServiceTypeManagement,
gocb.ServiceTypeSearch,
},
},
)

Works.