Lifetimes of Cluster and Buckets

From Couchbase’s documentation, it is suggested applications should keep Cluster and Bucket instances as singletons.

I am currently in the process of redesigning an ASP.NET Web API application (.NET Framework 4.6.1) in which every HTTP request has to read data from our Couchbase database. The application must be able to serve many parallel HTTP requests at the same time.

I used to keep our Cluster and Bucket instances as singletons but the application experienced abnormally high lock contention rate (~200-300 contentions/sec with ~30 parallel HTTP requests) on the singletons under moderate/high load.

What is the recommended approach to manage lifetimes of Cluster and Bucket instances in this situation?

Hi @rex_psa

We still advocate singleton lifetimes for the Cluster and bucket objects, though the bucket lifetimes are automatically managed by the cluster so you don’t need to mange it directly.

Can you explain more about “high lock rate”?

If you’re building an ASP.NET web application, it is better to use the async/await actions, eg GetAsync, UpsertAsync, QueryAsync, etc. This is because the async/await operations do not block the current thread, and instead release resources back to the host process to re-use until a response is received.

Also, can you confirm what SDK version, cluster version and what services (Key/Value, Views, N1QL) you’re using. There may be some additional steps that you could take to improve concurrency.

Thanks

Can you explain more about “high lock rate”?

My lock contention rate is obtained through Visual Studio Load Testing framework and also from the Windows performance monitor (.NET CLR LocksAndThreads → Contention Rate / sec)

If you’re building an ASP.NET web application, it is better to use the async/await actions, eg GetAsync, UpsertAsync, QueryAsync, etc. This is because the async/await operations do not block the current thread, and instead release resources back to the host process to re-use until a response is received.

Let me clarify, I am not talking about the execution of the Couchbase SDK methods, but rather the object lifetimes of Cluster and (indirect) Bucket instances. Since there is always only one Cluster instance throughout the lifecycle of the ASP.NET application, the construction of all application service classes depending on Cluster, that may or may not be singletons, must compete for this instance, thus causing the lock contention.

Also, can you confirm what SDK version, cluster version and what services (Key/Value, Views, N1QL) you’re using. There may be some additional steps that you could take to improve concurrency.

  • .NET SDK version: 2.4.5
  • Cluster version: 4.6.2-3905-enterprise
  • Services in use: Key/Value, Index, N1QL

Thanks for looking into my issue.

Internally we do not perform any locks, instead contentions comes from executing requests in parallel with limited resources, TCP connections for Key/Value and HTTP requests for View / N1QL. Requests are queued for best effort but if resources are not available before the operation timeout time (timeout depends on request type and is tuneable), it will fail.

Maintaining the cluster as a singleton is efficient because the configuration done during setup involved a number of steps and having to do them per HTTP request would cause additional slow-down. If you’re seeing timeouts because of the number of concurrent requests, you can increase available resources, for example KV TCP connection pool size.

What application behaviour are you trying to address?

  • Operation timeouts
  • Perf monitoring showing ‘hot’ areas
  • Which services are affected

Also, I suggested using the async version of the SDK requests because they are built using the .NET async/await framework and is based around using Tasks. Tasks are better at resource management because a task will release it’s thread back to the host process to do other things while waiting on an external resources, such as network traffic.

It’s also worth noting, as you’re on SDK client 2.4.5, you’re probably limited to a single multiplexing connection if using the standard IOService configuration. From 2.4.7 we introduced a pool of connections that is tuneable to increase throughput.

The most recent version of the SDK is 2.5.0, I would recommend to upgrade to that if possible. The connection pool size is managed using the PoolConfiguration.MaxSize property with a default of 2.

Programatically:

var config = new ClientConfiguration
{
    PoolConfiguraton = new PoolConfiguration
    {
        MaxSize = 5 // default is 2
    }
}

Configuration:

<couchbase>
  <servers>
    <add uri="http://localhost:8091/" />
  </servers>
  <buckets>
    <add name="default">
      <connectionPool name="default" maxSize="5"/>
    </add>
  </buckets>
</couchbase>

@rex_psa -

In addition to what @MikeGoldsmith wrote, could you post an example of your usage? I may be wrong, but suspect you may be explicitly synchronizing on the bucket object in your application - note that IBucket and ICluster implementations are threadsafe and in fact are designed to work under heavy contention.

-Jeff

@jmorris, sorry, I am afraid it is not possible for me to share my usage as this is my company’s proprietary code. But the gist of it is that we created custom wrappers of DocumentBucket and Cluster in order to perform auto retry (in case of connection failure, for example) and automatic version management of Documents.

@MikeGoldsmith I have just performed the upgrade from 2.4.5 to 2.5.0 but the lock contention is still the same. The PoolConfiguration.MaxSize property is set to 10 and is not changed throughout the upgrade.

Can you please let me know if there are other settings I can try to reduce lock contention?

@rex_psa do you perform any lock() behaviour in your custom DocumentBucket and Cluster objects?

@MikeGoldsmith we did not use any locks on the custom Bucket/Cluster.