Frequent TimeoutException from Python 3.x SDK

Joey_Wang · April 10, 2021, 3:17am

Hi there,

We got very frequent TimeoutException on Flask application, according to the log below, it just threw timeout but actually from the cb.get sent till it fails, it didn’t exceed the KVtimeout threshold yet.

2021-04-10 02:39:52,589 - app.modules.db_helper - INFO - key to fetch in get() of CouchbaseCluster [64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema]
2021-04-10 02:39:52,589 - app.modules.db_helper - ERROR - try again to get 64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema due to exception
2021-04-10 02:39:52,668 - app.modules.db_helper - ERROR - <Key=‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, RC=0xC9[LCB_ERR_TIMEOUT (201)], Operational Error, Results=1, C Source=(src/multiresult.c,316), Context={‘status_code’: 4, ‘opaque’: 6, ‘cas’: 0, ‘key’: ‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, ‘bucket’: ‘workflows’, ‘collection’: ‘’, ‘scope’: ‘’, ‘context’: ‘’, ‘ref’: ‘’, ‘endpoint’: ‘10.244..:11210’, ‘type’: ‘KVErrorContext’}, Tracing Output={“64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema”: {“s”: “kv:Unknown”, “i”: 14219769572458419854, “b”: “workflows”, “r”: “10.244..:11210”, “t”: 30000000, “debug_info”: {“FILE”: “src/callbacks.c”, “FUNC”: “value_callback”, “LINE”: 852}}}>
Traceback (most recent call last):
File “/var/myapp/dbaas/app/modules/db_helper.py”, line 418, in get
rv = self.collection.get(key, quiet=True)
File “/var/myapp/dbaas/venv/lib/python3.7/site-packages/couchbase/collection.py”, line 258, in wrapped
return func(self, args, **kwargs)
File “/var/myapp/dbaas/venv/lib/python3.7/site-packages/couchbase/result.py”, line 471, in wrapped
x, options = func(args, **kwargs)
File “/var/myapp/dbaas/venv/lib/python3.7/site-packages/couchbase/collection.py”, line 479, in get
return self._get_generic(key, kwargs, options)
File “/var/myapp/dbaas/venv/lib/python3.7/site-packages/couchbase/collection.py”, line 443, in _get_generic
x = CoreClient.get(self.bucket, key, **opts)
File “/var/myapp/dbaas/venv/lib/python3.7/site-packages/couchbase_core/client.py”, line 406, in get
return super(Client, self).get(args,**kwargs)
couchbase.exceptions.TimeoutException: <Key=‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, RC=0xC9[LCB_ERR_TIMEOUT (201)], Operational Error, Results=1, C Source=(src/multiresult.c,316), Context={‘status_code’: 4, ‘opaque’: 6, ‘cas’: 0, ‘key’: ‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, ‘bucket’: ‘workflows’, ‘collection’: ‘’, ‘scope’: ‘’, ‘context’: ‘’, ‘ref’: ‘’, ‘endpoint’: '10.244..:11210’, ‘type’: ‘KVErrorContext’}, Tracing Output={“64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema”: {“s”: “kv:Unknown”, “i”: 14219769572458419854, “b”: “workflows”, “r”: "10.244..*:11210", “t”: 30000000, “debug_info”: {“FILE”: “src/callbacks.c”, “FUNC”: “value_callback”, “LINE”: 852}}}>
2021-04-10 02:39:52,732 - app.modules.db_helper - INFO - Item [64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema] is found
2021-04-10 02:39:52,733 - app.modules.db_helper - INFO - value to return in get(64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema) of CouchbaseCluster is not empty

and here is the code for connections:

           logger.info(
                "Attempting to connect Couchbase: {}, on bucket: {}".format(self._db_hosts, self._bucket_name))

            if current_app.config['DB_PASS_ENCRYPTED']:
                self._my_password = DBFernetHelper().decrypt(db_pass).decode('utf-8')
            else:
                self._my_password = db_pass

            self._cluster = Cluster.connect(
                'couchbase://{}?query_timeout=75.0&operation_timeout=30.0&views_timeout=75.0'.format(
                    self._db_hosts),
                ClusterOptions(
                    PasswordAuthenticator(self._username, self._my_password),
                    lockmode=LOCKMODE_WAIT,
                    timeout_options=ClusterTimeoutOptions(query_timeout=timedelta(seconds=75),
                                                          kv_timeout=timedelta(seconds=30),
                                                          views_timeout=timedelta(seconds=75),
                                                          # config_total_timeout=timedelta(seconds=75)
                                                          )
                ), lockmode=LOCKMODE_WAIT
            )
            # self._bucket = self._cluster.bucket(self._bucket_name)
            self._bucket = self._cluster._cluster.open_bucket(self._bucket_name,
                                                              lockmode=LOCKMODE_WAIT)  # work around to set lockmode
            self._collection = self._bucket.default_collection()

Does anyone else experience the similar issue on Python SDK on Flask application? It’s really frustrating to see timeout but cannot find the root cause.

Joey_Wang · April 10, 2021, 3:29am

this is the get method:

def get(self, key):
    logger.info("key to fetch in get() of CouchbaseCluster [%s] " % key)
    val = None
    try:
        rv = self.collection.get(key, quiet=True)
        if rv.success:
            val = rv.content
            logger.info("Item [%s] is found" % key)
        else:
            logger.info("Item [%s] does not exist" % key)
    except CouchbaseInternalException as e:
        try:
            rv = self.collection.get(key, quiet=True, replica=True)
            if rv.success:
                val = rv.content
                logger.info("Item [%s] is found in replica" % key)
            else:
                logger.info("Item [%s] does not exist in replica" % key)
        except (CouchbaseException, Exception) as e:
            logger.exception(e)
            raise
    except (TimeoutException, CouchbaseException, Exception) as e:
        logger.error("try again to get {} due to exception".format(key))
        logger.exception(e)
        try:
            rv = self.collection.get(key, quiet=True)
            if rv.success:
                val = rv.content
                logger.info("Item [%s] is found" % key)
            else:
                logger.info("Item [%s] does not exist" % key)
        except (CouchbaseException, Exception) as e:
            logger.error("2nd get {} failed again due to exception".format(key))
            logger.exception(e)
            raise
    finally:
        if val:
            logger.info("value to return in get({}) of CouchbaseCluster is not empty".format(key))
        else:
            logger.info("value to return in get({}) of CouchbaseCluster is empty".format(key))

        return val

Joey_Wang · April 12, 2021, 4:08pm

Joey_Wang:

couchbase.exceptions.TimeoutException: <Key=‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, RC=0xC9[LCB_ERR_TIMEOUT (201)], Operational Error, Results=1, C Source=(src/multiresult.c,316), Context={‘status_code’: 4, ‘opaque’: 6, ‘cas’: 0, ‘key’: ‘64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema’, ‘bucket’: ‘workflows’, ‘collection’: ‘’, ‘scope’: ‘’, ‘context’: ‘’, ‘ref’: ‘’, ‘endpoint’: '10.244. . :11210’, ‘type’: ‘KVErrorContext’}, Tracing Output={“64c023fd-e4c4-4136-85cd-f4c6cdc15ded_schema”: {“s”: “kv:Unknown”, “i”: 14219769572458419854, “b”: “workflows”, “r”: "10.244. .*:11210", “t”: 30000000, “debug_info”: {“FILE”: “src/callbacks.c”, “FUNC”: “value_callback”, “LINE”: 852}}}>

although the exception message says the t(timeout) is 30000000, I don’t think the cmd actually took 30s long, that means the kvtimeout setting doesn’t take effective.

AV25242 · April 12, 2021, 7:16pm

@jcasey / @davidkelly can you help ?

Joey_Wang · April 13, 2021, 6:50am

I think I figure out what’s the problem.

My connection pool was not actually singleton, after implementing a thread-safe singleton class to ensure there’s only one couchbase connection instance, the timeout values seems working.

Joey_Wang · April 16, 2021, 6:40pm

It seems still happens after applying the singleton on connection instance, but timeout setting seems working.

Joey_Wang · April 21, 2021, 10:57pm

is there any way to test current cluster connection and if it’s closed, and do a reconnect?
cannot find any doc for Couchbase reconnect…

Topic		Replies	Views
Python client sdk timeout error Python SDK	1	1764	July 15, 2021
Couchbase Python SDK - How to enforce a custom couchbase connection timeout + Retry connections? Python SDK connections , timeout	1	4347	May 8, 2017
Random, unexplained timeout on get()? Python SDK	13	5775	March 31, 2017
Client-Side Timeout exceeded Python SDK	20	2882	October 15, 2018
TimeoutException in CouchbaseBucket.get. Please help ！ Java SDK	4	1635	September 21, 2016

Frequent TimeoutException from Python 3.x SDK

Related topics