Python SDK 0.8.0 unusable because of poor performance - takes about 2.5 seconds to add or get a key - is this correct?
Hi,
I must be doing something wrong. I downloaded, installed the latest couchbase server and configured in the browser. I'm getting unusable performances.
ADD and GET calls via the python module takes 2.65 seconds per call.
ADD and GET calls via REST takes about 0.18 seconds per call.
I have attached the code and performance report. Please let me know if this is "normal"
environment
Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) Couchbase Version 2.0.0-1723-rel Couchbase Python Module 0.8.0 OSX 10.7.4 (macbook air)
couchbase config
OS: i386-apple-darwin11.4.0 Version: 2.0.0 community edition (build-1723-rel) Couchbase Quota (2.58 GB)Total (2.92 GB) In Use (59.5 MB) Single couchbase instance on my laptop
test_add_cb.py
from couchbase.client import Couchbase def insert_doc(): couchbase = Couchbase("localhost:8091", "account", "") bucket = couchbase["account"] key = "test_key" val = "test_val" bucket.set(key, 0, 0, val) insert_doc()
$ time python test_add_cb.py real 0m2.961s user 0m0.422s sys 0m0.201s
test_get_cb.py
from couchbase.client import Couchbase def get_doc(): couchbase = Couchbase("localhost:8091", "account", "") bucket = couchbase["account"] bucket.get("test_key") get_doc()
$ time python test_get_cb.py real 0m2.760s user 0m0.432s sys 0m0.202s
test_add_rest.py
import httplib2 import json def insert_doc(): http = httplib2.Http() http.add_credentials("Administrator", "password") doc_name = "mydoc" url = 'http://127.0.0.1:8092/account/%s' % doc_name doc = {"doc_foo": "foo", "doc_bar": "bar"} body = json.dumps(doc, sort_keys=True, indent=4) response, content = http.request( url, 'PUT', body, headers={'Content-type': 'application/json'} ) insert_doc()
$ time python test_add_rest.py real 0m0.133s user 0m0.068s sys 0m0.052s
Code on github: https://gist.github.com/3746989
Hi,
Thank you for your response. Could you send the code used in your test. I just want to make sure I'm not doing anything wrong because my code runs way slower then that.
Thanks!
A
Any time you new up an instance of a Couchbase client, there will be overhead. So when you call
couchbase = Couchbase("localhost:8091", "account", "")
before each test, it's going to take a second or two to create the client. So when you time the script, you're timing the creation of the client. To see the time of sets, try something like:
from couchbase.client import Couchbase
from datetime import datetime
couchbase = Couchbase("localhost:8091", "default", "")
def insert_doc():
bucket = couchbase["default"]
key = "test_key"
val = "test_val"
a = datetime.now()
bucket.set(key, 0, 0, val)
b = datetime.now()
c = b - a
return c
for i in range(100):
time = insert_doc()
print timeMy results are:
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.004000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.004000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.004000
0:00:00.003000
0:00:00.004000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.004000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.004000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.003000
0:00:00.002000
0:00:00.003000
0:00:00.003000
0:00:00.003000
yep, I just figured this out. Thanks for sending your test example.
I profiled the couchbase code and found out about the client lag/issue
import cProfile import pstats import cStringIO from couchbase.client import Couchbase import logging logger = logging.getLogger('cbtest') logger.setLevel(logging.INFO) ch = logging.StreamHandler() ch.setLevel(logging.INFO) formatter = logging.Formatter( '%(asctime)s \ - %(name)s \ - %(levelname)s \ - %(message)s' ) ch.setFormatter(formatter) logger.addHandler(ch) def profile_app(function_name): pr = cProfile.Profile() pr = pr.run("%s()" % function_name) stream = cStringIO.StringIO() ps = pstats.Stats(pr, stream=stream) ps.sort_stats("cumulative") ps.print_stats("couchbase") log = stream.getvalue() stream.close() logger.info("Profile data:\n%s", log) def profile_insert_doc(): couchbase = Couchbase( "localhost:8091", "default", "" ) bucket = couchbase["default"] key = "test_key" val = "test_val" bucket.set(key, 0, 0, val) def main(): profile_app("profile_insert_doc") if __name__ == '__main__': main()
And the profiling results on my laptop
2012-09-28 14:58:57,308 - cbtest - INFO - Profile data:
91054 function calls (90937 primitive calls) in 2.477 seconds
Ordered by: cumulative time
List reduced from 500 to 50 due to restriction <'couchbase'>
ncalls tottime percall cumtime percall filename:lineno(function)
2 0.185 0.093 2.280 1.140 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:54(wait)
1 0.000 0.000 2.276 2.276 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:204(__del__)
1 0.000 0.000 2.276 2.276 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:493(done)
1 0.000 0.000 2.276 2.276 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:62(shutdown)
1 0.000 0.000 0.200 0.200 couchbase-db.py:35(profile_insert_doc)
1 0.000 0.000 0.147 0.147 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:140(__getitem__)
1 0.000 0.000 0.147 0.147 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:96(bucket)
1 0.000 0.000 0.147 0.147 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:191(__init__)
1 0.000 0.000 0.103 0.103 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:295(__init__)
5 0.001 0.000 0.082 0.016 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:689(get_bucket)
1 0.000 0.000 0.049 0.049 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:40(__init__)
5 0.000 0.000 0.046 0.009 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:282(_http_request)
2 0.000 0.000 0.042 0.021 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:160(__init__)
5 0.000 0.000 0.035 0.007 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:1043(parse_get_bucket_response)
1 0.001 0.001 0.035 0.035 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:400(init_vbucket_connections)
1024 0.005 0.000 0.034 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:408(start_vbucket_connection)
15007 0.034 0.000 0.034 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:30(__iadd__)
1 0.001 0.001 0.032 0.032 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:445(reconfig_vbucket_map)
2 0.000 0.000 0.031 0.016 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:703(get_vbuckets)
5 0.016 0.003 0.030 0.006 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:1047(parse_get_bucket_json)
1 0.000 0.000 0.025 0.025 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:146(_rest)
1 0.000 0.000 0.016 0.016 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:623(direct_client)
1 0.000 0.000 0.015 0.015 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:89(vbucket_map_ready)
15008 0.010 0.000 0.010 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:38(check)
5120 0.008 0.000 0.008 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:978(__init__)
1 0.000 0.000 0.004 0.004 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:219(set)
1 0.000 0.000 0.004 0.004 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:551(set)
1 0.000 0.000 0.004 0.004 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:503(_respond)
1 0.000 0.000 0.001 0.001 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:39(__init__)
3 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/logger.py:21(logger)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:259(sasl_auth_plain)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:255(sasl_auth_start)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:112(_doCmd)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:41(__init__)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:107(_handleSingleResponse)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:99(_handleKeyedResponse)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:72(_recvMsg)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/client.py:156(_rest_info)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:49(close)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:55(_sendCmd)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/memcachedclient.py:59(_sendMsg)
5 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:930(__init__)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/couchbaseclient.py:55(put)
5 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:915(__init__)
5 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:967(__init__)
5 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:275(_create_headers)
2 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:24(__init__)
2 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:45(__init__)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/rest_client.py:36(__init__)
1 0.000 0.000 0.000 0.000 /Users/alex/Envs/batch/lib/python2.7/site-packages/couchbase/event.py:51(is_set)the following code demonstrate what I'm seeing:
from couchbase import Couchbase from datetime import datetime couchbase = Couchbase("localhost:8091", "Administrator", "password") def insert_doc(): bucket = couchbase["default"] key = "test_key" val = "test_val" a = datetime.now() bucket.set(key, 0, 0, val) b = datetime.now() c = b - a return c for i in range(10): a = datetime.now() time = insert_doc() b = datetime.now() c = b - a print time print "--------------" print c
0:00:00.007368 0:00:02.502617 -------------- 0:00:00.004223 0:00:02.556080 -------------- 0:00:00.003832 0:00:02.649199 -------------- 0:00:00.006645 0:00:02.450640 -------------- 0:00:00.005339 0:00:02.413144 -------------- 0:00:00.004135 0:00:02.490352 -------------- 0:00:00.006837 0:00:02.670817 -------------- 0:00:00.004362 0:00:02.554126 -------------- 0:00:00.004003 0:00:02.609413 -------------- 0:00:00.001727 0:00:02.424202
So it does take over two seconds per "call". Maybe not the actual add call but the process of adding a key takes over 2.5 seconds. Am I missing anything here?
The "problem" is not the actual client connection overhead but the bucket. I re read the documentation and there is a detail that I missed:
"The construction of a Couchbase instance does require some overhead as the cluster topology is discovered, so as a best practice the client should not be created more than once per bucket, per application."
http://www.couchbase.com/develop/python/next
from couchbase import Couchbase from datetime import datetime couchbase = Couchbase("localhost:8091", "Administrator", "password") bucket = couchbase["default"] def insert_doc(bucket): key = "test_key" val = "test_val" a = datetime.now() bucket.set(key, 0, 0, val) b = datetime.now() c = b - a return c for i in range(10): a = datetime.now() time = insert_doc(bucket) b = datetime.now() c = b - a print time print c
Your timing will include python 'startup' time and the connection time. The actual get/put element will be faster.
However you are quite correct; we also find that the python lib v 0.8.0 is significantly slower than 0.7.2.
One obvious change within the library seems to be that it uses the 'requests' library rather than 'httplib' directly.
I did a single-threaded test of 50,000 sets and gets which took ~80seconds with 0.8.0 lib, compared to ~20seconds using 0.7.2 lib