Bad performance with SDK

Hi guys

I’m working on a Java application which uses Couchbase and I’m having some issues that I wanted to share with you. Basically, I note that there is a huge difference in the execution/response time of queries being executed from the workbench vs using the SDK.

Just to make sure the issue is independent from how we designed our own bucket, I installed the sample ‘beer-sample’ thus we are all in the same page.

I have created the following index (which I designed to be a COVERING INDEX) on ‘brewery’ documents:

CREATE INDEX beers ONbeer-sample(type, address, city, code, name) WHERE type = 'brewery';

This is the query I use:

SELECT type, address, city, code, name
FROM `beer-sample` USE INDEX (beers)
WHERE type = 'brewery' and city = 'Albany';

The EXPLAIN output is:
[
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“covers”: [
“cover((meta(beer-sample).id))”,
“cover((beer-sample.type))”,
“cover((beer-sample.address))”,
“cover((beer-sample.city))”,
“cover((beer-sample.code))”,
“cover((beer-sample.name))”
],
“index”: “beers”,
“keyspace”: “beer-sample”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(“brewery”)”
],
“Inclusion”: 1,
“Low”: [
"“brewery”"
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “((cover((beer-sample.type)) = “brewery”) and (cover((beer-sample.city)) = “Albany”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “cover((beer-sample.type))”
},
{
“expr”: “cover((beer-sample.address))”
},
{
“expr”: “cover((beer-sample.city))”
},
{
“expr”: “cover((beer-sample.code))”
},
{
“expr”: “cover((beer-sample.name))”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
}
]

Executed from the workbench, this query takes around 20 ms but it doubles the execution time when executed from the SDK. I’ve been playing around with both asynchronous and synchronous meethods:

		System.setProperty("com.couchbase.queryEnabled", "true");
		// Through the builder
		Cluster cluster = CouchbaseCluster.create(DefaultCouchbaseEnvironment
				.create());
		Bucket bucket = cluster.openBucket("beer-sample");
		for (int x = 0; x < 10; x++) {

			List<Object> params = new ArrayList<Object>();
			StringBuilder statement = new StringBuilder();
			statement.append("SELECT type, address, city, code, name ");
			statement.append("FROM %s ");
			params.add("`beer-sample`");

			statement.append("USE INDEX (%s) ");
			params.add("beers");
			statement.append("WHERE ");
			statement.append(" type = '%s' ");
			statement.append(" AND city = '%s' ");
			params.add("brewery");
			params.add("Albany");

			String query = String.format(statement.toString(),
					(Object[]) params.toArray(new Object[] {}));
			SimpleN1qlQuery n1ql = N1qlQuery.simple(query);

			final long ta = System.currentTimeMillis();
			Observer<? super AsyncN1qlQueryResult> obs = new Observer() {

				@Override
				public void onCompleted() {
					System.out.println(String.format(
							" - - - - - BEERS QUERY ASYNC %sms",
							System.currentTimeMillis() - ta));
				}

				@Override
				public void onError(Throwable arg0) {
					// TODO Auto-generated method stub

				}

				@Override
				public void onNext(Object arg0) {
					// TODO Auto-generated method stub

				}
			};
			bucket.async().query(n1ql).subscribe(obs);
			long t = System.currentTimeMillis();
			bucket.query(n1ql);
			System.out.println(String.format(" - - - - - BEERS QUERY SYNC %sms",
					System.currentTimeMillis() - t));
			System.out.println(String.format(" "));

			Thread.sleep(60000);
		}

This is the output:

 - - - - - BEERS QUERY ASYNC 65ms
 - - - - - BEERS QUERY SYNC 69ms
 
 - - - - - BEERS QUERY ASYNC 26ms
 - - - - - BEERS QUERY SYNC 43ms
 
 - - - - - BEERS QUERY ASYNC 28ms
 - - - - - BEERS QUERY SYNC 55ms
 
 - - - - - BEERS QUERY ASYNC 52ms
 - - - - - BEERS QUERY SYNC 78ms
 
 - - - - - BEERS QUERY ASYNC 21ms
 - - - - - BEERS QUERY SYNC 40ms
 
 - - - - - BEERS QUERY ASYNC 21ms
 - - - - - BEERS QUERY SYNC 39ms
 
 - - - - - BEERS QUERY ASYNC 26ms
 - - - - - BEERS QUERY SYNC 45ms
 
 - - - - - BEERS QUERY ASYNC 25ms
 - - - - - BEERS QUERY SYNC 53ms
 
 - - - - - BEERS QUERY ASYNC 22ms
 - - - - - BEERS QUERY SYNC 42ms
 
 - - - - - BEERS QUERY ASYNC 25ms
 - - - - - BEERS QUERY SYNC 44ms

Based on the process’ output, I assume that the async-like method works better but I can’t understand why is slower as the sync method the first time it gets executed.

I’d like you to review this and let me know if there is something I’m doing wrong or if you have other suggestions.

Couchbase version: 4.1.1-5914 Enterprise Edition (build-5914)
SDK version: 2.2.3

The first few iterations may have a higher latency because JIT and JVM are warming up. The other additional latency comes up due to the client parsing the query response as row.

In this case, the ideal latency you would expect to see is around 20ms. Try increasing the query endpoints to 2 as you are doing both async and sync query. Also, using a countdown latch is a better approach than thread.sleep.

@fbonecco the sync code just wraps the async one, but it waits until the full payload is received while the async one streams you the data as it becomes available.

Also note that there is one thing which is probably a misunderstanding: on your async call the, subscribe() is non-blocking so the next line will be called immediately. If you are mixing sync and async code you need to be very careful with proper synchronization.

subhashni and @daschl many thanks for your reply.
I have recently created a new ticket at http://support.couchbase.com/hc/en-us/requests/14803 that has some relationship with this guy. Would you mind taking a look?

FYI @brian_williams has been working with on some other issues.

Thanks.

Franco

@fbonecco lets follow up in the ticket with support. I quickly checked the code in there and I think you need to set the queryEndpoints in the environment to the number of callables you have (10) because with the default setting you are only having 1 socket open, since you fire 10 in parallel and they take long time they all just get queued up. If you run more nodes in the cluster things should improve as well, but for now set the queryEndpoints to the number of callables.