Hi there,
I’m experiencing performance problem with using TONUMBER()
type-conversion function. It might be that I’m doing something wrong and maybe someone can point me to the right direction.
Here is a very typical task, I store user sessions and each session document has userid
attribute in it, an example would be:
{
"session": "jk34h51jk3h4r5kj1hrkjqwhefkjqwhefkljqh34jrh134kljhrkj3",
"userid": "123456789",
"ip": "127.0.0.1",
"created": "2016-05-07 20:26:12",
"last_updated": "2016-05-14 20:54:02",
"expires": "2016-05-14 20:26:12"
}
To be able to query user sessions I created the only index in the default
bucket:
CREATE INDEX `user-session-idx` ON `default`(userid, meta().id) USING GSI;
Now I’m able to run such queries:
select meta().id from default where userid="123456789";
With 50k documents in the database these queries run in about 6ms, all cool.
But then I noticed that if there are session documents in the database where userid
attribute is a JSON number, not string, then the query would not return those documents. This was quite a catch for me, coming from MySQL, where it doesn’t matter if you query as:
select * from sessions where userid=123456789;
OR
select * from sessions where userid="123456789";
Anyway, to be on the safe side I decided to use TONUMBER()
function. And the immediate problem I got was “No primary index on keyspace default” error:
cbq> select meta().id from default where TONUMBER(`userid`)=123456789;
{
"requestID": "ea0393c6-603c-409b-9343-85518f1e0797",
"errors": [
{
"code": 4000,
"msg": "No primary index on keyspace default. Use CREATE PRIMARY INDEX to create one."
}
],
"status": "fatal",
"metrics": {
"elapsedTime": "5.015269ms",
"executionTime": "4.907528ms",
"resultCount": 0,
"resultSize": 0,
"errorCount": 1
}
}
And this is my first question:
- Why do I have to create primary index to be able to call
TONUMBER()
function?
After creating primary index I tried to run the same query again.
select meta().id from default where TONUMBER(`userid`)=123456789;
The query found all documents but took over 4 sec comparing to 6-7 ms when searching without type conversion. The difference is really huge!
Clearly, something is not right here.
Why am I experiencing this problem and how would you recommend me to filter by attributes which could be JSON strings and numbers at the same time?
System details:
- OS: Debian wheezy
- Couchbase server: couchbase-server-community-4.0.0-4051-1