N1ql array index performance

Hi,

I’m using CB 4.5.1-2844 Enterprise Edition and trying to compare 2 approaches to query my data below:

Case:
We have groups and users, which a group have multiple users and a user will belong to multiple groups.
Will need to list by 2 ways, (1) by group, (2) by user


Approach 1:
Store the information inside the group object, like:
{ “groupId”: 1,
“users”: [
{“userId”:11},
{“userId”:12}
]}

using array index, it took in average 90ms to query by “userId”


Approach 2:
Store information in a “row-by-row” way, like:
{ "groupId: 1, “userId”: 11}
{ "groupId: 1, “userId”: 12}

using 2 indexes, 1 for groupId and 1 for userId, it took me in average 45ms


For document databases, I believe approach 1 is better in data structure, but test result shows that approach 2 has better performance. Are there anyway to improve the performance? Please let me know if I did anything wrong.

Thanks.

If you only want to query performance for list by 2 ways, (1) by group, (2) by user,you can store group list inside the user object,and store user list inside the group object,such as
group object:

{ "id": 1,
"users": [
{"userId":11},
{"userId":12}
]}

user object:

{ "id": 11,
"groups": [
{"groupId":1}
]}

and

{ "id": 12,
"groups": [
{"groupId":1}
]}

all query only need one hit by keyScan,and you only need create Primary index.

but I think you should also need balance the cost of update the document (also with updating index)/query document.

thx, this is what we are doing currently without applying n1ql

The problem here is whenever there’s any disaster (e.g. overloaded, reboot), if not all the updates are executed successfully, there will be an issue that the result is different when list by group and user. That’s why we are switching to use n1ql.

I think N1QL can not solve this problem.
and FYI

as long as I keep the data in 1 single object, then no transaction will be needed, it can just be a success or fail update

if updating two documents, then your application will need to make sure either both docs are updated, or none.

yes

can you post the query. Are you using array indexing?

-Prasad

Yes, I’m using array indexing:

CREATE INDEX group_user ON dev ( DISTINCT ARRAY x.userId FOR x IN d.users END ) where t = “group”;

SELECT meta(dev).id from dev WHERE t = “group” AND ANY x IN d.users SATISFIES x.userId = 11 END;

the exact data structure is:
{
“t”: “group”,
“d”: {
“users”: [
{“userId”: 11},
{“userId”: 12}
]
}
}

Hi @littlewitchanita,

What is wrong with your original approach 2? It looks good to me, especially if you are getting desired performance. It seems to address the transaction issue.

For performance, approach 2 seems to be better, it’s just violating the best practice of nosql, so I’m wondering if there’ll be a better solution

Your approach #2 is completely consistent with the best practices of NoSQL. No issue there.

NoSQL gives you flexibility. It does not mean you must always do the opposite of relational.