Large Documents + Slowness

I'm observing some slowness in my query and I'm thinking its to do with the way N1QL works. I have my CBQ instance running locally and am connecting to a remote cluster. If I select a doc which is small in size the response is immediate. A large doc takes several seconds, even if if i do a small subselection.

Am I correct in assuming that if you use select object.subsection.thing - the CBQ engine will get the full object from the cluster and then do the subselection in the N1QL engine itself? My thinking is the slowness is because the engine is pulling back the full document and then parsing it for me in CBQ-Engine?

What would you recommend in terms of deploying CBQ-Engine for best performance? On one node of the cluster? Can we load balance in some way?

2 Answers

« Back to question.

Yes, you are correct about how N1QL works in the developer previews. It fetches the whole document. Changing this will be a significant change because it impacts subsystems outside of the query engine (e.g. the data engine). We're aware of the issue and can provide more specific info when ready.

The query engine will live in cluster nodes (and can right now), but because we're a distributed system, the query engine and data node can always be on separate nodes for a particular document fetch.

« Back to question.

Further to my last post here are the queries.

Large (2.2mb document)
select buc.entity.clientid from buc where buc.entity.objectid = 552910680
Creates huge network spike on local machine where CBQ is running, even for small response.

Small (30KB)
select buc.entity.clientid from buc where buc.entity.objectid =  552394365
Create no discernable network spike

The field being selected is one small number, yields this response in both instances

 {
  "resultset": [
    {
      "clientid": 552901576
    }
  ]
}

Hence I guess all the json is returned from Couch to CBQ for processing. If this is the case, I'm wondering whether there are plans for N1QL to live 'in' the cluster. It would make sense from an IO perspective.