N1QL architecture questions

@cihangirb

let’s assumpt that I have 3 query service nodes(q1,q2,q3),3 index service nodes(i1,i2,i3) and 3 data service nodes(d1,d2,d3). which all nodes in different machines.
and I have a bucket named db, I want to use REST API to query data by N1QL(PREPARE/EXECUTE statement).

Question 1: How to loadbalance and fairover Query Service?
PREPARE/EXECUTE statement can run on any query service node, Does Couchbase Server provide loadbalance featrue, Or I should implement myself(such as using nginx)?
If Couchbase Server provide loadbalance featrue, which Load Balancing Algorithm does Couchbase Server used?
If I should implement myself,Is there some apis can get the condition of loading of Query Service, So when query can equal distribut to query service node?

Question 2: How to loadbalance Index Service by Query Servuce?
When a query service working, the query service might call index service and data service, which index service node should query service call, if I am not assign the index?
How the query service equal distribut to index service node and data service node?

Question 3: best practice on CREATE INDEX if I have more than one index service node?
As docs descript(http://developer.couchbase.com/documentation/server/4.1/n1ql/n1ql-language-reference/createprimaryindex.html), If nodes is not specified, one of the nodes running the index service is randomly picked for the index.
Is that means I should using WITH in CREATE INDEX statement to make sure all index service node have the index,if I have more than one index?

wrt question 1: you’ll have to do the load balancing & failover. You can get the list of active query end points from ns_server, which knows & maintains list. You can create a distributed prepare/execute support using the encoded_json returned by prepare statement. This is how the Couchbase SDKs implement the support as well. You can look at the Java SDK (2.2.4) implementation since it’s all open source :smile:
For a simple implementation randomized or round robin would work. Getting the “current workload” in each system and then distributing the query execution is quite involved for first implementation. Checkout the query monitoring feature in the upcoming (soon!) Couchbase Watson developer preview.

wrt question (2): You just need to load balance among query services. Index nodes are load balanced automatically if there are duplicate indices (multiple indices with same signature). Ditto for data services. Query service orchastrates the use of index and data.

wrt question (3): yes. right now, if you want duplicate indexes, you should create them with WITH clause.

Thank you very much.