GSI: different sporadic bugs + fragmentation up to 98%

@cihangirb,
you know, i was unable to reproduce :open_mouth:
Even more, after “clean reinstall” (hw remained the same: 3 nodes, 4CPU x 4GB RAM; by the way, i saved all logs folder contents from “stucked installation”, ~ 15M per node .bz2), everything looks predictable:




… ? hmm.

UPDATE: oh-oh:

Is it possible at all ?

UPDATE2: opps №2:

… and my test-script died around here.

@cihangirb, @siri
… and the beautiful message at the end:
2016-06-02T07:18:46.427+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 165390 Milliseconds Vbucket 125
2016-06-02T07:18:46.457+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 165420 Milliseconds Vbucket 125
2016-06-02T07:18:46.487+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 165450 Milliseconds Vbucket 125
2016-06-02T07:18:46.517+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 165480 Milliseconds Vbucket 125

Also:





So:

  1. Unpredictable behavoir for even 2 GSI (what would happen if there were 6, for example ?)
  2. This is the one of the most strange and unconfigurable things i’ve ever seen
  3. Too resourceful to reproduce (how many times do i need to run code to catch it again?)
  4. Similar related bug Loading.... And it’s CLOSED. And do you know what is the reason for closing ? Right, “Closing this issue, could not reproduce with latest build”:

    No analysis, no investigation, just “i don’t see this anymore”. Ok, but did you try to ask “Why” ?

Imho, GSI implementation is buggy, at least for 4.0/4.1.X. Bugs are really unpredictable and sporadic, and this allows to say “oh, nothing happens, this is just your particular configuration”.
Also imho: deep implementation review (and, probably, architectural too) is needed for GSI.

Below is the code, anyone can reproduce (i was need 2 runs):

//package highcpuafterload;

import com.couchbase.client.java.Bucket;
import com.couchbase.client.java.Cluster;
import com.couchbase.client.java.CouchbaseCluster;
import com.couchbase.client.java.document.JsonDocument;
import com.couchbase.client.java.document.json.JsonObject;
import com.couchbase.client.java.env.CouchbaseEnvironment;
import com.couchbase.client.java.env.DefaultCouchbaseEnvironment;
import com.couchbase.client.java.query.N1qlQuery;
import java.util.LinkedList;
import java.util.Scanner;
import java.util.concurrent.Phaser;
import java.util.concurrent.ThreadLocalRandom;

public class BombardaMaxima extends Thread {

private static final CouchbaseEnvironment ce;
private static final Cluster cluster;
private static final String bucket = "default";
// configure here
private static final int indexableInsertions = 1;
private static final int nonIndexableInsertions = 1;
private static final int totalRuns = 200000;
private static final int threads = 20;
private static final int threadPauseEachNRuns = 1000;

private static final Phaser p = new Phaser(threads  + 1);

static {
    ce = DefaultCouchbaseEnvironment.create();
    final LinkedList<String> nodes = new LinkedList<String>();
    nodes.add("A.node");
    nodes.add("B.node");
    nodes.add("C.node");
    cluster = CouchbaseCluster.create(ce, nodes);
    final Bucket b = cluster.openBucket(bucket);
    final String iQA = "CREATE INDEX iQA ON `default`(a, b) WHERE a is valued USING GSI";
    final String iQX = "CREATE INDEX iQX ON `default`(a, c) WHERE a is valued USING GSI";
    
    b.query(N1qlQuery.simple(iQA));        
    b.query(N1qlQuery.simple(iQX));        
    
}

public final void run() {
    Bucket b = null;
    synchronized(cluster) { b = cluster.openBucket(bucket); }
    for(int k = 0; k< totalRuns; k++) {
        // if(k % 1000 == 0) try { Thread.currentThread().sleep(10000); } catch(Exception e) { e.printStackTrace(); }
        for(int i = 0; i < nonIndexableInsertions; i++) b.upsert(makeDoc("x", "y", "z"));
        for(int i = 0; i < indexableInsertions; i++)  b.upsert(makeDoc("a", "b", "c"));
    }    
    p.arriveAndAwaitAdvance();
}


public final JsonDocument makeDoc(String a, String b, String c) {
   /*
    final long ctm = System.currentTimeMillis();
    return JsonDocument
            .create(
                String.valueOf(ThreadLocalRandom.current().nextInt() + "" + ctm),
                JsonObject.empty()
                    .put(a, ThreadLocalRandom.current().nextInt() + "" + ctm)
                    .put(b, ThreadLocalRandom.current().nextInt() + "" + ctm)
                    .put(c, ThreadLocalRandom.current().nextInt() + "" + ctm)
            );
 */
    return JsonDocument
            .create(
                String.valueOf(ThreadLocalRandom.current().nextInt()),
                JsonObject.empty()
                    .put(a, ThreadLocalRandom.current().nextInt())
                    .put(b, ThreadLocalRandom.current().nextInt())
                    .put(c, ThreadLocalRandom.current().nextInt())
            );
}
public static void main(String[] args) {
    for(int i = 0; i< threads; i++) new BombardaMaxima().start();
    p.arriveAndAwaitAdvance();
}

}

And more (but this one with 6 indexes, 1 copy per node of 2 identical indexes + small load):




Hi all:

I’m facing with the same issue, both with 4.1.x and 4.5.x.
Building process of GSI forestdb indexes never ends and indexer.log shows continuously:

2016-10-26T16:59:45.704+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49500 Milliseconds Vbucket 566
2016-10-26T16:59:45.734+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49530 Milliseconds Vbucket 566
2016-10-26T16:59:45.764+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49560 Milliseconds Vbucket 566
2016-10-26T16:59:45.794+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49590 Milliseconds Vbucket 566
2016-10-26T16:59:45.824+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49620 Milliseconds Vbucket 566
2016-10-26T16:59:45.854+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49650 Milliseconds Vbucket 566
2016-10-26T16:59:45.884+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49680 Milliseconds Vbucket 566
2016-10-26T16:59:45.914+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49710 Milliseconds Vbucket 566
2016-10-26T16:59:45.944+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49740 Milliseconds Vbucket 566
2016-10-26T16:59:45.974+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49770 Milliseconds Vbucket 566
2016-10-26T16:59:46.004+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49800 Milliseconds Vbucket 566
2016-10-26T16:59:46.034+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49830 Milliseconds Vbucket 566
2016-10-26T16:59:46.064+00:00 [Warn] Indexer::MutationQueue Waiting for Node Alloc for 49860 Milliseconds Vbucket 566

@cihangirb, any clue?

Thanks,

@jfrodriguez, please switch to circular write mode and 4.5.1 version if not already done.
http://developer.couchbase.com/documentation/server/4.5/indexes/gsi-for-n1ql.html

Hi!

Does anyone know if issues related with building of standard gsi and millions of documents are already solved in 4.6.x? Maybe @keshav_m, @geraldss ?
Thanks

@jfrodriguez, with a reasonable memory quota and circular write mode in 4.5.1, you shouldn’t be facing the issues mentioned in this thread. Let us know if you are having problems and we can review your sizing.