Batch insertion - Strange behavior in ops/sec

manusyone · November 2, 2016, 9:38am

I’ve batch inserted 10k documents in the database to populate it. After I’ve done this insertion, I can see periodic spikes in the number of operations per second (in the order of 5k ops/sec) in the bucket, which slows the application massively (a single query that usually takes <1 second took more than 54 seconds).

I’ve then proceeded to flush the bucket, and I can see that even with an empty bucket, those spikes still occur. I then deleted the bucket and recreated it under the same name, only to find that the number of operations per second still rises from time to time.

Since this is a small dev environment, we currently have a single node with 2Gb RAM, with 80Gb disk. Here’s a picture of the current status of the server:

What causes this behavior? How can I solve it? Let me know if you need more information.

keshav_m · November 2, 2016, 4:17pm

The graph indicates insertion of more than 10K documents.

Can you post the explain for your queries?
Also, if you’re using 4.5.1, post: SELECT * FROM system:completed_requests;

manusyone · November 2, 2016, 4:52pm

Hi @keshav_m. We only inserted 10k documents (as you can see in the second image), I don’t know the meaning of the subsequent spikes.

The result of the query above yields 4000 results, w/ success status in 1.36s. Here’s the first one:

{
    "completed_requests": {
      "ElapsedTime": "13.313917777s",
      "ErrorCount": 0,
      "PhaseCounts": {
        "Fetch": 9841,
        "PrimaryScan": 9841,
        "Sort": 10
      },
      "PhaseOperators": {
        "Fetch": 1,
        "PrimaryScan": 1,
        "Sort": 1
      },
      "RequestId": "da8f0981-a3b0-4a0e-8745-9fe693a30938",
      "ResultCount": 5,
      "ResultSize": 2726,
      "ServiceTime": "13.312824699s",
      "State": "completed",
      "Statement": "*removed for privacy*",
      "Time": "2016-10-25 17:30:59.906734729 +0000 UTC"
    }
  },

geraldss · November 2, 2016, 5:07pm

Which are you using to insert the documents, N1QL INSERT or key-value?

manusyone · November 2, 2016, 5:11pm

Key-value, we’re using an ODM that does that for us. We insert one document at a time. Is it a bad approach?

geraldss · November 2, 2016, 5:17pm

I just wanted @keshav_m to have that context as he is helping you.

keshav_m · November 2, 2016, 5:35pm

@manusyone

From the completed_requests, I see your queries have the following:

PRIMARY Scan – that’ll scan the entire data set (10K) to produce 5 documents.
There is a SORT operation.

Please see the N1QL articles in DZone to design the right indexes and avoid ORDER BY, if possible.

manusyone · November 2, 2016, 6:18pm

@keshav_m And you believe the lack of indexes is the cause for such an enormous impact with such a low number of documents?

Regarding ORDER BY, this will be required for some operations, as the results are populating a listing that allows for sorting. I’ll take a look into creating indexes for my data.

manusyone · November 3, 2016, 9:18am

@keshav_m, @geraldss, just to give you some more context, I’ve inserted 10k documents containing info about cities (name, country, coordinates…). All other documents regard other things like Users. Do you have any suggestions on how should I not account for those 10k documents when making operations in the other documents?

Topic		Replies	Views
Inserts are around 1k on Cluster on Dev testing Couchbase Server	1	1269	September 29, 2017
Couchbase server poor ops/sec issue (disk creates very low) Mobile	4	1348	December 7, 2017
Rapid multiple inserts in Couchbase Couchbase Server	0	1774	September 4, 2014
Querying 'Ops' Values for Data Insertion and Selection in Couchbase Couchbase Server	3	207	May 19, 2024
Substential decrease in recent insert operation on a single bucket Couchbase Server	3	906	March 19, 2018

Batch insertion - Strange behavior in ops/sec

Related topics