Insert JSON bulk data via N1QL query

Hi Folks,

I want to insert JSON bulk data in couchbase via n1ql query.

I am able to insert JSON bulk data using Python SDK in this way.

“”"
def writehandler(event, context):
cluster = Cluster(‘couchbase://h1,h2’)
authenticator = PasswordAuthenticator(‘abc’, ‘abc’)
cluster.authenticate(authenticator)
cb = cluster.open_bucket(‘firstbucket’)
a = 1
start_time = time.time()
list1 ={}
while a <= 50000:
list1[str(‘doc_id_’ + str(a))]={‘airlines’: a}
a = a + 1
# print(list1)
cb.upsert_multi(list1)
“”"

In this I am using while loop.

Same thing I want to do using N1QL query. I want to insert 20000 JSON records at a time using N1QL.

Thank you.

What you have in Python is 90% of the way there - the reason you’re getting a timeout is it’s it’s taking longer than the default timeout to prepare, send and read the responses for a batch size of maximum 50000 items.

I suggest you either reduce the batch size, or increase the timeout from the default (which IIRC is 2.5s). See the client settings section in the SDK documentation for increasing the timeout.

Hi @drigby,

My problem is not timeout.

I want to insert 20000 JSON record in couchbase via n1ql query just like given python SDK.

Thank you.

Hi @ImPurshu,

this is the syntax in N1QL.

INSERT INTO `travel-sample` (KEY,VALUE) 
VALUES ( "airline_4444", 
    { "callsign": "MY-AIR",
      "country": "United States",
      "iata": "Z1",
      "icao": "AQZ",
      "name": "80-My Air",
      "id": "4444",
      "type": "airline"} ),
VALUES ( "4445", { "callsign": "AIR-X",
      "country": "United States",
      "iata": "X1",
      "icao": "ARX",
      "name": "10-AirX",
      "id": "4445",
      "type": "airline"} ) ;

See the docs at: https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/insert.html

Hi @keshav_m

Thank you for your replay.

The solution which you have given is correct and I had seen also.

But if you want to insert 20000 records at a time, then this is not a good solution for it.

Please help me if you have another solution.

What’s the purpose of trying to insert 200K documents in a single statement?

If you’re generating/have the data, and want to load the quickly, there are many approaches.

  1. Direct KV inserts
  2. cbimport
1 Like

Note the documentation has a discussion on batching operations which is often necessary when optimizing for throughput, including a Python example.