We observe very slow write performance using the Couchbase Spark Connector.
The Connector 2.2.0 currently using
async bucket and inserts documents one-by-one. We use the Connector with Spark Streaming, where 1000-5000 documents supposed to be inserted per second. Documents go through expensive models before insertion, but nevertheless, the writing dominates the run time of the whole micro-batch.
We have very few indexes on documents, practically one index on the document type (cardinality of 2).
Are there any tips to improve? Maybe rewrite the insertion to bulk operations?