Spark connector couchbase using java API, N1QL LIMIT option

Hello,

I use the Java API documentation for build my application, but when a run the query, the LIMIT is not considering. By default, it is 1000 but I get all database. If I use the

.option(QueryOptions.InferLimit(),“1”);

it is the same result, I get all database.

When I run SELECT * FROM system:completed_requests in couchbase, I can see my request but without the limit option…

Do you have an idea why the limit do not work?

Couchbase version 6.6.1
spark-core_2.12 version 3.3.0
spark-connector_2.12 version 3.3.1

I hope my question is enought understandable.

Hi @MisterQ . That setting is only used when inferring the schema. If you want to apply a LIMIT to a regular query statement, you just add it to the statement itself. E.g. “SELECT default.* FROM default WHERE <…> LIMIT 1000”

Thank you for the answer.
How can I write a full custom request ?

I use the exemple of the documentation java API

       DataFrameReader sources = spark.read()
                .format("couchbase.query")
                .option(QueryOptions.Filter(), String.format("type = 'airline'"))
                .option(QueryOptions.Bucket(), "mybucket")
                .option(QueryOptions.InferLimit(),"1");

Ah I see, you’re working with DataFrames. Can you try .limit(1000) on the end of the DataFrame chain?

The .limit(1000) is only available after de .load() (but not after the read() ) but I want to not get all database in one time and then limit at 1000 load, because there is too much result in the read() and it takes too long.
I want to request limit in the .read() for performance.

But if you know how to write full custom request in java API, I am listening.

I see, so it’s not pushed down. In that case your other option would be to use RDDs directly:

spark
  .sparkContext
  .couchbaseQuery[JsonObject]("select `travel-sample`.* from `travel-sample` LIMIT 1000")
  .collect()
  .foreach(println)

In java API, there is no function “couchbaseQuery” or “query” after the sparkContext().

Do you know the name of the function I should use?

Ah - this is because in Scala it’s an implicit method, pulled in with import com.couchbase.spark._ That should just be a static method though, you should be able to call it from Java if you explore that namespace.