Servidor Couchbase

Talk to Your Data: A UDF That Speaks Your Language

The query above provides valuable insights from your data that’s stored in Couchbase about your top five users who generated the most completed orders within the past 30 days. But what if you’re not an advanced SQL++ developer and need the answers by 11 p.m. for a report? You then need to wait for a developer to write a SQL++ query and get you the answers.

Alternatively, consider a case where you need to do some ad hoc debugging to address questions like:

  • Are there any documents where the date the order was delivered is missing?
  • Does that mean that the order was cancelled? Or did we misplace the order and the order never got delivered? Or was everything ok, but we simply missed adding the order_delivered value in the field?

In this case, you not only need to search the  order_delivered field, but also look at order_cancelled or investigate comments to figure out if it was misplaced, etc. So the query to be written isn’t simple or straightforward. 

In such cases, it would help if you had a reliable assistant available 24×7 to get all these answers. The UDF described in this blog is such an assistant. It accepts your questions in the most natural way and returns results in JSON. Behind the scenes, it connects to a model of your choice, along with your API key, to convert your thoughts into SQL++ and then executes it. And all you need to invoke this assistant is to use the UDF.

Como funciona

1. Set up the library.
You first create a JavaScript library used by the UDF.

Library:

2. Upload the library.
Run the curl command after copying the provided library code into a file, i.e., usingailib.js.

3. Create the UDF.
Use the create function statement below to create the UDF once you have created the library:

NL2SQL() now acts as your multilingual translator between human language and Couchbase’s query engine. You simply give it some context and a natural language request, and it returns a response.

How the UDF Thinks

Under the hood, it uses your preferred model when invoking the UDF to understand your intent and generate a query that Couchbase can execute.

The advantage of using the chat completions API means you could simply plug in a model from other providers that are compliant with the same API spec. You can use your own private LLM or known ones from Open AI, Gemini, Claude, etc.

The invoked UDF requires the following information from you:

  1. Espaços-chave – An array of strings, each representing a Couchbase keyspace (bucket.scope.collection).Use grave accent quotes where needed to escape special names (like amostra de viagem.inventory.route). This tells the UDF where to look for your data.
  2. imediato – Your request in plain English (or any other language).
    Example: “Show me all users who made a purchase in the last 24 hours.”
  3. apikey – Your API key used for authenticating with the model endpoint.
  4. model endpoint – e.g., Open AI compliant chat completions URL.
  5. modelo – The name of the model you want to use from the provider.
    e.g., “gpt-4o-2024-05-13”

There are also several available functions in the library:

inferencer()

Before generating a query, the UDF first tries to understand your data. The inferencer() helper function calls Couchbase’s INFER statement to retrieve a collection’s schema:

This schema is used to help the AI understand what kind of data lives inside each collection.

The main function: nl2sql()

  • Collects all schemas for the given keyspaces using the inferencer(). Constructs a prompt that includes: the inferred schema, your natural language query, and a Couchbase prompt to nudge the LLM.
  • Sends it to the LLM.
  • Extracts the generated SQL++ from the model’s response.
  • Executes it directly if it’s a SELECT statement and returns both the generated SQL++ statement and the query results.

The reason for not executing non-select statements is that you don’t want this UDF to insert, update, or delete documents in a collection without you verifying it. So the SQL++ statement lets you execute it after it’s been verified.

Example use case:

Experimenting with models from other providers

The next example uses Gemini’s Open AI-compatible API. You simply change the model provider’s URL from the previous Open AI API to Gemini’s API. Also, be sure to change the model parameter to a model it recognizes. Of course, you need to also update the api-key from Open AI’s key to Gemini’s key.

The following illustrates the result:

Conclusão

This blog provides a glimpse into how you can leverage AI to interact with your data in Couchbase. With this UDF, natural language querying becomes a reality – no SQL++ expertise required. It is model-agnostic and safe for production queries.

And this is just the beginning. In the future, we hope to extend it to:

  • Image → SQL++
  • Voice → SQL++
  • Agent-like pipelines

… all running inside Couchbase workflows.

Referências
Capella IQ: https://docs.couchbase.com/cloud/get-started/capella-iq/get-started-with-iq.html
Chat completions APIs:
https://platform.openai.com/docs/api-reference/chat
https://ai.google.dev/gemini-api/docs/openai#rest

Compartilhe este artigo
Receba atualizações do blog do Couchbase em sua caixa de entrada
Esse campo é obrigatório.

Autor

Postado por Gaurav Jayaraj - Engenheiro de software

Gaurav Jayaraj é estagiário na equipe de consultas da Couchbase R&D. Gaurav está cursando bacharelado em Ciência da Computação na PES University, Bangalore.

Deixe um comentário

Pronto para começar a usar o Couchbase Capella?

Iniciar a construção

Confira nosso portal do desenvolvedor para explorar o NoSQL, procurar recursos e começar a usar os tutoriais.

Use o Capella gratuitamente

Comece a trabalhar com o Couchbase em apenas alguns cliques. O Capella DBaaS é a maneira mais fácil e rápida de começar.

Entre em contato

Deseja saber mais sobre as ofertas do Couchbase? Deixe-nos ajudar.