카우치베이스 서버

Talk to Your Data: A UDF That Speaks Your Language

The query above provides valuable insights from your data that’s stored in Couchbase about your top five users who generated the most completed orders within the past 30 days. But what if you’re not an advanced SQL++ developer and need the answers by 11 p.m. for a report? You then need to wait for a developer to write a SQL++ query and get you the answers.

Alternatively, consider a case where you need to do some ad hoc debugging to address questions like:

  • Are there any documents where the date the order was delivered is missing?
  • Does that mean that the order was cancelled? Or did we misplace the order and the order never got delivered? Or was everything ok, but we simply missed adding the order_delivered value in the field?

In this case, you not only need to search the  order_delivered field, but also look at order_cancelled or investigate comments to figure out if it was misplaced, etc. So the query to be written isn’t simple or straightforward. 

In such cases, it would help if you had a reliable assistant available 24×7 to get all these answers. The UDF described in this blog is such an assistant. It accepts your questions in the most natural way and returns results in JSON. Behind the scenes, it connects to a model of your choice, along with your API key, to convert your thoughts into SQL++ and then executes it. And all you need to invoke this assistant is to use the UDF.

작동 방식

1. Set up the library.
You first create a JavaScript library used by the UDF.

Library:

2. Upload the library.
Run the curl command after copying the provided library code into a file, i.e., usingailib.js.

3. Create the UDF.
Use the create function statement below to create the UDF once you have created the library:

NL2SQL() now acts as your multilingual translator between human language and Couchbase’s query engine. You simply give it some context and a natural language request, and it returns a response.

How the UDF Thinks

Under the hood, it uses your preferred model when invoking the UDF to understand your intent and generate a query that Couchbase can execute.

The advantage of using the chat completions API means you could simply plug in a model from other providers that are compliant with the same API spec. You can use your own private LLM or known ones from Open AI, Gemini, Claude, etc.

The invoked UDF requires the following information from you:

  1. 키 스페이스 – An array of strings, each representing a Couchbase keyspace (bucket.scope.collection).Use grave accent quotes where needed to escape special names (like 여행 샘플.inventory.route). This tells the UDF where to look for your data.
  2. 프롬프트 – Your request in plain English (or any other language).
    Example: “Show me all users who made a purchase in the last 24 hours.”
  3. apikey – Your API key used for authenticating with the model endpoint.
  4. model endpoint – e.g., Open AI compliant chat completions URL.
  5. 모델 – The name of the model you want to use from the provider.
    e.g., “gpt-4o-2024-05-13”

There are also several available functions in the library:

inferencer()

Before generating a query, the UDF first tries to understand your data. The inferencer() helper function calls Couchbase’s INFER statement to retrieve a collection’s schema:

This schema is used to help the AI understand what kind of data lives inside each collection.

The main function: nl2sql()

  • Collects all schemas for the given keyspaces using the inferencer(). Constructs a prompt that includes: the inferred schema, your natural language query, and a Couchbase prompt to nudge the LLM.
  • Sends it to the LLM.
  • Extracts the generated SQL++ from the model’s response.
  • Executes it directly if it’s a SELECT statement and returns both the generated SQL++ statement and the query results.

The reason for not executing non-select statements is that you don’t want this UDF to insert, update, or delete documents in a collection without you verifying it. So the SQL++ statement lets you execute it after it’s been verified.

Example use case:

Experimenting with models from other providers

The next example uses Gemini’s Open AI-compatible API. You simply change the model provider’s URL from the previous Open AI API to Gemini’s API. Also, be sure to change the model parameter to a model it recognizes. Of course, you need to also update the api-key from Open AI’s key to Gemini’s key.

The following illustrates the result:

결론

This blog provides a glimpse into how you can leverage AI to interact with your data in Couchbase. With this UDF, natural language querying becomes a reality – no SQL++ expertise required. It is model-agnostic and safe for production queries.

And this is just the beginning. In the future, we hope to extend it to:

  • Image → SQL++
  • Voice → SQL++
  • Agent-like pipelines

… all running inside Couchbase workflows.

참조
Capella IQ: https://docs.couchbase.com/cloud/get-started/capella-iq/get-started-with-iq.html
Chat completions APIs:
https://platform.openai.com/docs/api-reference/chat
https://ai.google.dev/gemini-api/docs/openai#rest

이 문서 공유하기
받은 편지함에서 카우치베이스 블로그 업데이트 받기
이 필드는 필수 입력 사항입니다.

Author

Posted by 가우라브 자야라즈 - 소프트웨어 엔지니어

가우라브 자야라즈는 카우치베이스 R&D의 쿼리 팀에서 인턴으로 근무하고 있습니다. Gaurav는 방갈로르의 PES 대학에서 컴퓨터 공학 학사 학위를 취득했습니다.

댓글 남기기

카우치베이스 카펠라를 시작할 준비가 되셨나요?

구축 시작

개발자 포털에서 NoSQL을 살펴보고, 리소스를 찾아보고, 튜토리얼을 시작하세요.

카펠라 무료 사용

클릭 몇 번으로 Couchbase를 직접 체험해 보세요. Capella DBaaS는 가장 쉽고 빠르게 시작할 수 있는 방법입니다.

연락하기

카우치베이스 제품에 대해 자세히 알고 싶으신가요? 저희가 도와드리겠습니다.