IA generativa (GenAI)

CodeLab: Building a RAG Application With Couchbase Capella Model Services and LangChain

In this tutorial, you will learn how to build a geração aumentada por recuperação (RAG) aplicativo usando Couchbase AI Services to store data, generate embedding using embedding models, and LLM inference. We will create a RAG system that:

  1. Ingests news articles from the BBC News dataset.
  2. Generates vector embeddings using the NVIDIA NeMo Retriever model via Capella Model Services.
  3. Stores and indexes these vectors in Couchbase Capella.
  4. Performs semantic search to retrieve relevant context.
  5. Generates answers using the Mistral-7B LLM hosted on Capella.

You can find the notebook source code for this CodeLab aqui.

Why Couchbase AI Services?

Couchbase AI Services provide:

  • LLM inference and embeddings API: Access popular LLMs (i.e., Llama 3) and embedding models directly through Capella, without managing external API keys or infrastructure.
  • Unified platform: Leverage the database, vectorization, search, and model in one place.
  • Integrated vector search: Perform semantic search directly on your JSON data with millisecond latency.

Setting Up Couchbase AI Services

Create a Cluster in Capella

  1. Faça login em Couchbase Capella.
  2. Create a new cluster or use an existing one. Note that the cluster needs to run the latest version of Couchbase Server 8.0 that includes the Data, Query, Index, and Eventing services.
  3. Create a bucket.
  4. Create a scope and collection for the data.

Habilitar serviços de IA

  1. Navigate to Capella’s AI Services section on the UI.
  2. Deploy the embeddings and LLM models.
    • You need to launch an embedding and an LLM for this demo in the same region as the Capella cluster where the data will be stored.
    • For this demo to work well, you need to deploy an LLM that has tool calling capabilities such as mistralai/mistral-7b-instruct-v0.3. Para embeddings, você pode escolher um modelo como o nvidia/llama-3.2-nv-embedqa-1b-v2.
  3. Write down the endpoint URL and generate API keys.

For more details on launching AI models, you can read the documentação oficial.

Pré-requisitos

Before we begin, ensure you have Python 3.10+ installed.

Step 1: Install Dependencies

We need the Couchbase SDK, LangChain integrations, and the datasets library.

Step 2: Configuration & Connection

We’ll start by connecting to our Couchbase cluster. We also need to configure the endpoints for Capella Model Services.

Observação: Capella Model Services are compatible with the OpenAI API format, so we can use the standard langchain-openai library by pointing it to our Capella endpoint.

Step 3: Set Up the Database Structure

We need to ensure our bucket, scope, and collection exist to store the news data.

Step 4: Loading Couchbase Vector Search Index

Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where Couchbase Vector Search, formerly known as Full-Text Search (FTS) service, comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.

Step 5: Initialize AI Models

Here is the magic: we initialize the embedding model using Aberturas do OpenAIEmbeddings but point it to Capella. Couchbase AI Services provide OpenAI-compatible endpoints that are used by the agents. For embeddings, we’re using the LangChain OpenAI package as it is used in association with the LangChain Couchbase integration.

Step 6: Ingest Data

We load the BBC News dataset and ingest it into Couchbase. The CouchbaseSearchVectorStore automatically handles generating embeddings using our defined model and storing them.

Step 7: Build the RAG Chain

Now we create the RAG pipeline. We initialize the LLM (again pointing to Capella) and connect it to our vector store retriever.

Step 8: Run Queries

Let’s test our RAG.

Example Output:

Resposta: Pep Guardiola has expressed concern and frustration about Manchester City’s recent form. He stated, “I am not good enough. I am the boss… I have to find solutions.” He acknowledged the team’s defensive issues and lack of confidence.

Conclusão

Neste tutorial, você aprendeu a:

  1. Vectorize data using Couchbase.
  2. Use Couchbase AI Services for embeddings and LLM.
  3. Implement RAG with Couchbase Vector Search.

Couchbase’s unified database platform creates powerful AI applications that can generate high-quality, contextually-aware content.

Compartilhe este artigo
Receba atualizações do blog do Couchbase em sua caixa de entrada
Esse campo é obrigatório.

Author

Posted by Laurent Doguin

Laurent é um nerd metaleiro que mora em Paris. Em sua maior parte, ele escreve código em Java e texto estruturado em AsciiDoc, e frequentemente fala sobre dados, programação reativa e outras coisas que estão na moda. Ele também foi Developer Advocate do Clever Cloud e do Nuxeo, onde dedicou seu tempo e experiência para ajudar essas comunidades a crescerem e se fortalecerem. Atualmente, ele dirige as Relações com Desenvolvedores na Couchbase.

Deixe um comentário

Pronto para começar a usar o Couchbase Capella?

Iniciar a construção

Confira nosso portal do desenvolvedor para explorar o NoSQL, procurar recursos e começar a usar os tutoriais.

Use o Capella gratuitamente

Comece a trabalhar com o Couchbase em apenas alguns cliques. O Capella DBaaS é a maneira mais fácil e rápida de começar.

Entre em contato

Deseja saber mais sobre as ofertas do Couchbase? Deixe-nos ajudar.