IA Generativa (GenAI)

CodeLab: Building a RAG Application With Couchbase Capella Model Services and LangChain

In this tutorial, you will learn how to build a generación aumentada por recuperación (RAG) aplicación utilizando Couchbase AI Services to store data, generate embedding using embedding models, and LLM inference. We will create a RAG system that:

  1. Ingests news articles from the BBC News dataset.
  2. Generates vector embeddings using the NVIDIA NeMo Retriever model via Capella Model Services.
  3. Stores and indexes these vectors in Couchbase Capella.
  4. Performs semantic search to retrieve relevant context.
  5. Generates answers using the Mistral-7B LLM hosted on Capella.

You can find the notebook source code for this CodeLab aquí.

Why Couchbase AI Services?

Couchbase AI Services provide:

  • LLM inference and embeddings API: Access popular LLMs (i.e., Llama 3) and embedding models directly through Capella, without managing external API keys or infrastructure.
  • Unified platform: Leverage the database, vectorization, search, and model in one place.
  • Integrated vector search: Perform semantic search directly on your JSON data with millisecond latency.

Setting Up Couchbase AI Services

Create a Cluster in Capella

  1. Entrar en Couchbase Capella.
  2. Create a new cluster or use an existing one. Note that the cluster needs to run the latest version of Couchbase Server 8.0 that includes the Data, Query, Index, and Eventing services.
  3. Create a bucket.
  4. Create a scope and collection for the data.

Enable AI Services

  1. Navigate to Capella’s AI Services section on the UI.
  2. Deploy the embeddings and LLM models.
    • You need to launch an embedding and an LLM for this demo in the same region as the Capella cluster where the data will be stored.
    • For this demo to work well, you need to deploy an LLM that has tool calling capabilities such as mistralai/mistral-7b-instruct-v0.3. For embeddings, you can choose a model like the nvidia/llama-3.2-nv-embedqa-1b-v2.
  3. Write down the endpoint URL and generate API keys.

For more details on launching AI models, you can read the documentación oficial.

Requisitos previos

Before we begin, ensure you have Python 3.10+ installed.

Step 1: Install Dependencies

We need the Couchbase SDK, LangChain integrations, and the datasets library.

Step 2: Configuration & Connection

We’ll start by connecting to our Couchbase cluster. We also need to configure the endpoints for Capella Model Services.

Nota: Capella Model Services are compatible with the OpenAI API format, so we can use the standard langchain-openai library by pointing it to our Capella endpoint.

Step 3: Set Up the Database Structure

We need to ensure our bucket, scope, and collection exist to store the news data.

Step 4: Loading Couchbase Vector Search Index

Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where Couchbase Vector Search, formerly known as Full-Text Search (FTS) service, comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.

Step 5: Initialize AI Models

Here is the magic: we initialize the embedding model using OpenAIEmbeddings but point it to Capella. Couchbase AI Services provide OpenAI-compatible endpoints that are used by the agents. For embeddings, we’re using the LangChain OpenAI package as it is used in association with the LangChain Couchbase integration.

Step 6: Ingest Data

We load the BBC News dataset and ingest it into Couchbase. The CouchbaseSearchVectorStore automatically handles generating embeddings using our defined model and storing them.

Step 7: Build the RAG Chain

Now we create the RAG pipeline. We initialize the LLM (again pointing to Capella) and connect it to our vector store retriever.

Step 8: Run Queries

Let’s test our RAG.

Example Output:

Contesta: Pep Guardiola has expressed concern and frustration about Manchester City’s recent form. He stated, “I am not good enough. I am the boss… I have to find solutions.” He acknowledged the team’s defensive issues and lack of confidence.

Conclusión

In this tutorial, you learned how to:

  1. Vectorize data using Couchbase.
  2. Use Couchbase AI Services for embeddings and LLM.
  3. Implement RAG with Couchbase Vector Search.

Couchbase’s unified database platform creates powerful AI applications that can generate high-quality, contextually-aware content.

Comparte este artículo
Recibe actualizaciones del blog de Couchbase en tu bandeja de entrada
Este campo es obligatorio.

Author

Posted by Laurent Doguin

Laurent es un metalero empollón que vive en París. Principalmente escribe código en Java y texto estructurado en AsciiDoc, y a menudo habla sobre datos, programación reactiva y otras cosas de moda. También fue Developer Advocate de Clever Cloud y Nuxeo, donde dedicó su tiempo y experiencia a ayudar a esas comunidades a crecer y fortalecerse. Ahora dirige las relaciones con los desarrolladores en Couchbase.

Deja un comentario

¿Listo para empezar con Couchbase Capella?

Empezar a construir

Consulte nuestro portal para desarrolladores para explorar NoSQL, buscar recursos y empezar con tutoriales.

Utilizar Capella gratis

Ponte manos a la obra con Couchbase en unos pocos clics. Capella DBaaS es la forma más fácil y rápida de empezar.

Póngase en contacto

¿Quieres saber más sobre las ofertas de Couchbase? Permítanos ayudarle.