Artificial Intelligence (AI)

Codelab: Building an AI Agent With Couchbase AI Services & Agent Catalog

In this CodeLab, you will learn how to build a Hotel Search Agent using LangChain, Couchbase AI Services, and Agent Catalog. We will also incorporate Arize Phoenix for observability and evaluation to ensure our agent performs reliably.

This tutorial takes you from zero to a fully functional agent that can search for hotels, filter by amenities, and answer natural language queries using real-world data.

Note: You can find the full Google CodeLab notebook for this CodeLab here.

What Are Couchbase AI Services?

Building AI applications often involves juggling multiple services: a vector database for memory, an inference provider for LLMs (like OpenAI or Anthropic), and separate infrastructure for embedding models.

Couchbase AI Services streamlines this by providing a unified platform where your operational data, vector search, and AI models live together. It offers:

  • LLM inference and embeddings API: Access popular LLMs (like Llama 3) and embedding models directly within Couchbase Capella, with no external API keys, no extra infrastructure, and no data egress. Your application data stays inside Capella. Queries, vectors, and model inference all happen where the data lives. This enables secure, low-latency AI experiences while meeting privacy, compliance requirements. Thus, the key value: data and AI together, without sending sensitive information outside your system.
  • Unified platform: Database + Vectorization + Search + Model
  • Integrated vector search: Perform semantic search directly on your JSON data with millisecond latency.

Why Is This Needed?

As we move from simple chatbots to agentic workflows, where AI models autonomously use tools, latency, and complexity of setup become bottlenecks. By co-locating your data and AI services, you reduce the operational overhead and latency. Furthermore, tools like the Agent Catalog help with managing hundreds of agent prompts and tools and provide built in logging for your agents.

Prerequisites

Before we begin, ensure you have:

  • A Couchbase Capella account.
  • Python 3.10+ installed.
  • Basic familiarity with Python and Jupyter notebooks.

Create a Cluster in Couchbase Capella

  1. Log into Couchbase Capella.
  2. Create a new cluster or use an existing one. Note that the cluster needs to run the latest version of Couchbase Server 8.0 with the Data, Query, Index, and the Eventing services.
  3. Create a bucket.
  4. Create a scope and collection for your data.

Step 1: Install Dependencies

We’ll start by installing the necessary packages. This includes the couchbase-infrastructure helper for setup, the agentc CLI for the catalog, and the LangChain integration packages.

Step 2: Infrastructure as Code

Instead of manually clicking through the UI, we use the couchbase-infrastructure package to programmatically provision our Capella environment. This ensures a reproducible setup.

We will:

  1. Create a Project and Cluster.
  2. Deploy an Embedding Model (nvidia/llama-3.2-nv-embedqa-1b-v2) and an LLM (meta/llama3-8b-instruct).
  3. Load the travel-sample dataset.

Couchbase AI Services provides OpenAI-compatible endpoints that are used by the agents.

Ensure to follow the steps to setup the security root certificate. Secure connections to Couchbase Capella require a root certificate for TLS verification. You can find this in the ## 📜 Root Certificate Setup section of the Google Colab Notebook.

Step 3: Integrating Agent Catalog

The Agent Catalog is a powerful tool for managing the lifecycle of your agent’s capabilities. Instead of hardcoding prompts and tool definitions in your Python files, you manage them as versioned assets. You can centralize and reuse your tools across your development teams. You can also examine and monitor agent responses with the Agent Tracer.

Initialize and Download Assets

First, we initialize the catalog and download our pre-defined prompts and tools.

Index and Publish

We use agentc to index our local files and publish them to Couchbase. This stores the metadata in your database, making it searchable and discoverable by the agent at runtime.

Step 4: Preparing the Vector Store

To enable our agent to search for hotels semantically (e.g., “cozy place near the beach”), we need to generate vector embeddings for our hotel data.

We define a helper to format our hotel data into a rich text representation, prioritizing location and amenities.

Step 5: Building the LangChain Agent

We use the Agent Catalog to fetch our tool definitions and prompts dynamically. The code remains generic, while your capabilities (tools) and personality (prompts) are managed separately. We will also create our ReAct agents.

Step 6: Running the Agent

With the agent initialized, we can perform complex queries. The agent will:

  1. Receive the user input.
  2. Decide it needs to use the search_vector_database tool.
  3. Execute the search against Capella.
  4. Synthesize the results into a natural language response.

Example Output:

Agent: I found a hotel in Giverny that offers free breakfast called Le Clos Fleuri. It is located at 5 rue de la Dîme, 27620 Giverny. It offers free internet and parking as well.

Note: In Capella Model Services, the model outputs can be cached (both semantic and standard cache). The caching mechanism enhances the RAG’s efficiency and speed, particularly when dealing with repeated or similar queries. When a query is first processed, the LLM generates a response and then stores this response in Couchbase. When similar queries come in later, the cached responses are returned. The caching duration can be configured in the Capella Model services.

Adding Semantic Caching

Caching is particularly valuable in scenarios where users may submit similar queries multiple times or where certain pieces of information are frequently requested. By storing these in a cache, we can significantly reduce the time it takes to respond to these queries, improving the user experience.

Step 7: Observability With Arize Phoenix

In production, you need to know why an agent gave a specific answer. We use Arize Phoenix to trace the agent’s “thought process” (the ReAct chain).

We can also run evaluations to check for hallucinations or relevance.

By inspecting the Phoenix UI, you can visualize the exact sequence of tool calls and see the latency of each step in the chain.

Conclusion

We have successfully built a robust Hotel Search Agent. This architecture leverages:

  1. Couchbase AI Services: For a unified, low-latency data and AI layer.
  2. Agent Catalog: For organized, versioned management of agent tools and prompts. Agent catalog also provides tracing. It provides users to use SQL++ with traces, leverage the performance of Couchbase, and get insight into details of prompts and tools in the same platform.
  3. LangChain: For flexible orchestration.
  4. Arize Phoenix: For observability.

This approach scales well for teams building complex, multi-agent systems where data management and tool discovery are critical challenges.

Share this article
Get Couchbase blog updates in your inbox
This field is required.

Author

Posted by Laurent Doguin

Laurent is a nerdy metal head who lives in Paris. He mostly writes code in Java and structured text in AsciiDoc, and often talks about data, reactive programming and other buzzwordy stuff. He is also a former Developer Advocate for Clever Cloud and Nuxeo where he devoted his time and expertise to helping those communities grow bigger and stronger. He now runs Developer Relations at Couchbase.

Leave a comment

Ready to get Started with Couchbase Capella?

Start building

Check out our developer portal to explore NoSQL, browse resources, and get started with tutorials.

Use Capella free

Get hands-on with Couchbase in just a few clicks. Capella DBaaS is the easiest and fastest way to get started.

Get in touch

Want to learn more about Couchbase offerings? Let us help.