{"id":16002,"date":"2024-07-04T09:49:03","date_gmt":"2024-07-04T16:49:03","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=16002"},"modified":"2025-06-13T16:36:50","modified_gmt":"2025-06-13T23:36:50","slug":"accelerate-rag-ai-couchbase-nvidia","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/","title":{"rendered":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Today, we&#8217;re excited to announce our new integration with NVIDIA NIM\/NeMo. In this blog post, we present a solution concept of an interactive chatbot based on a <em>Retrieval Augmented Generation<\/em> (RAG)<\/span><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">architecture with Couchbase Capella as a Vector database. The retrieval and generation phases of the RAG pipeline are accelerated by NVIDIA NIM\/NeMo with <\/span><span style=\"font-weight: 400;\">just a few lines of code.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Enterprises across various verticals strive to offer the best customer service to their customers. To achieve this, they are arming their frontline workers such as ER nurses, store sales associates, and help desk representatives, with AI-powered interactive question-and-answer (QA) chatbots to retrieve relevant and up-to-date information quickly. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Chatbots are usually based on <\/span><a href=\"https:\/\/www.couchbase.com\/blog\/an-overview-of-retrieval-augmented-generation\/\"><span style=\"font-weight: 400;\">RAG<\/span><\/a><span style=\"font-weight: 400;\">, an AI framework used for retrieving facts from the enterprise\u2019s knowledge base to ground LLM responses in the most accurate and recent information. It involves three distinct phases, which starts with the retrieval of the most relevant context using <\/span><a href=\"https:\/\/www.couchbase.com\/products\/vector-search\/\"><span style=\"font-weight: 400;\">vector search<\/span><\/a><span style=\"font-weight: 400;\">, augmentation of the user\u2019s query with the context, and, finally, generating relevant responses using an LLM.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The problem with existing RAG pipelines is that calls to the embedding service in the retrieval phase for converting user prompts into vectors can add significant latency, slowing down applications that require interactivity. Vectorizing a document corpus consisting of millions of PDFs, docs, and other knowledge bases can take a long time to vectorize, increasing the likelihood of using stale data for RAG. Further, users find it challenging to accelerate inference (tokens\/sec) cost-efficiently to reduce the response time of their chatbot applications.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Figure 1 depicts a performant stack that will enable you to easily develop an <\/span><span style=\"font-weight: 400;\">interactive customer service chatbot. It consists of the StreamLit application framework, LangChain for orchestration, Couchbase Capella for indexing and searching vectors, and NVIDIA NIM\/NeMo for accelerating the retrieval and generation stages.<\/span><\/p>\n<div id=\"attachment_16003\" style=\"width: 910px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2024\/07\/image1-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16003\" class=\"wp-image-16003 size-large\" style=\"border: solid black 1px;\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2024\/07\/image1-1-1024x518.png\" alt=\"NVIDIA NIM\/NeMo and LangChain\" width=\"900\" height=\"455\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1-1024x518.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1-300x152.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1-768x389.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1-1320x668.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png 1345w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16003\" class=\"wp-caption-text\">Figure 1: Conceptual Architecture of a QA Chatbot built using Capella and NVIDIA NIM\/NeMo<\/p><\/div>\n<p><span style=\"font-weight: 400;\">Couchbase Capella, a high-performance database-as-a-service (DBaaS), allows you to get started quickly with storing, indexing, and querying operational, vector, text, time series, and geospatial data while leveraging the flexibility of JSON. You can easily integrate Capella for <\/span><a href=\"https:\/\/www.couchbase.com\/products\/vector-search\/\"><span style=\"font-weight: 400;\">vector search<\/span><\/a><span style=\"font-weight: 400;\"> or semantic search without the need for a separate vector database by integrating an orchestration framework such as <\/span><a href=\"https:\/\/www.langchain.com\/\"><span style=\"font-weight: 400;\">LangChain<\/span><\/a><span style=\"font-weight: 400;\"> or <\/span><a href=\"https:\/\/www.llamaindex.ai\/\"><span style=\"font-weight: 400;\">LlamaIndex<\/span><\/a><span style=\"font-weight: 400;\"> into your production RAG pipeline. It offers the <\/span><a href=\"https:\/\/www.couchbase.com\/blog\/hybrid-search\/\"><span style=\"font-weight: 400;\">hybrid search<\/span><\/a><span style=\"font-weight: 400;\"> capability, which blends vector search with traditional search to improve search performance significantly. Further, you can extend vector search to the edge using Couchbase mobile for edge AI use cases.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once you have configured Capella Vector Search, you can proceed to choose a performant model from the <\/span><a href=\"https:\/\/build.nvidia.com\/explore\/discover\"><span style=\"font-weight: 400;\">NVIDIA API Catalog<\/span><\/a><span style=\"font-weight: 400;\">, which offers a broad spectrum of foundation models that span open-source, NVIDIA AI foundation, and custom models, optimized to deliver the best performance on NVIDIA accelerated infrastructure. These models are deployed as <\/span><a href=\"https:\/\/developer.nvidia.com\/blog\/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale\/?ref=blog.langchain.dev\"><span style=\"font-weight: 400;\">NVIDIA NIM<\/span><\/a><span style=\"font-weight: 400;\"> either on-prem or in the cloud using easy-to-use prebuilt containers via a single command. NeMo Retriever, <\/span><span style=\"font-weight: 400;\">a part of NVIDIA NeMo,<\/span><span style=\"font-weight: 400;\"> offers information retrieval with the lowest latency, highest throughput, and maximum data privacy.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The chatbot that we have developed using the aforementioned stack will allow you to <\/span><span style=\"font-weight: 400;\">upload your PDF documents and ask questions interactively. It uses <em>NV-QA-Embed<\/em>, a GPU-accelerated text embedding model used for question-answer retrieval, and <\/span><a href=\"https:\/\/build.nvidia.com\/meta\/llama3-70b\"><span style=\"font-weight: 400;\">Llama 3 &#8211; 70B<\/span><\/a><span style=\"font-weight: 400;\">, which is packaged as a NIM and accelerated on NVIDIA infrastructure. The <\/span><a href=\"https:\/\/python.langchain.com\/v0.2\/docs\/integrations\/chat\/nvidia_ai_endpoints\/\"><span style=\"font-weight: 400;\">langchain-nvidia-ai-endpoints<\/span><\/a><span style=\"font-weight: 400;\"> package contains LangChain integrations for building applications with models on NVIDIA NIM. <\/span><span style=\"font-weight: 400;\">Although we have used NVIDIA-hosted endpoints for prototyping purposes, we recommend that you consider using self-hosted NIM by referring to the <\/span><a href=\"https:\/\/docs.nvidia.com\/nim\/large-language-models\/latest\/introduction.html?nvid=nv-int-tblg-432774\"><span style=\"font-weight: 400;\">NIM documentation<\/span><\/a><span style=\"font-weight: 400;\"> for production deployments.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">You can use this solution to support use cases that require quick information retrieval such as:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Enabling ER nurses to speed up triaging by quick access to relevant healthcare information for alleviating overcrowding, long waits for care, and poor patient satisfaction.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Helping customer service agents discover relevant knowledge quickly via an internal knowledge-base chatbot to reduce caller wait times. This will not only help boost CSAT scores but also allow for managing high call volumes.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Helping sales associates inside a store to quickly discover and recommend items in a product catalog similar to the picture or description of the item requested by a shopper but is currently out of stock (stockout), to improve the shopping experience.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">In conclusion, you can develop an interactive GenAI application, like a chatbot, with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo. <\/span><span style=\"font-weight: 400;\">This combination provides scalability, reliability, and ease of use. In addition to deploying alongside Capella for a DBaaS experience, NIM\/NeMo can be deployed with on-prem or self-managed Couchbase in public clouds within your VPC for use cases that have stricter requirements for security and privacy. Additionally, you can use <\/span><a href=\"https:\/\/developer.nvidia.com\/blog\/building-safer-llm-apps-with-langchain-templates-and-nvidia-nemo-guardrails\/\"><span style=\"font-weight: 400;\">NeMo Guardrails<\/span><\/a><span style=\"font-weight: 400;\"> to control the output of your LLM for content that your company deems objectionable. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">The details of the chatbot application can be found in the Couchbase <\/span><a href=\"https:\/\/github.com\/couchbase-examples\/couchbase-tutorials\/blob\/141424e68c18233c4ed47cc6321d38540ab4ca54\/tutorial\/markdown\/python\/nvidia-nim-llama3-pdf-chat\/nvidia-nim-llama3-pdf-chat.md\"><span style=\"font-weight: 400;\">Developer Portal<\/span><\/a><span style=\"font-weight: 400;\"> along with the <\/span><a href=\"https:\/\/github.com\/couchbase-examples\/nvidia-rag-demo\/blob\/main\/chat_with_pdf.py\"><span style=\"font-weight: 400;\">complete code<\/span><\/a><span style=\"font-weight: 400;\">. Please sign up for a <\/span><a href=\"https:\/\/cloud.couchbase.com\/sign-up\"><span style=\"font-weight: 400;\">Capella trial account<\/span><\/a><span style=\"font-weight: 400;\">, free <\/span><a href=\"https:\/\/build.nvidia.com\/explore\/discover?signin_corporate=false&amp;signin=false\"><span style=\"font-weight: 400;\">NVIDIA NIM account<\/span><\/a><span style=\"font-weight: 400;\">, and start developing your GenAI application.\u00a0<\/span><\/p>\n<p><br style=\"font-weight: 400;\" \/><br style=\"font-weight: 400;\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today, we&#8217;re excited to announce our new integration with NVIDIA NIM\/NeMo. In this blog post, we present a solution concept of an interactive chatbot based on a Retrieval Augmented Generation (RAG)\u00a0architecture with Couchbase Capella as a Vector database. The retrieval [&hellip;]<\/p>\n","protected":false},"author":84768,"featured_media":16003,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[10122,2242,2225,1816,7666,9973,2389,9937],"tags":[9963,9989],"ppma_author":[9977,9981],"class_list":["post-16002","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence-ai","category-connectors","category-cloud","category-couchbase-server","category-edge-computing","category-generative-ai-genai","category-solutions","category-vector-search","tag-langchain","tag-nvidia"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain\" \/>\n<meta property=\"og:description\" content=\"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2024-07-04T16:49:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-13T23:36:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1345\" \/>\n\t<meta property=\"og:image:height\" content=\"681\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Lokesh Goel, Software Engineer, Kiran Matty, Lead Product Manager AI\/ML\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Lokesh Goel, Software Engineer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/\"},\"author\":{\"name\":\"Lokesh Goel, Developer Experience Engineer\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/a918924898a24c1cbcf6712bb6d62b4e\"},\"headline\":\"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\\\/NeMo and LangChain\",\"datePublished\":\"2024-07-04T16:49:03+00:00\",\"dateModified\":\"2025-06-13T23:36:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/\"},\"wordCount\":859,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2024\\\/07\\\/image1-1.png\",\"keywords\":[\"langchain\",\"NVIDIA\"],\"articleSection\":[\"Artificial Intelligence (AI)\",\"Connectors\",\"Couchbase Capella\",\"Couchbase Server\",\"Edge computing\",\"Generative AI (GenAI)\",\"Solutions\",\"Vector Search\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/\",\"name\":\"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\\\/NeMo and LangChain - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2024\\\/07\\\/image1-1.png\",\"datePublished\":\"2024-07-04T16:49:03+00:00\",\"dateModified\":\"2025-06-13T23:36:50+00:00\",\"description\":\"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\\\/NeMo\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2024\\\/07\\\/image1-1.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2024\\\/07\\\/image1-1.png\",\"width\":1345,\"height\":681,\"caption\":\"NVIDIA NIM\\\/NeMo and LangChain\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/accelerate-rag-ai-couchbase-nvidia\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\\\/NeMo and LangChain\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/admin-logo.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/a918924898a24c1cbcf6712bb6d62b4e\",\"name\":\"Lokesh Goel, Developer Experience Engineer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g28f42fa6eaa9ec33a742151714d1f0cb\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g\",\"caption\":\"Lokesh Goel, Developer Experience Engineer\"},\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/author\\\/lokeshgoel\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain - The Couchbase Blog","description":"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/","og_locale":"en_US","og_type":"article","og_title":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain","og_description":"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo","og_url":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/","og_site_name":"The Couchbase Blog","article_published_time":"2024-07-04T16:49:03+00:00","article_modified_time":"2025-06-13T23:36:50+00:00","og_image":[{"width":1345,"height":681,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png","type":"image\/png"}],"author":"Lokesh Goel, Software Engineer, Kiran Matty, Lead Product Manager AI\/ML","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Lokesh Goel, Software Engineer","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/"},"author":{"name":"Lokesh Goel, Developer Experience Engineer","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/a918924898a24c1cbcf6712bb6d62b4e"},"headline":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain","datePublished":"2024-07-04T16:49:03+00:00","dateModified":"2025-06-13T23:36:50+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/"},"wordCount":859,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png","keywords":["langchain","NVIDIA"],"articleSection":["Artificial Intelligence (AI)","Connectors","Couchbase Capella","Couchbase Server","Edge computing","Generative AI (GenAI)","Solutions","Vector Search"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/","url":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/","name":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png","datePublished":"2024-07-04T16:49:03+00:00","dateModified":"2025-06-13T23:36:50+00:00","description":"Develop an interactive GenAI application with grounded and relevant responses using Couchbase Capella-based RAG and accelerate it using NVIDIA NIM\/NeMo","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/07\/image1-1.png","width":1345,"height":681,"caption":"NVIDIA NIM\/NeMo and LangChain"},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/accelerate-rag-ai-couchbase-nvidia\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Accelerate Couchbase-Powered RAG AI Application With NVIDIA NIM\/NeMo and LangChain"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/a918924898a24c1cbcf6712bb6d62b4e","name":"Lokesh Goel, Developer Experience Engineer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g28f42fa6eaa9ec33a742151714d1f0cb","url":"https:\/\/secure.gravatar.com\/avatar\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g","caption":"Lokesh Goel, Developer Experience Engineer"},"url":"https:\/\/www.couchbase.com\/blog\/author\/lokeshgoel\/"}]}},"acf":[],"authors":[{"term_id":9977,"user_id":84768,"is_guest":0,"slug":"lokeshgoel","display_name":"Lokesh Goel, Software Engineer","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/8f2cb3333278f50e81806e9c068732cf57d7268c2b1ed80cc3dc9645151df405?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""},{"term_id":9981,"user_id":85346,"is_guest":0,"slug":"kiranmatty","display_name":"Kiran Matty, Lead Product Manager AI\/ML","avatar_url":{"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/06\/T024FJS4M-U064W1AETPD-456e21a66cf5-512.png","url2x":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/06\/T024FJS4M-U064W1AETPD-456e21a66cf5-512.png"},"0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/16002","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/84768"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=16002"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/16002\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/16003"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=16002"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=16002"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=16002"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=16002"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}