{"id":5073,"date":"2026-01-08T12:56:16","date_gmt":"2026-01-08T20:56:16","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/"},"modified":"2026-01-08T12:56:16","modified_gmt":"2026-01-08T20:56:16","slug":"filtered-ann-search-with-composite-vector-indexes","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/","title":{"rendered":"Filtered ANN Search With Composite Vector Indexes (Part 1)"},"content":{"rendered":"\n<p><span>This post kicks off a multi-part series on composite vector indexing in Couchbase. We will start by building intuition, then progressively dive into internals, execution optimizations, and performance.<\/span><\/p>\n\n\n\n<p>The series will cover:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><span>Why composite vector indexes matter, including concepts, terminology, and developer motivation. A Smart Grocery Recommendation System will be used as a running example.<\/span><\/li>\n\n\n<li><a href=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes-2\/\"><span>How composite vector indexes are implemented inside the Couchbase Indexing Service.<\/span><\/a><\/li>\n\n\n<li><a href=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-ive-composite-vector-indexes\/\"><span>How ORDER BY pushdown works for composite vector queries.<\/span><\/a><\/li>\n\n\n<li><a href=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes-part-4\/\"><span>Real-world performance behavior and benchmarking results.<\/span><\/a><\/li>\n\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span>Smart Grocery Recommendation System With Filtered ANN<\/span><\/h2>\n\n\n\n<p><span>Imagine you&#8217;re building a grocery-recommendation app. <\/span><\/p>\n\n\n\n<p><span>A user opens it on a Sunday morning and types:<\/span><\/p>\n\n\n\n<p><span>\u201cI love dark chocolate spread, but I\u2019m trying to cut sugar and add more protein. What else should I buy?\u201d<\/span><\/p>\n\n\n\n<p><span>At this moment, your system needs to understand the user\u2019s intent, compare food items semantically, and apply strict nutritional filters.<\/span><\/p>\n\n\n\n<p><span>This is exactly where Filtered Approximate Nearest Neighbor (Filtered ANN) comes in:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Your ANN layer first finds semantically similar items\/foods that \u201cfeel like\u201d dark chocolate spread in flavor profile, texture, or category.\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Then your filtering layer steps in to remove anything with high sugar, keep items above a certain protein threshold,\u00a0 and maybe enforce dietary preferences (vegan, keto, nut-free).<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>The result? <\/span><span>A recommendation engine that understands both meaning and constraints just like a smart store associate who knows your taste and considers your goals.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Before We Get to FANN, Let\u2019s Build Intuition<\/span><\/h3>\n\n\n\n<p><b>NN (Nearest Neighbor):<\/b><span> Finding the <\/span><b>most similar thing<\/b><span> to what you have. It&#8217;s like asking, \u201cWhich food in my list tastes most like this chocolate spread?\u201d<\/span><\/p>\n\n\n\n<p><b>ANN (Approximate Nearest Neighbor):<\/b><span> Finding <\/span><b>something very similar<\/b><span>, but faster. It&#8217;s like saying, \u201cI don\u2019t need the <\/span><i><span>perfect<\/span><\/i><span> match, just something that\u2019s <\/span><i><span>close enough<\/span><\/i><span> quickly.\u201d<\/span><span><br>\n<\/span><\/p>\n\n\n\n<p><b>FANN (Filtered Approximate Nearest Neighbor):<\/b><span> Finding <\/span><b>something close enough<\/b><span> but <\/span><b>only among items that meet certain rules<\/b><span>. It&#8217;s like saying, \u201cShow me foods similar to chocolate spread, but only the ones that are low in sugar and high in protein.\u201d<\/span><\/p>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17802\" src=\"https:\/\/www.couchbase.com\/wp-content\/uploads\/sites\/5\/2026\/05\/Screenshot-2026-01-07-at-2.41.04-PM.png\" alt=\"\" width=\"1214\" height=\"672\"><\/p>\n\n\n\n<p><span>ANN algorithms trade a bit of <\/span><i><span>effectiveness<\/span><\/i><span> (accuracy) for much greater <\/span><i><span>efficiency<\/span><\/i><span> (speed and memory).<\/span><\/p>\n\n\n\n<p><span>A <\/span><b>composite index<\/b><span> is an index built on <\/span><b>multiple fields (columns)<\/b><span> together, not just one. For e<\/span><span>xample, it&#8217;s like sorting a spreadsheet first by Category, then by Sugar, then by Protein. <\/span><span>This ordering method groups all chocolate spreads together first. <\/span><span>Within that group, you can quickly find low-sugar, high-protein products without scanning everything.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Why Traditional Indexes Fail<\/span><\/h3>\n\n\n\n<p><span>Assume you have a small subset of the World Food Facts dataset loaded into memory as:<\/span><\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]type Food struct {<br \/>\n    ID            string<br \/>\n    ProductName   string<br \/>\n    Category      string<br \/>\n    Description   string<br \/>\n    Sugars100g    float64<br \/>\n    Proteins100g  float64<br \/>\n    Tags          []string<br \/>\n    Ingredients   []string<br \/>\n    &#8230;<br \/>\n    Country       string<br \/>\n}<br \/>\n[\/crayon]<\/p>\n\n\n\n<p><span>To find foods like dark chocolate spreads that are low in sugar and high in protein you can use a query like\u00a0 the one below:<\/span><\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]SELECT product_name<br \/>\nFROM food<br \/>\nWHERE category = &#8220;chocolate_spread&#8221;<br \/>\n  AND sugars_100g &lt; 20 AND proteins_100g &gt; 10;<br \/>\n[\/crayon]<\/p>\n\n\n\n<p><span>To speed up the query, you can use a composite secondary index like the one below:\u00a0<\/span><\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]CREATE INDEX idx_food ON food(category, sugars_100g, proteins_100g, product_name)<br \/>\n[\/crayon]<\/p>\n\n\n\n<p><span>Composite secondary indexes can be viewed as sorted lists of concatenated keys that enable faster lookups for specific values or iteration across a range of low to high values (i.e., range scan). These lookup values, as well as the high and low values, are constructed at query time using the query predicates.<\/span><\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]&#8230;<br \/>\n(&#8220;almond_butter&#8221;, 15, 20, &#8220;Almond butter with chocolate chips&#8221;)<br \/>\n(&#8220;chocolate_spread&#8221;, 19, 7, &#8220;Chocolate spread with nuts&#8221;)<br \/>\n(&#8220;chocolate_spread&#8221;, 20, 4, &#8220;Creamy chocolate spread&#8221;)<br \/>\n(&#8220;chocolate_spread&#8221;, 23, 6, &#8220;Chocolate spread with honey&#8221;)<br \/>\n(&#8220;chocolate_spread&#8221;, 25, 5, &#8220;Coffee chocolate spread&#8221;)<br \/>\n(&#8220;milk_chocolate&#8221;, 4, 6, &#8220;Milk chocolate spread&#8221;)<br \/>\n(&#8220;peanut_butter&#8221;, 19, 30, &#8220;Chocolate flavored peanut butter&#8221;)<br \/>\n&#8230;<\/p>\n<p>[\/crayon]<\/p>\n\n\n\n<p><span>Composite indexes work great for structured lookups.<\/span><\/p>\n\n\n\n<p><span>But a category filter can never find:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>chocolate-flavored nut butters<\/span><\/li>\n\n\n<li><span>chocolate-protein spreads<\/span><\/li>\n\n\n<li><span>hazelnut cocoa blends<\/span><\/li>\n\n\n<li><span>chocolate protein bars<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>\u2026even though a human instantly knows they are relatives of chocolate spreads.<\/span><\/p>\n\n\n\n<p><span>Traditional indexes only match structure, not meaning. This is why category-based range scans fail.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>How Filtered ANN Works<\/span><\/h3>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17803\" src=\"https:\/\/www.couchbase.com\/wp-content\/uploads\/sites\/5\/2026\/05\/Screenshot-2026-01-07-at-2.48.11-PM.png\" alt=\"\" width=\"1272\" height=\"228\">You can convert the query and data into vectors<\/p>\n\n\n\n<p><span>The user\u2019s sentence is fed into an embedding model (e.g., OpenAI, Cohere, or a domain-specific model).\u00a0\u00a0<\/span><\/p>\n\n\n\n<p><span>The result is a dense vector that captures concepts like:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>chocolate-like flavor\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>spreadable texture\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>dessert\/snack category\u00a0\u00a0<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>This vector represents what the user wants as opposed to just the literal words.<\/span><\/p>\n\n\n\n<p><span>Next, you can find nearest neighbors (semantic similarity).<\/span><\/p>\n\n\n\n<p><span>Candidates might include:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Hazelnut cocoa spread\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Chocolate almond butter\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Cocoa protein spread\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Chocolate tahini\u00a0\u00a0<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>But not all are healthy options, and the user specifically asked for low sugar and high protein.<\/span><\/p>\n\n\n\n<p><span>You can apply strict filters, which <\/span><span>is the \u201cFiltered\u201d part of Filtered ANN. <\/span><\/p>\n\n\n\n<p><span>You can filter out items:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Sugar &gt; threshold (e.g., &gt;5g per serving)\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Protein &lt; threshold (e.g., &lt;8g per serving)\u00a0\u00a0<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>Your system may also combine metadata filters:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Only vegan\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>No palm oil\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>No nuts\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>Under $10\u00a0\u00a0<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>What remains is a set of items that match both meaning and constraints.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Why Solely Using Filters Does Not Work<\/span><\/h3>\n\n\n\n<p><span>Using only filters, you would get:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Any high\u2011protein, low\u2011sugar product\u00a0\u00a0<\/span><\/li>\n\n\n<li><span>As well as items unrelated to chocolate (like tofu, Greek yogurt, chicken breast)<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>But the user wants something &#8220;similar to chocolate spread.&#8221;<\/span><\/p>\n\n\n\n<p><span>Filtered ANN = Personalization + Constraints.<\/span> <span>It mimics how a human store associate would answer the request: <\/span><span>\u201cIf you want something like chocolate spread but healthier, try this\u2026\u201d<\/span><\/p>\n\n\n\n<p><span>Behind the scenes, however, your recommendation engine faces a subtle but serious problem. <\/span><span>Modern vector databases say they can do \u201chybrid search,\u201d but they usually keep scalar fields like sugar or protein off to the side, as plain metadata. The ANN index has no idea how to use them.<\/span><\/p>\n\n\n\n<p><span>So what happens?<\/span><\/p>\n\n\n\n<p><span>The system first pulls in a huge batch of vector-similar candidates\u2026 and only then starts checking nutrition rules like sugars_100g &lt; 20 or proteins_100g &gt; 10.<\/span><\/p>\n\n\n\n<p><span>It\u2019s like a store employee bringing out every chocolate-related product from the back room, placing them on the counter, and then saying:<\/span><\/p>\n\n\n\n<p><span>\u201cOh wait, you wanted low-sugar? High-protein? Let me throw most of these away.\u201d<\/span><\/p>\n\n\n\n<p><span>Some vector systems try to filter earlier during graph traversal, but they still can\u2019t do real range filtering or prefix pruning. They must fetch and decode every candidate before deciding whether to throw it out.<\/span><\/p>\n\n\n\n<p><span>What does this mean for your app?<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>More disk reads<\/span><\/li>\n\n\n<li><span>More distance calculations<\/span><\/li>\n\n\n<li><span>More latency<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>&#8230;And a lot of wasted work for results the user will never see.<\/span><\/p>\n\n\n\n<p><span>This is exactly why a composite vector index that merges vector similarity and scalar pruning into the same index is a game-changer.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Composite Vector Indexes \u2013 Overview<\/span><\/h3>\n\n\n\n<p><span>Step 1: Embeddings Layer \u2013 Create Vector Embeddings<\/span><\/p>\n\n\n\n<p><span>Each product&#8217;s text description (tags, product name, category, ingredients) is converted into a high-dimensional vector using a language model. Products with similar meanings will have similar vectors.<\/span><\/p>\n\n\n\n<p><span>For example, embeddings for product names:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>\u00a0&#8220;dark chocolate spread&#8221; \u2192 [0.23, -0.15, 0.87, &#8230;] (384 dimensions)<\/span><\/li>\n\n\n<li><span>\u00a0&#8220;chocolate hazelnut butter&#8221; \u2192 [0.25, -0.12, 0.85, &#8230;] (similar vector)<\/span><\/li>\n\n\n<li><span>\u00a0&#8220;chocolate protein bar&#8221; \u2192 [0.18, -0.08, 0.79, &#8230;] (somewhat similar)<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>Step 2: FANN Index: Build Composite Vector Index<\/span><\/p>\n\n\n\n<p><span>Create a vector index (e.g., Couchbase Vector Index, FAISS) that can quickly find nearest neighbors in the embedding space.<\/span><\/p>\n\n\n\n<p><span>How are vectors different from other datatypes in a composite vector index?<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Vectors do not have natural total order, hence sort order for vector fields cannot be determined at index time for index construction.<\/span>\n<ul>\n<li><span>Vector fields do not support conventional comparison predicates (such as equality or range filters) in the WHERE clause.<\/span><\/li>\n<li><span>But vector fields are used in ORDER BY with vector distance functions, and may participate in query planning via those expressions.<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>Ordering is done at scan time using similarity to a query vector. The similarity function is chosen by the user as needed for the data and application.<\/span>\n<ul>\n<li><span>APPROX_VECTOR_DISTANCE can be used in the ORDER BY clause and is efficiently supported when a compatible vector index exists; otherwise, it results in a full scan.<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>As each dimension in a vector does not have any standalone meaning, you can only ask questions like \u201chow similar are two vectors.\u201d So you can only find the nearest neighbor or similar elements.<\/span>\n<ul>\n<li><span>Similarly, function and query needs to be provided as input at query time.<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>Nearest neighbor search is a computationally intensive problem which only worsens with increase in vector dimensions. So you need a time and space-efficient solution to get approximate results.<\/span>\n<ul>\n<li><span>Quantization methods are provided in the description in the DDL.<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>You will have to reduce the number of comparisons at query time for faster querying.<\/span>\n<ul>\n<li><span>Number of centroids and nprobes value help in reducing the search space.<\/span><\/li>\n<\/ul>\n<\/li>\n\n<\/ul>\n\n\n\n<p>Composite Vector Index is an index where at least one of the keys has a vector attribute, while other attributes like dimension, similarity, and description, etc. are given to qualify the vector.<\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]CREATE INDEX idx_vec ON food(sugars_100g, proteins_100g, text_vector Vector,  product_name) WITH { &#8220;dimension&#8221;: 384, &#8220;similarity&#8221; : &#8220;L2&#8221;, &#8220;description&#8221;: &#8220;IVF,SQ8&#8221; }<br \/>\n[\/crayon]<\/p>\n\n\n\n<p><span>In this definition, the VECTOR keyword explicitly marks text_vector as a vector attribute. This is necessary because, at the JSON level, a vector embedding is stored as a simple array of floating-point numbers. Without the vector annotation, GSI would treat the field as an ordinary array and apply standard indexing semantics.<\/span><\/p>\n\n\n\n<p><span>By declaring a field as a vector, the user establishes an explicit contract with the GSI service that:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>The index will contain a single vector key, and that key represents the embedding used for vector similarity search in this index<\/span><\/li>\n\n\n<li><span>The application is responsible for generating the vector embedding (for example, using an external embedding model) and persisting it in the specified document field.<\/span><\/li>\n\n\n<li><span>The GSI service must interpret the field semantically as a vector embedding and build vector-aware index structures optimized for Approximate Nearest Neighbor (ANN) search, rather than using conventional scalar or array indexing logic.<\/span><\/li>\n\n<\/ul>\n\n\n\n<p>In vector index DDL, a user must specify a few extra parameters like:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Dimension: length of the vector embeddings created<\/span><\/li>\n\n\n<li><span>Similarity: metric used for ANN search<\/span><\/li>\n\n\n<li><span>Description: FAISS index like description to specify the accuracy vs speed trade-off<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>In the above example:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>We created the 384 dimensional embeddings for tags, product name, category and ingredients fields using sentence-transformers\/all-MiniLM-L6-v2 model and stored them in the text_vector field of the document.\u00a0<\/span><\/li>\n\n\n<li><span>We used IVF coarse quantizer with default number of centroids and SQ8 quantization.<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>Step 3: Filtered ANN Query<\/span><\/p>\n\n\n\n<p><span>Instead of filtering by exact category, we:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Generate an embedding for the query &#8220;dark chocolate spread.&#8221;<\/span>\n<ul>\n<li><span>query_text = &#8220;dark chocolate spread&#8221;<\/span><\/li>\n<li><span>query_embedding = [0.23, -0.15, 0.87, 0.42, &#8230;, -0.31]\u00a0 # 384-dimensional vector<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>Find the top-k most similar products using ANN search (e.g., top 10) that meet our criteria (sugars_100g &lt; 20 AND proteins_100g &gt; 10).<\/span><\/li>\n\n\n<li><span>Return the top matches.<\/span><\/li>\n\n<\/ul>\n\n\n\n<p>SQL++ Example (Couchbase):<\/p>\n\n\n<p>[crayon lang=&#8221;default&#8221; decode=&#8221;true&#8221;]SELECT product_name<br \/>\nFROM food<br \/>\nWHERE sugars_100g &lt; 20 AND proteins_100g &gt; 10<br \/>\nORDER BY APPROX_VECTOR_DISTANCE(text_vector, [query_embedding], &#8216;L2&#8217;)<br \/>\nLIMIT 10;<br \/>\n[\/crayon]<\/p>\n\n\n\n<p><span>Key Advantages<\/span><\/p>\n\n\n\n<p><span>This approach finds products that are:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>Semantically similar to &#8220;dark chocolate spread&#8221; (using vector search).<\/span><\/li>\n\n\n<li><span>Meet the nutritional filters (low sugar, high protein).<\/span><\/li>\n\n\n<li><span>May include products from different categories like &#8220;chocolate protein bars,&#8221; &#8220;nut butter spreads,&#8221; or &#8220;chocolate-flavored snacks&#8221; that are similar in meaning but don&#8217;t match the category &#8220;chocolate spreads&#8221; filter.<\/span><\/li>\n\n<\/ul>\n\n\n\n<p><span>Learn more about composite vector indexes in the next part of this series, where we will answer practical questions such as:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span>How are vector embeddings stored and organized efficiently inside the index layer?<\/span><\/li>\n\n\n<li><span>Can a composite vector index answer scalar-only queries without reading the full document?<\/span><\/li>\n\n\n<li><span>Does the order of scalar fields and vector fields in the index definition matter?<\/span><\/li>\n\n<\/ul>\n\n\n\n<p>Dive deeper into the mechanics of composite vector indexing by checking out the <a href=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes-2\/\">second post<\/a> in this series, where we explore its implementation within Couchbase.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post kicks off a multi-part series on composite vector indexing in Couchbase. We will start by building intuition, then progressively dive into internals, execution optimizations, and performance. The series will cover: Smart Grocery Recommendation System With Filtered ANN Imagine you&#8217;re building a grocery-recommendation app. A user opens it on a Sunday morning and types: [&hellip;]<\/p>\n","protected":false},"author":85690,"featured_media":5070,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[715],"tags":[],"ppma_author":[1024],"class_list":["post-5073","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-vector-search"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.6 (Yoast SEO v27.6) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Filtered ANN Search With Composite Vector Indexes (Part 1) - The Couchbase Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Filtered ANN Search With Composite Vector Indexes (Part 1)\" \/>\n<meta property=\"og:description\" content=\"This post kicks off a multi-part series on composite vector indexing in Couchbase. We will start by building intuition, then progressively dive into internals, execution optimizations, and performance. The series will cover: Smart Grocery Recommendation System With Filtered ANN Imagine you&#8217;re building a grocery-recommendation app. A user opens it on a Sunday morning and types: [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2026-01-08T20:56:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes-1024x536.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"536\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Sai Kommaraju, Senior Software Engineer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sai Kommaraju, Senior Software Engineer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/\"},\"author\":{\"name\":\"Sai Kommaraju, Senior Software Engineer\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/8fb575d74280ff3d0f044904277a8076\"},\"headline\":\"Filtered ANN Search With Composite Vector Indexes (Part 1)\",\"datePublished\":\"2026-01-08T20:56:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/\"},\"wordCount\":1981,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png\",\"articleSection\":[\"Vector Search\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/\",\"name\":\"Filtered ANN Search With Composite Vector Indexes (Part 1) - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png\",\"datePublished\":\"2026-01-08T20:56:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png\",\"width\":2400,\"height\":1256},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/filtered-ann-search-with-composite-vector-indexes\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Filtered ANN Search With Composite Vector Indexes (Part 1)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/06\\\/logo.svg\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/06\\\/logo.svg\",\"width\":\"1024\",\"height\":\"1024\",\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/8fb575d74280ff3d0f044904277a8076\",\"name\":\"Sai Kommaraju, Senior Software Engineer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g35c0f86ae57fcc6e389d9889b832eed3\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g\",\"caption\":\"Sai Kommaraju, Senior Software Engineer\"},\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/author\\\/saikommaraju\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Filtered ANN Search With Composite Vector Indexes (Part 1) - The Couchbase Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/","og_locale":"en_US","og_type":"article","og_title":"Filtered ANN Search With Composite Vector Indexes (Part 1)","og_description":"This post kicks off a multi-part series on composite vector indexing in Couchbase. We will start by building intuition, then progressively dive into internals, execution optimizations, and performance. The series will cover: Smart Grocery Recommendation System With Filtered ANN Imagine you&#8217;re building a grocery-recommendation app. A user opens it on a Sunday morning and types: [&hellip;]","og_url":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/","og_site_name":"The Couchbase Blog","article_published_time":"2026-01-08T20:56:16+00:00","og_image":[{"width":1024,"height":536,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes-1024x536.png","type":"image\/png"}],"author":"Sai Kommaraju, Senior Software Engineer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Sai Kommaraju, Senior Software Engineer","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/"},"author":{"name":"Sai Kommaraju, Senior Software Engineer","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/8fb575d74280ff3d0f044904277a8076"},"headline":"Filtered ANN Search With Composite Vector Indexes (Part 1)","datePublished":"2026-01-08T20:56:16+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/"},"wordCount":1981,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png","articleSection":["Vector Search"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/","url":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/","name":"Filtered ANN Search With Composite Vector Indexes (Part 1) - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png","datePublished":"2026-01-08T20:56:16+00:00","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Filtered-ANN-Search-with-Composite-Vector-Indexes.png","width":2400,"height":1256},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/filtered-ann-search-with-composite-vector-indexes\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Filtered ANN Search With Composite Vector Indexes (Part 1)"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/06\/logo.svg","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/06\/logo.svg","width":"1024","height":"1024","caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/8fb575d74280ff3d0f044904277a8076","name":"Sai Kommaraju, Senior Software Engineer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g35c0f86ae57fcc6e389d9889b832eed3","url":"https:\/\/secure.gravatar.com\/avatar\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b8a56cb763c8e864050caaf5444778b8d7c93bd2960edfc078ca1e10d4fa3b51?s=96&d=mm&r=g","caption":"Sai Kommaraju, Senior Software Engineer"},"url":"https:\/\/www.couchbase.com\/blog\/author\/saikommaraju\/"}]}},"acf":[],"authors":[{"term_id":1024,"user_id":85690,"is_guest":0,"slug":"saikommaraju","display_name":"Sai Kommaraju, Senior Software Engineer","avatar_url":{"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Sai-Kommaraju-7.jpeg","url2x":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/Sai-Kommaraju-7.jpeg"},"0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/5073","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/85690"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=5073"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/5073\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/5070"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=5073"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=5073"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=5073"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=5073"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}