{"id":3588,"date":"2024-05-15T15:19:30","date_gmt":"2024-05-15T22:19:30","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"},"modified":"2024-05-15T15:19:30","modified_gmt":"2024-05-15T22:19:30","slug":"vector-search-indexing-recall-faiss","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/ko\/vector-search-indexing-recall-faiss\/","title":{"rendered":"Vector Search Performance: The Rise of Recall"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><span>Introducing vector search (KNN), with its distance-based similarity scoring, into the existing Search paradigm necessitated a shift in how we thought about \u201crelevant\u201d results and how to measure them.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Text based indexes use<\/span> <a href=\"https:\/\/docs.couchbase.com\/server\/current\/fts\/fts-scoring.html#scoring-td-idf\"><i><span>tf-idf<\/span><\/i><\/a><span> as their scoring mechanism with the results remaining the same across searches, given a fixed corpus of words (here, the documents in a partition).<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>In contrast, a KNN search does not guarantee the same level of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Idempotence\">idempotency<\/a>. The results are <\/span><i><span>approximate<\/span><\/i><span>, often differing between queries. This article is about the Search team pivoting from exact to approximate. Along the way, we answer questions about why <\/span><i><span>approximate results can become the new normal<\/span><\/i><span> and how much approximation is acceptable.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Setting the Stage\u00a0<\/span><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Each Search index partition is a <\/span><a href=\"https:\/\/blevesearch.com\/\"><span>Bleve<\/span><\/a><span> index comprising multiple zap segments (<\/span><i><span>segmented architecture<\/span><\/i><span>), with each segment containing a vector index. These are periodically compacted by a merger routine for the Bleve index.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Search uses <\/span><a href=\"https:\/\/github.com\/facebookresearch\/faiss\"><span>FAISS<\/span><\/a><span> for the vector index creation, training, searching and some more related functionality.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The two broad classes of indexes currently used by Couchbase Search are:<\/span><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><span>Flat indexes &#8211; perform exhaustive search, akin to storing the vectors in an array.<\/span><\/li>\n\n\n<li><a href=\"https:\/\/github.com\/facebookresearch\/faiss\/wiki\/Faiss-indexes#cell-probe-methods-indexivf-indexes\"><span>IVF indexes<\/span><\/a><span> &#8211; centroid-based indexes which involve clustering (KMeans in this case) the dataset and then populating those clusters.<\/span><\/li>\n\n<\/ol>\n\n\n\n<h4 class=\"wp-block-heading\"><span>A Brief about KNN\u00a0<\/span><\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Like I touched upon earlier, the testware (and equally important, our thinking!) was predisposed to exact scoring. Text-based search is <\/span><i><span>fundamentally exhaustive<\/span><\/i><span> in that an inverted index includes all the tokens in the partition\u2019s documents. All the documents in an inverted index are <\/span><i><span>eligible <\/span><\/i><span>for the search and will be searched through.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>A centroid-based vector index, by comparison, <\/span><i><span>limits the pool of eligible vectors<\/span><\/i><span> right off the bat &#8211; by only searching through specific clusters, which may or may not be the same for each query. This means that for a given query, potentially time consuming, exhaustive search is traded off for approximation.\u00a0\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>(If the word \u2018recall\u2019 threw you off for a bit, sit tight &#8211; we will be coming to that in just a bit).<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Considering that we limit our search space right at the beginning, it\u2019s important to \u201ccluster right\u201d.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>One of the first steps in a search involves picking how many and which clusters to search through. Too few and you end up missing out on some potentially similar vectors. Too many and search latency increases significantly for a relatively small increase in search quality.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The metric used for search quality is <\/span><b>recall<\/b><span> &#8211; what percentage of the returned results are objectively the closest to the query vectors. The set of the vectors closest to the query vector are called the ground truth and are used as a baseline when measuring recall. Since the KNN score is the distance between two vectors, it is <\/span><i><span>independent of the other documents<\/span><\/i><span> in the partition (unlike in tf-idf) and more importantly, this helps in an objective comparison between independently evaluated ground truth and the result.\u00a0<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>How Much (Approximation) is Too Much (Approximation): Driving Recall from 0.06 to 90+:<\/span><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Armed with all this knowledge, we decided to start testing for recall. Our early tests showed a surprisingly low recall of close to 0 &#8211; 0.06 to be precise.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>While we had offloaded the vector index search to FAISS, there were some aspects we needed to handle from our end. One of them being mapping document IDs to the vectors. Search maps each document number to a unique vector hash. These hashes are then passed as <\/span><a href=\"https:\/\/github.com\/blevesearch\/go-faiss\/blob\/master\/index.go#L45\"><span>custom IDs to FAISS<\/span><\/a><span> to leverage the support for mapping vectors to custom ID and searches return the custom ID. Considering that the vectors (each of which is of the same size) are concatenated into a large vector and so are the IDs, getting the ordering right determines the mapping of the vector to the ID. Internally, FAISS uses a hashmap to store vectors and their IDs.<\/span><\/p>\n\n\n<p>[crayon nums=&#8221;false&#8221; lang=&#8221;default&#8221; decode=&#8221;true&#8221;][&lt;vec1&gt;,&lt;vec2&gt;,&#8230;&lt;vec_n&gt;],[id1,id2,&#8230;id_n] =&gt; vec1 \u2192 id1, vec2 \u2192 id2, \u2026 vec_n \u2192 id_n[\/crayon]<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>A closer look showed that we were mapping randomly ordered IDs to vectors when rebuilding indexes during a merge, resulting in the result set being essentially random. This was impacting both flat and IVF indexes since they both relied on the <\/span><i><span>ordering <\/span><\/i><span>of the IDs when retrieving results.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Once the ordering issue was solved, along with some other merge path fixes, the recall jumped to around 70. We were now on the right track &#8211; we didn\u2019t have any fundamental bugs plaguing us. We started taking a look at the knobs we could tune.\u00a0<\/span><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span>Turning Knobs &#8211; Centroids and Nprobe<\/span><\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The initial strategy used a fixed number (100) of centroids for all vector indexes with more than 10k vectors. In essence, this was treating 1M vectors the same as 20k vectors.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>FAISS clustering defaults have a minimum (39) and maximum (256) number of points per cluster. The remaining points are subsampled. 100 centroids may have been enough for 100 * 256 = 25600 vectors at most but for anything over that, there was <\/span><i><span>excessive<\/span><\/i><span> subsampling taking place, as reflected in the recall.\u00a0 What we needed was a formula for centroids which scaled with the dataset.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>What we\u2019re looking to optimise for: <em>R<\/em><\/span><span><em>ecall@K<\/em>, without indexing and search latency taking too much of a hit, if possible.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span><strong>Setup<\/strong><\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The setup was fairly simple &#8211; scripts creating a FAISS index (training and adding IDs) and querying them, with the ground truth results known beforehand. I used the <\/span><span>SIFT10K and SIFT1M<\/span><span> datasets from the <\/span><span>original paper<\/span><span> since they provided groundtruth vectors using Euclidean distance. The recall@K was the mean recall over 100\/10k respectively queries.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Increasing centroids<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The first phase involved tweaking the number of clusters.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Sift1M results &#8211; <span>10,000 queries<\/span><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<tbody>\n<tr>\n<td><strong># centroids<\/strong><\/td>\n<td><strong>Training time(s)<\/strong><\/td>\n<td><strong>Search time(s)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span>100<\/span><\/td>\n<td><span>1.83<\/span><\/td>\n<td><span>20.72<\/span><\/td>\n<td><span>0.61<\/span><\/td>\n<\/tr>\n<tr>\n<td><span>200<\/span><\/td>\n<td><span>1.856<\/span><\/td>\n<td><span>14.423<\/span><\/td>\n<td><span>0.558<\/span><\/td>\n<\/tr>\n<tr>\n<td><span>500<\/span><\/td>\n<td><span>4.75<\/span><\/td>\n<td><span>4.101<\/span><\/td>\n<td><span>0.4833<\/span><\/td>\n<\/tr>\n<tr>\n<td><span>1000<\/span><\/td>\n<td><span>15.13<\/span><\/td>\n<td><span>2.4113<\/span><\/td>\n<td><span>0.43<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Sift10k results &#8211; <\/span><span>100 queries<\/span><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<tbody>\n<tr>\n<td><strong># centroids<\/strong><\/td>\n<td><strong>Training time (ms)<\/strong><\/td>\n<td><strong>Search time (ms)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span>10<\/span><\/td>\n<td><span>30.78<\/span><\/td>\n<td><span>1370<\/span><\/td>\n<td><span>0.82<\/span><\/td>\n<\/tr>\n<tr>\n<td><span>50<\/span><\/td>\n<td><span>103<\/span><\/td>\n<td><span>368.9<\/span><\/td>\n<td><span>0.69<\/span><\/td>\n<\/tr>\n<tr>\n<td><span>100<\/span><\/td>\n<td><span>100<\/span><\/td>\n<td><span>188.48<\/span><\/td>\n<td><span>0.6<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Insights<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><span>The current baseline shows a recall of 0.61, which can definitely be improved.<\/span><\/li>\n\n\n<li><span>The recall <\/span><i><span>decreases<\/span><\/i><span> with an increase in the number of centroids.\u00a0<\/span><\/li>\n\n\n<li><span>Search time decreases due to<\/span> <i><span>increasing localization<\/span><\/i><span> even as training time increases.<\/span><\/li>\n\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The converse being search time increasing, despite low training time, for a lower number of centroids since that entails searching in larger cells with greater number of vectors.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Now that it\u2019s been established that increasing centroids has a negative impact on recall, let\u2019s try to intuitively understand why that\u2019s so.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>With a fixed size dataset, increasing the number of centroids could decrease the number of documents in each cluster. With smaller clusters, we <\/span><i><span>search fewer vectors overall<\/span><\/i><span> and pick the K closest vectors.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Hence, an increase in the number of clusters should be accompanied by a <\/span><i><span>corresponding increase in the number of clusters searched.<\/span><\/i><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Increasing nprobe<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Sift1M &#8211; <\/span>10,000 queries<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<tbody>\n<tr>\n<td><strong>nlist<\/strong><\/td>\n<td><strong>nprobe<\/strong><\/td>\n<td><strong>Training time(s)<\/strong><\/td>\n<td><strong>Total Search time(s)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span>100 (current baseline)<\/span><\/td>\n<td><span>1<\/span><\/td>\n<td><span>1.43<\/span><\/td>\n<td><span>21.24\u00a0<\/span><\/td>\n<td><span><b>0.61<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>100<\/span><\/td>\n<td><span>10<\/span><\/td>\n<td><span>0.778<\/span><\/td>\n<td><span>119.5<\/span><\/td>\n<td><span><b>0.993<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>200<\/span><\/td>\n<td><span>14<\/span><\/td>\n<td><span>1.12<\/span><\/td>\n<td><span>84.54<\/span><\/td>\n<td><span><b>0.99<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>500<\/span><\/td>\n<td><span>22<\/span><\/td>\n<td><span>3.23<\/span><\/td>\n<td><span>52.80<\/span><\/td>\n<td><span><b>0.988<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>1000<\/span><\/td>\n<td><span>31<\/span><\/td>\n<td><span>10.033<\/span><\/td>\n<td><span>37.79<\/span><\/td>\n<td><span><b>0.988<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>2000<\/span><\/td>\n<td><span>44<\/span><\/td>\n<td><span>36.36<\/span><\/td>\n<td><span>27.61<\/span><\/td>\n<td><span><b>0.985<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>3000<\/span><\/td>\n<td><span>54<\/span><\/td>\n<td><span>80.94<\/span><\/td>\n<td><span>22.74<\/span><\/td>\n<td><span><b>0.985<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>3906 (1M\/256)<\/span><\/td>\n<td><span>62<\/span><\/td>\n<td><span>134.61<\/span><\/td>\n<td><span>20s<\/span><\/td>\n<td><span><b>0.984<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>4000<\/span><\/td>\n<td><span>32<\/span><\/td>\n<td><span>136.71<\/span><\/td>\n<td><span>10.09<\/span><\/td>\n<td><span><b>0.956<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>4000<\/span><\/td>\n<td><span>64<\/span><\/td>\n<td><span>138.57<\/span><\/td>\n<td><span>20.36<\/span><\/td>\n<td><span><b>0.987<\/b><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Sift10k &#8211; <\/span>100 queries<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<tbody>\n<tr>\n<td><strong>nlist<\/strong><\/td>\n<td><strong>nprobe<\/strong><\/td>\n<td><strong>Training time<\/strong><\/td>\n<td><strong>Total Search time<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span>10<\/span><\/td>\n<td><span>1<\/span><\/td>\n<td><span>33.85ms<\/span><\/td>\n<td><span>1.52s<\/span><\/td>\n<td><span><b>0.82<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>39 (10000\/256)<\/span><\/td>\n<td><span>6<\/span><\/td>\n<td><span>70.6ms<\/span><\/td>\n<td><span>1.91s<\/span><\/td>\n<td><span><b>0.96<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>50<\/span><\/td>\n<td><span>7<\/span><\/td>\n<td><span>70.26ms<\/span><\/td>\n<td><span>1.68s<\/span><\/td>\n<td><span><b>0.99<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>100<\/span><\/td>\n<td><span>5<\/span><\/td>\n<td><span>91.5ms<\/span><\/td>\n<td><span>677.14ms<\/span><\/td>\n<td><span><b>0.9<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>100<\/span><\/td>\n<td><span>10<\/span><\/td>\n<td><span>99ms<\/span><\/td>\n<td><span>1.317s<\/span><\/td>\n<td><span><b>0.96<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span>200<\/span><\/td>\n<td><span>14<\/span><\/td>\n<td><span>133.26ms<\/span><\/td>\n<td><span>930.2ms<\/span><\/td>\n<td><span><b>1.0<\/b><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Insights<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><span>For the 1M dataset, the baseline recall is low due to <em>nprobe = 1<\/em>.<\/span>\n<ul>\n<li aria-level=\"2\"><span>Once that is fixed, the recall increases rapidly and is no longer a concern, despite subsampling.<\/span><\/li>\n<li aria-level=\"2\"><span>However, <\/span><i><span>with great recall comes greater search latency.<\/span><\/i><span>\u00a0<\/span><\/li>\n<\/ul>\n<\/li>\n\n\n<li><span>For the 10k dataset, baseline recall, while not as low as that of the 1M dataset, is still a matter of concern.<\/span>\n<ul>\n<li aria-level=\"2\"><span>This too, is easily remedied by increasing <em>nprobe<\/em>.\u00a0<\/span><\/li>\n<\/ul>\n<\/li>\n\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The defaults for determining nlist and nprobe were then changed to:<\/span><\/p>\n\n\n<p>[crayon nums=&#8221;false&#8221; lang=&#8221;default&#8221; decode=&#8221;true&#8221;]if nVecs &gt;= 1M:<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0nlist = 4 * \u221anVecs<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ Above a certain number of vectors,\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ increasing nlist does not increase recall.<br \/>\nif nVecs &gt;= 1000:<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0nlist = nVecs\/100<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ 100 points per cluster seems like a\u00a0<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ reasonable midpoint between the minimum<br \/>\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ and maximum points per cluster.<br \/>\nnprobe = \u221anlist[\/crayon]<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Suffice to say, this is how the team felt once we\u2019d successfully found a solution to the recall issue:<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-15721\" src=\"https:\/\/www.couchbase.com\/wp-content\/uploads\/sites\/5\/2026\/05\/Screenshot-2024-05-15-at-3.49.17-PM-732x1024-1.png\" alt=\"\" width=\"349\" height=\"488\"><br><br><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span>Tuning a Vector Index<\/span><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Tuning a Couchbase vector index definitely isn\u2019t something to sweat about.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-15722\" src=\"https:\/\/www.couchbase.com\/wp-content\/uploads\/sites\/5\/2026\/05\/Screenshot-2024-05-15-at-3.55.07-PM.png\" alt=\"\" width=\"278\" height=\"385\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean. Along with user-facing considerations such as easy to understand and intuitive, future proofing (forward compatibility) and the segmented architecture entailed some limitations in the API.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Each segment is a vector index with a different number of vectors. At query time, a <\/span><i><span>user isn\u2019t aware of the data distribution<\/span><\/i><span> at a partition level, let alone at the segment level. Depending on the nature of mutations (large number of deletes for eg.), the <\/span><i><span>number of vectors can vary quite a bit<\/span><\/i><span> when merging segments. Hence, a <\/span><i><span>one-size-fits-all(-segments) approach cannot be applied<\/span><\/i><span>.\u00a0\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>In addition, forward compatibility necessitates that the<\/span><i><span> knobs aren\u2019t specific to a FAISS index type<\/span><\/i><span> since this is an area that should be, for the most part, abstracted away from the user, despite changing the index types being used in the implementation. For example, the user specifying <em>nprobe<\/em> or <em>nlist<\/em> would be very IVF index specific and would be a breaking change if the index types powering vector search changed.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>Allowing the user to <\/span><i><span>toggle which metric to optimise for <\/span><\/i><span>(recall\/latency) fits the bill. While tuning for recall is done differently for an IVF index and say, a HNSW index, recall is applicable to both and can be tuned for at the segment level. In the data above, doubling nprobe leads to a corresponding doubling of search time but with the corresponding increase in recall being a few points. Hence, when optimising for latency, halving the <em>nprobe<\/em> yields gains in latency without too great a hit on recall.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>The index definition setting where the user can toggle between recall and latency can be modified is <em>vector_index_optimized_for<\/em>. This setting has been documented in the official docs.\u00a0<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span>For more such deep dives and developments in the Couchbase Vector Search space, stay tuned!<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><br><br><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introducing vector search (KNN), with its distance-based similarity scoring, into the existing Search paradigm necessitated a shift in how we thought about \u201crelevant\u201d results and how to measure them.\u00a0 Text based indexes use tf-idf as their scoring mechanism with the results remaining the same across searches, given a fixed corpus of words (here, the documents [&hellip;]<\/p>\n","protected":false},"author":85141,"featured_media":3587,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"_acf":"","footnotes":""},"categories":[598,179,301,54,17,441,715],"tags":[834,192],"ppma_author":[835],"class_list":["post-3588","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence-ai","category-couchbase-architecture","category-cloud","category-couchbase-server","category-performance","category-search","category-vector-search","tag-faiss","tag-indexing"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.6 (Yoast SEO v27.6) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Vector Search Performance: The Rise of Recall - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/ko\/vector-search-indexing-recall-faiss\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Vector Search Performance: The Rise of Recall\" \/>\n<meta property=\"og:description\" content=\"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/ko\/vector-search-indexing-recall-faiss\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2024-05-15T22:19:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"625\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Aditi Ahuja, Software Engineer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Aditi Ahuja, Software Engineer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/\"},\"author\":{\"name\":\"Aditi Ahuja, Software Engineer\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/efea1290226a380570a7a86c651090a0\"},\"headline\":\"Vector Search Performance: The Rise of Recall\",\"datePublished\":\"2024-05-15T22:19:30+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/\"},\"wordCount\":1656,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/latency-no-sweat-1.png\",\"keywords\":[\"FAISS\",\"Indexing\"],\"articleSection\":[\"Artificial Intelligence (AI)\",\"Couchbase Architecture\",\"Couchbase Capella\",\"Couchbase Server\",\"High Performance\",\"Search\",\"Vector Search\"],\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/\",\"name\":\"Vector Search Performance: The Rise of Recall - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/latency-no-sweat-1.png\",\"datePublished\":\"2024-05-15T22:19:30+00:00\",\"description\":\"Since the recall\\\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/latency-no-sweat-1.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/05\\\/latency-no-sweat-1.png\",\"width\":1200,\"height\":625},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/vector-search-indexing-recall-faiss\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Vector Search Performance: The Rise of Recall\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/06\\\/logo.svg\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/5\\\/2026\\\/06\\\/logo.svg\",\"width\":\"1024\",\"height\":\"1024\",\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/efea1290226a380570a7a86c651090a0\",\"name\":\"Aditi Ahuja, Software Engineer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=ga3eb898818ce7bdfc1b89af35c10b1f5\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"caption\":\"Aditi Ahuja, Software Engineer\"},\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ko\\\/author\\\/aditi\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Vector Search Performance: The Rise of Recall - The Couchbase Blog","description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/ko\/vector-search-indexing-recall-faiss\/","og_locale":"ko_KR","og_type":"article","og_title":"Vector Search Performance: The Rise of Recall","og_description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","og_url":"https:\/\/www.couchbase.com\/blog\/ko\/vector-search-indexing-recall-faiss\/","og_site_name":"The Couchbase Blog","article_published_time":"2024-05-15T22:19:30+00:00","og_image":[{"width":1200,"height":625,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png","type":"image\/png"}],"author":"Aditi Ahuja, Software Engineer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Aditi Ahuja, Software Engineer","Est. reading time":"8\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"},"author":{"name":"Aditi Ahuja, Software Engineer","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0"},"headline":"Vector Search Performance: The Rise of Recall","datePublished":"2024-05-15T22:19:30+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"},"wordCount":1656,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png","keywords":["FAISS","Indexing"],"articleSection":["Artificial Intelligence (AI)","Couchbase Architecture","Couchbase Capella","Couchbase Server","High Performance","Search","Vector Search"],"inLanguage":"ko-KR","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","url":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","name":"Vector Search Performance: The Rise of Recall - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png","datePublished":"2024-05-15T22:19:30+00:00","description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"]}]},{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/05\/latency-no-sweat-1.png","width":1200,"height":625},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Vector Search Performance: The Rise of Recall"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/06\/logo.svg","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/5\/2026\/06\/logo.svg","width":"1024","height":"1024","caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0","name":"Aditi Ahuja, Software Engineer","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=ga3eb898818ce7bdfc1b89af35c10b1f5","url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","caption":"Aditi Ahuja, Software Engineer"},"url":"https:\/\/www.couchbase.com\/blog\/ko\/author\/aditi\/"}]}},"acf":[],"authors":[{"term_id":835,"user_id":85141,"is_guest":0,"slug":"aditi","display_name":"Aditi Ahuja, Software Engineer","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","author_category":"","first_name":"Aditi","last_name":"Ahuja, Software Engineer","user_url":"","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/posts\/3588","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/users\/85141"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/comments?post=3588"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/posts\/3588\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/media\/3587"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/media?parent=3588"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/categories?post=3588"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/tags?post=3588"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/ko\/wp-json\/wp\/v2\/ppma_author?post=3588"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}