{"id":15720,"date":"2024-05-15T15:19:30","date_gmt":"2024-05-15T22:19:30","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=15720"},"modified":"2025-06-13T16:36:57","modified_gmt":"2025-06-13T23:36:57","slug":"vector-search-indexing-recall-faiss","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","title":{"rendered":"Vector Search Performance: The Rise of Recall"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Introducing vector search (KNN), with its distance-based similarity scoring, into the existing Search paradigm necessitated a shift in how we thought about \u201crelevant\u201d results and how to measure them.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Text based indexes use<\/span> <a href=\"https:\/\/docs.couchbase.com\/server\/current\/fts\/fts-scoring.html#scoring-td-idf\"><i><span style=\"font-weight: 400;\">tf-idf<\/span><\/i><\/a><span style=\"font-weight: 400;\"> as their scoring mechanism with the results remaining the same across searches, given a fixed corpus of words (here, the documents in a partition).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In contrast, a KNN search does not guarantee the same level of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Idempotence\">idempotency<\/a>. The results are <\/span><i><span style=\"font-weight: 400;\">approximate<\/span><\/i><span style=\"font-weight: 400;\">, often differing between queries. This article is about the Search team pivoting from exact to approximate. Along the way, we answer questions about why <\/span><i><span style=\"font-weight: 400;\">approximate results can become the new normal<\/span><\/i><span style=\"font-weight: 400;\"> and how much approximation is acceptable.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Setting the Stage\u00a0<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Each Search index partition is a <\/span><a href=\"https:\/\/blevesearch.com\/\"><span style=\"font-weight: 400;\">Bleve<\/span><\/a><span style=\"font-weight: 400;\"> index comprising multiple zap segments (<\/span><i><span style=\"font-weight: 400;\">segmented architecture<\/span><\/i><span style=\"font-weight: 400;\">), with each segment containing a vector index. These are periodically compacted by a merger routine for the Bleve index.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Search uses <\/span><a href=\"https:\/\/github.com\/facebookresearch\/faiss\"><span style=\"font-weight: 400;\">FAISS<\/span><\/a><span style=\"font-weight: 400;\"> for the vector index creation, training, searching and some more related functionality.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The two broad classes of indexes currently used by Couchbase Search are:<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Flat indexes &#8211; perform exhaustive search, akin to storing the vectors in an array.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/github.com\/facebookresearch\/faiss\/wiki\/Faiss-indexes#cell-probe-methods-indexivf-indexes\"><span style=\"font-weight: 400;\">IVF indexes<\/span><\/a><span style=\"font-weight: 400;\"> &#8211; centroid-based indexes which involve clustering (KMeans in this case) the dataset and then populating those clusters.<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h4><span style=\"font-weight: 400;\">A Brief about KNN\u00a0<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Like I touched upon earlier, the testware (and equally important, our thinking!) was predisposed to exact scoring. Text-based search is <\/span><i><span style=\"font-weight: 400;\">fundamentally exhaustive<\/span><\/i><span style=\"font-weight: 400;\"> in that an inverted index includes all the tokens in the partition\u2019s documents. All the documents in an inverted index are <\/span><i><span style=\"font-weight: 400;\">eligible <\/span><\/i><span style=\"font-weight: 400;\">for the search and will be searched through.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A centroid-based vector index, by comparison, <\/span><i><span style=\"font-weight: 400;\">limits the pool of eligible vectors<\/span><\/i><span style=\"font-weight: 400;\"> right off the bat &#8211; by only searching through specific clusters, which may or may not be the same for each query. This means that for a given query, potentially time consuming, exhaustive search is traded off for approximation.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">(If the word \u2018recall\u2019 threw you off for a bit, sit tight &#8211; we will be coming to that in just a bit).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Considering that we limit our search space right at the beginning, it\u2019s important to \u201ccluster right\u201d.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One of the first steps in a search involves picking how many and which clusters to search through. Too few and you end up missing out on some potentially similar vectors. Too many and search latency increases significantly for a relatively small increase in search quality.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The metric used for search quality is <\/span><b>recall<\/b><span style=\"font-weight: 400;\"> &#8211; what percentage of the returned results are objectively the closest to the query vectors. The set of the vectors closest to the query vector are called the ground truth and are used as a baseline when measuring recall. Since the KNN score is the distance between two vectors, it is <\/span><i><span style=\"font-weight: 400;\">independent of the other documents<\/span><\/i><span style=\"font-weight: 400;\"> in the partition (unlike in tf-idf) and more importantly, this helps in an objective comparison between independently evaluated ground truth and the result.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">How Much (Approximation) is Too Much (Approximation): Driving Recall from 0.06 to 90+:<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Armed with all this knowledge, we decided to start testing for recall. Our early tests showed a surprisingly low recall of close to 0 &#8211; 0.06 to be precise.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">While we had offloaded the vector index search to FAISS, there were some aspects we needed to handle from our end. One of them being mapping document IDs to the vectors. Search maps each document number to a unique vector hash. These hashes are then passed as <\/span><a href=\"https:\/\/github.com\/blevesearch\/go-faiss\/blob\/master\/index.go#L45\"><span style=\"font-weight: 400;\">custom IDs to FAISS<\/span><\/a><span style=\"font-weight: 400;\"> to leverage the support for mapping vectors to custom ID and searches return the custom ID. Considering that the vectors (each of which is of the same size) are concatenated into a large vector and so are the IDs, getting the ordering right determines the mapping of the vector to the ID. Internally, FAISS uses a hashmap to store vectors and their IDs.<\/span><\/p>\n<pre class=\"nums:false lang:default decode:true \">[&lt;vec1&gt;,&lt;vec2&gt;,...&lt;vec_n&gt;],[id1,id2,...id_n] =&gt; vec1 \u2192 id1, vec2 \u2192 id2, \u2026 vec_n \u2192 id_n<\/pre>\n<p><span style=\"font-weight: 400;\">A closer look showed that we were mapping randomly ordered IDs to vectors when rebuilding indexes during a merge, resulting in the result set being essentially random. This was impacting both flat and IVF indexes since they both relied on the <\/span><i><span style=\"font-weight: 400;\">ordering <\/span><\/i><span style=\"font-weight: 400;\">of the IDs when retrieving results.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once the ordering issue was solved, along with some other merge path fixes, the recall jumped to around 70. We were now on the right track &#8211; we didn\u2019t have any fundamental bugs plaguing us. We started taking a look at the knobs we could tune.\u00a0<\/span><\/p>\n<h4><span style=\"font-weight: 400;\">Turning Knobs &#8211; Centroids and Nprobe<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">The initial strategy used a fixed number (100) of centroids for all vector indexes with more than 10k vectors. In essence, this was treating 1M vectors the same as 20k vectors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">FAISS clustering defaults have a minimum (39) and maximum (256) number of points per cluster. The remaining points are subsampled. 100 centroids may have been enough for 100 * 256 = 25600 vectors at most but for anything over that, there was <\/span><i><span style=\"font-weight: 400;\">excessive<\/span><\/i><span style=\"font-weight: 400;\"> subsampling taking place, as reflected in the recall.\u00a0 What we needed was a formula for centroids which scaled with the dataset.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What we\u2019re looking to optimise for: <em>R<\/em><\/span><span style=\"font-weight: 400;\"><em>ecall@K<\/em>, without indexing and search latency taking too much of a hit, if possible.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><strong>Setup<\/strong><\/span><\/p>\n<p><span style=\"font-weight: 400;\">The setup was fairly simple &#8211; scripts creating a FAISS index (training and adding IDs) and querying them, with the ground truth results known beforehand. I used the <\/span><span style=\"font-weight: 400;\">SIFT10K and SIFT1M<\/span><span style=\"font-weight: 400;\"> datasets from the <\/span><span style=\"font-weight: 400;\">original paper<\/span><span style=\"font-weight: 400;\"> since they provided groundtruth vectors using Euclidean distance. The recall@K was the mean recall over 100\/10k respectively queries.<\/span><\/p>\n<p><strong>Increasing centroids<\/strong><\/p>\n<p>The first phase involved tweaking the number of clusters.<\/p>\n<p>Sift1M results &#8211; <span style=\"font-weight: 400;\">10,000 queries<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><strong># centroids<\/strong><\/td>\n<td><strong>Training time(s)<\/strong><\/td>\n<td><strong>Search time(s)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.83<\/span><\/td>\n<td><span style=\"font-weight: 400;\">20.72<\/span><\/td>\n<td><span style=\"font-weight: 400; color: #ff0000;\">0.61<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">200<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.856<\/span><\/td>\n<td><span style=\"font-weight: 400;\">14.423<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.558<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">500<\/span><\/td>\n<td><span style=\"font-weight: 400;\">4.75<\/span><\/td>\n<td><span style=\"font-weight: 400;\">4.101<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.4833<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">1000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">15.13<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2.4113<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.43<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">Sift10k results &#8211; <\/span><span style=\"font-weight: 400;\">100 queries<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><strong># centroids<\/strong><\/td>\n<td><strong>Training time (ms)<\/strong><\/td>\n<td><strong>Search time (ms)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">30.78<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1370<\/span><\/td>\n<td><span style=\"font-weight: 400; color: #ff0000;\">0.82<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">50<\/span><\/td>\n<td><span style=\"font-weight: 400;\">103<\/span><\/td>\n<td><span style=\"font-weight: 400;\">368.9<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.69<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">188.48<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.6<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Insights<\/strong><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The current baseline shows a recall of 0.61, which can definitely be improved.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The recall <\/span><i><span style=\"font-weight: 400;\">decreases<\/span><\/i><span style=\"font-weight: 400;\"> with an increase in the number of centroids.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Search time decreases due to<\/span> <i><span style=\"font-weight: 400;\">increasing localization<\/span><\/i><span style=\"font-weight: 400;\"> even as training time increases.<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">The converse being search time increasing, despite low training time, for a lower number of centroids since that entails searching in larger cells with greater number of vectors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now that it\u2019s been established that increasing centroids has a negative impact on recall, let\u2019s try to intuitively understand why that\u2019s so.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With a fixed size dataset, increasing the number of centroids could decrease the number of documents in each cluster. With smaller clusters, we <\/span><i><span style=\"font-weight: 400;\">search fewer vectors overall<\/span><\/i><span style=\"font-weight: 400;\"> and pick the K closest vectors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Hence, an increase in the number of clusters should be accompanied by a <\/span><i><span style=\"font-weight: 400;\">corresponding increase in the number of clusters searched.<\/span><\/i><\/p>\n<p><strong>Increasing nprobe<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">Sift1M &#8211; <\/span>10,000 queries<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>nlist<\/strong><\/td>\n<td><strong>nprobe<\/strong><\/td>\n<td><strong>Training time(s)<\/strong><\/td>\n<td><strong>Total Search time(s)<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100 (current baseline)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.43<\/span><\/td>\n<td><span style=\"font-weight: 400;\">21.24\u00a0<\/span><\/td>\n<td><span style=\"color: #ff0000;\"><b>0.61<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">0.778<\/span><\/td>\n<td><span style=\"font-weight: 400; color: #ff0000;\">119.5<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.993<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">200<\/span><\/td>\n<td><span style=\"font-weight: 400;\">14<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.12<\/span><\/td>\n<td><span style=\"font-weight: 400;\">84.54<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.99<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">500<\/span><\/td>\n<td><span style=\"font-weight: 400;\">22<\/span><\/td>\n<td><span style=\"font-weight: 400;\">3.23<\/span><\/td>\n<td><span style=\"font-weight: 400;\">52.80<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.988<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">1000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">31<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10.033<\/span><\/td>\n<td><span style=\"font-weight: 400;\">37.79<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.988<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">2000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">44<\/span><\/td>\n<td><span style=\"font-weight: 400;\">36.36<\/span><\/td>\n<td><span style=\"font-weight: 400;\">27.61<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.985<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">3000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">54<\/span><\/td>\n<td><span style=\"font-weight: 400;\">80.94<\/span><\/td>\n<td><span style=\"font-weight: 400;\">22.74<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.985<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">3906 (1M\/256)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">62<\/span><\/td>\n<td><span style=\"font-weight: 400;\">134.61<\/span><\/td>\n<td><span style=\"font-weight: 400;\">20s<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.984<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">4000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">32<\/span><\/td>\n<td><span style=\"font-weight: 400;\">136.71<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10.09<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.956<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">4000<\/span><\/td>\n<td><span style=\"font-weight: 400;\">64<\/span><\/td>\n<td><span style=\"font-weight: 400;\">138.57<\/span><\/td>\n<td><span style=\"font-weight: 400;\">20.36<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.987<\/b><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">Sift10k &#8211; <\/span>100 queries<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>nlist<\/strong><\/td>\n<td><strong>nprobe<\/strong><\/td>\n<td><strong>Training time<\/strong><\/td>\n<td><strong>Total Search time<\/strong><\/td>\n<td><strong>recall@100<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">33.85ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.52s<\/span><\/td>\n<td><span style=\"color: #99cc00;\"><b>0.82<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">39 (10000\/256)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">6<\/span><\/td>\n<td><span style=\"font-weight: 400;\">70.6ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.91s<\/span><\/td>\n<td><span style=\"color: #99cc00;\"><b>0.96<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">50<\/span><\/td>\n<td><span style=\"font-weight: 400;\">7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">70.26ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.68s<\/span><\/td>\n<td><span style=\"color: #008000;\"><b>0.99<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">91.5ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">677.14ms<\/span><\/td>\n<td><span style=\"color: #339966;\"><b>0.9<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">100<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">99ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1.317s<\/span><\/td>\n<td><span style=\"color: #99cc00;\"><b>0.96<\/b><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">200<\/span><\/td>\n<td><span style=\"font-weight: 400;\">14<\/span><\/td>\n<td><span style=\"font-weight: 400;\">133.26ms<\/span><\/td>\n<td><span style=\"font-weight: 400;\">930.2ms<\/span><\/td>\n<td><span style=\"color: #339966;\"><b>1.0<\/b><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Insights<\/strong><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">For the 1M dataset, the baseline recall is low due to <em>nprobe = 1<\/em>.<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Once that is fixed, the recall increases rapidly and is no longer a concern, despite subsampling.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">However, <\/span><i><span style=\"font-weight: 400;\">with great recall comes greater search latency.<\/span><\/i><span style=\"font-weight: 400;\">\u00a0<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">For the 10k dataset, baseline recall, while not as low as that of the 1M dataset, is still a matter of concern.<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">This too, is easily remedied by increasing <em>nprobe<\/em>.\u00a0<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">The defaults for determining nlist and nprobe were then changed to:<\/span><\/p>\n<pre class=\"nums:false lang:default decode:true\">if nVecs &gt;= 1M:\r\n\u00a0\u00a0\u00a0\u00a0\u00a0nlist = 4 * \u221anVecs\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ Above a certain number of vectors,\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ increasing nlist does not increase recall.\r\nif nVecs &gt;= 1000:\r\n\u00a0\u00a0\u00a0\u00a0\u00a0nlist = nVecs\/100\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ 100 points per cluster seems like a\u00a0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ reasonable midpoint between the minimum\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ and maximum points per cluster.\r\nnprobe = \u221anlist<\/pre>\n<p><span style=\"font-weight: 400;\">Suffice to say, this is how the team felt once we\u2019d successfully found a solution to the recall issue:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-15721\" style=\"border: solid 1px;\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM-732x1024.png\" alt=\"\" width=\"349\" height=\"488\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM-732x1024.png 732w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM-214x300.png 214w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM-768x1074.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM-300x420.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.49.17\u202fPM.png 862w\" sizes=\"auto, (max-width: 349px) 100vw, 349px\" \/><br style=\"font-weight: 400;\" \/><br style=\"font-weight: 400;\" \/><\/p>\n<h3><span style=\"font-weight: 400;\">Tuning a Vector Index<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Tuning a Couchbase vector index definitely isn\u2019t something to sweat about.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-15722\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-15-at-3.55.07\u202fPM.png\" alt=\"\" width=\"278\" height=\"385\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.55.07\u202fPM.png 582w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.55.07\u202fPM-217x300.png 217w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/Screenshot-2024-05-15-at-3.55.07\u202fPM-300x415.png 300w\" sizes=\"auto, (max-width: 278px) 100vw, 278px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean. Along with user-facing considerations such as easy to understand and intuitive, future proofing (forward compatibility) and the segmented architecture entailed some limitations in the API.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Each segment is a vector index with a different number of vectors. At query time, a <\/span><i><span style=\"font-weight: 400;\">user isn\u2019t aware of the data distribution<\/span><\/i><span style=\"font-weight: 400;\"> at a partition level, let alone at the segment level. Depending on the nature of mutations (large number of deletes for eg.), the <\/span><i><span style=\"font-weight: 400;\">number of vectors can vary quite a bit<\/span><\/i><span style=\"font-weight: 400;\"> when merging segments. Hence, a <\/span><i><span style=\"font-weight: 400;\">one-size-fits-all(-segments) approach cannot be applied<\/span><\/i><span style=\"font-weight: 400;\">.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In addition, forward compatibility necessitates that the<\/span><i><span style=\"font-weight: 400;\"> knobs aren\u2019t specific to a FAISS index type<\/span><\/i><span style=\"font-weight: 400;\"> since this is an area that should be, for the most part, abstracted away from the user, despite changing the index types being used in the implementation. For example, the user specifying <em>nprobe<\/em> or <em>nlist<\/em> would be very IVF index specific and would be a breaking change if the index types powering vector search changed.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Allowing the user to <\/span><i><span style=\"font-weight: 400;\">toggle which metric to optimise for <\/span><\/i><span style=\"font-weight: 400;\">(recall\/latency) fits the bill. While tuning for recall is done differently for an IVF index and say, a HNSW index, recall is applicable to both and can be tuned for at the segment level. In the data above, doubling nprobe leads to a corresponding doubling of search time but with the corresponding increase in recall being a few points. Hence, when optimising for latency, halving the <em>nprobe<\/em> yields gains in latency without too great a hit on recall.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The index definition setting where the user can toggle between recall and latency can be modified is <em>vector_index_optimized_for<\/em>. This setting has been documented in the official docs.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For more such deep dives and developments in the Couchbase Vector Search space, stay tuned!<\/span><\/p>\n<p><br style=\"font-weight: 400;\" \/><br style=\"font-weight: 400;\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introducing vector search (KNN), with its distance-based similarity scoring, into the existing Search paradigm necessitated a shift in how we thought about \u201crelevant\u201d results and how to measure them.\u00a0 Text based indexes use tf-idf as their scoring mechanism with the [&hellip;]<\/p>\n","protected":false},"author":85141,"featured_media":15725,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[10122,1821,2225,1816,9417,9936,9937],"tags":[9961,1696],"ppma_author":[9962],"class_list":["post-15720","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence-ai","category-couchbase-architecture","category-cloud","category-couchbase-server","category-performance","category-search","category-vector-search","tag-faiss","tag-indexing"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.8 (Yoast SEO v25.8) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Vector Search Performance: The Rise of Recall - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Vector Search Performance: The Rise of Recall\" \/>\n<meta property=\"og:description\" content=\"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2024-05-15T22:19:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-13T23:36:57+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"625\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Aditi Ahuja, Software Engineer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Aditi Ahuja, Software Engineer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\"},\"author\":{\"name\":\"Aditi Ahuja, Software Engineer\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0\"},\"headline\":\"Vector Search Performance: The Rise of Recall\",\"datePublished\":\"2024-05-15T22:19:30+00:00\",\"dateModified\":\"2025-06-13T23:36:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\"},\"wordCount\":1576,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png\",\"keywords\":[\"FAISS\",\"Indexing\"],\"articleSection\":[\"Artificial Intelligence (AI)\",\"Couchbase Architecture\",\"Couchbase Capella\",\"Couchbase Server\",\"High Performance\",\"Search\",\"Vector Search\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\",\"name\":\"Vector Search Performance: The Rise of Recall - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png\",\"datePublished\":\"2024-05-15T22:19:30+00:00\",\"dateModified\":\"2025-06-13T23:36:57+00:00\",\"description\":\"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png\",\"width\":1200,\"height\":625},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.couchbase.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Vector Search Performance: The Rise of Recall\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0\",\"name\":\"Aditi Ahuja, Software Engineer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/a3eb898818ce7bdfc1b89af35c10b1f5\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"caption\":\"Aditi Ahuja, Software Engineer\"},\"url\":\"https:\/\/www.couchbase.com\/blog\/author\/aditi\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Vector Search Performance: The Rise of Recall - The Couchbase Blog","description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","og_locale":"en_US","og_type":"article","og_title":"Vector Search Performance: The Rise of Recall","og_description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","og_url":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","og_site_name":"The Couchbase Blog","article_published_time":"2024-05-15T22:19:30+00:00","article_modified_time":"2025-06-13T23:36:57+00:00","og_image":[{"width":1200,"height":625,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png","type":"image\/png"}],"author":"Aditi Ahuja, Software Engineer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Aditi Ahuja, Software Engineer","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"},"author":{"name":"Aditi Ahuja, Software Engineer","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0"},"headline":"Vector Search Performance: The Rise of Recall","datePublished":"2024-05-15T22:19:30+00:00","dateModified":"2025-06-13T23:36:57+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"},"wordCount":1576,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png","keywords":["FAISS","Indexing"],"articleSection":["Artificial Intelligence (AI)","Couchbase Architecture","Couchbase Capella","Couchbase Server","High Performance","Search","Vector Search"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","url":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/","name":"Vector Search Performance: The Rise of Recall - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png","datePublished":"2024-05-15T22:19:30+00:00","dateModified":"2025-06-13T23:36:57+00:00","description":"Since the recall\/latency tradeoff is a crucial one in vector search, we wanted to allow the user some flexibility in which way they wanted to lean.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/05\/latency-no-sweat-1.png","width":1200,"height":625},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/vector-search-indexing-recall-faiss\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Vector Search Performance: The Rise of Recall"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0","name":"Aditi Ahuja, Software Engineer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/a3eb898818ce7bdfc1b89af35c10b1f5","url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","caption":"Aditi Ahuja, Software Engineer"},"url":"https:\/\/www.couchbase.com\/blog\/author\/aditi\/"}]}},"authors":[{"term_id":9962,"user_id":85141,"is_guest":0,"slug":"aditi","display_name":"Aditi Ahuja, Software Engineer","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","author_category":"","last_name":"Ahuja, Software Engineer","first_name":"Aditi","job_title":"","user_url":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/15720","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/85141"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=15720"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/15720\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/15725"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=15720"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=15720"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=15720"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=15720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}