{"id":17138,"date":"2025-05-22T09:54:28","date_gmt":"2025-05-22T16:54:28","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=17138"},"modified":"2025-06-13T22:45:04","modified_gmt":"2025-06-14T05:45:04","slug":"semantic-similarity-with-focused-selectivity","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/pt\/semantic-similarity-with-focused-selectivity\/","title":{"rendered":"Similaridade sem\u00e2ntica com seletividade focada"},"content":{"rendered":"<h2>Por que a pesquisa sem\u00e2ntica precisa de seletividade?<\/h2>\n<p>At\u00e9 agora, consider\u00e1vamos uma incorpora\u00e7\u00e3o de vetor como uma entidade completa e aut\u00f4noma, focada inteiramente no significado que ela codifica. Embora isso permita a pesquisa sem\u00e2ntica, muitas vezes com um <a target=\"_blank\" href=\"https:\/\/www.couchbase.com\/blog\/pt\/vector-search-indexing-recall-faiss\/\" rel=\"noopener\">alto grau de similaridade<\/a>ele permanece limitado \u00e0 semelhan\u00e7a entre a consulta e os embeddings do conjunto de dados.<\/p>\n<p>N\u00e3o se pode confiar na oferta de pesquisa de similaridade de vetores para satisfazer predicados exatos. A pr\u00e9-filtragem tem como objetivo abordar exatamente essa lacuna, buscando vetores semelhantes somente entre aqueles que satisfazem alguns crit\u00e9rios de filtragem.<\/p>\n<p>\u00c9 o equivalente de incorpora\u00e7\u00e3o de limitar sua pesquisa, seja para um emprego ou uma propriedade, a um local. Digamos que voc\u00ea queira uma propriedade \u00e0 beira-mar em um estado espec\u00edfico. Voc\u00ea tamb\u00e9m deseja limitar sua pesquisa \u00e0quelas com tr\u00eas quartos ou mais. Combinar as listagens sem um m\u00e9todo de filtragem para esses crit\u00e9rios \u00e9 quase invi\u00e1vel devido ao grande n\u00famero de listagens.<\/p>\n<p>Com a pr\u00e9-filtragem, voc\u00ea pode limitar sua pesquisa a um local espec\u00edfico, restringindo o espa\u00e7o de pesquisa a propriedades eleg\u00edveis por meio de consultas geoespaciais e num\u00e9ricas. Uma pesquisa de similaridade vetorial para propriedades \"em frente \u00e0 praia\", \"\u00e0 beira da praia\" e \"\u00e0 beira-mar\" ser\u00e1 realizada nesse subconjunto limitado.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-17139\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-1024x289.png\" alt=\"\" width=\"900\" height=\"254\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-1024x289.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-300x85.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-768x217.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-1536x434.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1-1320x373.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image4-1.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<p>A pr\u00e9-filtragem permitir\u00e1 que os usu\u00e1rios especifiquem consultas de filtro como parte do atributo kNN na consulta, sendo que somente os resultados ser\u00e3o considerados eleg\u00edveis para serem retornados pela consulta kNN. Simplificando, o usu\u00e1rio agora pode usar a sintaxe familiar de consulta do FTS para <i>restringir os documentos sobre os quais ser\u00e1 realizada uma pesquisa kNN<\/i>.<\/p>\n<h2>Quando aplicar a pr\u00e9-filtragem?<\/h2>\n<p>Como o nome sugere, os embeddings s\u00e3o filtrados por metadados <i>antes de <\/i>a pesquisa de similaridade. Isso \u00e9 diferente da p\u00f3s-filtragem, em que uma pesquisa kNN \u00e9 seguida pela filtragem de metadados. A pr\u00e9-filtragem oferece uma chance muito maior de retornar k ocorr\u00eancias, supondo que haja pelo menos esses documentos que passam pelo est\u00e1gio de filtragem.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-17141\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-1024x291.png\" alt=\"\" width=\"900\" height=\"256\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-1024x291.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-300x85.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-768x219.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-1536x437.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1-1320x376.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image2-1.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-17140\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-1024x246.png\" alt=\"\" width=\"900\" height=\"216\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-1024x246.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-300x72.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-768x185.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-1536x370.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3-1320x318.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image1-3.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<h2>Como isso funciona<\/h2>\n<p>Antes de entrarmos na ess\u00eancia da pr\u00e9-filtragem com o kNN, vamos entender como um vetor e um \u00edndice de texto completo s\u00e3o co-localizados no \u00edndice do servi\u00e7o Search. Cada \u00edndice do Search tem uma ou mais parti\u00e7\u00f5es, cada uma das quais tem um ou mais segmentos. Cada um desses segmentos \u00e9 um arquivo, e o arquivo \u00e9 dividido em se\u00e7\u00f5es separadas, uma por tipo de \u00edndice. A visualiza\u00e7\u00e3o de um segmento como uma unidade aut\u00f4noma com conte\u00fado de texto e vetor de um <i>lote de documentos<\/i> indexado ser\u00e1 \u00fatil para entender como a pr\u00e9-filtragem funciona no n\u00edvel do segmento.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-17142 alignleft\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1-1024x658.png\" alt=\"Semantic Similarity with Focused Selectivity\" width=\"399\" height=\"257\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1-1024x658.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1-300x193.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1-768x494.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1-1320x848.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image6-1.png 1444w\" sizes=\"auto, (max-width: 399px) 100vw, 399px\" \/>Uma considera\u00e7\u00e3o importante ao constru\u00ed-lo foi que ele deveria ser <i>agn\u00f3stico de predicados<\/i> no momento da pesquisa. Essencialmente, isso significa que a filtragem no \u00edndice vetorial deve funcionar da mesma forma, independentemente do predicado de texto completo. Um predicado de texto sobre um campo de texto n\u00e3o deve ser diferente de um predicado num\u00e9rico sobre outro.<\/p>\n<p>Para isso, os n\u00fameros de documentos, identificadores exclusivos de um documento, s\u00e3o usados para demarcar quais documentos s\u00e3o eleg\u00edveis para a consulta kNN. Todas as consultas FTS, sejam elas de texto, num\u00e9ricas ou geoespaciais, se resumem a identificar as ocorr\u00eancias pelo n\u00famero do documento. O uso de n\u00fameros de documentos significa que n\u00e3o precisamos alterar nossa estrat\u00e9gia de indexa\u00e7\u00e3o para vetores e limit\u00e1-la a uma altera\u00e7\u00e3o no tempo de pesquisa.<\/p>\n<h3>Fase 1: Filtragem de metadados<\/h3>\n<p>Como um segmento \u00e9 essencialmente um lote imut\u00e1vel de documentos, com seu texto e conte\u00fado vetorial indexados, a consulta de metadados de texto completo retorna todos os documentos eleg\u00edveis <i>em n\u00edvel de segmento<\/i>. Seus n\u00fameros de documentos s\u00e3o ent\u00e3o passados para o \u00edndice de vetores para recuperar os vetores eleg\u00edveis mais pr\u00f3ximos.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-17143 size-large\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-1024x500.png\" alt=\"Semantic Similarity with Focused Selectivity\" width=\"900\" height=\"439\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-1024x500.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-300x146.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-768x375.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-1536x750.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1-1320x644.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image3-1.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<h3>Fase 2: pesquisa kNN<\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-17144\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1-707x1024.png\" alt=\"Semantic Similarity with Focused Selectivity\" width=\"400\" height=\"579\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1-707x1024.png 707w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1-207x300.png 207w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1-768x1112.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1-300x434.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/image5-1.png 804w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/>O algoritmo atual que escolhe o <em>nprobe<\/em> clusters mais pr\u00f3ximos do vetor de consulta em um \u00edndice IVF \u00e9 <i>essencialmente voando \u00e0s cegas<\/i>\u00a0pois n\u00e3o leva em conta os clusters pr\u00f3ximos com poucos ou nenhum documento qualificado. No momento da pesquisa, a sele\u00e7\u00e3o de clusters a serem investigados agora precisa <i>conta para a distribui\u00e7\u00e3o de ocorr\u00eancias filtradas por metadados<\/i> em todo o \u00edndice.<\/p>\n<p>Um grupo <em>mais pr\u00f3ximo<\/em>\u00a0do vetor de consulta pode ter muito menos ocorr\u00eancias eleg\u00edveis do que um vetor muito mais distante.<\/p>\n<p>Levar em conta a distribui\u00e7\u00e3o de acertos do filtro e, ao mesmo tempo, manter a recupera\u00e7\u00e3o alta significa que n\u00e3o podemos fazer a varredura de clusters somente com base na densidade de acertos do filtro. O que isso significa \u00e9 que tentamos minimizar os c\u00e1lculos de dist\u00e2ncia para vetores ineleg\u00edveis, mesmo quando examinamos quantos clusters forem necess\u00e1rios para retornar o <em>k vizinhos mais pr\u00f3ximos<\/em>.<\/p>\n<p>Anteriormente, <em>nprobe<\/em> foi definido como o <i>limite absoluto<\/i> para o n\u00famero de clusters examinados. Agora, trata-se mais de um n\u00famero m\u00ednimo de clusters a serem examinados, supondo que menos clusters tenham vetores qualificados suficientes. Em casos de distribui\u00e7\u00e3o esparsa de acertos de filtros, em que cada cluster tem relativamente poucos vetores qualificados, nossa busca pelos k vizinhos mais pr\u00f3ximos pode nos levar a <i>varre muito mais do que os clusters do nprobe<\/i>. Tanto o kNN filtrado quanto o n\u00e3o filtrado examinam os clusters em ordem crescente de dist\u00e2ncia do vetor de consulta, sendo que a diferen\u00e7a est\u00e1 no fato de examinar um subconjunto de vetores em um n\u00famero potencialmente maior de clusters.<\/p>\n<pre class=\"lang:c++ decode:true\">eligible_clusters: clusters with at least 1 filtered hit\r\nif eligible_clusters &lt; nprobe :\r\n    scan all clusters\r\nelse:\r\n    total_hits(1,n): Cumulative count of eligible hits in closest 'n' eligible clusters\r\n    if total_hits(1,nprobe) &gt;= k:\r\n        scan nprobe eligible clusters\r\n    else:\r\n        while total_hits(1,n) &lt; k:\r\n            n++\r\n        scan n eligible clusters<\/pre>\n<p>Quando o \u00edndice de vetores retorna os vetores mais semelhantes em um n\u00edvel de segmento, eles s\u00e3o agregados em um n\u00edvel de \u00edndice global semelhante ao que \u00e9 feito para o kNN sem filtragem.<\/p>\n<h3>Como usar isso<\/h3>\n<p>Vamos pegar um balde, <em>marcos hist\u00f3ricos<\/em>. Este \u00e9 um exemplo de documento:<\/p>\n<pre class=\"lang:js decode:true\">{\r\n\u00a0\u00a0\"title\": \"Los Angeles\/Northwest\",\r\n\u00a0\u00a0\"name\": \"El Cid Theatre\",\r\n\u00a0\u00a0\"alt\": null,\r\n\u00a0\u00a0\"address\": \"4212 W Sunset Blvd\",\r\n\u00a0\u00a0\"directions\": null,\r\n\u00a0\u00a0\"phone\": null,\r\n\u00a0\u00a0\"tollfree\": null,\r\n\u00a0\u00a0\"email\": null,\r\n\u00a0\u00a0\"url\": null,\r\n\u00a0\u00a0\"hours\": null,\r\n\u00a0\u00a0\"image\": null,\r\n\u00a0\u00a0\"price\": null,\r\n\u00a0\u00a0\"content\": \"Built around turn of the century and, after several reincarnations, offers one of the only dinner theater options left in Los Angeles. The menu is heavily Spanish and the shows differ depending on the night and range from flamenco performances to tongue-in-cheek burlesque.\",\r\n\u00a0\u00a0\"geo\": {\r\n\u00a0\u00a0\u00a0\u00a0\"accuracy\": \"ROOFTOP\",\r\n\u00a0\u00a0\u00a0\u00a0\"lat\": 34.0939,\r\n\u00a0\u00a0\u00a0\u00a0\"lon\": -118.2822\r\n\u00a0\u00a0},\r\n\u00a0\u00a0\"activity\": \"do\",\r\n\u00a0\u00a0\"type\": \"landmark\",\r\n\u00a0\u00a0\"id\": 35034,\r\n\u00a0\u00a0\"country\": \"United States\",\r\n\u00a0\u00a0\"city\": \"Los Angeles\",\r\n\u00a0\u00a0\"state\": \"California\",\r\n\u00a0\u00a0\"embedding_crc\": \"fa6edfd97ffa665b\",\r\n\u00a0\u00a0\"embedding\": [-0.003134159604087472, -0.020280055701732635,.... \u00a0 -0.014541691169142723]\r\n}<\/pre>\n<p>Criar um \u00edndice, <em>teste,<\/em> que indexa o <em>incorpora\u00e7\u00e3o<\/em>, <em>id<\/em> e <em>cidade<\/em> campos.<\/p>\n<p>Minha primeira consulta \u00e9 uma incorpora\u00e7\u00e3o do Royal Engineers Museum em Gillingham:<\/p>\n<pre class=\"lang:js decode:true\">{\r\n\u00a0\u00a0\"knn\": [\r\n\u00a0\u00a0\u00a0\u00a0{\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"k\": 10,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"field\": \"embedding\",\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"vector\": [\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a00.0022478399332612753,\r\n        ....\r\n      ]\r\n\u00a0\u00a0\u00a0\u00a0}\r\n\u00a0\u00a0],\r\n\u00a0\u00a0\"explain\": true,\r\n\u00a0\u00a0\"size\": 10,\r\n\u00a0\u00a0\"from\": 0\r\n}<\/pre>\n<p>Os que mais se aproximam s\u00e3o o London Fire Brigade Museum, o Verulamium Museum em St Albans e o RAF Museum em Londres.<\/p>\n<p>Agora queremos pesquisar museus semelhantes em Glasgow, ou seja, o campo cidade deve ter um valor <em>Glasgow<\/em>.<\/p>\n<p>Veja como fica a consulta filtrada, com o <strong>filtro<\/strong> cl\u00e1usula adicionada:<\/p>\n<pre class=\"lang:js decode:true\">{\r\n\u00a0\u00a0\"knn\": [\r\n\u00a0\u00a0\u00a0\u00a0{\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"k\": 10,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"field\": \"embedding\",\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"vector\": [\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a00.0022478399332612753,\r\n        ....\r\n  ],\r\n  \"filter\": {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"match\": \"Glasgow\",\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\"field\": \"city\"\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0}\r\n\u00a0\u00a0\u00a0\u00a0}\r\n\u00a0\u00a0],\r\n\u00a0\u00a0\"explain\": true,\r\n\u00a0\u00a0\"size\": 10,\r\n\u00a0\u00a0\"from\": 0\r\n}<\/pre>\n<p>Como esperado, os resultados agora est\u00e3o limitados aos de Glasgow, sendo os mais pr\u00f3ximos a Kelvingrove Art Gallery and Museum e o Riverside Museum.<\/p>\n<p>Como visto neste exemplo, uma consulta kNN filtrada oferece a vantagem de selecionar os documentos para uma pesquisa de similaridade usando as boas e velhas consultas FTS.<\/p>\n<h2>Continue aprendendo<\/h2>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>Leia o blog para saber mais sobre o FAISS e a indexa\u00e7\u00e3o vetorial, <a target=\"_blank\" href=\"https:\/\/www.couchbase.com\/blog\/pt\/vector-search-indexing-recall-faiss\/\">Desempenho da pesquisa vetorial: A ascens\u00e3o do recall<\/a><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/www.couchbase.com\/blog\/pt\/faster-llm-apps-semantic-cache-langchain-couchbase\/\">Crie aplicativos LLM mais r\u00e1pidos e econ\u00f4micos com o Couchbase<\/a><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/www.couchbase.com\/blog\/pt\/what-is-semantic-search\/\">O que \u00e9 pesquisa sem\u00e2ntica?<\/a><\/li>\n<li>Comece a usar o Couchbase Capella hoje mesmo, <a target=\"_blank\" href=\"https:\/\/cloud.couchbase.com\/sign-up?ref=blog\">gratuitamente<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<hr \/>\n<p><a target=\"_blank\" href=\"https:\/\/cloud.couchbase.com\/sign-up?ref=blog\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-16409\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-1024x835.png\" alt=\"Free Cloud NoSQL DBaaS\" width=\"613\" height=\"500\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-1024x835.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-300x245.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-768x626.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-1536x1252.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-2048x1670.png 2048w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2024\/10\/capella-cloud-dbaas-couchbase-signup-free-1320x1076.png 1320w\" sizes=\"auto, (max-width: 613px) 100vw, 613px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>Why does semantic search need selectivity? Up until now, we\u2019ve viewed a vector embedding as a complete, stand-alone entity &#8211; focused entirely on the meaning it encodes. While this enables semantic search, often with a high degree of similarity, it [&hellip;]<\/p>","protected":false},"author":85141,"featured_media":17148,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[1815,2165,8683,9936,9937],"tags":[10117],"ppma_author":[9962],"class_list":["post-17138","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-best-practices-and-tutorials","category-full-text-search","category-geospatial","category-search","category-vector-search","tag-knn"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.1 (Yoast SEO v26.1.1) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Semantic Similarity with Focused Selectivity - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"Discover how selective pre-filtering enhances semantic similarity search in Couchbase by combining vector and metadata queries.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/pt\/semantic-similarity-with-focused-selectivity\/\" \/>\n<meta property=\"og:locale\" content=\"pt_BR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Semantic Similarity with Focused Selectivity\" \/>\n<meta property=\"og:description\" content=\"Discover how selective pre-filtering enhances semantic similarity search in Couchbase by combining vector and metadata queries.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/pt\/semantic-similarity-with-focused-selectivity\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2025-05-22T16:54:28+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-14T05:45:04+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search-1024x536.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"536\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Aditi Ahuja, Software Engineer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Aditi Ahuja, Software Engineer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/\"},\"author\":{\"name\":\"Aditi Ahuja, Software Engineer\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0\"},\"headline\":\"Semantic Similarity with Focused Selectivity\",\"datePublished\":\"2025-05-22T16:54:28+00:00\",\"dateModified\":\"2025-06-14T05:45:04+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/\"},\"wordCount\":1005,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png\",\"keywords\":[\"knn\"],\"articleSection\":[\"Best Practices and Tutorials\",\"Full-Text Search\",\"Geospatial\",\"Search\",\"Vector Search\"],\"inLanguage\":\"pt-BR\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/\",\"name\":\"Semantic Similarity with Focused Selectivity - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png\",\"datePublished\":\"2025-05-22T16:54:28+00:00\",\"dateModified\":\"2025-06-14T05:45:04+00:00\",\"description\":\"Discover how selective pre-filtering enhances semantic similarity search in Couchbase by combining vector and metadata queries.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#breadcrumb\"},\"inLanguage\":\"pt-BR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-BR\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png\",\"width\":2400,\"height\":1256,\"caption\":\"blog-semantic-search selectivity\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.couchbase.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Semantic Similarity with Focused Selectivity\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"pt-BR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-BR\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0\",\"name\":\"Aditi Ahuja, Software Engineer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-BR\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/a3eb898818ce7bdfc1b89af35c10b1f5\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g\",\"caption\":\"Aditi Ahuja, Software Engineer\"},\"url\":\"https:\/\/www.couchbase.com\/blog\/pt\/author\/aditi\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Semantic Similarity with Focused Selectivity - The Couchbase Blog","description":"Descubra como a pr\u00e9-filtragem seletiva aprimora a pesquisa de similaridade sem\u00e2ntica no Couchbase, combinando consultas de vetores e metadados.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/pt\/semantic-similarity-with-focused-selectivity\/","og_locale":"pt_BR","og_type":"article","og_title":"Semantic Similarity with Focused Selectivity","og_description":"Discover how selective pre-filtering enhances semantic similarity search in Couchbase by combining vector and metadata queries.","og_url":"https:\/\/www.couchbase.com\/blog\/pt\/semantic-similarity-with-focused-selectivity\/","og_site_name":"The Couchbase Blog","article_published_time":"2025-05-22T16:54:28+00:00","article_modified_time":"2025-06-14T05:45:04+00:00","og_image":[{"width":1024,"height":536,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search-1024x536.png","type":"image\/png"}],"author":"Aditi Ahuja, Software Engineer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Aditi Ahuja, Software Engineer","Est. reading time":"6 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/"},"author":{"name":"Aditi Ahuja, Software Engineer","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0"},"headline":"Semantic Similarity with Focused Selectivity","datePublished":"2025-05-22T16:54:28+00:00","dateModified":"2025-06-14T05:45:04+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/"},"wordCount":1005,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png","keywords":["knn"],"articleSection":["Best Practices and Tutorials","Full-Text Search","Geospatial","Search","Vector Search"],"inLanguage":"pt-BR","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/","url":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/","name":"Semantic Similarity with Focused Selectivity - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png","datePublished":"2025-05-22T16:54:28+00:00","dateModified":"2025-06-14T05:45:04+00:00","description":"Descubra como a pr\u00e9-filtragem seletiva aprimora a pesquisa de similaridade sem\u00e2ntica no Couchbase, combinando consultas de vetores e metadados.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#breadcrumb"},"inLanguage":"pt-BR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/"]}]},{"@type":"ImageObject","inLanguage":"pt-BR","@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/05\/blog-semantic-search.png","width":2400,"height":1256,"caption":"blog-semantic-search selectivity"},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/semantic-similarity-with-focused-selectivity\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Semantic Similarity with Focused Selectivity"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"Blog do Couchbase","description":"Couchbase, o banco de dados NoSQL","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"pt-BR"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"Blog do Couchbase","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"pt-BR","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/efea1290226a380570a7a86c651090a0","name":"Aditi Ahuja, engenheira de software","image":{"@type":"ImageObject","inLanguage":"pt-BR","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/a3eb898818ce7bdfc1b89af35c10b1f5","url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","caption":"Aditi Ahuja, Software Engineer"},"url":"https:\/\/www.couchbase.com\/blog\/pt\/author\/aditi\/"}]}},"authors":[{"term_id":9962,"user_id":85141,"is_guest":0,"slug":"aditi","display_name":"Aditi Ahuja, Software Engineer","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/c7e80b3dd70704a52cc5d032f55449eb2bc253009a8495c7a53ea50a14a014a8?s=96&d=mm&r=g","author_category":"","last_name":"Ahuja, Software Engineer","first_name":"Aditi","job_title":"","user_url":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/posts\/17138","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/users\/85141"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/comments?post=17138"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/posts\/17138\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/media\/17148"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/media?parent=17138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/categories?post=17138"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/tags?post=17138"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/pt\/wp-json\/wp\/v2\/ppma_author?post=17138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}