Búsqueda vectorial

Búsqueda con redes neuronales artificiales filtradas y índices vectoriales compuestos (Parte 4)

This post is the fourth part of a multi-part series exploring composite vector indexing in Couchbase. If you missed the previous posts, be sure to catch up on Parte 1, Parte 2 y Parte 3.

La serie tratará los siguientes temas:

  1. Por qué son importantes los índices vectoriales compuestos, incluyendo conceptos, terminología y motivación de los desarrolladores. Se utilizará un sistema inteligente de recomendación de productos alimenticios como ejemplo práctico.
  2. Cómo se implementan los índices vectoriales compuestos dentro del servicio de indexación de Couchbase.
  3. Cómo funciona ORDER BY pushdown para consultas vectoriales compuestas.
  4. Comportamiento real y resultados de pruebas comparativas.

Part 4: Performance Analysis of Composite Vector Indexes

Agentic applications and AI workloads increasingly require efficient vector search. Traditional approximate nearest neighbor (ANN) search systems can struggle at scale, with challenges such as memory consumption, index build times, and real-time update mechanisms.

Composite Vector Indexes (CVI) are designed for filtered ANN workloads, where scalar predicates reduce the candidate set before approximate vector search. For pure vector workloads at a very large scale, Couchbase also provides Hyperscale Vector Indexes. For best practices check out our documentation aquí.

This post focuses on the performance behavior of Composite Vector Indexes for filtered ANN workloads. Building on the concepts and execution model introduced in Parts 1 through 3, we now look at how throughput and p95 latency change as scalar selectivity varies on large-scale datasets.

In this post, selectivity % refers to how much of the dataset remains relevant after the scalar portion of the query constrains the search space. Lower selectivity means a narrower slice of the dataset qualifies, which in turn reduces the amount of vector work the system must perform.

Build Performance

In an internal build benchmark, CVI was able to build an index on 1 billion 128-dimensional vectors in 7 hours. This demonstrates the indexing architecture and use of modern hardware. 

The build performance was measured on the following infrastructure:

Processor: 32-core AMD-EPYC-7643

Memory: 128GB RAM

Almacenamiento: Samsung PM1743 Enterprise SSD 15.36TB

Dataset: SIFT benchmark data

This shows that indexing billions of vectors for production workloads is practical.

Query Performance: Speed and Precision Combined

CVI provides query performance with high recall. Using the 100M SIFT dataset with SQ8 quantization and one leading scalar field, CVI achieved 75% recall@10 across various selectivity percentages, with measured throughput and latency characteristics.

Throughput improves as selectivity narrows

The throughput and latency curves tell the same story from two angles. Narrower scalar constraints reduce the amount of work flowing through the execution path, which improves both system throughput and tail behavior. For applications that naturally include hard constraints such as category, brand, tenant, region, language, or compliance boundary this behavior is exactly where Composite Vector Indexes become compelling.

Configuración de la prueba

Definición

Consulta

En escalar field is populated in the data as needed for the selectivity and <nprobes> is adjusted to get expected recall.

Why the Curves Look This Way

CVI’s performance is influenced by several architectural features:

  1. Order-aware scanning
    • CVI uses an order-aware scan pipeline that leverages scalar predicates combined with vector similarity search, enabling efficient access patterns and minimizing I/O operations.
  2. Parallel processing architecture
    • The system uses parallelism across centroids, allowing multiple scan workers to operate concurrently on different partitions of the vector space.
  3. SIMD-accelerated distance computation
    • CVI uses SIMD operations through the FAISS library to accelerate similarity evaluations and minimize computational overhead.
  4. HNSW routing layer
    • The Hierarchical Navigable Small World (HNSW) routing layer enables identification of relevant centroids, reducing the search space.

Example Applications

CVI’s performance characteristics are applicable to a range of use cases:

  1. E-commerce and product recommendations
    • Product similarity search with price, brand, and category filters
  2. Content discovery and search
    • Document and media similarity search with metadata constraints
  3. Fraud detection and risk assessment
    • Anomaly detection in transaction patterns with temporal constraints
  4. Personalized marketing
    • Customer segmentation and targeted recommendations

Conclusión

The first three parts of this series explained why Composite Vector Indexes matter, how they are implemented, and how they enable flexible ORDER BY pushdown for mixed scalar-plus-vector queries. This final part shows the performance payoff of that design.

On the 100M SIFT benchmark with SQ8 quantization, throughput increased from 800 QPS at 100% selectivity to 2853 QPS at 1% selectivity, while p95 latency improved from 66 ms to 17 ms. In a separate internal build benchmark, Composite Vector Indexes built an index over 1 billion 128-dimensional vectors in about 7 hours on modern commodity server hardware.

For filtered ANN workloads, that is the core value proposition of Composite Vector Indexes: they let applications combine scalar constraints and semantic similarity in one index structure, while still delivering strong throughput and low tail latency at scale.

 

 

Comparte este artículo
Recibe actualizaciones del blog de Couchbase en tu bandeja de entrada
Este campo es obligatorio.

Autor

Publicado por Sai Kommaraju

Deja un comentario

¿Listo para empezar con Couchbase Capella?

Empezar a construir

Consulte nuestro portal para desarrolladores para explorar NoSQL, buscar recursos y empezar con tutoriales.

Utilizar Capella gratis

Ponte manos a la obra con Couchbase en unos pocos clics. Capella DBaaS es la forma más fácil y rápida de empezar.

Póngase en contacto

¿Quieres saber más sobre las ofertas de Couchbase? Permítanos ayudarle.