Filtered ANN Search With Composite Vector Indexes (Part 4)

This post is the fourth part of a multi-part series exploring composite vector indexing in Couchbase. If you missed the previous posts, be sure to catch up on Part 1, Part 2 and Part 3.

The series will cover:

Why composite vector indexes matter, including concepts, terminology, and developer motivation. A Smart Grocery Recommendation System will be used as a running example.
How composite vector indexes are implemented inside the Couchbase Indexing Service.
How ORDER BY pushdown works for composite vector queries.
Real-world performance behavior and benchmarking results.

Part 4: Performance Analysis of Composite Vector Indexes

Agentic applications and AI workloads increasingly require efficient vector search. Traditional approximate nearest neighbor (ANN) search systems can struggle at scale, with challenges such as memory consumption, index build times, and real-time update mechanisms.

Composite Vector Indexes (CVI) are designed for filtered ANN workloads, where scalar predicates reduce the candidate set before approximate vector search. For pure vector workloads at a very large scale, Couchbase also provides Hyperscale Vector Indexes. For best practices check out our documentation here.

This post focuses on the performance behavior of Composite Vector Indexes for filtered ANN workloads. Building on the concepts and execution model introduced in Parts 1 through 3, we now look at how throughput and p95 latency change as scalar selectivity varies on large-scale datasets.

In this post, selectivity % refers to how much of the dataset remains relevant after the scalar portion of the query constrains the search space. Lower selectivity means a narrower slice of the dataset qualifies, which in turn reduces the amount of vector work the system must perform.

Build Performance

In an internal build benchmark, CVI was able to build an index on 1 billion 128-dimensional vectors in 7 hours. This demonstrates the indexing architecture and use of modern hardware.

The build performance was measured on the following infrastructure:

Processor: 32-core AMD-EPYC-7643

Memory: 128GB RAM

Storage: Samsung PM1743 Enterprise SSD 15.36TB

Dataset: SIFT benchmark data

This shows that indexing billions of vectors for production workloads is practical.

Query Performance: Speed and Precision Combined

CVI provides query performance with high recall. Using the 100M SIFT dataset with SQ8 quantization and one leading scalar field, CVI achieved 75% recall@10 across various selectivity percentages, with measured throughput and latency characteristics.

Throughput improves as selectivity narrows

The throughput and latency curves tell the same story from two angles. Narrower scalar constraints reduce the amount of work flowing through the execution path, which improves both system throughput and tail behavior. For applications that naturally include hard constraints such as category, brand, tenant, region, language, or compliance boundary this behavior is exactly where Composite Vector Indexes become compelling.

Test Setup

Definition

CREATE INDEX `vector-idx` on `bucket-1`.`_default`.`_default` (scalar, emb Vector) WITH {‘dimension’:128, ‘similarity’:’L2′, ‘description’:’IVF,SQ8′}

1	CREATE INDEX `vector–idx` on `bucket–1`.`_default`.`_default` (scalar, emb Vector) WITH {‘dimension’:128, ‘similarity’:‘L2’, ‘description’:‘IVF,SQ8’}

Query

SELECT meta().id FROM `bucket-1`.`_default`.`_default` 
WHERE scalar = ‘eligible’ 
ORDER BY ANN_DISTANCE(emb, , ‘L2’, )
LIMIT 10

SELECT meta().id FROM `bucket–1`.`_default`.`_default`

WHERE scalar = ‘eligible’

ORDER BY ANN_DISTANCE(emb, , ‘L2’, )

LIMIT 10

The scalar field is populated in the data as needed for the selectivity and <nprobes> is adjusted to get expected recall.

Why the Curves Look This Way

CVI’s performance is influenced by several architectural features:

Order-aware scanning
- CVI uses an order-aware scan pipeline that leverages scalar predicates combined with vector similarity search, enabling efficient access patterns and minimizing I/O operations.
Parallel processing architecture
- The system uses parallelism across centroids, allowing multiple scan workers to operate concurrently on different partitions of the vector space.
SIMD-accelerated distance computation
- CVI uses SIMD operations through the FAISS library to accelerate similarity evaluations and minimize computational overhead.
HNSW routing layer
- The Hierarchical Navigable Small World (HNSW) routing layer enables identification of relevant centroids, reducing the search space.

Example Applications

CVI’s performance characteristics are applicable to a range of use cases:

E-commerce and product recommendations
- Product similarity search with price, brand, and category filters
Content discovery and search
- Document and media similarity search with metadata constraints
Fraud detection and risk assessment
- Anomaly detection in transaction patterns with temporal constraints
Personalized marketing
- Customer segmentation and targeted recommendations

Conclusion

The first three parts of this series explained why Composite Vector Indexes matter, how they are implemented, and how they enable flexible ORDER BY pushdown for mixed scalar-plus-vector queries. This final part shows the performance payoff of that design.

On the 100M SIFT benchmark with SQ8 quantization, throughput increased from 800 QPS at 100% selectivity to 2853 QPS at 1% selectivity, while p95 latency improved from 66 ms to 17 ms. In a separate internal build benchmark, Composite Vector Indexes built an index over 1 billion 128-dimensional vectors in about 7 hours on modern commodity server hardware.

For filtered ANN workloads, that is the core value proposition of Composite Vector Indexes: they let applications combine scalar constraints and semantic similarity in one index structure, while still delivering strong throughput and low tail latency at scale.

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

Filtered ANN Search With Composite Vector Indexes (Part 4)

Building a Test Matrix Pipeline for Couchbase Autonomous Operator

App Development Cost: A Complete Pricing Guide and Breakdown

Azure Key Vault for Credentials

Your AI Agents Are Stuck in Pilot. It’s a Data Problem, Not a Model Problem.

When the Internet Goes Down, Your Business Shouldn’t

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

Filtered ANN Search With Composite Vector Indexes (Part 4)

Part 4: Performance Analysis of Composite Vector Indexes

Build Performance

Query Performance: Speed and Precision Combined

Throughput improves as selectivity narrows

Test Setup

Definition

Query

Why the Curves Look This Way

Example Applications

Conclusion

Get Couchbase blog updates in your inbox

Author

Publicado por Sai Kommaraju, Senior Software Engineer

Deixe um comentário Cancelar resposta

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch