Question 1

What is the difference between HNSW and IVF indexes for approximate nearest neighbor search?

Accepted Answer

HNSW (Hierarchical Navigable Small World) is a graph-based index that builds a layered proximity graph. Search starts at the top layer (few nodes, long-range connections) and greedily descends to lower layers for precision, achieving O(log N) query time with high recall. It requires no training data and handles dynamic inserts well, but consumes significant memory (each node stores M neighbor pointers per layer). IVF (Inverted File) partitions the vector space into K clusters using k-means, stores posting lists per cluster, and at query time probes the nprobe nearest cluster centroids. IVF is more memory-efficient and faster for very large static datasets, but requires training (k-means) and its recall degrades if the query vector's true neighbors are spread across many clusters.

Question 2

What is the difference between pre-filter and post-filter hybrid search?

Accepted Answer

In hybrid search (vector ANN + metadata filter), pre-filter applies the metadata filter first and then runs ANN only on the filtered subset. This guarantees recall within the filtered set but degrades ANN performance if the subset is small, because the ANN index was built on the full dataset and the small subset may not be well-represented in the graph structure. Post-filter runs ANN on the full index first (retrieves top-K candidates), then filters by metadata. Post-filter is fast but may return fewer than K results if many candidates fail the filter. Most production systems use pre-filter for high-selectivity filters (less than 1% of data) via brute-force on the subset, and post-filter with over-fetching (retrieve top-K * multiplier) for low-selectivity filters.

Question 3

When should you use cosine similarity versus dot product versus L2 distance?

Accepted Answer

Cosine similarity measures the angle between vectors, ignoring magnitude — use it when embedding magnitude is not meaningful, which is the case for most text embedding models (OpenAI, Cohere) where embeddings are L2-normalized before storage. Dot product is equivalent to cosine similarity when vectors are unit-normalized, but when vectors are NOT normalized it rewards both direction alignment and magnitude. Recommendation models sometimes use un-normalized dot product to encode item popularity in magnitude. L2 (Euclidean) distance measures absolute spatial distance — appropriate for embeddings where magnitude encodes quantity (e.g., some image embeddings or embeddings from models that do not normalize output). When in doubt, normalize embeddings and use cosine or dot product.

Question 4

How do you handle re-indexing when the embedding model changes?

Accepted Answer

When an embedding model changes, all existing vectors are incompatible with new query vectors because the embedding space is different. The correct approach is: (1) create a new namespace or index for the new model version; (2) re-embed all entities using the new model in batches (this is the expensive step); (3) write new vectors into the new namespace; (4) once re-embedding is complete and verified, update the query routing to use the new namespace; (5) delete the old namespace. Model version metadata should be stored alongside each vector record so you can identify which records need re-embedding. For large corpora, run both indexes in parallel during the transition and route queries to the old index for entities not yet re-embedded.

Question 5

How does the HNSW algorithm enable approximate nearest neighbor search?

Accepted Answer

HNSW builds a layered graph where upper layers have long-range connections for fast navigation and lower layers have dense local connections for precision; a search starts at the top layer, greedily descends to the entry point, then performs a beam search on the bottom layer.

Question 6

What is the difference between pre-filter and post-filter in hybrid search?

Accepted Answer

Pre-filter applies metadata filters first, reducing the vector search space to a qualifying subset; post-filter performs ANN search first then filters results by metadata; pre-filter is more precise but may miss results if the filtered set is too small; post-filter may require fetching more candidates to ensure K results after filtering.

Question 7

Why is cosine similarity preferred over L2 for text embeddings?

Accepted Answer

Cosine similarity measures the angle between vectors, making it invariant to vector magnitude; text embeddings from different sentences may have different norms but similar directions; L2 distance conflates directional similarity with magnitude differences.

Question 8

How is re-indexing handled when the embedding model changes?

Accepted Answer

All entities must be re-embedded with the new model and re-indexed; a new index namespace is created, populated, and validated before traffic is switched; the old namespace is kept as a fallback during migration.

Vector Database Low-Level Design: Embeddings Storage, ANN Search, and Hybrid Filtering

Embedding Storage

HNSW Index: Structure and Parameters

IVF Index: Cluster Centroids and Posting Lists

Hybrid Search: Pre-filter vs Post-filter

Distance Metrics

SQL DDL

Python: Core Operations

Design Considerations Summary