-
Notifications
You must be signed in to change notification settings - Fork 289
Description
When embedding results cluster in vector space, KNN returns redundant near-duplicates. For example, querying a diary
index for "architectural patterns" returns five paraphrases of the same insight instead of covering different topics.
Maximal Marginal Relevance (MMR) is the standard solution from information retrieval. It balances relevance against
redundancy by penalizing candidates too similar to already-selected results:
MMR(d) = λ · relevance(d, query) - (1 - λ) · max_similarity(d, selected)
λ = 1.0→ pure relevance (identical to standard KNN)λ = 0.0→ pure diversityλ = 0.5–0.7→ typical balance
Originally described in Carbonell & Goldstein, "The Use of MMR, Diversity-Based Reranking for Reordering Documents and
Producing Summaries" (SIGIR 1998).
MMR is widely used in RAG pipelines, search engines, and recommendation systems to improve result diversity.
Proposed API
A new mmr_lambda hidden column on vec0, following the same pattern as k and distance:
-- Existing behavior (unchanged when mmr_lambda is absent)
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10;
-- MMR reranking: over-fetch candidates, then greedy-select k diverse results
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10 AND mmr_lambda = 0.7;- mmr_lambda accepts [0.0, 1.0]
- distance column still reports original distance to query vector (not the MMR score)
- Internally over-fetches min(k * 5, 4096) candidates, then greedy-selects k using MMR
- No effect on non-KNN queries or when mmr_lambda is omitted
Why it belongs in vec0
MMR reranking requires access to raw candidate vectors to compute pairwise similarities. Application-level
post-processing would need to re-fetch vectors by rowid and recompute distances, which is both slow and defeats the
purpose of having a vector index. Inside vec0Filter_knn, the vectors are already loaded during the KNN search phase —
the reranking step adds negligible overhead.
Implementation approach
- Add mmr_lambda as a hidden column (same pattern as k and distance)
- Recognize mmr_lambda = ? as a constraint in xBestIndex
- In xFilter KNN path: if mmr_lambda is present, over-fetch candidates then apply greedy MMR selection
- The greedy selection loop is O(k_target × k_internal × dims) — for k=10, k_internal=50, dims=768: microseconds
- Support all vector types (float, int8, bit) and all distance metrics (cosine, L2, L1)
Composition
- Works with distance constraints from vec0: Support constraints on
distancecolumn for pagination/custom thresholds #165 / WIP: Support constaints ondistancecolumn in KNN queries, for pagination and range queries #166 — distance filtering happens during KNN, MMR reranks the filtered
results - Works with partition key constraints — partitioning narrows candidates before KNN+MMR
- mmr_lambda = 1.0 is a no-op (pure relevance = standard KNN), so the feature degrades gracefully
This implementation is already pushed to fork, which I will send a PR from shortly.