Skip to content

vec0: Support MMR (Maximal Marginal Relevance) reranking for KNN diversity #266

@MayCXC

Description

@MayCXC

When embedding results cluster in vector space, KNN returns redundant near-duplicates. For example, querying a diary
index for "architectural patterns" returns five paraphrases of the same insight instead of covering different topics.

Maximal Marginal Relevance (MMR) is the standard solution from information retrieval. It balances relevance against
redundancy by penalizing candidates too similar to already-selected results:

MMR(d) = λ · relevance(d, query) - (1 - λ) · max_similarity(d, selected)

  • λ = 1.0 → pure relevance (identical to standard KNN)
  • λ = 0.0 → pure diversity
  • λ = 0.5–0.7 → typical balance

Originally described in Carbonell & Goldstein, "The Use of MMR, Diversity-Based Reranking for Reordering Documents and
Producing Summaries" (SIGIR 1998)
.
MMR is widely used in RAG pipelines, search engines, and recommendation systems to improve result diversity.

Proposed API

A new mmr_lambda hidden column on vec0, following the same pattern as k and distance:

-- Existing behavior (unchanged when mmr_lambda is absent)
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10;

-- MMR reranking: over-fetch candidates, then greedy-select k diverse results
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10 AND mmr_lambda = 0.7;
  • mmr_lambda accepts [0.0, 1.0]
  • distance column still reports original distance to query vector (not the MMR score)
  • Internally over-fetches min(k * 5, 4096) candidates, then greedy-selects k using MMR
  • No effect on non-KNN queries or when mmr_lambda is omitted

Why it belongs in vec0

MMR reranking requires access to raw candidate vectors to compute pairwise similarities. Application-level
post-processing would need to re-fetch vectors by rowid and recompute distances, which is both slow and defeats the
purpose of having a vector index. Inside vec0Filter_knn, the vectors are already loaded during the KNN search phase —
the reranking step adds negligible overhead.

Implementation approach

  1. Add mmr_lambda as a hidden column (same pattern as k and distance)
  2. Recognize mmr_lambda = ? as a constraint in xBestIndex
  3. In xFilter KNN path: if mmr_lambda is present, over-fetch candidates then apply greedy MMR selection
  4. The greedy selection loop is O(k_target × k_internal × dims) — for k=10, k_internal=50, dims=768: microseconds
  5. Support all vector types (float, int8, bit) and all distance metrics (cosine, L2, L1)

Composition

This implementation is already pushed to fork, which I will send a PR from shortly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions