vec0: Support MMR (Maximal Marginal Relevance) reranking for KNN diversity

When embedding results cluster in vector space, KNN returns redundant near-duplicates. For example, querying a diary
index for "architectural patterns" returns five paraphrases of the same insight instead of covering different topics.

**Maximal Marginal Relevance (MMR)** is the standard solution from information retrieval. It balances relevance against
redundancy by penalizing candidates too similar to already-selected results:

MMR(d) = λ · relevance(d, query) - (1 - λ) · max_similarity(d, selected)

- `λ = 1.0` → pure relevance (identical to standard KNN)
- `λ = 0.0` → pure diversity
- `λ = 0.5–0.7` → typical balance

Originally described in [Carbonell & Goldstein, "The Use of MMR, Diversity-Based Reranking for Reordering Documents and
Producing Summaries" (SIGIR 1998)](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf).
MMR is widely used in RAG pipelines, search engines, and recommendation systems to improve result diversity.

## Proposed API

A new `mmr_lambda` hidden column on `vec0`, following the same pattern as `k` and `distance`:

```sql
-- Existing behavior (unchanged when mmr_lambda is absent)
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10;

-- MMR reranking: over-fetch candidates, then greedy-select k diverse results
SELECT rowid, distance FROM vec_items
WHERE embedding MATCH ? AND k = 10 AND mmr_lambda = 0.7;
```

- mmr_lambda accepts [0.0, 1.0]
- distance column still reports original distance to query vector (not the MMR score)
- Internally over-fetches min(k * 5, 4096) candidates, then greedy-selects k using MMR
- No effect on non-KNN queries or when mmr_lambda is omitted

### Why it belongs in vec0

MMR reranking requires access to raw candidate vectors to compute pairwise similarities. Application-level
post-processing would need to re-fetch vectors by rowid and recompute distances, which is both slow and defeats the
purpose of having a vector index. Inside vec0Filter_knn, the vectors are already loaded during the KNN search phase —
the reranking step adds negligible overhead.

### Implementation approach

1. Add mmr_lambda as a hidden column (same pattern as k and distance)
2. Recognize mmr_lambda = ? as a constraint in xBestIndex
3. In xFilter KNN path: if mmr_lambda is present, over-fetch candidates then apply greedy MMR selection
4. The greedy selection loop is O(k_target × k_internal × dims) — for k=10, k_internal=50, dims=768: microseconds
5. Support all vector types (float, int8, bit) and all distance metrics (cosine, L2, L1)

### Composition

- Works with distance constraints from #165 / #166 — distance filtering happens during KNN, MMR reranks the filtered
results
- Works with partition key constraints — partitioning narrows candidates before KNN+MMR
- mmr_lambda = 1.0 is a no-op (pure relevance = standard KNN), so the feature degrades gracefully

This implementation is already pushed to fork, which I will send a PR from shortly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vec0: Support MMR (Maximal Marginal Relevance) reranking for KNN diversity #266

Proposed API

Why it belongs in vec0

Implementation approach

Composition

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

vec0: Support MMR (Maximal Marginal Relevance) reranking for KNN diversity #266

Description

Proposed API

Why it belongs in vec0

Implementation approach

Composition

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions