Replace aggregated payload RocksDB tables with in-memory circular buffers by pablodeymo · Pull Request #182 · lambdaclass/ethlambda

pablodeymo · 2026-03-03T20:50:19Z

Motivation

When finalization stalls, ethlambda's attestation-related tables grow without bound, causing OOM crashes. In our devnet observations, memory climbs to ~11.5 GB and crashes in ~27-minute cycles. The root causes are:

LatestKnownAggregatedPayloads and LatestNewAggregatedPayloads grow without bound — they are append-only with no eviction. Each gossip message with 9 validators creates 9 copies of the same ~256 KB proof, and the read-merge-write pattern in RocksDB keeps accumulating entries.
GossipSignatures and AttestationDataByRoot also grow unbounded when finalization stalls, since their pruning was only triggered by finalization events.

Both pruning mechanisms were gated on finalization advancing. When finalization stalls (which is exactly when memory pressure is highest), nothing gets cleaned up.

Solution

Two complementary mechanisms, inspired by Lantern (C client):

1. In-memory circular buffer for aggregated payloads (Lantern-style)

Replace the two RocksDB aggregated payload tables with VecDeque-based buffers that have a hard cap of 4096 entries and FIFO eviction.

Why in-memory buffers are safe:

Aggregated payloads are ephemeral — rebuilt from gossip on restart. There's no durability requirement.
Only the BlockChain actor reads/writes them (P2P and RPC clones never access these tables).
Store derives Clone (cloned 3× at startup for P2P, RPC, and BlockChain actors), but cloning VecDeque is fine — only the BlockChain clone populates its buffers.

Why 4096? With 9 validators, each producing 1 attestation per slot, that's ~455 unique attestation messages — roughly 30 minutes of history at 4-second slots. This matches Lantern's approach.

What changes for each method:

Method	Before (RocksDB)	After (buffer)
`insert_new_aggregated_payload`	Read-merge-write DB	`self.new_payloads.push(key, payload)`
`insert_known_aggregated_payload`	Read-merge-write DB	`self.known_payloads.push(key, payload)`
`iter_known_aggregated_payloads`	Full table scan + deserialize + `.collect()`	`self.known_payloads.grouped().into_iter()`
`iter_new_aggregated_payloads`	Full table scan + deserialize + `.collect()`	`self.new_payloads.grouped().into_iter()`
`promote_new_aggregated_payloads`	Read new → merge into known → delete new (40 lines)	`drain new → push_batch to known` (2 lines)

2. Slot-age pruning for remaining DB tables

GossipSignatures and AttestationDataByRoot remain in RocksDB (they serve other purposes), but are now pruned by slot age every slot at interval 4, independent of finalization.

Retention window: 64 slots (~4.3 minutes at 4s/slot)
Cutoff: current_slot.saturating_sub(64) — entries at or below this slot are pruned
Short-circuit: When cutoff_slot == 0 (early slots < 64), pruning is skipped entirely, so spec tests (which operate within short slot ranges) are unaffected.

This ensures these tables stay bounded even during prolonged finalization stalls.

Changes

`crates/storage/src/store.rs`

Added PayloadBuffer struct — VecDeque<(SignatureKey, StoredAggregatedPayload)> with configurable capacity. Methods: push (FIFO eviction), push_batch, drain, grouped (HashMap by key), unique_keys (HashSet).
Added new_payloads / known_payloads fields to Store struct, initialized in init_store.
Refactored all 9 public aggregated payload methods to use in-memory buffers instead of RocksDB operations.
Removed 5 private DB helper methods that are no longer needed: iter_aggregated_payloads, iter_aggregated_payload_keys, insert_aggregated_payload, insert_aggregated_payloads_batch, prune_aggregated_payload_table.
Added prune_attestation_data_by_age — slot-age-based pruning for GossipSignatures and AttestationDataByRoot.
Removed aggregated payload pruning from update_checkpoints (FIFO handles it now).
Added 6 new tests: FIFO eviction, grouped(), drain(), promotion, age-based pruning, and early-slot no-op.

`crates/storage/src/api/tables.rs`

Removed LatestNewAggregatedPayloads and LatestKnownAggregatedPayloads from Table enum (10 → 8 tables).

`crates/storage/src/backend/rocksdb.rs`

Removed cf_name match arms for the deleted table variants.

`crates/blockchain/src/store.rs`

Added prune_attestation_data_by_age call at interval 4 (end of slot) with info logging when entries are pruned.
Updated doc comments referencing removed table names.

How to Test

make fmt    # Formatting passes
make lint   # Clippy with -D warnings passes
cargo test --workspace --release   # All 93 tests pass (including 6 new ones)

All 26 fork choice spec tests pass unchanged — they operate within short slot ranges (< 64 slots), so cutoff_slot is 0 and age-based pruning short-circuits.

For production validation, deploy to a devnet and monitor memory usage during a finalization stall. Memory should plateau instead of climbing linearly.

…fers and add slot-age pruning for attestation data Aggregated payloads (new and known) are now stored in VecDeque-based circular buffers with a 4096-entry hard cap and FIFO eviction, matching Lantern's approach. This prevents unbounded memory growth when finalization stalls (~11.5 GB OOM after ~27 min). The two RocksDB tables (LatestNewAggregatedPayloads and LatestKnownAggregatedPayloads) are removed entirely since aggregated payloads are ephemeral and rebuilt from gossip on restart. Additionally, GossipSignatures and AttestationDataByRoot are now pruned by slot age (64-slot retention window) every slot at interval 4, independent of finalization progress.

github-actions · 2026-03-03T20:50:57Z

🤖 Kimi Code Review

Review Summary

This PR replaces RocksDB-backed aggregated payload storage with in-memory circular buffers (PayloadBuffer). The change is correct and well-executed, with comprehensive tests and proper handling of edge cases.

Key Strengths

Memory safety: Fixed-size buffers prevent unbounded growth
Performance: Eliminates RocksDB read/write overhead for aggregated payloads
Correctness: FIFO eviction matches expected behavior when finalization stalls
Test coverage: Extensive unit tests for PayloadBuffer and integration tests for pruning logic

Minor Issues

1. Potential race condition in promote_new_aggregated_payloads (storage/src/store.rs:968-972)
The current implementation drains the new buffer and pushes to known buffer in two separate steps. If another thread inserts into new_payloads between these operations, those entries would be lost. While the current codebase appears single-threaded, consider making this atomic:

pub fn promote_new_aggregated_payloads(&mut self) {
    // Use drain + extend to make this atomic
    let drained = self.new_payloads.entries.drain(..).collect::<Vec<_>>();
    self.known_payloads.entries.extend(drained);
}

2. Missing capacity check in PayloadBuffer::push_batch (storage/src/store.rs:129-134)
The batch push could temporarily exceed capacity before eviction. While not harmful, it's inconsistent with the single-item push behavior:

fn push_batch(&mut self, entries: Vec<(SignatureKey, StoredAggregatedPayload)>) {
    for (key, payload) in entries {
        // This ensures we never exceed capacity even temporarily
        self.push(key, payload);
    }
}

3. Documentation inconsistency (blockchain/src/store.rs:118)
The comment still references LatestNewAggregatedPayloads which no longer exists. Update to match the new buffer-based implementation.

Consensus Safety

The changes maintain all critical consensus properties:

Fork choice weights use only known_payloads (correctly populated)
Attestation data pruning uses proper slot-based cutoff
No changes to state transition logic or signature verification

Performance Impact

Positive: Eliminates RocksDB I/O for aggregated payloads (hot path). The 4096-entry cap (~30 min at 1 attestation/slot) is reasonable for the 9-validator testnet.

Recommendation: ✅ Approve - The PR is ready to merge after addressing the minor issues above.

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-03-03T20:52:07Z

🤖 Codex Code Review

High: age-based pruning can remove active LMD votes and change fork-choice outcome.
At store.rs:574, prune_attestation_data_by_age deletes AttestationDataByRoot entries older than 64 slots, and it is called every slot end in blockchain/store.rs:339.
extract_latest_attestations depends on that table (store.rs:842); if data is pruned, the validator’s latest message is silently dropped from fork choice. That is a consensus behavior change versus “latest-message” tracking and can affect head selection under delayed/inactive validators.
High: fixed global FIFO cap enables vote-eviction attacks / incorrect weighting under load.
PayloadBuffer evicts oldest entries globally when full (store.rs:121, cap at store.rs:96).
Because known votes are also capped/evicted, a burst of valid attestations can evict still-relevant latest votes for other validators, reducing their effective fork-choice weight. In adversarial conditions, this can bias LMD-GHOST decisions. Consider per-validator latest tracking (map keyed by validator) instead of global FIFO.
Medium: aggregated payload persistence was removed, so restart loses fork-choice attestation context.
Tables were dropped from storage (tables.rs:3), and store init now always starts empty in-memory buffers (store.rs:342).
After restart, fork-choice uses only newly observed attestations, not previously known ones, which may cause temporary head instability/reorg sensitivity until enough fresh votes arrive.
Low (performance): repeated full cloning/grouping of payloads adds avoidable overhead.
grouped() clones every payload into a new HashMap each call (store.rs:140); produce_block_with_signatures then collects and remaps again (blockchain/store.rs:779).
Not a correctness bug, but this is extra allocation/copy churn on hot paths.

Could not run tests locally in this environment because rustup attempted to write outside permitted paths (/home/runner/.rustup/tmp, permission denied).

Automated review by OpenAI Codex · custom prompt

github-actions · 2026-03-03T20:54:40Z

🤖 Claude Code Review

PR Review: Replace Aggregated Payload Tables with In-Memory Circular Buffers

The motivation and overall approach are solid — replacing unbounded RocksDB append-only tables with fixed-size FIFO buffers is the right call for ephemeral data that doesn't need durability. The slot-age pruning is also a good complementary fix for the finalization-stall case. Findings below in descending severity.

Bug: `Default` derive creates a capacity-0 (effectively capacity-1) buffer

crates/storage/src/store.rs:105

#[derive(Clone, Default)]
struct PayloadBuffer {
    entries: VecDeque<(SignatureKey, StoredAggregatedPayload)>,
    capacity: usize,  // Default = 0
}

#[derive(Default)] sets capacity = 0. The push guard is:

if self.entries.len() >= self.capacity { self.entries.pop_front(); }

With capacity = 0, entries.len() >= 0 is always true, so every push evicts before inserting. The buffer silently acts as a capacity-1 ring (always keeps the single latest entry). This is not "zero capacity" — it's a quiet data loss footgun.

Store itself doesn't derive Default, so the risk is currently low, but Default on PayloadBuffer is unnecessary and should be removed. If it's needed for macro compatibility, implement it manually with capacity: AGGREGATED_PAYLOAD_CAP.

Performance: `grouped()` clones all payloads on every block-building call

crates/storage/src/store.rs:140

fn grouped(&self) -> HashMap<SignatureKey, Vec<StoredAggregatedPayload>> {
    let mut map = HashMap::new();
    for (key, payload) in &self.entries {
        map.entry(*key).or_default().push(payload.clone());  // full clone
    }
    map
}

AggregatedSignatureProof.proof_data is a ByteList<U1048576> (up to 1 MiB). At typical scale (9 validators × ~455 entries ≈ 116 MB per grouped() call) this matches the old RocksDB deserialization cost. Near the 4096-entry cap the transient allocation could reach ~1 GB, reintroducing memory pressure during the exact scenario this PR fixes.

This is called at intervals 0 and 4 during block building (every 4 seconds). For the current devnet scale it's fine, but it's worth noting the cliff. A longer-term approach would be to store Arc<StoredAggregatedPayload> in the buffer to make grouped() copy-free.

Naming: `prune_attestation_data_by_age` prunes gossip signatures too

crates/storage/src/store.rs:574

pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) {
    ...
    let pruned_sigs = self.prune_gossip_signatures(cutoff_slot);
    let pruned_att_data = self.prune_attestation_data_by_root(cutoff_slot);
    (pruned_sigs, pruned_att_data)
}

The function name implies it only prunes attestation data, but it also prunes gossip signatures. The docstring is accurate; the name is not. prune_stale_attestation_tables or similar would be clearer. Minor, but naming matters for pub APIs.

Minor: early-exit boundary is slightly off

crates/storage/src/store.rs:576

let cutoff_slot = current_slot.saturating_sub(ATTESTATION_RETENTION_SLOTS);
if cutoff_slot == 0 { return (0, 0); }

For current_slot = ATTESTATION_RETENTION_SLOTS (64), the cutoff saturates to 0 and we early-return, skipping pruning even for slot-0 entries. Pruning correctly starts at slot 65. This is inconsequential for OOM prevention but is a mild off-by-one worth knowing about.

Positive notes

promote_new_aggregated_payloads simplification is excellent. 40 lines of read-merge-write-delete RocksDB logic → 2 lines of drain + push_batch. The semantics are identical.
FIFO eviction correctness: push evicts the oldest entry before inserting, so the buffer always reflects the most recent 4096 entries.
Test coverage: The PayloadBuffer unit tests (fifo_eviction, grouped_returns_correct_groups, drain_empties_buffer, promote_moves_new_to_known, prune_attestation_data_by_age_removes_old, prune_attestation_data_by_age_noop_early_slots) are well-constructed and cover the critical paths.
Capacity math: 4096 / 9 validators ≈ 455 slots ≈ 30 min at 4s/slot is correct and matches Lantern's approach.
No durability regression: Aggregated payloads were always rebuild-from-gossip on restart; making them explicitly in-memory is correct.
Double-pruning is safe: prune_gossip_signatures / prune_attestation_data_by_root are idempotent, so calling them both on finalization and at interval 4 is harmless.

Automated review by Claude (Anthropic) · custom prompt

greptile-apps · 2026-03-03T20:55:57Z

Greptile Summary

This PR eliminates the primary source of unbounded memory growth during finalization stalls by replacing two RocksDB aggregated-payload tables with in-memory VecDeque-backed FIFO circular buffers (capped at 4096 entries each), and adds slot-age-based pruning (64-slot retention window) for the remaining GossipSignatures and AttestationDataByRoot tables that previously only pruned on finalization events.

Key changes:

PayloadBuffer struct (crates/storage/src/store.rs): new private FIFO buffer with push (evicts oldest when at capacity), drain, grouped, and unique_keys methods. Replaces all RocksDB read-merge-write cycles for aggregated payloads with simple in-memory operations.
promote_new_aggregated_payloads reduced from ~40 lines of RocksDB batch operations to a drain + push_batch (2 lines).
prune_attestation_data_by_age: new public method that prunes GossipSignatures and AttestationDataByRoot by slot age independently of finalization, called every slot at interval 4 in crates/blockchain/src/store.rs.
Table enum cleanup: LatestNewAggregatedPayloads and LatestKnownAggregatedPayloads removed from Table, ALL_TABLES, and the RocksDB cf_name matcher.
6 new unit tests covering FIFO eviction, grouping, drain, promotion, age pruning, and the early-slot no-op guard.

The overall approach is sound and directly targets the root cause. Two findings:

Latent invariant violation: PayloadBuffer derives Default, creating a zero-capacity buffer where push() still inserts entries, violating entries.len() <= capacity. While default() is never called in current code, the footgun should be removed.
Redundant pruning during healthy finalization: When finalization advances at interval 4, both prune_attestation_data_by_age and update_checkpoints execute the same prune operations, causing double full-table scans. A short-circuit check can skip age-based pruning when finalization has already covered the window.

Confidence Score: 4/5

PR is safe to merge after addressing the Default derive and redundant pruning optimization; core logic is sound and well-tested.
The core logic is correct and directly targets the OOM root cause during finalization stalls. All 93 tests pass, including 6 new tests for the FIFO buffer behavior. Two fixable issues: (1) PayloadBuffer::Default creates a zero-capacity buffer that violates the invariant when push() is called, though this path is currently unused; (2) redundant full-table scans during healthy finalization when both pruning mechanisms fire. Neither causes immediate runtime risk, and both have straightforward fixes. No data-loss, memory-safety, or consensus-correctness concerns.
crates/storage/src/store.rs — specifically the PayloadBuffer struct (remove Default derive) and prune_attestation_data_by_age method (add short-circuit check for finalization).

_{Last reviewed commit: 626b966}

greptile-apps · 2026-03-03T20:56:00Z

crates/storage/src/store.rs

+    pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) {
+        let cutoff_slot = current_slot.saturating_sub(ATTESTATION_RETENTION_SLOTS);
+        if cutoff_slot == 0 {
+            return (0, 0);
        }
+        let pruned_sigs = self.prune_gossip_signatures(cutoff_slot);
+        let pruned_att_data = self.prune_attestation_data_by_root(cutoff_slot);
+        (pruned_sigs, pruned_att_data)
    }


During normal finalization, both prune_attestation_data_by_age (line 574-582) and update_checkpoints (line 456-457) call the same prune functions (prune_gossip_signatures and prune_attestation_data_by_root), causing redundant full-table scans when finalization advances.

While idempotent and functionally correct, you could optimize by skipping age-based pruning when finalization has already advanced past the age cutoff:

Suggested change

pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) {

let cutoff_slot = current_slot.saturating_sub(ATTESTATION_RETENTION_SLOTS);

if cutoff_slot == 0 {

return (0, 0);

}

let pruned_sigs = self.prune_gossip_signatures(cutoff_slot);

let pruned_att_data = self.prune_attestation_data_by_root(cutoff_slot);

(pruned_sigs, pruned_att_data)

}

pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) {

let cutoff_slot = current_slot.saturating_sub(ATTESTATION_RETENTION_SLOTS);

if cutoff_slot == 0 {

return (0, 0);

}

// Skip if finalization-based pruning already covers this window

if self.latest_finalized().slot >= cutoff_slot {

return (0, 0);

}

let pruned_sigs = self.prune_gossip_signatures(cutoff_slot);

let pruned_att_data = self.prune_attestation_data_by_root(cutoff_slot);

(pruned_sigs, pruned_att_data)

}

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/storage/src/store.rs Line: 574-582 Comment: During normal finalization, both `prune_attestation_data_by_age` (line 574-582) and `update_checkpoints` (line 456-457) call the same prune functions (`prune_gossip_signatures` and `prune_attestation_data_by_root`), causing redundant full-table scans when finalization advances. While idempotent and functionally correct, you could optimize by skipping age-based pruning when finalization has already advanced past the age cutoff: ```suggestion pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) { let cutoff_slot = current_slot.saturating_sub(ATTESTATION_RETENTION_SLOTS); if cutoff_slot == 0 { return (0, 0); } // Skip if finalization-based pruning already covers this window if self.latest_finalized().slot >= cutoff_slot { return (0, 0); } let pruned_sigs = self.prune_gossip_signatures(cutoff_slot); let pruned_att_data = self.prune_attestation_data_by_root(cutoff_slot); (pruned_sigs, pruned_att_data) } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-03T20:56:01Z

crates/storage/src/store.rs

+#[derive(Clone, Default)]
+struct PayloadBuffer {
+    entries: VecDeque<(SignatureKey, StoredAggregatedPayload)>,
+    capacity: usize,
+}
+
+impl PayloadBuffer {
+    fn new(capacity: usize) -> Self {
+        Self {
+            entries: VecDeque::with_capacity(capacity),
+            capacity,
+        }
+    }
+
+    /// Insert one entry, FIFO-evicting the oldest if at capacity.
+    fn push(&mut self, key: SignatureKey, payload: StoredAggregatedPayload) {
+        if self.entries.len() >= self.capacity {
+            self.entries.pop_front();
+        }
+        self.entries.push_back((key, payload));
+    }


#[derive(Default)] creates a broken buffer with capacity: 0. The push() method doesn't guard against this: when capacity == 0, the condition self.entries.len() >= self.capacity evaluates to 0 >= 0 == true, so pop_front() is called (no-op on empty deque), then push_back() inserts the entry, leaving len = 1 despite capacity = 0—violating the invariant entries.len() <= capacity.

While PayloadBuffer::default() is not called in current code (all instances use explicit PayloadBuffer::new(...)), the derived impl is a latent footgun. Remove the Default derive since Store initializes both buffers explicitly:

Suggested change

#[derive(Clone, Default)]

struct PayloadBuffer {

entries: VecDeque<(SignatureKey, StoredAggregatedPayload)>,

capacity: usize,

}

impl PayloadBuffer {

fn new(capacity: usize) -> Self {

Self {

entries: VecDeque::with_capacity(capacity),

capacity,

}

}

/// Insert one entry, FIFO-evicting the oldest if at capacity.

fn push(&mut self, key: SignatureKey, payload: StoredAggregatedPayload) {

if self.entries.len() >= self.capacity {

self.entries.pop_front();

}

self.entries.push_back((key, payload));

}

#[derive(Clone)]

struct PayloadBuffer {

entries: VecDeque<(SignatureKey, StoredAggregatedPayload)>,

capacity: usize,

}

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/storage/src/store.rs Line: 105-125 Comment: `#[derive(Default)]` creates a broken buffer with `capacity: 0`. The `push()` method doesn't guard against this: when `capacity == 0`, the condition `self.entries.len() >= self.capacity` evaluates to `0 >= 0 == true`, so `pop_front()` is called (no-op on empty deque), then `push_back()` inserts the entry, leaving `len = 1` despite `capacity = 0`—violating the invariant `entries.len() <= capacity`. While `PayloadBuffer::default()` is not called in current code (all instances use explicit `PayloadBuffer::new(...)`), the derived impl is a latent footgun. Remove the `Default` derive since `Store` initializes both buffers explicitly: ```suggestion #[derive(Clone)] struct PayloadBuffer { entries: VecDeque<(SignatureKey, StoredAggregatedPayload)>, capacity: usize, } ``` How can I resolve this? If you propose a fix, please make it concise.

This is a valid concern

MegaRedHand · 2026-03-03T21:02:43Z

crates/storage/src/store.rs

+    ///
+    /// Independent of finalization — prevents unbounded growth when finalization stalls.
+    /// Returns (pruned_sigs, pruned_att_data).
+    pub fn prune_attestation_data_by_age(&mut self, current_slot: u64) -> (usize, usize) {


I think we don't need this. The circular buffer should be enough

MegaRedHand · 2026-03-03T21:04:23Z

crates/storage/src/store.rs

+    new_payloads: PayloadBuffer,
+    known_payloads: PayloadBuffer,


These should go behind an Arc<Mutex<...>>, right?

Also, we should add a test for that:

1. create store 2. clone store 3. modify first store 4. check modification holds in cloned store

MegaRedHand · 2026-03-03T21:07:21Z

crates/storage/src/store.rs

-        Self { backend }
+        Self {
+            backend,
+            new_payloads: PayloadBuffer::new(AGGREGATED_PAYLOAD_CAP),


I think we should use a lower cap for new payloads

MegaRedHand · 2026-03-03T21:08:32Z

crates/storage/src/store.rs

+            new_payloads: PayloadBuffer::new(AGGREGATED_PAYLOAD_CAP),
+            known_payloads: PayloadBuffer::new(AGGREGATED_PAYLOAD_CAP),


Why do we need these changes? I think we should refactor this in another PR

MegaRedHand · 2026-03-03T21:29:48Z

Code review

Found 1 issue:

Mismatched retention windows cause silent loss of validator votes in fork choice. known_payloads retains up to 4096 entries (~30 min at 9 validators/slot), but AttestationDataByRoot is pruned every 64 slots (~4.3 min). extract_latest_attestations looks up attestation data by root for each key in the buffer; when get_attestation_data_by_root returns None (because the entry was age-pruned), the validator's vote is silently skipped. After ~64 slots, all buffer entries older than the retention window produce phantom references that contribute zero weight to LMD GHOST.

The two constants that define the mismatch:

ethlambda/crates/storage/src/store.rs

Lines 94 to 100 in 626b966

    
           /// Matches Lantern's approach. With 9 validators, this holds 
        
           /// ~455 unique attestation messages (~30 min at 1/slot). 
        
           const AGGREGATED_PAYLOAD_CAP: usize = 4096; 
        
           /// Attestation data retention window in slots (~4.3 min at 4s/slot). 
        
           const ATTESTATION_RETENTION_SLOTS: u64 = 64;

The silent skip in extract_latest_attestations:

ethlambda/crates/storage/src/store.rs

Lines 841 to 848 in 626b966

    
           for (validator_id, data_root) in keys { 
        
               let data = data_cache 
        
                   .entry(data_root) 
        
                   .or_insert_with(|| self.get_attestation_data_by_root(&data_root)); 
        
               let Some(data) = data else { 
        
                   continue; 
        
               };

The fix should either align the two retention windows (prune AttestationDataByRoot at the same rate the buffer evicts, or vice versa), or prune buffer entries when their corresponding AttestationDataByRoot entries are removed.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

MegaRedHand reviewed Mar 3, 2026

View reviewed changes

MegaRedHand requested changes Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace aggregated payload RocksDB tables with in-memory circular buffers#182

Replace aggregated payload RocksDB tables with in-memory circular buffers#182
pablodeymo wants to merge 1 commit intomainfrom
circular-buffer-aggregated-payloads

pablodeymo commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

greptile-apps bot Mar 3, 2026

Uh oh!

greptile-apps bot Mar 3, 2026

Uh oh!

MegaRedHand Mar 3, 2026

Uh oh!

MegaRedHand Mar 3, 2026

Uh oh!

MegaRedHand Mar 3, 2026

Uh oh!

MegaRedHand Mar 3, 2026

Uh oh!

MegaRedHand Mar 3, 2026

Uh oh!

MegaRedHand commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		new_payloads: PayloadBuffer::new(AGGREGATED_PAYLOAD_CAP),
		known_payloads: PayloadBuffer::new(AGGREGATED_PAYLOAD_CAP),

Conversation

pablodeymo commented Mar 3, 2026

Motivation

Solution

1. In-memory circular buffer for aggregated payloads (Lantern-style)

2. Slot-age pruning for remaining DB tables

Changes

crates/storage/src/store.rs

crates/storage/src/api/tables.rs

crates/storage/src/backend/rocksdb.rs

crates/blockchain/src/store.rs

How to Test

Uh oh!

github-actions bot commented Mar 3, 2026

🤖 Kimi Code Review

Review Summary

Key Strengths

Minor Issues

Consensus Safety

Performance Impact

Uh oh!

github-actions bot commented Mar 3, 2026

🤖 Codex Code Review

Uh oh!

github-actions bot commented Mar 3, 2026

🤖 Claude Code Review

PR Review: Replace Aggregated Payload Tables with In-Memory Circular Buffers

Bug: Default derive creates a capacity-0 (effectively capacity-1) buffer

Performance: grouped() clones all payloads on every block-building call

Naming: prune_attestation_data_by_age prunes gossip signatures too

Minor: early-exit boundary is slightly off

Positive notes

Uh oh!

greptile-apps bot commented Mar 3, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand commented Mar 3, 2026

Code review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`crates/storage/src/store.rs`

`crates/storage/src/api/tables.rs`

`crates/storage/src/backend/rocksdb.rs`

`crates/blockchain/src/store.rs`

Bug: `Default` derive creates a capacity-0 (effectively capacity-1) buffer

Performance: `grouped()` clones all payloads on every block-building call

Naming: `prune_attestation_data_by_age` prunes gossip signatures too