Skip to content

feat: add configurable io rate limiting for snapshot writes#2802

Open
blindchaser wants to merge 8 commits intorelease/v6.3from
yiren/v631-sp-limit
Open

feat: add configurable io rate limiting for snapshot writes#2802
blindchaser wants to merge 8 commits intorelease/v6.3from
yiren/v631-sp-limit

Conversation

@blindchaser
Copy link
Contributor

@blindchaser blindchaser commented Feb 4, 2026

Describe your changes and provide context

Background

v6.3 optimized snapshot creation time from >3 hours to ~20 minutes using aggressive parallel writes. While this improves performance on high-end machines, it can cause high I/O bursts that may impact page cache efficiency on systems with limited RAM.

Solution

Add optional I/O rate limiting for snapshot writes:

  • New config: sc-snapshot-write-rate-mbps (default: 0 = unlimited)
  • Global limiter shared across all trees/files using token bucket algorithm
  • Allows operators to trade snapshot speed for more stable I/O patterns

Configuration Example

[state-commit]
sc-snapshot-write-rate-mbps = 300

@github-actions
Copy link

github-actions bot commented Feb 4, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedFeb 7, 2026, 4:54 PM

@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 84.41558% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.38%. Comparing base (cd224f6) to head (21ef291).

Files with missing lines Patch % Lines
sei-db/sc/memiavl/snapshot.go 84.61% 3 Missing and 3 partials ⚠️
sei-db/sc/memiavl/db.go 84.21% 1 Missing and 2 partials ⚠️
sei-db/config/config.go 0.00% 1 Missing ⚠️
sei-db/sc/memiavl/multitree.go 90.00% 0 Missing and 1 partial ⚠️
sei-db/sc/store.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@               Coverage Diff                @@
##           release/v6.3    #2802      +/-   ##
================================================
- Coverage         45.96%   43.38%   -2.58%     
================================================
  Files              1199     1866     +667     
  Lines            104456   155383   +50927     
================================================
+ Hits              48008    67419   +19411     
- Misses            52227    81955   +29728     
- Partials           4221     6009    +1788     
Flag Coverage Δ
sei-chain 42.56% <100.00%> (+0.03%) ⬆️
sei-cosmos 38.03% <ø> (?)
sei-db 45.28% <84.21%> (+0.22%) ⬆️
sei-ibc-go 55.96% <ø> (ø)
sei-tendermint 47.52% <ø> (+0.01%) ⬆️
sei-wasmd 41.56% <ø> (ø)
sei-wasmvm 39.88% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/seidb.go 82.69% <100.00%> (ø)
sei-db/sc/memiavl/opts.go 100.00% <100.00%> (ø)
sei-db/sc/memiavl/tree.go 79.69% <100.00%> (+0.31%) ⬆️
sei-db/config/config.go 0.00% <0.00%> (ø)
sei-db/sc/memiavl/multitree.go 79.74% <90.00%> (+0.33%) ⬆️
sei-db/sc/store.go 0.00% <0.00%> (ø)
sei-db/sc/memiavl/db.go 63.25% <84.21%> (-0.24%) ⬇️
sei-db/sc/memiavl/snapshot.go 59.05% <84.61%> (+0.95%) ⬆️

... and 687 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

cursor bot pushed a commit that referenced this pull request Feb 5, 2026
- Analyzed snapshot write rate limiting implementation
- Validated design decisions and code quality
- Identified pre-existing test failure in storev2/rootmulti (unrelated)
- Recommendation: APPROVE with minor notes
- All tests passing, well-designed global rate limiter approach

Co-authored-by: Steven Landers <steven.landers@gmail.com>
Add optional I/O rate limiting for snapshot writes using token bucket
algorithm to prevent page cache eviction on systems with limited RAM.

Key changes:
- New config: sc-snapshot-write-rate-mbps (default: 0 = unlimited)
- Global limiter shared across all trees and files using rate.Limiter
- Backpressure propagates through entire write pipeline
- Added TestGlobalRateLimiterSharedAcrossWriters for validation

This addresses high I/O bursts from v6.3 snapshot optimization which
can impact page cache efficiency. With rate limiting, operators can
trade snapshot speed for more stable I/O patterns.

Recommended: 300 MB/s for validators with 128GB RAM (~3 hours per snapshot)
- Hardcode sc-snapshot-writer-limit to 4 (remove from app.toml)
- Change sc-snapshot-write-rate-mbps default from 0 to 300 MB/s

Rationale:
- writer-limit: With rate limiting, this mainly affects CPU overhead
  rather than I/O throughput. Fixed at 4 provides optimal balance.
- rate-limit: 300 MB/s default prevents page cache issues on 128GB RAM
  validators while maintaining reasonable snapshot time (~3 hours).

Users can override rate limit in app.toml (100 for conservative,
0 for high-end machines).
blindchaser and others added 3 commits February 6, 2026 14:27
Resolve conflicts in sei-db/sc/memiavl:
- db.go: keep CAS (pruningInProgress), snapshotWriteRateMBps, add closed guard from release
- db_test.go: use directory state check for prune completion

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants