Skip to content

feat: add heap memory profiling with jemalloc#180

Merged
pablodeymo merged 3 commits intomainfrom
jemalloc-heap-profiling
Mar 3, 2026
Merged

feat: add heap memory profiling with jemalloc#180
pablodeymo merged 3 commits intomainfrom
jemalloc-heap-profiling

Conversation

@MegaRedHand
Copy link
Collaborator

This PR adds jemalloc as a global allocator, with heap memory profiling, similar to how we use it in ethrex.

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🤖 Kimi Code Review

Review Summary

This PR adds jemalloc-based heap profiling capabilities to ethlambda. The implementation is well-structured with proper platform-specific handling. Here are my findings:

Issues Found

  1. Security concern in bin/ethlambda/src/main.rs:11: The malloc_conf is exposed as a public symbol via #[unsafe(export_name = "malloc_conf")]. This could potentially be overwritten by malicious code. Consider making it static mut with proper synchronization or using jemalloc's API to configure at runtime instead.

  2. Error handling inconsistency in crates/net/rpc/src/heap_profiling.rs:

    • Lines 15-17 and 42-44 return StatusCode::INTERNAL_SERVER_ERROR when profiling is not enabled, but lines 60-66 return StatusCode::NOT_IMPLEMENTED. These should be consistent - NOT_IMPLEMENTED is more appropriate for disabled features.
  3. Missing feature flag in Cargo.toml: The jemalloc dependencies are added unconditionally but only used on non-MSVC targets. Consider adding a Cargo feature to make this optional.

  4. Potential race condition: The PROF_CTL.as_ref() access in heap_profiling.rs could benefit from a comment explaining that jemalloc_pprof initializes this safely.

Suggestions

  1. Add rate limiting: The profiling endpoints could be expensive. Consider adding basic rate limiting or authentication for /debug/pprof/* endpoints.

  2. Document memory overhead: Add comments documenting the memory overhead of enabling profiling (approximately 1-2% extra memory usage).

  3. Consider using #[cfg(feature = "jemalloc")]: Instead of just checking for MSVC, make this a proper Cargo feature for better control.

Positive Aspects

  • Clean separation of platform-specific code using cfg attributes
  • Proper error handling with appropriate HTTP status codes
  • Good integration with existing RPC structure
  • Clear logging of allocator choice at startup

Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🤖 Codex Code Review

  1. High: unauthenticated heap-profiler endpoints are exposed on the main RPC listener
  • Code: crates/net/rpc/src/lib.rs:14, crates/net/rpc/src/lib.rs:47, crates/net/rpc/src/heap_profiling.rs:11
  • Risk: /debug/pprof/allocs* is publicly reachable anywhere this RPC port is reachable. Attackers can repeatedly trigger expensive profile dumps (DoS vector) and extract process internals (allocation/symbol patterns).
  • Suggestion: gate behind an explicit --enable-debug-endpoints flag and/or compile-time feature; bind debug routes to localhost/admin port; add auth + rate limiting.
  1. Medium: profiling is forced on globally in non-MSVC builds
  • Code: bin/ethlambda/src/main.rs:11
  • Risk: prof_active:true adds continuous allocation sampling overhead in production. For a consensus client, this can impact slot-time headroom under load.
  • Suggestion: default to prof_active:false and enable profiling only on demand (admin-only switch).
  1. Medium: blocking profiler dump work runs directly inside async handlers
  1. Low: internal error details are returned to clients

Consensus-specific note: this PR does not modify fork choice, attestation validation, justification/finalization, state transition, XMSS, or SSZ logic, so I found no direct consensus-correctness regressions in scope.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🤖 Claude Code Review

Now I have everything I need for the review. Here's my analysis:


PR Review: feat: add heap memory profiling with jemalloc

Summary: Adds tikv-jemallocator as a global allocator with jemalloc heap profiling enabled, and exposes /debug/pprof/allocs and /debug/pprof/allocs/flamegraph HTTP endpoints through the existing RPC server.


Issues

cfg guard mismatch — profiling overhead on non-Linux platforms (main.rs:8-11)

The allocator and malloc_conf are gated on cfg(not(target_env = "msvc")), but the working profiling endpoints are gated on cfg(target_os = "linux"). On macOS (non-MSVC, non-Linux), jemalloc is installed as the global allocator and profiling is activated via malloc_conf (prof:true,prof_active:true), but the endpoints return 501 Not Implemented. The jemalloc_pprof crate's own docs note macOS has limitations with profiling. This is dead overhead.

// main.rs:8-11 — should also be cfg(target_os = "linux") to match the endpoints
#[cfg(not(target_env = "msvc"))]
#[allow(non_upper_case_globals)]
#[unsafe(export_name = "malloc_conf")]
static malloc_conf: &[u8] = b"prof:true,prof_active:true,lg_prof_sample:19\0";

Suggestion: gate malloc_conf (and arguably the ALLOC declaration, or at least the profiling features in Cargo.toml) on cfg(target_os = "linux") to align with where the endpoints are actually functional.


Unauthenticated debug endpoints on the shared metrics port (lib.rs:46-57, heap_profiling.rs:11-46)

The /debug/pprof/allocs and /debug/pprof/allocs/flamegraph endpoints are wired into the same server as /metrics and /lean/v0/* with no authentication or access control. The default bind is 127.0.0.1:5054, which is safe by default, but operators running in Docker or cloud environments often override --metrics-address 0.0.0.0 to scrape metrics externally. In that case, heap profiles (which may contain memory layout information) become externally accessible.

A comment or doc note warning that the debug endpoints should not be exposed publicly would be appropriate. Alternatively, bind the debug router to a separate listener address — though that adds complexity.


jemalloc_pprof as an unconditional dependency of the RPC library crate (crates/net/rpc/Cargo.toml)

Adding jemalloc_pprof directly to ethlambda-rpc (a library crate) pulls in ~300 new entries to Cargo.lock — prost, inferno, backtrace, num-complex, num-rational, flate2, gimli, etc. — for every downstream user of this crate, including in tests. A heap-profiling feature flag would isolate this to when it's actually needed:

# crates/net/rpc/Cargo.toml
[features]
heap-profiling = ["dep:jemalloc_pprof"]

[dependencies]
jemalloc_pprof = { workspace = true, optional = true }

Minor Notes

  • handle_get_heap / handle_get_heap_flamegraph both take the same prof_ctl mutex (heap_profiling.rs:16, 37). Dumping a profile involves file I/O and can be slow, blocking the mutex for the duration. Concurrent requests to both endpoints will queue behind each other. This is fine for a debug tool but worth being aware of.

  • The #[allow(non_upper_case_globals)] + #[unsafe(export_name = "malloc_conf")] pattern (main.rs:9-11) is correct and idiomatic for jemalloc configuration — well done.

  • lg_prof_sample:19 (sample every 512 KB) is a reasonable default with ~1–3% overhead. Good choice.

  • When PROF_CTL.as_ref() returns None, the response is 500 INTERNAL_SERVER_ERROR with the body "Heap profiling not enabled" (heap_profiling.rs:13-15). Semantically 503 SERVICE_UNAVAILABLE or 501 NOT_IMPLEMENTED would be closer, but this is minor and consistent with jemalloc_pprof conventions.


Summary

The main actionable items are:

  • Align the cfg guard on malloc_conf with cfg(target_os = "linux") to avoid silent profiling overhead on macOS.
  • Add a note or doc warning about the debug endpoints being exposed on the shared metrics port.
  • Consider a feature flag to avoid bloating the ethlambda-rpc build graph unnecessarily.

The core implementation — the jemalloc allocator setup, the malloc_conf symbol export, the cfg-based platform dispatch in heap_profiling.rs, and the two endpoint handlers — is correct and follows established patterns (matching what ethrex does).


Automated review by Claude (Anthropic) · custom prompt

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR integrates jemalloc as the global allocator for the ethlambda binary and adds heap memory profiling endpoints (/debug/pprof/allocs and /debug/pprof/allocs/flamegraph) to the RPC server, following a pattern already used in the sibling ethrex project.

Allocator & Profiling: tikv-jemallocator is registered as the global allocator on all non-MSVC targets; a malloc_conf static enables jemalloc's built-in profiler at startup with sampling (lg_prof_sample:19).

Endpoints: The new heap_profiling module in the RPC crate exposes pprof and flamegraph endpoints. These are correctly gated to Linux with a 501 Not Implemented fallback for other platforms.

Platform Inconsistency: The malloc_conf static is guarded by #[cfg(not(target_env = "msvc"))], which includes macOS. However, the profiling dump endpoints only compile on Linux. This causes a logic mismatch: on macOS, jemalloc profiling overhead is incurred during runtime but there is no way to retrieve the profiling data, resulting in wasted CPU cycles.

Confidence Score: 2/5

  • Safe to merge from a security perspective, but contains a logic bug where profiling overhead is incurred on macOS without a way to retrieve the data.
  • The PR introduces a platform-specific configuration inconsistency: jemalloc profiling is enabled on all non-MSVC platforms (including macOS) but the profiling dump endpoints only compile on Linux. This causes unnecessary profiling overhead on macOS with no way to retrieve the data. The core functionality is correct and well-structured, but this configuration mismatch should be resolved. The suggested fix (narrowing the malloc_conf cfg guard to Linux-only) is straightforward.
  • bin/ethlambda/src/main.rs — the malloc_conf cfg guard should be narrowed to #[cfg(target_os = "linux")] to match the endpoint availability and prevent unnecessary profiling overhead on non-Linux platforms.

Sequence Diagram

sequenceDiagram
    participant Client
    participant RpcServer
    participant HeapProfilingHandler
    participant JemallocProfCtl

    Client->>RpcServer: GET /debug/pprof/allocs
    RpcServer->>HeapProfilingHandler: handle_get_heap()
    alt Linux
        HeapProfilingHandler->>JemallocProfCtl: PROF_CTL.as_ref()
        alt profiling enabled
            JemallocProfCtl-->>HeapProfilingHandler: Some(prof_ctl)
            HeapProfilingHandler->>JemallocProfCtl: lock().await
            HeapProfilingHandler->>JemallocProfCtl: dump_pprof()
            JemallocProfCtl-->>HeapProfilingHandler: Ok(pprof_bytes)
            HeapProfilingHandler-->>RpcServer: 200 application/octet-stream
        else profiling not enabled
            JemallocProfCtl-->>HeapProfilingHandler: None
            HeapProfilingHandler-->>RpcServer: 500 Internal Server Error
        end
    else non-Linux
        HeapProfilingHandler-->>RpcServer: 501 Not Implemented
    end
    RpcServer-->>Client: Response

    Client->>RpcServer: GET /debug/pprof/allocs/flamegraph
    RpcServer->>HeapProfilingHandler: handle_get_heap_flamegraph()
    alt Linux
        HeapProfilingHandler->>JemallocProfCtl: dump_flamegraph()
        JemallocProfCtl-->>HeapProfilingHandler: Ok(svg)
        HeapProfilingHandler-->>RpcServer: 200 image/svg+xml
    else non-Linux
        HeapProfilingHandler-->>RpcServer: 501 Not Implemented
    end
    RpcServer-->>Client: Response
Loading

Last reviewed commit: 2110013

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@pablodeymo pablodeymo merged commit 77d48e2 into main Mar 3, 2026
3 checks passed
@pablodeymo pablodeymo deleted the jemalloc-heap-profiling branch March 3, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants