Skip to content

node: add metrics and observability #441

@ch4r10t33r

Description

@ch4r10t33r

Summary

The leanspec-node Docker image (currently published as ghcr.io/leanethereum/leanspec-node) runs the lean consensus specification as a node process, but has no instrumentation. Adding metrics and observability will make it possible to monitor node health, track protocol behaviour, and debug issues in devnet/testnet deployments.

Motivation

  • Operators running leanspec-node have no visibility into internal state (slot progression, sync status, peer count, etc.)
  • Without structured metrics there is no way to alert on anomalies or compare behaviour across multiple node instances
  • Observability is a prerequisite for running a reliable multi-node devnet

Proposed Work

Metrics (Prometheus)

Expose a /metrics endpoint (default port 8008) with at minimum:

Metric Type Description
leanspec_slot_current Gauge Current slot number
leanspec_epoch_current Gauge Current epoch
leanspec_peers_connected Gauge Number of connected peers
leanspec_blocks_imported_total Counter Cumulative blocks imported
leanspec_attestations_processed_total Counter Cumulative attestations processed
leanspec_sync_distance Gauge Slots behind head (0 when synced)
leanspec_process_start_time_seconds Gauge Unix timestamp of node startup

Structured Logging

  • Switch to structured (JSON) log output so log aggregators (Loki, Datadog, etc.) can parse fields without regex
  • Include slot, epoch, and peer_id as log fields where relevant
  • Log level controllable via --log-level CLI flag or LEANSPEC_LOG_LEVEL env var

Health Endpoint

Expose GET /health (same port as metrics, or a separate --rpc-port) returning:

{ "status": "ok", "slot": 12345, "synced": true }

Returns 200 when synced, 503 otherwise — suitable for Docker/Kubernetes readiness probes.

Docker / Deployment

  • Expose port 8008 in the node stage of the Dockerfile
  • Document the metrics and health endpoints in DOCKER_QUICKSTART.md
  • Add a minimal docker-compose example with a Prometheus + Grafana sidecar

Acceptance Criteria

  • /metrics endpoint returns valid Prometheus text exposition format
  • /health endpoint returns correct synced field based on sync state
  • Structured JSON log output when --log-format json is set
  • Metrics port configurable via --metrics-port CLI flag
  • Dockerfile node stage exposes the metrics port
  • At least one example Grafana dashboard JSON included

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions