Skip to content

Conversation

@tnull
Copy link
Contributor

@tnull tnull commented Jan 12, 2026

NetworkGraph: Update node and channel count estimates for Jan 2026
NetworkGraph: Determine pre-allocations using actual numbers when reading

When reading a persisted network graph, we previously pre-allocated our
default node/channels estimate count for the respective `IndexedMap`
capacities. However, this might unnecessarily allocate memory on
reading, for example if we have an (almost) empty network graph for one
reason or another. As we have the actual counts of persisted nodes and
channels available, we here simply opt to allocate these numbers (plus
15%). This will also ensure that our pre-allocations will keep
up-to-date over time as the network grows or shrinks.
NetworkGraph: One pre-allocate memory on mainnet

Previously, we'd always pre-allocate memory for the node and channel
maps based on mainnet numbers, even if we're on another network like
`Regest`. Here, we only apply the estimates if we're actually on
`Network::Bitcoin`, which should reduce the `NetworkGraph`'s memory
footprint considerably in tests.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jan 12, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.62%. Comparing base (c722443) to head (39ca7cb).
⚠️ Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/routing/gossip.rs 86.66% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4306      +/-   ##
==========================================
+ Coverage   86.59%   86.62%   +0.03%     
==========================================
  Files         158      158              
  Lines      102408   102730     +322     
  Branches   102408   102730     +322     
==========================================
+ Hits        88678    88989     +311     
- Misses      11309    11321      +12     
+ Partials     2421     2420       -1     
Flag Coverage Δ
fuzzing 37.10% <20.00%> (+0.22%) ⬆️
tests 85.91% <86.66%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

///
/// To improve efficiency, this will pre-allocate memory for `node_count_estimate` nodes and
/// `channel_count_estimate` channels.
pub fn from_node_and_channel_count_estimates(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty skeptical of exposing this. If you don't want a network graph, don't build one. If you want a network graph, we should allocate for a network graph. How is a downstream dev better positioned to provide estimates here than we are?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, from my point of view it is a bit odd to pre-allocate a lot of memory based on static estimations that will become stale over time. Plus, we currently make no distinction based on Network here, so we will always allocate that much memory even for small Regtest or Signet environments with only a hand full of nodes (e.g., in tests). If you're skeptical, maybe we can drop the extra constructor, but leave the dynamic allocation on read, and only apply the estimates for Network::Bitcoin?

Of course, I have to admit that I first had planned to use that constructor for the minimal-mode Node over at LDK Node, but you're right, for that application we probably need to completely change up our types there and simply rip out anything related to the Router/NetworkGraph entirely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, from my point of view it is a bit odd to pre-allocate a lot of memory based on static estimations that will become stale over time.

Sure, but do we really expect downstream devs to update their estimates more often than us?

Plus, we currently make no distinction based on Network here, so we will always allocate that much memory even for small Regtest or Signet environments with only a hand full of nodes (e.g., in tests).

Yea, we should definitely use Network to disable the pre-allocation.

If you're skeptical, maybe we can drop the extra constructor, but leave the dynamic allocation on read, and only apply the estimates for Network::Bitcoin?

Yea, makes sense. We could still use the constants on mainnet loads, even, to ensure we pre-allocate even if loading with an empty (or partially-synced) graph but I guess it doesn't matter that much either way.

Of course, I have to admit that I first had planned to use that constructor for the minimal-mode Node over at LDK Node, but you're right, for that application we probably need to completely change up our types there and simply rip out anything related to the Router/NetworkGraph entirely.

Yea, I mean it would be simpler for LDK Node, sure, but it definitely feels like the wrong way :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, will drop the first commit then, and add one that disables the estimates on non-mainnet networks.

Copy link
Contributor Author

@tnull tnull Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, let me know if I can squash.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please.

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@tnull tnull requested a review from TheBlueMatt January 14, 2026 11:19
@TheBlueMatt TheBlueMatt removed the request for review from wpaulino January 14, 2026 12:27
@tnull tnull force-pushed the 2026-01-allow-empty-network-graph branch 2 times, most recently from 9138bb2 to 070002c Compare January 14, 2026 14:20
@tnull
Copy link
Contributor Author

tnull commented Jan 14, 2026

Squashed without further changes.

Comment on lines 1795 to 1800
const CHAN_COUNT_ESTIMATE: usize = 50_000;
/// In Jan, 2026 there were about 13K nodes
///
/// We over-allocate by a bit because 15% more is better than the double we get if we're slightly
/// too low.
const NODE_COUNT_ESTIMATE: usize = 15_000;
Copy link
Collaborator

@TheBlueMatt TheBlueMatt Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, oops, these are actives. My node (restarted less than a week ago, i believe, so hasn't pruned channels where one side has been disabled for a week) shows network_nodes: 17013, network_channels: 54264. Thus, if we use 50k/15k what we'll actually end up with is 100k/30k as the allocation (I assume vec doubles? maybe not for large allocations?)

Copy link
Contributor Author

@tnull tnull Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can bump the numbers a bit if you prefer:

> git diff-tree -U2 508d806b0 39ca7cb1d
diff --git a/lightning/src/routing/gossip.rs b/lightning/src/routing/gossip.rs
index 9ebefceb7..534bebe76 100644
--- a/lightning/src/routing/gossip.rs
+++ b/lightning/src/routing/gossip.rs
@@ -1691,8 +1691,6 @@ where
                let channels_map_capacity = (channels_count as u128 * 115 / 100)
                        .try_into()
+                       .map(|v: usize| v.min(MAX_CHAN_COUNT_LIMIT))
                        .map_err(|_| DecodeError::InvalidValue)?;
-               if channels_map_capacity > MAX_CHAN_COUNT_LIMIT {
-                       return Err(DecodeError::InvalidValue);
-               }
                let mut channels = IndexedMap::with_capacity(channels_map_capacity);
                for _ in 0..channels_count {
@@ -1708,9 +1706,8 @@ where
                }
                // Pre-allocate 115% of the known channel count to avoid unnecessary reallocations.
-               let nodes_map_capacity: usize =
-                       (nodes_count as u128 * 115 / 100).try_into().map_err(|_| DecodeError::InvalidValue)?;
-               if nodes_map_capacity > MAX_NODE_COUNT_LIMIT {
-                       return Err(DecodeError::InvalidValue);
-               }
+               let nodes_map_capacity: usize = (nodes_count as u128 * 115 / 100)
+                       .try_into()
+                       .map(|v: usize| v.min(MAX_NODE_COUNT_LIMIT))
+                       .map_err(|_| DecodeError::InvalidValue)?;
                let mut nodes = IndexedMap::with_capacity(nodes_map_capacity);
                for i in 0..nodes_count {

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

let channels_map_capacity = (channels_count as u128 * 115 / 100)
.try_into()
.map_err(|_| DecodeError::InvalidValue)?;
if channels_map_capacity > MAX_CHAN_COUNT_LIMIT {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe someone is doing a 1M channels test? Let's just limit the pre-allocation rather than failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm following? If the concern is that we read a network graph that is somehow so large and we'd allocate 'infinite memory', we'd want to fail here?

If we don't think that we'd ever run into that we don't need a hard limit at all?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 10M nodes and 100M channels is "infinite memory". eg a while back rusty did his "million channels" test to see if he could run a network with 1M channels and figure out what the bottlenecks were. I don't think its crazy to think someone might do a 100M-and-1-channel-test :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I'm then not sure I understand the main concern in the first place. Are we considering a case where we read a maliciously crafted NetworkGraph dump? Or when else would we ever hit the 'infinite memory' case you requested the upper-bound for above #4306 (comment) ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern is if we read a corrupt NetworkGraph and the value here returns 2^64 then we'll immediately crash trying to allocate, rather than failing to read with an error (when we run out of bytes to read because the file we're reading isn't 2^64 bytes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but then returning an error if it surpasses the limits is the right thing? Or are you just saying we should raise the limits further? If so, please suggest some numbers, happy to adjust.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most of the rest of the crate we avoid picking failure constants here by just allocating min(read-count, reasonable-max). IMO we should do the same here. Its entirely possible that a value of 100M channels is a corrupt graph and we should avoid pre-allocating it. Its also possible that someone is doing some benchmarking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, not sure if I entirely follow the logic, but no strong opinion here. Force-pushed the min approach:

> git diff-tree -U2 508d806b0 32c4362a4
diff --git a/lightning/src/routing/gossip.rs b/lightning/src/routing/gossip.rs
index 9ebefceb7..95dd5a7aa 100644
--- a/lightning/src/routing/gossip.rs
+++ b/lightning/src/routing/gossip.rs
@@ -1691,8 +1691,6 @@ where
                let channels_map_capacity = (channels_count as u128 * 115 / 100)
                        .try_into()
-                       .map_err(|_| DecodeError::InvalidValue)?;
-               if channels_map_capacity > MAX_CHAN_COUNT_LIMIT {
-                       return Err(DecodeError::InvalidValue);
-               }
+                       .map_err(|_| DecodeError::InvalidValue)?
+                       .min(MAX_CHAN_COUNT_LIMIT);
                let mut channels = IndexedMap::with_capacity(channels_map_capacity);
                for _ in 0..channels_count {
@@ -1708,9 +1706,8 @@ where
                }
                // Pre-allocate 115% of the known channel count to avoid unnecessary reallocations.
-               let nodes_map_capacity: usize =
-                       (nodes_count as u128 * 115 / 100).try_into().map_err(|_| DecodeError::InvalidValue)?;
-               if nodes_map_capacity > MAX_NODE_COUNT_LIMIT {
-                       return Err(DecodeError::InvalidValue);
-               }
+               let nodes_map_capacity: usize = (nodes_count as u128 * 115 / 100)
+                       .try_into()
+                       .map_err(|_| DecodeError::InvalidValue)?
+                       .min(MAX_NODE_COUNT_LIMIT);
                let mut nodes = IndexedMap::with_capacity(nodes_map_capacity);
                for i in 0..nodes_count {

@tnull tnull force-pushed the 2026-01-allow-empty-network-graph branch from 070002c to 508d806 Compare January 15, 2026 10:01
@tnull tnull requested a review from TheBlueMatt January 15, 2026 10:02
@tnull tnull moved this to Goal: Merge in Weekly Goals Jan 15, 2026
@tnull tnull self-assigned this Jan 15, 2026
@TheBlueMatt TheBlueMatt removed their request for review January 15, 2026 19:23
@tnull tnull force-pushed the 2026-01-allow-empty-network-graph branch 2 times, most recently from 05a4286 to 32c4362 Compare January 16, 2026 08:10
tnull added 2 commits January 16, 2026 09:15
…eading

When reading a persisted network graph, we previously pre-allocated our
default node/channels estimate count for the respective `IndexedMap`
capacities. However, this might unnecessarily allocate memory on
reading, for example if we have an (almost) empty network graph for one
reason or another. As we have the actual counts of persisted nodes and
channels available, we here simply opt to allocate these numbers (plus
15%). This will also ensure that our pre-allocations will keep
up-to-date over time as the network grows or shrinks.
Previously, we'd always pre-allocate memory for the node and channel
maps based on mainnet numbers, even if we're on another network like
`Regest`. Here, we only apply the estimates if we're actually on
`Network::Bitcoin`, which should reduce the `NetworkGraph`'s memory
footprint considerably in tests.
@tnull tnull force-pushed the 2026-01-allow-empty-network-graph branch from 32c4362 to 39ca7cb Compare January 16, 2026 08:15
@tnull tnull changed the title NetworkGraph: Allow to construct with custom count estimates, update numbers, determine pre-allocation dynamically on read NetworkGraph: Update chan/node estimation numbers, determine pre-allocation dynamically on read Jan 16, 2026
@tnull tnull requested a review from TheBlueMatt January 16, 2026 09:44
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@TheBlueMatt TheBlueMatt merged commit 4281e49 into lightningdevkit:main Jan 16, 2026
20 checks passed
@github-project-automation github-project-automation bot moved this from Goal: Merge to Done in Weekly Goals Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants