Skip to content

0.1.5 backports #3932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 16, 2025
Merged

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Jul 15, 2025

This backports #3687, #3858, #3890, #3907, #3923, #3933.

@TheBlueMatt TheBlueMatt added this to the 0.1.4 milestone Jul 15, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jul 15, 2025

I've assigned @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

TheBlueMatt and others added 5 commits July 15, 2025 17:53
`test_node_counter_consistency` can make gossip operations *really*
slow. This makes it a pretty bad idea in a general node just
running in debug mode. It also makes our
`lightning-rapid-gossip-sync` real-world test painfully slow.

Thus, here, we make `test_node_counter_consistency` only actually
run in the `lightning`-crate tests, rather than always with
`debug_assertions`.
When a channel is force-closed, there might be blocked monitor updates
not yet applied. But `latest_monitor_update_id` has been incremented and
assigned to these updates. This results in a panic when trying to apply
the `ChannelForceClosed` update. Use the unblocked update id instead.

Resolves: lightningdevkit#3857

Conflicts resolved in:
 * lightning/src/ln/channel.rs due to `rustfmt`-induced changes.
`RouteGraphNode` is the main heap entry in our dijkstra's next-best
heap. Thus, because its rather constantly being sorted, we care a
good bit about its size as fitting more of them on a cache line can
provide some additional speed.

In 43d250d, we switched from
tracking nodes during pathfinding by their `NodeId` to a "counter"
which allows us to avoid `HashMap`s lookups for much of the
pathfinding process.

Because the `dist` lookup is now quite cheap (its just a `Vec`),
there's no reason to track `NodeId`s in the heap entries. Instead,
we simply fetch the `NodeId` of the node via the `dist` map by
examining its `candidate`'s pointer to its source `NodeId`.

This allows us to remove a `NodeId` in `RouteGraphNode`, moving it
from 64 to 32 bytes. This allows us to expand the `score` field
size in a coming commit without expanding `RouteGraphNode`'s size.

While we were doing the `dist` lookup in
`add_entries_to_cheapest_to_target_node` anyway, the `NodeId`
lookup via the `candidate` may not be free. Still, avoiding
expanding `RouteGraphNode` above 128 bytes in a few commits is a
nice win.
We track the total CLTV from the recipient to the current hop in
`RouteGraphNode` so that we can limit its total during pathfinding.
While its great to use a `u32` for that to match existing CLTV
types, allowing a total CLTV limit of 64K blocks (455 days) is
somewhat absurd, so here we swap the `total_cltv_delta` to a `u16`.

This keeps `RouteGraphNode` to 32 bytes in a coming commit as we
expand `score`.
While walking nodes in our Dijkstra's pathfinding, we may find a
channel which is amount-limited to less than the amount we're
currently trying to send. This is fine, and when we encounter such
nodes we simply limit the amount we'd send in this path if we pick
the channel.

When we encounter such a path, we keep summing the cost across hops
as we go, keeping whatever scores we assigned to channels between
the amount-limited one and the recipient, but using the new limited
amount for any channels we look at later as we walk towards the
sender.

This leads to somewhat inconsistent scores, especially as our
scorer assigns a large portion of its penalties and a portion of
network fees are proportional to the amount. Thus, we end up with a
somewhat higher score than we "should" for this path as later hops
use a high proportional cost. We accepted this as a simple way to
bias against small-value paths and many MPP parts.

Sadly, in practice it appears our bias is not strong enough, as
several users have reported that we often attempt far too many MPP
parts. In practice, if we encounter a channel with a small limit
early in the Dijkstra's pass (towards the end of the path), we may
prefer it over many other paths as we start assigning very low
costs early on before we've accumulated much cost from larger
channels.

Here, we swap the `cost` Dijkstra's score for `cost / path amount`.
This should bias much stronger against many MPP parts by preferring
larger paths proportionally to their amount.

This somewhat better aligns with our goal - if we have to pick
multiple paths, we should be searching for paths the optimize
fee-per-sat-sent, not strictly the fee paid.

However, it might bias us against smaller paths somewhat stronger
than we want - because we're still using the fees/scores calculated
with the sought amount for hops processed already, but are now
dividing by a smaller sent amount when walking further hops, we
will bias "incorrectly" (and fairly strongly) against smaller
parts.

Still, because of the complaints on pathfinding performance due to
too many MPP paths, it seems like a worthwhile tradeoff, as
ultimately MPP splitting is always the domain of heuristics anyway.
TheBlueMatt and others added 7 commits July 15, 2025 21:49
In `handle_new_monitor_update!`, we correctly check that the
channel doesn't have any blocked monitor updates pending before
calling `handle_monitor_update_completion!` (which calls
`Channel::monitor_updating_restored`, which in turn assumes that
all generated `ChannelMonitorUpdate`s, including blocked ones, have
completed).

We, however, did not do the same check at several other places
where we called `handle_monitor_update_completion!`. Specifically,
after a monitor update completes during reload (processed via a
`BackgroundEvent` or when monitor update completes async, we didn't
check if there were any blocked monitor updates before completing).

Here we add the missing check, as well as an assertion in
`Channel::monitor_updating_restored`.

Conflicts resolved in:
 * lightning/src/ln/chanmon_update_fail_tests.rs due to
   `rustfmt`-induced changes as well as other tests cleanups.
 * lightning/src/ln/channelmanager.rs due to upstream Channel object
   refactoring
 * lightning/src/ln/quiescence_tests.rs which were dropped as they
   were fixing a test which only exists upstream
In a number of tests we require available UTXOs to do HTLC anchor
claims by bringing our own fees. We previously wrote that out in
each test, which is somewhat verbose, so here we simply add a test
utility that gives each node a full BTC in a single UTXO.

Trivial conflicts resolved in:
 * lightning/src/ln/monitor_tests.rs
We have to prune locktimed packages when their inputs are spent,
otherwise the notification of the watched outputs might be missed. This
can lead to locktimed packages with spent inputs being added back to
the pending claim requests in the future, and they are never cleaned
up until node restart.

Resolves: lightningdevkit#3859

Conflicts resolved in:
 * lightning/src/ln/functional_tests.rs due to upstream changes of
   removed code
 * lightning/src/ln/monitor_tests.rs due to trivial upstream
   changes
When we have an outpoint to claim which is lock-timed and the
locktime is reached, we add it to
`OnchainTxHandler::claimable_outpoints` to indicate the outpoint is
now being claimed. However, `claimable_outpoints` is supposed to
track when the outpoint first appeared on chain so that we can
remove the claim if the outpoint is reorged out.

Sadly, in the handling for lock-timed packages, we incorrectly
stored the current height in `claimable_outpoints`, causing such
claims to be removed in case of a reorg right after they were
generated, even if the output we intend to claim isn't removed at
all.

Here we start tracking when the outpoint we're spending was created
in `PackageSolvingData`'s constituent types. While we could have
tracked this information in `PackageTemplate`, it would preclude
later merging packages that are spending outpoints included in
different blocks, which we don't necessarily want to do.

Conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs,
 * lightning/src/chain/onchaintx.rs, and
 * lightning/src/chain/package.rs due to upstream changes to
   package struct fields.
When we have an outpoint to claim which is lock-timed and the
locktime is reached, we add it to
`OnchainTxHandler::claimable_outpoints` to indicate the outpoint is
now being claimed. However, `claimable_outpoints` is supposed to
track when the outpoint first appeared on chain so that we can
remove the claim if the outpoint is reorged out.

Sadly, in the handling for lock-timed packages, we incorrectly
stored the current height in `claimable_outpoints`, causing such
claims to be removed in case of a reorg right after they were
generated, even if the output we intend to claim isn't removed at
all.

Here we use the creation-height tracking added in the previous
commit to actually address the issue, using the tracked height when
adding a claim to `OnchainTxHandler::claimable_outpoints`.

In cases where we have no information, we continue to use the
current height, retaining the issue for locktimed packages on
upgrades, but this simplifies cases where we actually don't have
the information available anyway.

Trivial conflicts resolved in:
 * lightning/src/chain/package.rs
This adds a single test which exercises both the ability to prune
locktimed packages when inputs are spent as well as the
creation-height tracking for locktimed packages.

Trivial conflicts resolved in:
 * lightning/src/ln/reorg_tests.rs
`next_remote_commit_tx_fee_msat` previously mistakenly classified HTLCs
with values equal to the dust limit as dust.

This did not cause any force closes because the code that builds
commitment transactions for signing correctly trims dust HTLCs.

Nonetheless, this can cause `next_remote_commit_tx_fee_msat` to predict
a weight for the next remote commitment transaction that is
significantly lower than the eventual weight. This allows a malicious
channel funder to create an unbroadcastable commitment for the channel
fundee by adding HTLCs with values equal to the dust limit to the
commitment transaction; according to the fundee, the funder has not
exhausted their reserve because all the added HTLCs are dust, while in
reality all the HTLCs are non-dust, and the funder does not have the
funds to pay the minimum feerate to enter the mempool.

Conflicts resolved in:
 * lightning/src/ln/htlc_reserve_unit_tests.rs which is a new file
   upstream. The new test was instead moved to
   lightning/src/ln/functional_tests.rs and rewritten where the
   upstream API has changed (in some cases nontrivially).
@TheBlueMatt TheBlueMatt force-pushed the 2025-07-0.1.4-backports branch from 820e2c8 to 382e71b Compare July 15, 2025 21:50
@TheBlueMatt TheBlueMatt mentioned this pull request Jul 15, 2025
@TheBlueMatt TheBlueMatt merged commit 6e6e4dc into lightningdevkit:0.1 Jul 16, 2025
18 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants