Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: force close from settled HTLC being falsely considered as dangling #9486

Open
yyforyongyu opened this issue Feb 6, 2025 · 2 comments
Labels
advanced Issues suitable for very experienced developers commitments Commitment transactions containing the state of the channel htlcswitch itests Issues related to integration tests. needs triage

Comments

@yyforyongyu
Copy link
Member

First seen from a macOS build, where an unexpected pending commit was found,

// When running in macOS, we might see three anchor sweeps - one from
// the local, one from the remote, and one from the pending remote:
// - the local one will be swept.
// - the remote one will be marked as failed due to `testmempoolaccept`
// check.
// - the pending remote one will not be attempted due to it being
// uneconomical since it was not used for CPFP.
// The anchor from the pending remote may or may not appear, which is a
// bug found only in macOS - when updating the commitments, the channel
// state machine somehow thinks we still have a pending remote
// commitment, causing it to sweep the anchor from that version.

A similar issue was found from this build, where the channel was force close due to a supposely settled HTLC still showing up as dangling,

2025-02-05 10:34:09.256 [DBG] CHIO dispatcher.go:271: Height[   899]: Waiting for consumer[ChannelArbitrator([ChanPoint: Bob=>Carol])] to process it
...
2025-02-05 10:34:09.259 [DBG] CNCT channel_arbitrator.go:1863: ChannelArbitrator([ChanPoint: Bob=>Carol]): checking commit chain actions at height=899, in_htlc_count=0, out_htlc_count=0
2025-02-05 10:34:09.259 [INF] CNCT channel_arbitrator.go:2176: ChannelArbitrator([ChanPoint: Bob=>Carol]): fail dangling htlc=2613a176b31693215e2bbde91f91e64d1b77f9ba1c5a8f912a9566fd7b2ee958 from local/remote commitments diff
...
2025-02-05 10:34:09.274 [DBG] CNCT channel_arbitrator.go:1669: ChannelArbitrator([ChanPoint: Bob=>Carol]): attempting state step with trigger=chainTrigger from state=StateBroadcastCommit at height=899
2025-02-05 10:34:09.274 [INF] CNCT channel_arbitrator.go:1110: ChannelArbitrator([ChanPoint: Bob=>Carol]): force closing chan

The HTLC was sent using Alice->Bob-Carol, which was settled by Carol and the messages were exchanged succesfully between Bob and Carol,

Bob's log:

2025-02-05 10:34:04.749 [DBG] PEER brontide.go:2412: Peer([Carol]): Sending UpdateAddHTLC(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, id=0, amt=100000000 mSAT, expiry=899, hash=2613a176b31693215e2bbde91f91e64d1b77f9ba1c5a8f912a9566fd7b2ee958, blinding_point=, custom_records=map[106823:[0]]) to [Carol]@127.0.0.1:12579
2025-02-05 10:34:04.820 [DBG] PEER brontide.go:2412: Peer([Carol]): Sending CommitSig(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, num_htlcs=1) to [Carol]@127.0.0.1:12579
2025-02-05 10:34:04.837 [DBG] PEER brontide.go:2412: Peer([Carol]): Received RevokeAndAck(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, rev=325d0ac53bebcb33c9f11008603745225cb09864e3a93409617b327cf6bfd1c7, next_point=02c3cc0c8476a82fa9d88ae42c6f8ce9e9c958f7b882445afa5a3ec25c7311c15f) from [Carol]@127.0.0.1:12579
2025-02-05 10:34:04.860 [DBG] PEER brontide.go:2412: Peer([Carol]): Received CommitSig(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, num_htlcs=1) from [Carol]@127.0.0.1:12579
2025-02-05 10:34:04.899 [DBG] PEER brontide.go:2412: Peer([Carol]): Sending RevokeAndAck(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, rev=1b0854bb0653ea3e152ef5b36e46865086bcf0639be806bb3b830caf5f1d45c5, next_point=02d25bc0770f95372bbe3c24b72a9b48f47c1842119e71ffe763f12aa590ce2787) to [Carol]@127.0.0.1:12579
...
2025-02-05 10:34:06.631 [DBG] PEER brontide.go:2412: Peer([Carol]): Received UpdateFulfillHTLC(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, id=0, pre_image=0203d0bbb138523f4816347f0383ac6d365da08e8559eb7ec4a51234537b32b0, custom_records=map[]) from [Carol]@127.0.0.1:12579
2025-02-05 10:34:06.784 [DBG] PEER brontide.go:2412: Peer([Carol]): Received CommitSig(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, num_htlcs=0) from [Carol]@127.0.0.1:12579
2025-02-05 10:34:06.896 [DBG] PEER brontide.go:2412: Peer([Carol]): Sending RevokeAndAck(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, rev=bb361f254b37fac430c767c81af41466d4ab3b01f380d4c7f8651f02da2e2779, next_point=0309fd7a2d8fa885c4f47ff92759d0f30a19145334926c1fc0c9861d5ca8d185cb) to [Carol]@127.0.0.1:12579
2025-02-05 10:34:06.970 [DBG] PEER brontide.go:2412: Peer([Carol]): Sending CommitSig(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, num_htlcs=0) to [Carol]@127.0.0.1:12579
2025-02-05 10:34:07.223 [DBG] PEER brontide.go:2412: Peer([Carol]): Received RevokeAndAck(chan_id=500cd2fde8f7711023574fbd0298cfecbe37eaf87a030ef8f5c802598ff72de5, rev=1b03081de96c5273140d6a85281293bfc6b6ab5c1070d3ad37f589d0c77267b3, next_point=02a3845815d0933f104af8cf84e61ff334de2a0cfe2d3709a2800124d0b257c8e2) from [Carol]@127.0.0.1:12579

Note that Bob received Carol's RevokeAndAck at 2025-02-05 10:34:07.223, but the msg was processed 2 seconds later, which indicates the process might be blocked,

2025-02-05 10:34:09.526 [ERR] HSWC link.go:4494: ChannelLink([ChanPoint: Bob=>Carol]): failing link: unable to accept revocation: cannot mutate borked channel with error: invalid revocation
2025-02-05 10:34:09.526 [ERR] HSWC link.go:1359: ChannelLink([ChanPoint: Bob=>Carol]): link failed, exiting htlcManager
2025-02-05 10:34:09.526 [INF] HSWC link.go:1258: ChannelLink([ChanPoint: Bob=>Carol]): exited

This could be a flake, but need more investigation to be sure. Included the logs here for future analysis.
72-sweep_cpfp_anchor_incoming_timeout-Alice-03f29905.log
73-sweep_cpfp_anchor_incoming_timeout-Bob-026ee264.log
74-sweep_cpfp_anchor_incoming_timeout-Carol-02806e9d.log

@yyforyongyu yyforyongyu added advanced Issues suitable for very experienced developers commitments Commitment transactions containing the state of the channel htlcswitch itests Issues related to integration tests. needs triage labels Feb 6, 2025
@yyforyongyu
Copy link
Member Author

yyforyongyu commented Feb 19, 2025

Should be fixed by #9501

@yyforyongyu
Copy link
Member Author

Reopen as it's seen again here.

@yyforyongyu yyforyongyu reopened this Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
advanced Issues suitable for very experienced developers commitments Commitment transactions containing the state of the channel htlcswitch itests Issues related to integration tests. needs triage
Projects
None yet
Development

No branches or pull requests

1 participant