-
Notifications
You must be signed in to change notification settings - Fork 2.2k
multi: update close logic to handle re-orgs of depth n-1, where n is num confs - add min conf floor #10265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Roasbeef, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a major overhaul to the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant improvements to the channel closing logic, making it more robust against chain reorganizations by implementing a state machine in the chainWatcher
. It also adds a more nuanced, capacity-based confirmation scaling for channel closes. The changes are well-structured, with logic centralized in new files like lnwallet/confscale.go
and supported by an impressive suite of new tests, including property-based tests for reorg scenarios. My review found a couple of minor areas for improvement: one related to potentially redundant code in peer/brontide.go
and a naming suggestion in lnwallet/confscale_prod.go
for better readability.
Looking into the itest failure, I think it's due to a change in how we notify the coop close. |
Pushed up a few fix up commits. One thing I realized is that the old blockbeat sync processing assumptions no longer apply. This is due to the fact that we won't call IMO, the best way to handle this may just be to add a devrpc call that we can use to force a sweep. |
// notifications are done in two different goroutines, so the | ||
// expected order: [receive block -> receive spend] is not | ||
// guaranteed . | ||
case spend, ok := <-c.fundingSpendNtfn.Spend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like an easier and more direct way to resolve this is to listen to a conf channel instead of a spend channel? so replace c.fundingSpendNtfn
with a c.fundingConfNtfn
, and other logic can stay untouched?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don't know what to listen for conf of, until we get the spend. Hence the need to request confs for each spend we get. With the way things work, if a spend confs, then is re-org'd out, then we get another spend, we'll get another notification on this same spend channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't been clear enough offline, what I meant is, we can just listen for the spendness of the fundingPkScript
with a nil tx via
c.cfg.notifier.RegisterConfirmationsNtfn(nil, fundingPkScript, numConfs, beat.CurrentHeight())
Then we can now replace the usage of c.fundingSpendNtfn
with c.fundingConfNtfn
, and the change will be smaller.
case spend, ok := <-c.fundingSpendNtfn.Spend: | ||
// If the channel was closed, then this means that the | ||
// notifier exited, so we will as well. | ||
spend := c.handleBlockbeat(beat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a reorg happens, a replaced block will be sent from the blockbeat, and we can ask the subsystem to reprocess this new block; hence the reorg is handled. That was part of the initial design so we can have a single piece of logic to handle reorg, assuming the blockbeat
is extended to other subsystems like funding manager or gossip. Atm the blockbeat
only gives the block height data, which is very much limited, and it can easily carry more info like utxos spent so we can just use a callback query to check for spending of interested outpoints. This will have the advantage that we now have a sync, single source of truth to check for spendness. This linear flow makes the code easier to reason and maintain.
These are long-term projects tho, just want to mention since it's related to the reorg handling.
We have two versions: for itests, we just use one conf, but in prod, we'll scale the number of confirmations.
This'll be useful for the set up upcoming itests.
In this commit, we add a new param that'll allow us to scale up the number of confirmations before we act on a new close. We'll use this later to improve the current on chain handling logic.
We wnt to add better handling, but not break any UIs or wallets. So we'll continue to send out a notification after a single confirmation, then send another after things are fully confirmed.
…re n is num confs In this commit, we update the close logic to handle re-ogs up to the final amount of confirmations. This is done generically, so we're able to handle events such as: coop close confirm, re-org, breach confirm, re-org, force close confirm, re-org, etc. The upcoming set of new tests will exercise all of these cases. We modify the block beat handling to unify the control flow. As it's possible we get the beat, then see the spend, or the oher way around.
We'll use this for all the upcoming tests.
All the tests need to send a confirmation _after_ the spend is detected now.
This set of new tests ensures that if have created N RBF variants of the coop close transaction, that any of then can confirm, and be re-org'd, with us detecting the final spend once it confirms deeploy enough.
In this commit, we add a set of generic close re-org tests. The most important test is the property based test, they will randomly confirm transactions, generate a re-org, then assert that eventually we dtect the final version.
This ensures that during the RBF process, if one confirms, a re-org occurs, then another confirms, that we'll properly detect this case.
In this commit, we add a new TriggerSwep dev rpc command. This command can be used in the itest to trigger a sweep on demand, which will be useful to ensure that they still pass, now that some block beat handling is more async.
…rigger sweep if failed This implements a generalized pattern where we'll try out assertion, then if that fails, we'll try a sweep, then try the assertion again. This uses the new TriggerSweep call that was added earlier.
This change is due to the fact that blockbeat handling is now more async in the cnct.
PTAL @yyforyongyu. This includes the extra commits for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of RegisterSpendNtfn
, we can create a single, long-lived confirmation notifier:
fundingPkScript, err := deriveFundingPkScript(c.cfg.chanState)
// ...
heightHint := c.cfg.chanState.DeriveHeightHint()
numConfs := c.requiredConfsForSpend()
fundingConfNtfn, err := c.cfg.notifier.RegisterConfirmationsNtfn(
nil, // No specific txid, watch the script
fundingPkScript,
numConfs,
heightHint,
)
if err != nil {
// Handle error
}
defer fundingConfNtfn.Cancel()
We then add a checkFundingConfirmed
to replace checkFundingSpend
,
// checkFundingConfirmed performs a non-blocking read on the Confirmed
// channel to check whether the spend has been fully confirmed.
func (c *chainWatcher) checkFundingConfirmed() *chainntnfs.TxConfirmation {
select {
case conf, ok := <-c.fundingConfNtfn.Confirmed:
if !ok {
return nil
}
return conf
default:
return nil
}
}
Which is used on handleBlockbeat
to replace the old checkFundingSpend
check, and everything else stays unchanged.
The main loop then just needs fewer changes,
func (c *chainWatcher) closeObserver() {
defer c.wg.Done()
for {
select {
case beat := <-c.BlockbeatChan:
c.handleBlockbeat(beat)
// This case handles the initial discovery of the spend,
// or a re-orged spend.
case update := <-fundingConfNtfn.Updates:
...
// This case handles a re-org.
case <-fundingConfNtfn.NegativeConf:
...
// This case handles a confirmation that might arrive between blocks.
case conf := <-fundingConfNtfn.Confirmed:
// TODO: handleCommitSpend needs to be refactored to use `conf` instead of `SpendDetail`.
err := c.handleCommitSpend(conf)
...
case <-c.quit:
return
}
}
}
I think this should bring the least disruption to the system and itests (fixing the remaining itest would be a nightmare🤦🏻).
} | ||
|
||
// If not we return a value scaled linearly | ||
// between 3 and 6, depending on channel size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanna note that this is no longer from 1 to 6 but 3 to 6, which makes sense.
peer/brontide.go
Outdated
if closeReq != nil { | ||
closeReq.Updates <- &ChannelCloseUpdate{ | ||
ClosingTxid: closingTxid[:], | ||
// Determine the number of confirmations to wait before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the formatting isn't quite right
Success: true, | ||
go peer.WaitForChanToClose( | ||
uint32(bestHeight), notifier, errChan, chanPoint, | ||
&closingTxid, closingTx.TxOut[0].PkScript, 1, func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm should we also derive and pass the numConfs
here?
closeReq.Updates <- &ChannelCloseUpdate{ | ||
ClosingTxid: closingTxid[:], | ||
|
||
// Determine the number of confirmations to wait before signaling a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i see, looks like this commit can be a fixup commit for the previous one
// notifications are done in two different goroutines, so the | ||
// expected order: [receive block -> receive spend] is not | ||
// guaranteed . | ||
case spend, ok := <-c.fundingSpendNtfn.Spend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't been clear enough offline, what I meant is, we can just listen for the spendness of the fundingPkScript
with a nil tx via
c.cfg.notifier.RegisterConfirmationsNtfn(nil, fundingPkScript, numConfs, beat.CurrentHeight())
Then we can now replace the usage of c.fundingSpendNtfn
with c.fundingConfNtfn
, and the change will be smaller.
In this PR, we address the oldest issue in the lnd track: #53. Incremental improvements have landed over the years, but this is the most significant one to date.
First, we start to scale confs for closes just like we do for funding confirmation. A small refactor helps us to re-use this, with additional tests added.
We revamp the chain watcher to implement a state machine to ensure that we recognize the most deeply confirmed transaction closure. Once we detect a spend, we'll register for N confirmations for that spend. If we detect anther one (we can get another spend if a re-org happens, and a new one confirms), then we'll re-register for confirmations for that spend. We also start to read from the
NegativeConf
channel which will be sent upon if the transaction is re-org'd out after confirmation.It's important to note that the logic around the
NegativeConf
case is incomplete, as this will only trigger if: we wait for 6 confs, then we have a 7 block re-org. In that case, we've already claimed all of our outputs on chain typically. To handle this, we'll actually need to re-create the channel arb and the chain watcher, the re-handle any contested outputs we may have had. If the re-org goes all the way back to the funding transaction, then that's extremely dire, and needs special handling. This isn't covered in this PR.Most of the diff is actually just tests: unit tests, property based tests (helps us ensure we can handle spend, conf, re-org, spend, conf of nearly arbitrary params -- eg: confirm coop, re-org, confirm force close, etc), integration tests, etc. The core change is in this commit: b1d055e.
This fixes most of #53. Re the second item, if the funding txn gets re-org out, then confirmed again, then we'll be able to detect that, and close it normally. We still need the active logic to detect such a case though.