-
Notifications
You must be signed in to change notification settings - Fork 23
fix(qbft): commit message handling in catch-up scenarios #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(qbft): commit message handling in catch-up scenarios #548
Conversation
236a243
to
e702cf6
Compare
Add failing test that demonstrates QBFT incorrectly drops commit messages when no proposal has been accepted for the current round. This prevents proper catch-up scenarios where nodes should be able to achieve consensus based on commit quorum even without seeing the original proposal. The test shows that commit messages are dropped in the commit message handling logic, which should buffer these messages instead for later processing when a proposal arrives.
e702cf6
to
fb89af0
Compare
So we should just store the commit in the container before checking if the proposal was accepted? |
From what I understand, we can receive the messages out of order, so we need to buffer the commit messages in case we receive the proposal later. Do you think it makes sense? |
The QBFT Specification Requirement According to the QBFT specification, if:
Then we should accept the value as committed, regardless of the order these messages arrived. Why Buffering is Necessary In distributed systems, messages can arrive out of order: Scenario 1 (Normal Order):
Scenario 2 (Out of Order - Catch-up):
Current Anchor Bug The current code at lines 748-750: if !self.proposal_accepted_for_current_round {
warn!(from=?operator_id, ?self.state, "Have not accepted Proposal for current round yet");
return; // ❌ DROPS commit messages instead of buffering
} This violates the QBFT specification because it prevents Scenario 2 from working correctly. Correct Implementation Should:
This ensures that message ordering doesn't affect consensus correctness, which is a fundamental requirement for robust distributed consensus protocols like QBFT. |
5502512
to
5b159ee
Compare
Replace the incomplete buffering test with a comprehensive test that verifies complete QBFT consensus behavior in out-of-order scenarios. The new test_consensus_with_commits_before_proposal() test: - Sends commit messages before proposal (realistic catch-up scenario) - Verifies actual consensus achievement with correct value - Tests complete distributed systems behavior rather than just buffering This provides better validation of QBFT message ordering independence and ensures the fix addresses the complete consensus protocol behavior.
5b159ee
to
45cbdaa
Compare
Okay good job, I did some more looking into this and it does appear to be a valid issue. This is spec compliance, but anchor and go-ssv both do not implement this. I think our first step would be to bring it up to them in the discord and coordinate a change. I dont think this change would be huge, but it does require refactoring our some pieces of our logic and thinking through every case where this is needed. We are just doing the same thing as go-ssv atm |
I agree, let's do it |
Think we should close these for now and make an issue for each linking these prs? |
But thinking more about it, what would be the issue if Anchor fixes it while Go doesn't? |
I dont think there is an issue, but atm I think our focus should be making sure we are 100% confident in our implementations correctness with respect to go-ssv and then coordinate these extra changes. |
Shouldn't we be correct wrt to the spec? If this is really an issue that happens in production, although probably not often, an Anchor only cluster could have worse performance. But I agree it might not be high in the priority. @jking-aus @AgeManning what are your thoughts? |
Are we good to close these and just put them into an issue? |
Should we raise this on discord? |
opened issue #614 |
Issue Addressed
This PR addresses a bug in QBFT commit message handling where commit messages are dropped when received before a proposal, preventing consensus achievement in out-of-order message scenarios.
Proposed Changes
test_consensus_with_commits_before_proposal
that verifies QBFT can achieve consensus when commit messages arrive before proposalsAdditional Info
The current implementation drops commit messages when
proposal_accepted_for_current_round
is false. The fix involves implementing proper message buffering and re-evaluation when proposals arrive.Test Scenario:
This ensures network resilience by allowing nodes to properly participate in consensus even when experiencing temporary network issues or joining late.