-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29501 IOException in SerialReplicationChecker.canPush causes entries to be pushed out of order #7194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tomasbanet
wants to merge
2
commits into
apache:branch-2.6
Choose a base branch
from
tomasbanet:HBASE-29501
base: branch-2.6
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+5
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Apache9
reviewed
Aug 9, 2025
.../java/org/apache/hadoop/hbase/replication/regionserver/SerialReplicationSourceWALReader.java
Outdated
Show resolved
Hide resolved
…tries to be pushed out of order
ae83e1e
to
e81eeac
Compare
💔 -1 overall
This message was automatically generated. |
3 similar comments
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Please open a PR against master? Thanks. |
@tomasbanet Please open a PR against master? Thanks. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Detailed Example
Zookeeper state:
WAL's:
hbase:meta barriers:
[2, 5, 6, 9, 17, 26, 33]
Entry with seqId=34 cannot be pushed until previous range finishes (entries with seqId 27, 28, 30, 31, 32, between barriers 26 and 33):
However, if main WAL reader thread runs before RS_CLAIM_REPLICATION_QUEUE WAL reader thread, when running canPush with entry seqId=34 and IOException is caught, in SerialReplicationSourceWALReader.readWALEntries we get:
The next iteration in SerialReplicationSourceWALReader.readWALEntries processes entry with seqId=35:
The shipper thread logs:
ReplicationSourceShipper.shipEdits ships edit with seqId=34:
updateLogPosition()
will callReplicationSourceManager.logPositionAndCleanOldLogs()
, which callsZKReplicationQueueStorage.setWALPosition()
.ZKReplicationQueueStorage.setWALPosition()
updates zookeeper with:Afterwards main WAL reader thread can push entries with seqId higher than seqId's in reclaimed queue:
This means table in sink cluster can have out of order entries (key6 seqid in source cluster = 36):
After reclaimed queue finishes (key5 seqid in source cluster = 28):
Unit test, integration test
TODO