Skip to content

Conversation

andriytk
Copy link
Contributor

@andriytk andriytk commented Jul 16, 2025

Currently, when reading compressed blocks with -R and decompressing them with :d option and specifying lsize, which is normally bigger than psize for compressed blocks, the checksum is calculated on decompressed data. But it makes no sense since zfs always calculates checksum on physical, i.e. compressed data. So reading the same block produces different checksum results depending on how we read it, whether we decompress it or not, which, again, makes no sense.

For example:

$ sudo ./zdb -R mypool 0:beb7f6e7e000:1000:c | grep fletcher4
   fletcher4	cksum=00000088174743b9:00019b84088c35af:02827051a621db13:b0ca19b0de96b688
$ sudo ./zdb -R mypool 0:beb7f6e7e000:4000/1000:dc | grep fletcher4
   fletcher4	cksum=0000021a4433878e:001188edf8682c10:6280e56d3358da45:d2900132b1add9dd

Fix: use psize instead of lsize when calculating the checksum so that it is always calculated on the physical block size, no matter was it compressed or not.

Here's how the same example works with the fix:

$ sudo ./zdb -R mypool 0:beb7f6e7e000:1000:c | grep fletcher4
   fletcher4	cksum=00000088174743b9:00019b84088c35af:02827051a621db13:b0ca19b0de96b688
$ sudo ./zdb -R mypool 0:beb7f6e7e000:4000/1000:dc | grep fletcher4
   fletcher4	cksum=00000088174743b9:00019b84088c35af:02827051a621db13:b0ca19b0de96b688

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jul 18, 2025
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should definitely be using the psize here. Thanks for noticing and opening up a fix.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jul 24, 2025
@amotin amotin merged commit 4bd7a2e into openzfs:master Jul 25, 2025
25 checks passed
amotin pushed a commit to amotin/zfs that referenced this pull request Jul 28, 2025
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17547
amotin pushed a commit that referenced this pull request Aug 5, 2025
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes #17547
ixhamza pushed a commit to truenas/zfs that referenced this pull request Aug 28, 2025
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17547
spauka pushed a commit to spauka/zfs that referenced this pull request Aug 30, 2025
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17547
bugclerk pushed a commit to truenas/zfs that referenced this pull request Sep 8, 2025
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17547
(cherry picked from commit 1383cda)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants