Skip to content

Conversation

@tonyhutter
Copy link
Contributor

Motivation and Context

Print error byte ranges with zpool status -vv

Description

Print the byte error ranges with 'zpool status -vv'. This works with all the normal zpool status formatting flags: -p, -j, --json-int

In addition:

  • Move range_tree/btree to common userspace/kernel code.
  • Modify ZFS_IOC_OBJ_TO_STATS ioctl to optionally return "extended" object stats.
  • Let zinject corrupt zvol data.
  • Add test case.

This commit takes code from these PRs: #17502 #9781 #8902

How Has This Been Tested?

Test case added

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@tonyhutter
Copy link
Contributor Author

Sample output:

$ zpool status -vv

  pool: testpool                                                     
 state: ONLINE                                                       
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.                       
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.                                         
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A       
  scan: scrub repaired 0B in 00:00:00 with 3 errors on Tue Oct 21 17:22:20 2025
config:                                                              
                                                                     
    NAME        STATE     READ WRITE CKSUM                           
    testpool    ONLINE       0     0     0                           
      loop0     ONLINE       3     0    18                           
                                                                     
errors: Permanent errors have been detected in the following files:  
                                                                     
        <metadata>:<0x1> (no ranges)                                 
        /testpool/4k/4k_file1 0-4.00K                                
        /testpool/4k/4k_file2 0-4.00K,16K-20.0K,24K-28.0K            
        /testpool/1m/1m_file 1M-2.00M                                
        testpool/testvol:<0x1> 3.91M-3.91M,4.30M-4.30M    
$ zpool status -vvjp --json-int | jq
...
      "errors": {
        "<metadata>:<0x1>": {
          "name": "<metadata>:<0x1>",
          "object": 1,
          "dataset": 0
        },
        "/testpool/4k/4k_file1": {
          "object_type": "ZFS plain file",
          "ranges": [
            {
              "start_byte": 0,
              "end_byte": 4095
            }
          ],
          "name": "/testpool/4k/4k_file1",
          "object": 2,
          "dataset": 262,
          "block_size": 4096
        },
        "/testpool/4k/4k_file2": {
          "object_type": "ZFS plain file",
          "ranges": [
            {
              "start_byte": 0,
              "end_byte": 4095
            },
            {
              "start_byte": 16384,
              "end_byte": 20479
            },
            {
              "start_byte": 24576,
              "end_byte": 28671
            }
          ],
          "name": "/testpool/4k/4k_file2",
          "object": 128,
          "dataset": 262,
          "block_size": 4096
        },
        "/testpool/1m/1m_file": {
          "object_type": "ZFS plain file",
          "ranges": [
            {
              "start_byte": 1048576,
              "end_byte": 2097151
            }
          ],
          "name": "/testpool/1m/1m_file",
          "object": 2,
          "dataset": 270,
          "block_size": 1048576
        },
        "testpool/testvol:<0x1>": {
          "object_type": "zvol",
          "ranges": [
            {
              "start_byte": 4096000,
              "end_byte": 4100095
            },
            {
              "start_byte": 4505600,
              "end_byte": 4509695
            }
          ],
          "name": "testpool/testvol:<0x1>",
          "object": 1,
          "dataset": 278,
          "block_size": 4096
        }
      }
    }
  }
}

@tonyhutter
Copy link
Contributor Author

Just the filenames (JSON):

$ zpool status -vj
...
      "errors": {
        "<metadata>:<0x1>": {
          "name": "<metadata>:<0x1>"
        },
        "/testpool/4k/4k_file1": {
          "name": "/testpool/4k/4k_file1"
        },
        "/testpool/4k/4k_file2": {
          "name": "/testpool/4k/4k_file2"
        },
        "/testpool/1m/1m_file": {
          "name": "/testpool/1m/1m_file"
        },
        "testpool/testvol:<0x1>": {
          "name": "testpool/testvol:<0x1>"
        }
      }
    }
  }

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Oct 22, 2025
Print the byte error ranges with 'zpool status -vv'.  This works
with the normal zpool status formatting flags (-p, -j, --json-int).

In addition:

- Move range_tree/btree to common userspace/kernel code.
- Modify ZFS_IOC_OBJ_TO_STATS ioctl to optionally return "extended"
  object stats.
- Let zinject corrupt zvol data.
- Add test case.

This commit takes code from these PRs: openzfs#17502 openzfs#9781 openzfs#8902

Signed-off-by: Tony Hutter <[email protected]>
Co-authored-by:: Alan Somers <[email protected]>
Co-authored-by: TulsiJain <[email protected]>

/* Resolve symlinks to /dev/zd* device */
if (realpath(inpath, buf) != NULL)
if (strncmp(buf, "/dev/zd", 7) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (strncmp(buf, "/dev/zd", 7) == 0)
if (strncmp(buf, "/dev/" ZVOL_DEV_NAME, 7) == 0)

char *slash;
int rc;
if ((fd = open(inpath, O_RDONLY|O_CLOEXEC)) == -1 ||
fstat64(fd, statbuf) != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If fstat64() fails the file won't be closed. Let's simply split this in to two conditional blocks and add the close.

if (path_is_zvol(inpath)) {
int fd;
char *slash;
int rc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing newline after the declarations.

*
* So we hardcode that in the statbuf inode field as workaround.
*/
statbuf->st_ino = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
statbuf->st_ino = 1;
statbuf->st_ino = ZVOL_OBJ;

if (rc != 0)
return (-1);

(void) strcpy(dataset, fullpath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(void) strcpy(dataset, fullpath);
(void) strlcpy(dataset, fullpath, MAXNAMELEN);

}

/*
* Given an properties nvlist like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Given an properties nvlist like:
* Given a properties nvlist like:

nodist_libzfs_la_SOURCES = \
module/zcommon/btree.c \
module/zcommon/cityhash.c \
module/zcommon/range_tree.c \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the btree and range_tree code is needed by libzfs we should build it as noinst convenience libraries (like libavl). To do that we'll need to make sure they're fully generic and don't have any accidental dependencies on other zfs structures or functions. This doesn't look like it should be too bad.

/*
* highbit64 is defined in sysmacros.h for the kernel side. However, we need
* it on the libzfs side and zpool_main.c side, and there's no good place to
* put it but here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib/libspl/include/os/linux/sys/sysmacros.h would be a better spot for this. This might already be defined on the FreeBSD side, it wasn't immediately clear to me.

VERIFY0(nvlist_add_uint32(nv, ZFS_OBJ_STAT_METADATA_BLOCK_SIZE,
doi.doi_metadata_block_size));
VERIFY0(nvlist_add_uint64(nv, ZFS_OBJ_STAT_TYPE,
doi.doi_type));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also provide the object type converted from a uint64 to a string here. That way user space doesn't have to know how to do the translation and dmu_ot can stay where it is. Alternately, we'd need to move it to sys/dmu.h. We should give the same treatment to doi_bonus_type, so we'd have the following keys.

ZFS_OBJ_STAT_TYPE
ZFS_OBJ_STAT_TYPE_STR
ZFS_OBJ_STAT_BONUS_TYPE
ZFS_OBJ_STAT_BONUS_TYPE_STR

You'll also want the make sure to include the names from the extended types. Here's an example from the zdb code.

                        VERIFY0(dmu_object_info(mos, object, &doi));
                        if (doi.doi_type & DMU_OT_NEWTYPE) {
                                dmu_object_byteswap_t bswap =
                                    DMU_OT_BYTESWAP(doi.doi_type);
                                name = dmu_ot_byteswap[bswap].ob_name;
                        } else {
                                name = dmu_ot[doi.doi_type].ot_name;
                        }

#include <stdio.h>
#include <stdlib.h>
#define panic(...) do {printf(__VA_ARGS__); exit(EXIT_FAILURE); } while (0)
#define zfs_panic_recover(...) do {printf(__VA_ARGS__); } while (0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to get this code to build in user space using the existing compatility wrappers. Here are a couple of thoughts which should work as long as sys/debug.h is included.

panic -> PANIC

zfs_panic_recover we should probably redefine to a PANIC.

You should be able to replace the zfs_dbgmsg and VERIFY3U with a single VERIFY3UF that will do the right thing in both user and kernel space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants