Skip to content

Conversation

brenns10
Copy link
Member

This imports the pstack contrib script from drgn, and updates it. I'll end up pushing many of these updates back to drgn as well, but the main things are:

  1. It is now tested on aarch64 and works with minimal changes.
  2. Tasks that are running in user-mode on CPU can now be unwound and printed correctly.
  3. Task state selection is done so that multiple pids, command patterns, or task states can be specified very flexibly.
  4. For the "dump" mode, all task data is dumped to a single file, rather than several files per PID.

The pstack script allows dumping userspace stack traces for processes.
It has a dump mode which only dumps minimal necessary information,
suitable for use within a kdump kernel. Finally, it has a "print" mode
to print the dumped contents. Import this script and make a small
CorelensModule wrapper around it. We'll refine some of the functionality
so that it can be supported in LKCE.

The drgn file contrib/pstack.py was wholely contributed by me under
drgn's license. As the author, I can also add it here under the
UPL.

Signed-off-by: Stephen Brennan <[email protected]>
In most vmcores, userspace tasks are not actually running: the CPU is
executing a kernel-mode interrupt handler that brings it to a halt.
Thus, the userspace registers are stored to the kernel stack. But for
hypervisor vmcores, the CPU may be executing user code directly. In that
case, drgn's stack trace registers are in fact the userspace registers.
Handle this (rare) case.

Signed-off-by: Stephen Brennan <[email protected]>
The original pstack implementation dumps data to multiple files: for
each process, created one dump file for the JSON metadata, and another for
the stack data. For large amounts of tasks, this can result in a lot of
file creation, which can be unwieldy, and it's probably not a great idea
to create thousands of tiny files in the kdump kernel.

Move to a new format where everything is dumped to a single file. This
requires a bit more code and a clear format definition, but the result
is more efficient and easier to use.

Signed-off-by: Stephen Brennan <[email protected]>
Previously, task selection could only be done with one CLI option. So a
single PID could be specified, or a single command pattern, or a single
task state. This is obviously not flexible enough. We would like to be
able to select multiple patterns where possible. Implement this while
ensuring we don't dump or print any tasks twice.

Signed-off-by: Stephen Brennan <[email protected]>
In particular, take special care to highlight tasks that are actually
running on CPU now, rather than simply in the runnable state. This is
important because we allow selecting processes by their state or online
status, so we want it to be obvious why a task is printed.

Signed-off-by: Stephen Brennan <[email protected]>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Aug 14, 2025
Signed-off-by: Stephen Brennan <[email protected]>
For a while we've been beating around the bush by having a way to filter
out which vmcores we run tests against. But that doesn't help filtering
out kernel versions for live tests.

Most new functionality doesn't necessarily need to work on UEK4, so we
want it to be easy to skip tests below the minimum version.

Signed-off-by: Stephen Brennan <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
@brenns10 brenns10 requested a review from biger410 August 15, 2025 21:43
@brenns10
Copy link
Member Author

This is ready for review. Since most of the functionality can be exercised on a live system via /proc/kcore, the tests here actually do exercise it pretty well. The tests verify the register fetching, the dump format, and ensure that the "pstack" operation outputs the same as the "dump + print" operation.

Copy link
Member

@biger410 biger410 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@brenns10 brenns10 merged commit 2f4f1ce into oracle-samples:main Aug 25, 2025
6 of 7 checks passed
@brenns10 brenns10 deleted the pstack-import branch August 25, 2025 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants