Skip to content

Conversation

c6c7
Copy link
Contributor

@c6c7 c6c7 commented Jul 11, 2024

This PR goes one step further than #127594 by adding an unstable feature to make determining whether a Fuchsia process aborted more robust. There is certainly more work that needs to go in to landing the unstable feature, but I'm drafting this PR as a means to convey intent for the feature to survey others' thoughts.

r​? @tmandry

@rustbot
Copy link
Collaborator

rustbot commented Jul 11, 2024

r? @cuviper

rustbot has assigned @cuviper.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-fuchsia Operating system: Fuchsia S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 11, 2024
@c6c7
Copy link
Contributor Author

c6c7 commented Jul 11, 2024

r? @tmandry

@rustbot rustbot assigned tmandry and unassigned cuviper Jul 11, 2024
@c6c7
Copy link
Contributor Author

c6c7 commented Jul 11, 2024

I should state explicitly that a goal of this PR is to enable Fuchsia testing for tests/ui/process/signal-exit-status.rs without needing to copy-paste the magic Fuchsia status code that also appears in library/test/src/test_result.rs (and any futures authors needing to determine whether a Fuchsia process was aborted).

@rust-log-analyzer

This comment has been minimized.

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from 130612b to ba47433 Compare July 11, 2024 05:32
@rust-log-analyzer

This comment has been minimized.

@c6c7
Copy link
Contributor Author

c6c7 commented Jul 11, 2024

#58590 references #102032 which ignored some tests because of the inability of those tests easily recognize Fuchsia's process status code in various error cases. This change should consider re-enabling testing for Fuchsia in those cases.

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch 3 times, most recently from 4508223 to 9353463 Compare July 11, 2024 06:05
@rust-log-analyzer

This comment has been minimized.

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from 9353463 to a6aed9b Compare July 11, 2024 06:10
@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from a6aed9b to 11f6c04 Compare July 13, 2024 06:00
@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from 11f6c04 to 1e8eba9 Compare July 13, 2024 06:09
@c6c7
Copy link
Contributor Author

c6c7 commented Jul 15, 2024

After a lot of thought, I've changed the approach of this PR significantly. In short, Zircon does not implement signals as the definition of the C Standard Library expects for the implementation of libc::abort(). As a result, I think it's impossible to expose an API on top of a return code from Zircon to determine if a task was aborted or not. The best we can do is smooth the path for retrieving a ZX_TASK_RETCODE.

I will be pushing a new version of this PR shortly.

@c6c7
Copy link
Contributor Author

c6c7 commented Jul 15, 2024

The longer story with citations....

There are cases in this repository that attempt to detect whether a process aborted or not. The first is in fn get_result_from_exit_code(..., status: ExitStatus, ...) -> TestResult (defined in test::test_result) which returns whether a test failed based on an ExitStatus. If a platform implements libc::abort abort as defined in ISO C, a panic in a test will result in calling libc::abort which will call libc::raise(SIGABRT). Thus, get_result_from_exit_code will return a test failed when ExitStatus::signal returns Some(SIGABRT). The story does not end here though because Fuchsia and Windows don't support signals, and thus ExitStatus::signal always returns None on those platforms.

(For Windows, it's apparently sufficient to check for STATUS_FAIL_FAST_EXCEPTION, but I don't know the details beyond the existing comments in test::test_result)

For Fuchsia, libc::abort does not call libc::raise(SIGABRT) because Fuchsia in this case deviates from the ISO C definition of libc::abort. Instead, Fuchsia executes the LLVM intrinsic __builtin_trap()1 which generally raises an Zircon kernel exception. Unless the process installed an exception handler, the exception will go uncaught, and the kernel will kill the process with the code ZX_TASK_RETCODE_EXCEPTION_KILL. Thus, for get_result_from_exit_code, the best we can do to check if a process aborted is to observe the return code ZX_TASK_RETCODE_EXCEPTION_KILL.

The second case is tests/ui/process/signal-exit-status.rs which tests whether a process aborted as expected upon calling core::intrinsics::abort(). The goal is very similar to get_result_from_exit_code, but the result is not a TestResult. For Fuchsia, the hindrance for writing this test is not knowing what the return code from the process should be. My first thought was, "Oh, we should just implement ExitStatus::aborted_code(&self) -> Option<i32>." But this is fraught because of there is no return code in Fuchsia that is set if and only if a process aborted. The closest approximation to detecting if a process aborted is to check if the process returned ZX_TASK_RETCODE_EXCEPTION_KILL.

Taking a step back though, it's not actually the primary goal of signal-exit-status to determine that the process returned ZX_TASK_RETCODE_EXCEPTION_KILL and no other possible uncaught exception was raised. There is some potential for a test flake where the Zircon kernel during the signal-exit-status test raises an exception for a reason other than the abort() call; this seems very unlikely though. Thus, it's sufficient for signal-exit-status to merely check for ZX_TASK_RETCODE_EXCEPTION_KILL.

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from 1e8eba9 to f98bc4b Compare July 15, 2024 19:57
@rustbot rustbot added the O-unix Operating system: Unix-like label Jul 15, 2024
@c6c7
Copy link
Contributor Author

c6c7 commented Jul 15, 2024

The latest change modifies the approach to implement task_retcode() instead of aborted_code(). I believe this API more directly corresponds to the semantics around Fuchsia processes and correctly requires callers to reason about whether the ZX_TASK_RETCODE indicates the condition they expect to check.

@rust-log-analyzer

This comment has been minimized.

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from f98bc4b to 6d717c9 Compare July 15, 2024 20:04
Comment on lines +8 to +26
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub type zx_status_t = i32;
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_SYSCALL_KILL: zx_status_t = -1024;
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_OOM_KILL: zx_status_t = -1025;
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_POLICY_KILL: zx_status_t = -1026;
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_VDSO_KILL: zx_status_t = -1027;
/// On Zircon (the Fuchsia kernel), an abort from userspace calls the
/// LLVM implementation of __builtin_trap(), e.g., ud2 on x86, which
/// raises a kernel exception. If a userspace process does not
/// otherwise arrange exception handling, the kernel kills the process
/// with this return code.
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_EXCEPTION_KILL: zx_status_t = -1028;
#[unstable(feature = "fuchsia_exit_status", issue = "none")]
pub const ZX_TASK_RETCODE_CRITICAL_PROCESS_KILL: zx_status_t = -1029;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not thrilled about adding these constants here since std::sys::pal::unix::process::zircon exists. However, since std::sys::pal::unix::process::zircon is private, I didn't want to cross the bridge of exposing parts of that module in this change.

Copy link
Contributor

@tgross35 tgross35 Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could/should these be added to libc instead and then used in both places?

@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch 2 times, most recently from c7980b6 to 8a6859d Compare July 15, 2024 22:17
The method task_retcode() method improves the ergonomics of
determining if a Fuchsia process was killed by the kernel and
for what reason.
@c6c7 c6c7 force-pushed the fix-panic=abort-tests-on-fuchsia branch from 8a6859d to 900e3c6 Compare July 15, 2024 22:22
@bors
Copy link
Collaborator

bors commented Jul 29, 2024

☔ The latest upstream changes (presumably #125443) made this pull request unmergeable. Please resolve the merge conflicts.

@tmandry
Copy link
Member

tmandry commented Jan 25, 2025

LGTM but I'm not on Fuchsia anymore, so unassigning myself. Someone on the library team should review this – @c6c7, use r? library once this is no longer a draft.

@tmandry tmandry removed their assignment Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-fuchsia Operating system: Fuchsia O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants