-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix race condition in checkMailbox #25073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3c8e06c
to
6dc6e87
Compare
// MINIMAL_RUNTIME doesn't support the runtimeKeepalive stuff, but under | ||
// some circumstances it supportes `runtimeExited` | ||
#if EXIT_RUNTIME | ||
if (runtimeExited) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if we are building in -sEXIT_RUNTIME=0
mode, and some code calls abort()
, and then this async timeout for checkMailbox() triggers? Wouldn't that result in this safeguard passing right through into calling _pthread_self()
again, since in EXIT_RUNTIME=0
mode this check was compiled out altogether?
_pthread_self()
is a C compiled function, so if the Worker is not hosting a pthread and hence doesn't have an active program stack, then even entering that function will not be safe (even though it might return 0;
), since it could e.g. do a stack bump into corrupted space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could be a problem for MINIMAL_RUNTIME
, since it doesn't track ABORT
like the normal runtime does.
Do we have any other signal to know of the runtime is alive other than ABORT
, runtimeExited
and/or pthread_self() == null
? i.e. could we do any better?
I've love to consolidate those 3 myself into single "is_runtime_ok_or_valid_right_now" things, so adding yet another piece of state seems like the wrong direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could be a problem for
MINIMAL_RUNTIME
, since it doesn't trackABORT
like the normal runtime does.
I mean just in regular runtime, if one builds with -sEXIT_RUNTIME=0
, then the above if (runtimeExited)
build won't be compiled in, and if a pthread calls abort, then that would set ABORT = true;
in the Worker and shut down its runtime? But this code wouldn't catch it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the other implementation of callUserCallback
for the regular runtime. It does include that check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, right - missed that this was the MINIMAL_RUNTIME path. That makes sense. LGTM.
Thanks for looking into this. I'm happy with this as long as #25066 passes. Although if the checks are narrower, it might need to be expanded to cover both |
6dc6e87
to
df8b5c2
Compare
The `checkMailbox` callback can occur after the thread has terminated. In this case calling into native code can trigger the `makeAbortWrapper` wrapper that is put around each native function which then results in a "program has already aborted!" error being thrown. Once solution to this is to make sure that the function which are called do not have `makeAbortWrapper` applied to them. This was the technique I used in emscripten-core#18754, but the list of functions became stale when emscripten_proxy_execute_task_queue was removed in emscripten-core#18852. A better solution is to wrap to whole function in callUserCallback, which takes case of checking if the runtime is alive before calling into native code. Fixes: emscripten-core#20067
df8b5c2
to
24f9659
Compare
The
checkMailbox
callback can occur after the thread has terminated. In this case calling into native code can trigger themakeAbortWrapper
wrapper that is put around each native function which then results in a "program has already aborted!" error being thrown.Once solution to this is to make sure that the function which are called do not have
makeAbortWrapper
applied to them.This was the technique I used in #18754, but the list of functions became stale when emscripten_proxy_execute_task_queue was removed in #18852.
A better solution is to wrap to whole function in callUserCallback, which takes care of checking if the runtime is alive before calling into native code.
Fixes: #20067