-
Notifications
You must be signed in to change notification settings - Fork 123
Fix race between parent and monitor in use_pty
#1130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
use_pty
By the way, I have left b156eb9 in for illustration purposes only. That fix alone takes care of the problem "in practice". We can drop that one before it is merged. |
|
I checked the changes and they totally make sens to me. However I was unable to reproduce this bug on my system so I don't have a proper way to see it working. |
|
6a33c5e
to
05ef39e
Compare
As for the final commit: the I/O subsystem doesn't know that the end of a pipe is at hand, since the child has stopped responding (but is being kept in suspended animation to avoid a "broken pipe"). So the solution that I used is to put the right end of the pipe in nonblocking mode, and then read until the OS indicates that a read on it will block (which signals the true end of the stream). As far as I can tell, all the rest of our code will work fine with the Pty leader in nonblocking mode, but to avoid any changes in that part, I've only set it to 'nonblocking' before the final loop. Since we're not going to be doing anything with the pipe after that, that is a conservative choice. |
I have investigated whether the "red light" synchronization mechanism is still necessary or merely reading from the Pty until exhaustion is enough. But on FreeBSD I've found that just the last two commits don't fix the problem. So we can't drop those. |
The problem with #1129 was two-fold:
This could be demonstrated by adding a sleep in the monitor, but a better fix is simply to use the existing backchannel to synchronise parent and monitor. I re-used the old
ExecCommand
message for this, as that was already used to handle a similar race at the start.So basically the backchannel protocol is: parent sends an
Edge
to signal the monitor it can go (this was already the case), do the usual thing while the monitor and child are doing their dance, and then again when the monitor has communicated that it is ended, let it wait for anotherEdge
from the parent.The fix: in the
flush_left
function, also read the right side to completion.