- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.2k
          Better errors from runc init
          #4928
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
08fb065    to
    0200b76      
    Compare
  
    0200b76    to
    af1e5f2      
    Compare
  
    af1e5f2    to
    0871366      
    Compare
  
    0871366    to
    8d2e079      
    Compare
  
    8d2e079    to
    c735358      
    Compare
  
    abf4958    to
    ef31851      
    Compare
  
    ef31851    to
    3e5a8ed      
    Compare
  
    | @kolyshkin The extra path (the one no present in the other mentioned PRs) LGTM. But would that print the libcrypto issue? I mean, is the go panic forwarded? This panic you posted in this issue, for example: #4916 (comment) It seems packages.microsoft.com is down now, I can't easily test myself (Yeah, I'm sending some messages, but they are probably aware already :)). If you still have that install handy, it will be great if you can test it :) | 
In case early stage of runc init (nsenter) fails for some reason, it logs error(s) with FATAL log level, via bail(). The runc init log is read by a parent (runc create/run/exec) and is logged via normal logrus mechanism, which is all fine and dandy, except when `runc init` fails, we return the error from the parent (which is usually not too helpful, for example): runc run failed: unable to start container process: can't get final child's PID from pipe: EOF Now, the actual underlying error is from runc init and it was logged earlier; here's how full runc output looks like: FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state: Success ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF The problem is, upper level runtimes tend to ignore everything except the last line from runc, and thus error reported by e.g. docker is not very helpful. This patch tries to improve the situation by collecting FATAL errors from runc init and appending those to the error returned (instead of logging). With it, the above error will look like this: ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state: Success Yes, it is long and ugly, but at least the upper level runtime will report it. Signed-off-by: Kir Kolyshkin <[email protected]>
3e5a8ed    to
    d387935      
    Compare
  
    | 
 Alas, no. This PR is about the C code of  You can emulate the libcrypto error by adding "panic" call into  | 
This currently includes #4930 (and serves as a test for it). Draft until that one is merged.This currently includes #4951 and is therefore a draft until #4951 is merged.Inspired by the discussion in #4905.
In case early stage of runc init (nsenter) fails for some reason, it
logs error(s) with FATAL log level, via bail().
The runc init log is read by a parent (runc create/run/exec) and is
logged via normal logrus mechanism, which is all fine and dandy, except
when
runc initfails, we return the error from the parent (which isusually not too helpful, for example):
Now, the actual underlying error is from runc init and it was logged
earlier; here's how full runc output looks like:
The problem is, upper level runtimes tend to ignore everything except
the last line from runc, and thus error reported by e.g. docker is not
very helpful.
This patch tries to improve the situation by collecting FATAL errors
from runc init and appending those to the error returned (instead of
logging). With it, the above error will look like this:
Yes, it is long and ugly, but at least the upper level runtime will
report it.
Fixes: #4905