Bug: exception in noexcept what() when Python exception contains a surrogate character
              
              #4287
            
            
                  
                    
                      TheShiftedBit
                    
                  
                
                  started this conversation in
                General
              
            Replies: 2 comments
-
| @TheShiftedBit Could your provide a minimal example? I think I might have a proposed solution, but I am having trouble replicating this. | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            -
| Here's a reproducer. Tested on Linux. Extension code: #include <pybind11/pybind11.h>
#include <pybind11/functional.h>
#include <iostream>
void invoke(std::function<void()> cb) {
  try {
    cb();
  } catch (pybind11::error_already_set& ex) {
    std::cout << ex.what() << std::endl;
  }
}
PYBIND11_MODULE(pybind11_repro, m) {
  m.def("invoke", &invoke);
}Python code: import pybind11_repro
def foo():
  raise RuntimeError("\ud927")
pybind11_repro.invoke(foo)This produces the following output: Defining that macro, we see:  | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
By default, Python produces errors when converting encoding
strs with utf-8 if thestrcontains surrogate characters. This can be disabled by passingsurrogatepassas a second argument to.encode(). Pybind11 has this same behavior with itsstr->std::stringconversion. However, the bug is this: if an exception message contains a surrogate character, calling.what()on anerror_already_setwith such an exception causes another exception to be thrown, but since.what()isnoexcept, that exception cannot be caught and the programstd::terminates.I'm not sure what the correct behavior regarding surrogate characters is. Perhaps pybind11 should always use
surrogatepass, perhaps not. However, even if that's not the right choice, it should probably use it during exception handling, or Python exceptions like this are extremely difficult to diagnose.Beta Was this translation helpful? Give feedback.
All reactions