Skip to content

[API Proposal]: Overriding the default behavior in case of unhandled exceptions and fatal errors. #101560

@VSadov

Description

@VSadov

Re: The previous proposal and a discussion that led to this proposal - (#42275)

Background and motivation

The current default behavior in a case of unhandled exception is termination of a process.
The current default behavior in a case of a fatal error is print exception to console and invoke Watson/CrashDump.

While satisfactory to the majority of uses, the scheme is not flexible enough for some classes of scenarios.

Scenarios like Designers, REPLs or game scripting that host user provided code are not able to handle unhandled exceptions thrown by the user provided code. Unhandled exceptions on finalizer thread, threadpool threads or user created threads will take down the whole process. This is not desirable experience for these types of scenarios.

In addition, there are customers that have existing infrastructure for postmortem analysis of failures and inclusion of .NET components requires interfacing with or overriding the way the fatal errors are handled.

API Proposal

API for process-wide handling of unhandled exception

namespace System.Runtime.ExceptionServices
{
    public delegate bool UnhandledExceptionHandler(System.Exception exception);

    public static class ExceptionHandling
    {
        /// <summary>
        /// Sets a handler for unhandled exceptions.
        /// </summary>
        /// <exception cref="ArgumentNullException">If handler is null</exception>
        /// <exception cref="InvalidOperationException">If a handler is already set</exception>
        public static void SetUnhandledExceptionHandler(UnhandledExceptionHandler handler);
    }
}

The semantics of unhandled exception handler follows the model of imaginary handler like the following inserted in places where the exception will not lead to process termination regardless of what handler() returns.

try { UserCode(); } catch (Exception ex) when handler(ex){};

In particular:

  • only exceptions that can be caught and ignored will cause the handler to be invoked. (i.e. stack overflow will not)
  • an unhandled exception thrown in a handler will not invoke the handler, but will be treated as returning false.
  • when an exception is handled via a handler in a user-started thread, the thread will still exit (but not escalate to process termination)
  • when an exception is handled in a task-like scenario on an infrastructure thread (thread pool, finalizer queue...), the execution of tasks will continue.
    (Whether the infrastructure thread continues or restarted is unspecified, but the process should be able to proceed)
  • a reverse pinvoke will not install the try/catch like above.
  • main() will not install the try/catch like above

API Proposal for custom handling of fatal errors

Managed API to set up the handler.

namespace System.Runtime.ExceptionServices
{
    public static class ExceptionHandling
    {
        /// <summary>
        /// .NET runtime is going to call `fatalErrorHandler` set by this method before its own
        /// fatal error handling (creating .NET runtime-specific crash dump, etc.).
        /// </summary>
        /// <exception cref="ArgumentNullException">If fatalErrorHandler is null</exception>
        /// <exception cref="InvalidOperationException">If a handler is already set</exception>
        public static void SetFatalErrorHandler(delegate* unmanaged<int, void*, int> fatalErrorHandler);
    }
}

The shape of the FatalErrorHandler, if implemented in c++
(the default calling convention for the given platform is used)

// expected signature of the handler
FatalErrorHandlerResult FatalErrorHandler(int32_t hresult, struct FatalErrorInfo* data);

With FatalErrorHandlerResult and FatalErrorInfo defined in "FatalErrorHandling.h" under src/native/public:

enum FatalErrorHandlerResult : int32_t
{
    RunDefaultHandler = 0,
    SkipDefaultHandler = 1,
};

#if defined(_MSC_VER) && defined(_M_IX86)
#define DOTNET_CALLCONV __stdcall
#else
#define DOTNET_CALLCONV
#endif

struct FatalErrorInfo
{
    size_t size;    // size of the FatalErrorInfo instance
    void*  address; // code location correlated with the failure (i.e. location where FailFast was called)

    // exception/signal information, if available
    void* info;     // Cast to PEXCEPTION_RECORD on Windows or siginfo_t* on non-Windows.
    void* context;  // Cast to PCONTEXT on Windows or ucontext_t* on non-Windows.

    // An entry point for logging additional information about the crash.
    // As runtime finds information suitable for logging, it will invoke pfnLogAction and pass the information in logString.
    // The callback may be called multiple times.
    // Combined, the logString will contain the same parts as in the console output of the default crash handler.
    // The errorLog string will have UTF-8 encoding.
    void (DOTNET_CALLCONV *pfnGetFatalErrorLog)(
           FatalErrorInfo* errorData, 
           void (DOTNET_CALLCONV *pfnLogAction)(char8_t* logString, void *userContext), 
           void* userContext);

    // More information can be exposed for querying in the future by adding
    // entry points with similar pattern as in pfnGetFatalErrorLog
};

API Usage

Setting up a handler for unhandled exceptions:

using System.Runtime.ExceptionServices;

ExceptionHandling.SetUnhandledExceptionHandler(
    (ex) =>
    {
        if (DesignMode)
        {
            DisplayException(ex);
            // the exception is now "handled"
            return true;
        }
        return false;
    }
);

Setting up a handler for fatal errors:

Setting up the handler for the process (C# code in the actual app):

internal unsafe class Program
{
    [DllImport("myCustomCrashHandler.dll")]
    public static extern delegate* unmanaged<int, void*, int> GetFatalErrorHandler();

    static void Main(string[] args)
    {
        ExceptionHandling.SetFatalErrorHandler(GetFatalErrorHandler());

        RunMyProgram();
    }
}

The handler. (c++ in myCustomCrashHandler.dll)

#include "FatalErrorHandling.h"

static FatalErrorHandlerResult FatalErrorHandler(int32_t hresult, struct FatalErrorInfo* data)
{
    // this is a special handler that analyzes OOM crashes
    if (hresult != COR_E_OUTOFMEMORY)
    {
        return FatalErrorHandlerResult::RunDefaultHandler;
    }

    DoSomeCustomProcessingOfOOM(data);

    // retain the additional error data
    data->pfnGetFatalErrorLog(data, &LogErrorMessage, NULL);

    // no need for a huge crash dump after an OOM.
    return FatalErrorHandlerResult::SkipDefaultHandler;
}

static void LogErrorMessage(char8_t* logString, void *userContext)
{
    AppendToBlob(logString);
}

extern "C" DLL_EXPORT void* GetFatalErrorHandler()
{
    return &FatalErrorHandler;
}

Alternative Designs

Unmanaged hosting API that enables this behavior. (CoreCLR has undocumented and poorly tested configuration option for this today. #39587. This option is going to be replaced by this API.)

Extending AppDomain.CurrentDomain.UnhandledException API and make IsTerminating property writeable to allow "handling".
Upon scanning the existing use of this API it was found that IsTerminating is often used as a fact - whether an exception is terminal or not. Changing the behavior to mean "configurable" will be a breaking change to those uses.

Risks

This APIs can be abused to ignore unhandled exceptions or fatal errors in scenarios where it is not warranted.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

UserStories + Epics

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions