Exception handler

Hex-Rays' support for exceptions in Microsoft Visual C++/x64 incorporates the C++ exception metadata for functions into their decompilation, and presents the results to the user via built-in constructs in the decompilation (`try`, `catch`, `__wind`, `__unwind`). When the results cannot be presented entirely with these constructs, they will be presented via helper calls in the decompilation.

The documentation describes:

  1. Background behind C++ exception metadata. It is recommended that users read this first.
  2. Interactive operation via the GUI, configuration file, and keyboard shortcuts.
  3. A list of helper calls that may appear in the output.
  4. A note about the boundaries of `try` and `__unwind` regions.
  5. Miscellaneous notes about the plugin.

# TRY, CATCH, AND THROW

The C++ language provides the `try` scoped construct in which the developer expects that an exception might occur. `try` blocks must be followed by one or more scoped `catch` constructs for catching exceptions that may occur within. `catch` blocks may use `...` to catch any exception. Alternatively, `catch` blocks may name the type of an exception, such as `std::bad_alloc`. `catch` blocks with named types may or may not also catch the exception object itself. For example, `catch(std::bad_alloc *v10)` and `catch(std::bad_alloc *)` are both valid. The former can access the exception object through variable `v10`, whereas the latter cannot access the exception object.

C++ provides the `throw` keyword for throwing an exception, as in `std::bad_alloc ba; throw ba;`. This is represented in the output as (for example) `throw v10;`. C++ also allows code to rethrow the current exception via `throw;`. This is represented in the output as `throw;`.

# WIND AND UNWIND

Exception metadata in C++ binaries is split into two categories: `try` and `catch` blocks, as discussed above, and so-called `wind` and `unwind` blocks. C++ does not have `wind` and `unwind` keywords, but the compiler creates these blocks implicitly. In most binaries, they outnumber `try` and `catch` blocks by about 20 to 1.

Consider the following code, which may or may not throw an `int` as an exception at three places:

  void may_throw() {
    // Point -1
    if(rand() % 2)
      throw -1;

    string s0 = "0";

    // Point 0
    if(rand() % 2)
      throw 0;

    string s1 = "1";

    // Point 1
    if(rand() % 2)
      throw 1;

    // Point 2
    printf("%s %s\n",
      s0.c_str(),
      s1.c_str());

    // Implicit
    // destruction
    s1.~string();
    s0.~string();
  }

If an exception is thrown at point -1, the function exits early without executing any of its remaining code. As no objects have been created on the stack, nothing needs to be cleaned up before the function returns.

If an exception is thrown at point 0, the function exits early as before. However, since `string s0` has been created on the stack, it needs to be destroyed before exiting the function. Similarly, if an exception is thrown at point 1, both `string s1` and `string s0` must be destroyed.

These destructor calls would normally happen at the end of their enclosing scope, i.e. the bottom of the function, where the compiler inserts implicitly-generated destructor calls. However, since the function does not have any `try` blocks, none of the function's remaining code will execute after the exception is thrown. Therefore, the destructor calls at the bottom will not execute. If there were no other mechanism for destructing `s0` and/or `s1`, the result would be memory leaks or other state management issues involving those objects. Therefore, the C++ exception management runtime provides another mechanism to invoke their destructors: `wind` blocks and their corresponding `unwind` handlers.

`wind` blocks are effectively `try` blocks that are inserted invisibly by the compiler. They begin immediately after constructing some object, and end immediately before destructing that object. Their `unwind` blocks play the role of `catch` handlers, calling the destructor upon the object when exceptional control flow would otherwise cause the destructor call to be skipped.

Microsoft Visual C++ effectively transforms the previous example as follows:

  void may_throw_transformed() {
    if(rand() % 2)
      throw -1;

    string s0 = "0";

    // Implicit try
    __wind {
      if(rand() % 2)
        throw 0;

      string s1 = "1";

      // Implicit try
      __wind {
        if(rand() % 2)
          throw 1;

        printf("%s %s\n",
          s0.c_str(),
          s1.c_str());
      }
      // Implicit catch
      __unwind {
        s1.~string();
      }
      s1.~string();
    }
    // Implicit catch
    __unwind {
      s0.~string();
    }
    s0.~string();
  }

`unwind` blocks always re-throw the current exception, unlike `catch` handlers, which may or may not re-throw it. Re-throwing the exception ensures that prior `wind` blocks will have a chance to execute. So, for example, if an exception is thrown at point 1, after the `unwind` handler destroys `string s1`, re-throwing the exception causes the unwind handler for point 0 to execute, thereby allowing it to destroy `string s0` before re-throwing the exception out of the function.

# STATE NUMBERS AND INSTRUCTION STATES

As we have discussed, the primary components of Microsoft Visual C++ x64 exception metadata are `try` blocks, `catch` handlers, `wind` blocks, and `unwind` handlers. Generally speaking, these elements can be nested within one another. For example, in C++ code, it is legal for one `try` block to contain another, and a `catch` handler may contain `try` blocks of its own. The same is true for `wind` and `unwind` constructs: `wind` blocks may contain other `wind` blocks (as in the previous example) or `try` blocks, and `try` and `catch` blocks may contain `wind` blocks.

Exceptions must be processed in a particular sequence: namely, the most nested handlers must be consulted first. For example, if a `try` block contains another `try` block, any exceptions occurring within the latter region must be processed by the innermost `catch` handlers first. Only if none of the inner `catch` handlers can handle the exception should the outer `try` block's catch handlers be consulted. Similarly, as in the previous example, `unwind` handlers must destruct their corresponding objects before passing control to any previous exception handlers (such as `string s1`'s `unwind` handler passing control to `string s0`'s `unwind` handler).

Microsoft's solution to ensure that exceptions are processed in the proper sequence is simple. It assigns a "state number" to each exception-handling construct. Each exception state has a "parent" state number whose handler will be consulted if the current state's handler is unable to handle the exception. In the previous example, what we called "point 0" is assigned the state number 0, while "point 1" is assigned the state number 1. State 1 has a parent of 0. (State 0's parent is a dummy value, -1, that signifies that it has no parent.) Since `unwind` handlers always re-throw exceptions, if state 1's `unwind` handler is ever invoked, the exception handling machinery will always invoke state 0's `unwind` handler afterwards. Because state 0 has no parent, the exception machinery will re-throw the exception out of the current function. This same machinery ensures that the catch handlers for inner `try` blocks are consulted before outer `try` blocks.

There is only one more piece to the puzzle: given that an exception could occur anywhere, how does the exception machinery know which exception handler should be consulted first? I.e., for every address within a function with C++ exception metadata, what is the current exception state? Microsoft C++/x64 binaries provide this information in the `IPtoStateMap` metadata tables, which is an array of address ranges and their corresponding state numbers.

# GUI OPERATION

This support is fully automated and requires no user interaction. However, the user can customize the display of C++ exception metadata elements for the global database, as well as for individual functions.

# GLOBAL SETTINGS

Under the `Edit->Other->C++ exception display settings` menu item, the user can edit the default settings to control which exception constructs are shown in the listing. These are saved persistently in the database (i.e., the user's choices are remembered after saving, closing, and re-opening), and can also be adjusted on a per-function basis (described later).

The settings on the dialog are as follows:

* Default output mode. When the plugin is able to represent C++ exception constructs via nice constructs like `try`, `catch`, `__wind`, and `__unwind` in the listings, these are called "structured" exception states. The plugin is not always able to represent exception metadata nicely, and may instead be forced to represent the metadata via helper calls in the listing (which are called "unstructured" states). As these can be messy and distracting, users may prefer not to see them by default. Alternatively, the user may prefer to see no exception metadata whatsoever, not even the structured ones. This setting allows the user to specify which types of metadata will be shown in the listing. * Show wind states. We discussed wind states and unwind handlers in the background material. Although these states can be very useful when reverse engineering C++ binaries (particularly when analyzing constructors), displaying them increases the amount of code in the listing, and sometimes the information they provide is more redundant than useful. Therefore, this checkbox allows the user to control whether they are shown by default. * Inform user of hidden states. The two settings just discussed can cause unstructured and/or wind states to be omitted from the default output. If this checkbox is enabled, then the plugin will inform the user of these omissions via messages at the top of the listing, such as this message indicating that one unstructured wind state was omitted: ``` // Hidden C++ exception states: #wind_helpers=1 ```

There are three more elements on the settings dialog; most users should never have to use them. However, for completeness, we will describe them now.

* Warning behavior. When internal warnings occur, they will either be printed to the output window at the bottom, or shown as a pop-up warning message box depending on this setting. * Reset per-function settings. The next section will discuss how the display settings described above can be customized on a per-function basis. This button allows the user to erase all such saved settings, such that all functions will use the global display settings the next time they are decompiled. * Rebuild C++ metadata caches. Before the plugin can show C++ exception metadata in the output, it must pre-process the metadata across the whole binary. Doing so crucially relies upon the ability to recognize the `__CxxFrameHandler3` and `__CxxFrameHandler4` unwind handler functions when they are referenced by the binary's unwind metadata. If the plugin fails to recognize one of these functions, then it will be unable to display C++ exception metadata for any function that uses the unrecognized unwind handler(s).

If the user suspects that a failure like this has taken place -- say, because they expect to see a `try`/`catch` in the output and it is missing, and they have confirmed that the output was not simply hidden due to the display settings above -- then this button may help them to diagnose and repair the issue. Pressing this button flushes the existing caches from the database and rebuilds them. It also prints output to tell the user which unwind handlers were recognized and which ones were not. The user can use these messages to confirm whether the function's corresponding unwind handler was unrecognized. If it was not, the user can rename the unwind handler function to something that contains one of the two aforementioned names, and then rebuild the caches again.

Note that users should generally not need to use this button, as the plugin tries several methods to recognize the unwind handlers (such as FLIRT signatures, recognizing import names, and looking at the destination of "thunk" functions with a single `jmp` to a destination function). If the user sees any C++ exception metadata in the output, this almost always means that the recognition worked correctly. This button should only be used by experienced users as a last resort. Users are advised to save their database before pressing this button, and only proceed with the changes if renaming unwind handlers and rebuilding the cache addresses missing metadata in the output.

# CONFIGURATION

The default options for the settings just described are controlled via the `%IDADIR%/cfg/eh34.cfg` configuration file. Editing this file will change the defaults for newly-created databases (but not affect existing databases).

# PER-FUNCTION SETTINGS

As just discussed, the user can control which C++ exception metadata is displayed in the output via the global menu item. Users can also customize these settings on a per-function basis (say, by enabling display of wind states for selected functions only), and they will be saved persistently in the database.

When a function has C++ exception metadata, one or more items will appear on Hex-Rays' right click menu. The most general one is "C++ exception settings...". Selecting this menu item will bring up a dialog that is similar to the global settings menu item with the following settings:

* Use global settings. If the user previously changed the settings for the function, but wishes that the function be shown via the global settings in the future, they can select this item and press "OK". This will delete the saved settings for the function, causing future decompilations to use the global settings. * This function's output mode. This functions identically to "Default output mode" from the global settings dialog, but only affects the current function. * Show wind states. Again, identical to the global settings dialog item.

There is a button at the bottom, "Edit global settings", which is simply a shortcut to the same global settings dialog from the `Edit->Other->C++ exception display settings` menu item.

The listing will automatically refresh if the user changes any settings.

Additionally, there are four other menu items that may or may not appear, depending upon the metadata present and whether the settings caused any metadata to be hidden. These menu items are shortcuts to editing the corresponding fields in the per-function settings dialog just discussed. They are:

* Show unstructured C++ states. If the global or per-function default output setting was set to "Structured only", and the function had unstructured states, this menu item will appear. Clicking it will enable display of unstructured states for the function and refresh the decompilation. * Hide unstructured C++ states. Similar to the above. * Show wind states. If the global or per-function "Show wind states" setting was disabled, and the function had wind states, this menu item will appear. Clicking it will enable display of wind states for the function and refresh the decompilation. * Hide wind states. Similar to the above.

# KEYBOARD SHORTCUTS

The user can change (add, remove, or edit) the keyboard shortcuts for the per-function settings right-click menu items from the `Edit->Options->Shortcuts` dialog. The names of the corresponding actions are:

* "C++ exception settings": `eh34:func_settings` * "Show unstructured C++ states": `eh34:enable_unstructured` * "Hide unstructured C++ states": `eh34:disable_unstructured` * "Show wind states": `eh34:enable_wind` * "Hide wind states": `eh34:disable_wind` * The global settings dialog: `eh34:config_menu`

# HELPER CALLS

Hex-Rays' Microsoft C++ x64 exception support tries to details about exception state numbers as much as possible. However, compiler optimizations can cause binaries to diverge from the original source code. For example, inlined functions can produce `goto` statements in the decompilation despite there being none in the source. Optimizations can also cause C++ exception metadata to differ from the original code. As a result, it is not always possible to represent `try`, `catch`, `wind`, and `unwind` constructs as scoped regions that hide the low-level details.

In these cases, Hex-Rays' Microsoft C++ x64 exception support will produce helper calls with informative names to indicate when exception states are entered and exited, and to ensure that the user can see the bodies of `catch` and `unwind` handlers in the output. The user can hover their mouse over those calls to see their descriptions. They are also catalogued below.

The following helper calls are used when exception states have multiple entrypoints, or multiple exits:

  `__eh34_enter_wind_state(s1, s2) : switch state from parent state s1 to child wind state s2`
  `__eh34_enter_try_state(s1, s2) : switch state from parent state s1 to child try state s2`
  `__eh34_exit_wind_state(s1, s2) : switch state from child wind state s1 to parent state s2`
  `__eh34_exit_try_state(s1, s2) : switch state from child try state s1 to parent state s2`

The following helper calls are used when exception states had single entry and exit points, but could not be represented via `try` or `__wind` keywords:

  `__eh34_wind(s1, s2) : switch state from parent state s1 to child state s2; a new c++ object that requires a dtr has been created`
  `__eh34_try(s1, s2) : switch state from parent state s1 to child state s2; mark beginning of a try block`

The following helper calls are used to display `catch` handlers for exception states that could not be represented via the `catch` keyword:

  `__eh34_catch(s) : beginning of catch blocks at state s; s corresponds to the second argument of the matching try call(if present)`
  `__eh34_catch_type(s, \"handler_address\") : a catch statement for the type described at \"address\"`
  `__eh34_catch_ellipsis(s) : \"catch all\" statement`

The following helper calls should be removed, but if you see them, they signify the boundary of a `catch` handler:

  `__eh34_try_continuation(s, i, ea) : end of catch handler `i` for state `s`, returning to address `ea``
  `__eh34_caught_type(s, \"handler_address\") : a pairing call for __eh34_catch_type when catch handler has no continuation`
  `__eh34_caught_ellipsis(s) : \"caught all\", paired with __eh34_catch_ellipsis when catch handler has no continuation`

The following helper calls are used to display `unwind` handlers for exception states that could not be represented via the `__unwind` keyword:

  `__eh34_unwind(s) : destruct the c++ object created immediately before entering state s; s corresponds to the second argument of the matching wind call(if present)`

The following helper calls are used to signify that an `unwind` handler has finished executing, and will transfer control to a parent exception state (or outside of the function):

  `__eh34_continue_unwinding(s1, s2) : after unwinding at child state s1, switch to parent state s2 and perform its unwind or catch action`
  `__eh34_propagate_exception_into_caller(s1, s2) : after unwinding at child state s1, switch to root state s2; this corresponds to the exception being propagated into the calling function`

The following helper call is used when the exception metadata did not specify a function pointer for an `unwind` handler, which causes program termination:

  `__eh34_no_unwind_handler(s) : the state s did not have an unwind handler, which causes program termination in the event that an exception reaches it`

The following helper calls are used to signify that Hex-Rays was unable to display an exception handler in the decompilation:

  `__eh34_unwind_handler_absent(s, ea) : could not inline unwind handler at ea for wind state s`
  `__eh34_catch_handler_absent(s, i, ea) : could not inline i'th catch handler at ea for try state s`
From Microsoft Visual Studio versions 2005 (toolchain version 8.0) to 2017 Service Pack 2 (version 14.12), the compiler emitted detailed metadata that precisely defined the boundaries of all exception regions within C++ functions. This made binary files large, and not all of the metadata was strictly necessary for the runtime library to handle C++ exceptions correctly.

Starting from MSVC 2017 Service Pack 3 (version 14.13), the compiler began applying optimizations to reduce the size of the C++ exception metadata. An official Microsoft blog entry entitled ["Making C++ Exception Handling Smaller on x64"](https://devblogs.microsoft.com/cppblog/making-cpp-exception-handling-smaller-x64/)

As a result of these changes, the C++ exception metadata in MSVC 14.13+ binaries is no longer fully precise. Exception states are frequently reported as beginning physically after where the source code would indicate. In order to produce usable output, Hex-Rays employs mathematical optimization algorithms to reconstruct more detailed C++ exception metadata configurations that can be displayed in a nicer format in the decompilation. These algorithms improve the listings by producing more structured regions and fewer helper calls in the output, but they introduce further imprecision as to the true starting and ending locations of exception regions when compared to the source code. They are an integral part of Hex-Rays C++/x64 Windows exception metadata support and cannot be disabled.

The takeaway is that, when processing MSVC 14.13+ binaries, Hex-Rays C++/x64 Windows exception support frequently produces `try` and `__unwind` blocks that begin and/or end earlier and/or later than what the source code would indicate, were it available. This has important consequences for vulnerability analysis.

For example, given accurate exception boundary information, the destructor for a local object would ordinarily be situated after the end of that object's `__wind` and `__unwind` blocks, as in:

  Object::Constructor(&v14);
  __wind
  {
    // ...
  }
  __unwind
  {
    Object::Destructor(&v14);
  }
  // HERE: destructor after __wind region
  Object::Destructor(&v14);

Yet, due to the imprecise boundary information, Hex-Rays might display the destructor as being inside of the `__wind` block:

  Object::Constructor(&v14);
  __wind
  {
    // ...
    // HERE: destructor inside of __wind region
    Object::Destructor(&v14);
  }
  __unwind
  {
    Object::Destructor(&v14);
  }

The latter output might indicate that `v14`'s destructor would be called twice if its destructor were to throw an exception. However, this indication is simply the result of imprecise exception region boundary information. In short, users should be wary of diagnosing software bugs or security issues based upon the positioning of statements nearby the boundaries of `try` and `__wind` blocks. The example above indicates something that might appear to be a bug in the code -- a destructor being called twice -- but is in fact not one.

These considerations primarily apply when analyzing C++ binaries compiled with MSVC 14.13 or greater. They do not apply as much to binaries produced by MSVC 14.12 or earlier, when the compiler emitted fully precise information about exception regions.

Although Hex-Rays may improve its detection of exception region boundaries in the future, because modern binaries lack the ground truth of older binaries, the results will never be fully accurate. If the imprecision is unacceptable to you, we recommend permanently disabling C++ metadata display via the `eh34.cfg` file discussed previously.

# MISCELLANEOUS

Hex-Rays' support for exceptions in Microsoft Visual C++/x64 only works after auto-analysis has been completed. Users can explore the database and decompile functions as usual, but no C++ exception metadata will be shown. Users are advised to refresh any decompilation windows after auto-analysis has completed.

If users have enabled display of wind states, they may see empty `__wind` or `__unwind` constructs in the output. Usually, this does not indicate an error occurred; this usually means that the region of the code corresponding the `wind` state was very small or contained dead code, and Hex-Rays normal analysis and transformation made it empty.

Starting in IDA 9.0, IDA's auto-analysis preprocesses C++ exception metadata differently than in previous versions. In particular, on MSVC/x64 binaries, `__unwind` and `catch` handlers are created as standalone functions, not as chunks of their parent function as in earlier versions. This is required to display the exception metadata correctly in the decompilation. For databases created with older versions, the plugin will still show the outline of the exception metadata, but the bodies of the `__unwind` and `catch` handlers will be displayed via the helper calls `__eh34_unwind_handler_absent` and `__eh34_catch_handler_absent`, respectively. The plugin will also print a warning at the top of the decompilation such as `Absent C++ exception handlers: #catch=1 (pre-9.0 IDB)` in these situations. Re-creating the IDB with a newer version will solve those issues, although users might still encounter absent handlers in new databases (rarely, and under different circumstances).

Last updated