The PE operation mode can be used to load PE files and debug them in a MS Windows like environment.
The current limitations include:
PE executables feature list:
- SEH support: we try to mimic MS Windows as much as possible. For example, the ICEBP instruction is a privileged instruction, but Windows reports back a single step exception. Similarly, MS Windows does not distinguish between 0xCC and 0xCD 0x03, so when an exception occurs, it reports that the exception address is always one byte before the trap. So if it was an INT 0x3 (CD03), the exception address will point to the 0x03 (in the middle of the instruction).
- TLS callbacks: TLS callbacks are normally parsed by IDA and presented as entry points. They will be called by the debugger before jumping to the main entry point of the application. Turning on the "Debugger Setup/Suspend on debugging start" may be a good idea since it will make all this logic clearly visible.
- Emulation of NT structures: Some malware do not use GetProcAddress() or GetModuleHandle(). Instead, they try to parse the system structures and deduce these values. For this we also provide and build the basic structure of TIB, PEB, PEB_LDR_DATA, LDR_MODULE(s) and RTL_USER_PROCESS_PARAMETERS. Other structures can be built in the bochs_startup() function of the startup.idc file.
PE/PE+ executables feature list:
- Extensible API emulation: The user may provide an implementation of a any API function using scripts. The plugin supports IDC language by default, but if there are any other registered and active external languages, then external language will be used. Currently, the plugin ships with preconfigured IDC and Python scripts (please refer to startup.idc/startup.py).
It is also possible to take a copy of all the API and startup scripts and place them next to the database in question. This will tell the Bochs debugger plugin that these scripts are to be used with the current database directory. Such mechanism makes it possible to customize API/startup scripts for different databases.
In the following example, kernel32!GlobalAlloc is implemented via IDC like this:
A simple MessageBoxA replacement can be:
To access the stack arguments passed to a given IDC function, please use the BochsGetParam() IDC function.
For a full reference on using IDC to implement API calls, please refer to ida\plugins\bochs\api_kernel32.idc file.
- Remap a DLL path (from the startup.idc script):
- Specify additional DLL search path (and optionally change the mapping):
- Redefine the environment variables: the environment variables can be redefined in startup.idc
- Use native code: it is possible to write a custom Win32/64 DLL and map it into the process space.
Existing APIs can then be redirected to custom written functions. For example:
Here we redirect some functions to bochsys.dll that modify the memory space of the application. Please note that bochsys.dll is a special module, IDA is aware of it. Custom functions are declared like this:
Then in startup.idc file, the following line must be added:
Custom DLLs are normal DLLs that can import functions from any other DLL. However, it is advisable that the custom DLL is kept small in size and simple, by not linking it to the runtime libraries.
- Helper IDC functions: a set of helper IDC functions are available when the debugger is active. For more information, please refer to "startup.idc".
- Less demanding PE loader: Most PE files can be loaded and run, including system drivers, DLL and some PE files that cannot be run by the operating system.
- Dependency resolution: In the PE operation mode, the plugin will recursively load all DLLs referenced by the program. All DLLs that are not explicitly marked with "stub" in startup.idc will be loaded as is. It is important to "stub" all system DLLs for faster loading. The PE loader creates empty stubs for undefined functions in stubbed DLLs. For example, the following line defines a stub that will always return 0 for CreateFileA:
Since CreateFileA is mentioned in the IDS files and IDA knows how many bytes it purges from the stack, there is no need to specify the "purge" value. For other functions that are not known to IDA, a full definition line would look like:
- Startup and Exit scripts: It is possible to assign IDC functions that run when the debugging session starts or is about to terminate (before the application continues from the PROCESS_EXITED event). In addition to running code at startup, the startup script serves a role in telling the PE loader which DLLs are to be mapped for the current debugging session. For example:
These lines list the DLLs that will be used during the debugging session. IDA creates empty stubs for all functions from these DLLs. Non-trivial implementations of selected functions can be specified in api_xxxx.idc files, where xxxx is the module name.
API and startup scripts are searched first in the current directory and then in the ida/plugins/bochs directory.
- Memory allocation limit: The PE loader has a simple memory manager. It is used for the initial memory allocation at the loading stage and for dynamic memory allocation during the debugging session. No memory limits are applied at the loading stage: the loader will load all modules regardless of their size. However, when the debugging session starts, a limit applies on how much memory can be dynamically allocated. This limit is specified in the debugger specific options as "Max allocatable memory". Memory allocation will fail once this limit is reached.
Some notes on bochsys.dll:
- BxIDACall: This exported function is used as a trap to suspend the code execution in Bochs and perform some actions in IDA. For example, when the target calls kernel32.VirtualAlloc, it is redirected to bochsys.BxVirtualAlloc, which calls BxIDACall, which triggers IDA:
A breakpoint can be set on this function to monitor all API calls that are handled by IDA.
- BxUndefinedApiCall: This exported function is executed when the application calls an unimplemented function. Setting a breakpoint on it will allow discovering unimplemented functions and eventually implementing them as IDC or DLL functions. It can also be used to determine when unpacking/decryption finishes (provided that all functions used by the unpacker have been defined).
See also:
The IDB operation mode, as its name implies, takes the current database as the input and runs it under the Bochs debugger. This mode can be used to debug any x86 32 or 64-bit code. Please note that the code executes with privilege ring 3.
The following parameters can be specified in the IDB operation mode:
It may also prove useful to enable the "Debugger Setup/Suspend on debugging start" so that IDA automatically suspends the process before executing the first instruction.
While debugging, exceptions may occur and are caught by IDA. Please note that these exceptions are raw machine exceptions. For example, instead of an access violation exception, a page fault exception is generated.
See also:
The disk image operation mode is used to run Bochs with any Bochs disk image.
A simple way to get started is to launch IDA and disassemble the bochsrc file associated with your disk image. IDA will recognize bochsrc files, parse the contents, determine the associated disk image and create a new database containing the first sector of the disk image (usually the boot sector).
The database does not have to correspond to the disk image: it could in fact start as an empty database, then user could convert the needed segments to loader segments for later analysis. The following script can be used for that purpose:
If the disk image switches to protected mode with memory paging enabled, IDA will use the page table mapping to display segments. For 16-bit applications, IDA automatically creates a default DOS memory map (Interrupt vector table, Bios Data Area, User Memory and BIOS ROM). Also, the Bochs Debugger plugin will try to guess the debugger segment bitness, nonetheless the user can edit the bitness manually.
Moreover, the Bochs internal debugger provides the ability to add hardware like breakpoints, known as watchpoints, but the addresses must be physical addresses. In order to use the disk image operation mode in a more convenient way, the plugin will convert the virtual addresses to physical addresses (if page table information is present) before adding the hardware breakpoint. This mechanism will not always work, please check the FAQ for more information. For hardware breakpoint on execute, the plugin will use the selected address as-is and create a physical breakpoint.
The following parameters can be specified for the disk image operation mode:
This is a small example on how to debug a given disk image:
1. Prepare the needed bochs virtual machine files (bochsrc, disk image, floppy image if needed, etc...)
2. Load the bochsrc file into IDA. IDA will automatically create a database.
(Step 2, is optional. It is possible to use a database of your choice, but remember to point its "Debugger->Process Options->Input file" to the bochsrc file)
3. Make sure the "Disk image" operation mode is selected (If Step 2 was used, then Disk image operation mode will be selected automatically)
4. Enable "Debugger Options->Suspend on debugging start", and start debugging!
In the disk image operation mode, the Bochs debugger plugin does not handle or report exceptions, if they must be caught and handled, please put breakpoints in the IDT or IVT entries.
See also:
The Bochs debugger plugin uses the Bochs internal command line debugger. For more about the internal debugger: http://bochs.sourceforge.net/doc/docbook/user/internal-debugger.html
To use the Bochs debugger plugin, the following steps must be carried out:
Download and install Bochs v2.6.x from: http://bochs.sourceforge.net/getcurrent.html For Mac OS or Linux, please refer to the following guide: https://www.hex-rays.com//products/ida/support/tutorials/debugging_bochs_linux.pdf
Because the debugger plugin uses the Bochs command line debugger, it has the following limitations:
There are ways to overcome some of the limitations mentioned above by downloading Bochs source code and modifying it. For example, the number of allowed breakpoints can be increased.
The Bochs debugger configuration dialog box has the following entries:
Operation mode The user can choose between Disk Image, IDB and PE operation modes.
Default configuration parameters are taken from ida\cfg\dbg_bochs.cfg.
The Bochs debugger module adds a new menu item: Debugger, Bochs Command. It can be used to send arbitrary commands to Bochs. The command output is displayed in the message window (there is also an IDC counterpart of this function, please refer to "startup.idc" file). This command is very useful but may interfere with IDA, especially if the user modifies breakpoints or resume execution outside IDA.
See also:
General:
- How are breakpoints treated by IDA Bochs Plugin: Bochs debugger does not use breakpoints by inserting 0xCC at the given address. Instead, since it is an emulator, it constantly compares the current instruction pointer against the breakpoint list. Data breakpoints are supported by Bochs and are known as "watchpoints". If the user creates hardware breakpoints, IDA will automatically create Bochs watchpoints.
- How to select the Bochs operation mode programmatically: using IDC: it is possible to use process_config_line() function to pass a key=value. For example, process_config_line("DEFAULT_MODE=1") will select the disk image operation mode. (Please refer to cfg\dbg_bochs.cfg for list of configurable options).
- When debugging, IDA undefines instructions: if the "Debugger options / Reconstruct stack" is checked and the stack pointer is in the same segment as the currently executing code, then IDA might undefine instructions. To solve this program, uncheck the reconstruct stack option.
- How to convert from physical to linear addresses and vice versa: Bochs internal debugger provides two useful commands for this: "info tab" and "page".
Disk image operation mode:
- Data/Software breakpoints are not always triggered: During a debugging session, when a breakpoint is created while the protected mode/paging is enabled, the page table information is used to translate addresses and correctly create the breakpoint. When debugging session is started again, IDA cannot access translation tables to re-create the same breakpoint, thus the breakpoint will be created without any translation information (it will be treated as physical addresses) This is why those breakpoints are not triggered. As a workaround, we suggest disabling those breakpoints and re-enable them when paging is enabled. This problem can also arise when the "use virtual breakpoints" option is enabled.
IDB/PE operation mode:
- Cannot map VA: Sometimes, IDA may display a message stating that a given VA could not be mapped. This mainly happens because both IDB/PE operation modes use virtual addresses from 0x0 to 0x9000 and from 0xE0000000 to 0xFFFFFFFF internally. To solve the problem, please rebase the program to another area:
PE operation mode:
- Dynamic DLL loading: sometimes, when running a program, the plugin may attempt to load a DLL that is not declared in the stub or load section of the startup script. In this case, please write down the name of the DLL, then add it to the startup script, and restart the debug session. It is possible to create a local copy, next to your database, of startup scripts so that these scripts will be used with this database only.
- Disk image loading slow: The disk image produced in the PE operation mode can be as big as 20MB. The reason for this slow loading is most probably because the plugin tries to load all referenced DLLs instead of stubbing them. To fix this, when the process starts, please take note of the loaded DLLs list (using IDA / Modules List) then add the desired module names in the startup.* / "stub" section. See also: Bochs debugger