The PE operation mode can be used to load PE files and debug them in a MS Windows like environment.
The current limitations include:
PE executables feature list:
- SEH support: we try to mimic MS Windows as much as possible. For example, the ICEBP instruction is a privileged instruction, but Windows reports back a single step exception. Similarly, MS Windows does not distinguish between 0xCC and 0xCD 0x03, so when an exception occurs, it reports that the exception address is always one byte before the trap. So if it was an INT 0x3 (CD03), the exception address will point to the 0x03 (in the middle of the instruction).
- TLS callbacks: TLS callbacks are normally parsed by IDA and presented as entry points. They will be called by the debugger before jumping to the main entry point of the application. Turning on the "Debugger Setup/Suspend on debugging start" may be a good idea since it will make all this logic clearly visible.
- Emulation of NT structures: Some malware do not use GetProcAddress() or GetModuleHandle(). Instead, they try to parse the system structures and deduce these values. For this we also provide and build the basic structure of TIB, PEB, PEB_LDR_DATA, LDR_MODULE(s) and RTL_USER_PROCESS_PARAMETERS. Other structures can be built in the bochs_startup() function of the startup.idc file.
PE/PE+ executables feature list:
- Extensible API emulation: The user may provide an implementation of a any API function using scripts. The plugin supports IDC language by default, but if there are any other registered and active external languages, then external language will be used. Currently, the plugin ships with preconfigured IDC and Python scripts (please refer to startup.idc/startup.py).
It is also possible to take a copy of all the API and startup scripts and place them next to the database in question. This will tell the Bochs debugger plugin that these scripts are to be used with the current database directory. Such mechanism makes it possible to customize API/startup scripts for different databases.
In the following example, kernel32!GlobalAlloc is implemented via IDC like this:
A simple MessageBoxA replacement can be:
To access the stack arguments passed to a given IDC function, please use the BochsGetParam() IDC function.
For a full reference on using IDC to implement API calls, please refer to ida\plugins\bochs\api_kernel32.idc file.
- Remap a DLL path (from the startup.idc script):
- Specify additional DLL search path (and optionally change the mapping):
- Redefine the environment variables: the environment variables can be redefined in startup.idc
- Use native code: it is possible to write a custom Win32/64 DLL and map it into the process space.
Existing APIs can then be redirected to custom written functions. For example:
Here we redirect some functions to bochsys.dll that modify the memory space of the application. Please note that bochsys.dll is a special module, IDA is aware of it. Custom functions are declared like this:
Then in startup.idc file, the following line must be added:
Custom DLLs are normal DLLs that can import functions from any other DLL. However, it is advisable that the custom DLL is kept small in size and simple, by not linking it to the runtime libraries.
- Helper IDC functions: a set of helper IDC functions are available when the debugger is active. For more information, please refer to "startup.idc".
- Less demanding PE loader: Most PE files can be loaded and run, including system drivers, DLL and some PE files that cannot be run by the operating system.
- Dependency resolution: In the PE operation mode, the plugin will recursively load all DLLs referenced by the program. All DLLs that are not explicitly marked with "stub" in startup.idc will be loaded as is. It is important to "stub" all system DLLs for faster loading. The PE loader creates empty stubs for undefined functions in stubbed DLLs. For example, the following line defines a stub that will always return 0 for CreateFileA:
Since CreateFileA is mentioned in the IDS files and IDA knows how many bytes it purges from the stack, there is no need to specify the "purge" value. For other functions that are not known to IDA, a full definition line would look like:
- Startup and Exit scripts: It is possible to assign IDC functions that run when the debugging session starts or is about to terminate (before the application continues from the PROCESS_EXITED event). In addition to running code at startup, the startup script serves a role in telling the PE loader which DLLs are to be mapped for the current debugging session. For example:
These lines list the DLLs that will be used during the debugging session. IDA creates empty stubs for all functions from these DLLs. Non-trivial implementations of selected functions can be specified in api_xxxx.idc files, where xxxx is the module name.
API and startup scripts are searched first in the current directory and then in the ida/plugins/bochs directory.
- Memory allocation limit: The PE loader has a simple memory manager. It is used for the initial memory allocation at the loading stage and for dynamic memory allocation during the debugging session. No memory limits are applied at the loading stage: the loader will load all modules regardless of their size. However, when the debugging session starts, a limit applies on how much memory can be dynamically allocated. This limit is specified in the debugger specific options as "Max allocatable memory". Memory allocation will fail once this limit is reached.
Some notes on bochsys.dll:
- BxIDACall: This exported function is used as a trap to suspend the code execution in Bochs and perform some actions in IDA. For example, when the target calls kernel32.VirtualAlloc, it is redirected to bochsys.BxVirtualAlloc, which calls BxIDACall, which triggers IDA:
A breakpoint can be set on this function to monitor all API calls that are handled by IDA.
- BxUndefinedApiCall: This exported function is executed when the application calls an unimplemented function. Setting a breakpoint on it will allow discovering unimplemented functions and eventually implementing them as IDC or DLL functions. It can also be used to determine when unpacking/decryption finishes (provided that all functions used by the unpacker have been defined).
See also: