An Adventure In Hostile Code Analysis: Disassembly
Last updated
Last updated
We used IDA to disassemble both trojan components.
The first part is small and trivial to analyze. It displays a message box saying “Error on line 25: invalid object” and, when the user clicks OK, connects to http://view-greetings-yahoo.com to download sysman32.exe into the system directory. It then creates a registry key in the well known Software\Microsoft\Windows\CurrentVersion\Run registry branch to run at the next reboot. You can find the full analysis of Part 1 here as an IDA 4.5 database or a text listing.
The main trojan executable was more challenging and interesting because
it was compiled using Microsoft Visual C++ and packed with PEPACK executable compressor.
it heavily used the COM interface to interact with the browser windows. COM Interface interactions are notoriously hard to analyze especially when you deal hostile which must be analyzed statically.
its code was filled with indirect calls that made a raw analysis difficult to comprehend.
We used proprietary in-house tools to handle the unpacking and analyze the COM interface calls. The Part 2 IDA 4.5 database and a text listing are available.
Here is a sample of the raw code:
and here is how it looks after analysis:
The trojan uses the COM interface to get information from all open browser windows. To get a pointer to the IShellWindows object which represents all browser windows it calls CoCreateInstance with the class id value for IShellWindows: {9BA05972-F6A8-11CF-A44200A0C90A8F39}
The algorithm that steals the web browser information is as follows
For each browser window
get the window
document using IWebBrowser2::get_Document() get the window title using IHTMLDocument2::get_title() convert the title from unicode to ascii and look for the substrings “e-gold Account Access”, “PayPal – Log In”, “- Sign”
If the window title contains any of them, then:
get the collection of all document frames using IHTMLDocument2::get_frames() get the number of document frames using IHTMLFramesCollection2::get_length()
for each document frame:
get html element collection using IHTMLDocument2::get_all() get the number of elements using IHTMLElementCollection::get_length()
for each HTML element:
get its value using IHTMLElement::getAttribute() with attribute name = “Value” if getAtribute() succeeds, then store the value of the HTML element in the output buffer.
That is how the gets the values of all input fields, checkboxes and other input controls on the web page.
The most time consuming but essential phase of the analysis is finding out the object and function names from their magic class IDs. While the object names have to be defined manually, the function names can be recovered by the IDA type system. The type library vc6winr.til contains information about common windows object virtual tables (vtbls). In order to replace call [ebx+8] with something nice like call [ebx+IShellWindows.Item] we add the corresponding virtual table definition from the type library to the database and then use the “structure offset” command to convert the number into a nice function name. The virtual tables usually have a class name postfix with “Vtbl”. For example, the virtual table for the IShellWindows class is IShellWindowsVtbl. In practice, we have automated this procedure with a plugin. What can’t be yet automated is the “structure offset” command since IDA doesn’t trace the data flow in programs. The user must still locate the call [ebx+N] instructions and convert them to a meaningful representation.
NOTE: in general all COM objects are retrieved by their class ids. We won’t give the class ids for the all the objects but all of them are retrieved by supplying a class id.