Functions
This submenu allows you to manipulate functions in the disassembly:
Create Function
Action name: MakeFunction
This command defines a new function in the disassembly text.
You can specify function boundaries using the anchor. If you don't specify any, IDA will try to find the boundaries automatically:
- function start point is equal to the current cursor position;
- function end point is calculated by IDA.
A function cannot contain references to undefined instructions. If a function has already been defined at the specified addresses, IDA will jump to its start address, showing you a warning message.
A function must start with an instruction.
Edit Function
Action name: EditFunction
Here you can change function bounds, its name and flags. In order to change only the function end address, you can use FunctionEnd command.
If the current address does not belong to any function, IDA beeps.
This command allows you to change the function frame parameters too. You can change sizes of some parts of frame structure.
IDA considers the stack as the following structure:
+------------------------------+
| function arguments |
+------------------------------+
| return address |
+------------------------------+
| saved registers (SI,DI,etc) |
+------------------------------+ <- BP
| local variables |
+------------------------------+ <- SP
For some processors or functions, BP may be equal to SP. In other words, it can point to the bottom of the stack frame.
You may specify the number of bytes in each part of the stack frame. The size of the return address is calculated by IDA (possibly depending on the far function flag).
"Purged bytes" specifies the number of bytes added to SP upon function return. This value will be used to calculate the SP changes at call sites (used in some calling conventions, such as __stdcall in Windows 32-bit programs.)
"BP based frame" allows IDA to automatically convert [BP+xxx] operands to stack variables.
"BP equal to SP" means that the frame pointer points to the bottom of the stack. It is usually used for the processors which set up the stack frame with EBP and ESP both pointing to the bottom of the frame (for example MC6816, M32R).
If you press <Enter> even without changing any parameter,IDA will reanalyze the function.
Sometimes, EBP points to the middle of the stack frame. FPD (frame pointer delta) is used to handle such situations. FPD is the value substracted from the EBP before accessing variables. An example:
push ebp
lea ebp, [esp-78h]
sub esp, 588h
push ebx
push esi
lea eax, [ebp+74h]
+------------------------------+
| function arguments |
+------------------------------+
| return address |
+------------------------------+
| saved registers (SI,DI,etc) |
+------------------------------+ <- typical BP
| |
| |
| | <- actual BP
| local variables |
| |
| |
| |
+------------------------------+ <- SP
In our example, the saved registers area is empty (since EBP has been initialized before saving EBX and ESI). The difference between the 'typical BP' and 'actual BP' is 0x78 and this is the value of FPD.
After specifying FPD=0x78 the last instruction of the example becomes
lea eax, [ebp+78h+var_4]
where var_4 = -4
Most of the time, IDA calculates the FPD value automatically. If it fails, the user can specify the value manually.
If the value of the stack pointer is modified in an unpredictable way, (e.g. "and esp, -16"), then IDA marks the function as "fuzzy-sp".
If this command is invoked for an imported function, then a simplified dialog box will appear on the screen.
Function flags
The following flags can be set in function properties:
Does not return
The function does not return to caller (for example, calls a process exit function or has an infinite loop). If no-return analysis is enabled in Kernel Options, IDA will not analyze bytes following the calls to this function.
Far function
On processors which distinguish near and far functions (e.g. PC x86), mark the function as 'far'. This may affect the size of the special stack frame field reserved for the return address, as well as analysis of calls to this function.
Library func
Mark the function as part of compiler runtime library code. This flag is usually set when applying FLIRT signatures
Static func
Mark the function as static. Currently this flag is not used by IDA and is simply informational.
BP based frame
Inform IDA that the function uses a frame pointer (BP/EBP/RBP on PC) to access local variables. The operands of the form [BP+xxx] will be automatically converted to stack variables.
BP equal to SP
Frame pointer points to the bottom of the stack instead of at the beginning of the local variables area as is typical.
Fuzzy SP
Function changes SP by an unknown value, for example: and esp, 0FFFFFFF0h
Outlined code
The function is not a real function but a fragment of multiple functions' common instruction sequence extracted by the compiler as a code size optimization (sometimes called "code factoring"). During decompilation, body of the function will be expanded at the call site.
Append Function Tail
Action name: AppendFunctionTail
This command appends an arbitrary range of the program to a function definition. A range must be selected before applying this command. This range must not intersect with other function chunks (however, an existing tail can be added to multiple functions).
IDA will ask to select the parent function for the selection and will append the range to the function definition.
Remove Function Tail
Action name: RemoveFunctionTail
This command removes the function tail at the cursor from a function definition.
If there are several parent functions for the current function tail range, IDA will ask to select the parent function(s) to remove the tail from.
After the confirmation, the current function tail range will be removed from the selected function definition.
If the parent was the only owner of the current tail, then the tail will be destroyed. Otherwise it will still be present in the database. If the removed parent was the owner of the tail, then another function will be selected as the owner.
Delete Function
Action name: DelFunction
Deleting a function deletes only information about a function, such as information about stack variables, comments, function type, etc.
The instructions composing the function will remain intact.
Set Function End
Action name: FunctionEnd
This command changes the current or previous function bounds so that its end will be set at the cursor. If it is not possible, IDA beeps.
Edit the argument location
Allow to edit argument or return value location.
Stack Variables Window
Action name: OpenStackVariables
This command opens the stack variables window for the current function.
The stack variables are internally represented as a structure. This structure consists of two parts: local variables and function arguments.
You can modify stack variable definitions here: add/delete/define stack variables, enter comments for them.
There may be two special fields in this window: " r" and " s". They represent the size of the function return address and of the saved registers in bytes. You cannot modify them directly. To change them, use edit function command.
Offsets at the line prefixes represent offsets from the frame pointer register (BP). The window indicator at the lower left corner of the window displays offsets from the stack pointer.
In order to create or delete a stack variable, use data definitions commands (data, strlit, array, undefine, Rename). Also you may define regular or repeatable comments.
The defined stack variables may be used in the program by converting operands to stack variables.
Esc closes this window.
See also Convert to stack variable.
Change Stack Pointer
Action name: ChangeStackPointer
This command allows you to specify how the stack pointer (SP) is modified by the current instruction.
You cannot use this command if the current instruction does not belong to any function.
You will need to use this command only if IDA was not able to trace the value of the SP register. Usually IDA can handle it but in some special cases it fails. An example of such a situation is an indirect call of a function that purges its parameters from the stack. In this case, IDA has no information about the function and cannot properly trace the value of SP.
Please note that you need to specify the difference between the old and new values of SP.
The value of SP is used if the current function accesses local variables by [ESP+xxx] notation.
See also Convert to stack variable.
Rename register
Action name: RenameRegister
This command allows you to rename a processor general register to some meaningful name. While this is not used very often on IBM PCs, it is especially useful on RISC processors with lots of registers.
For example, a general register R9 is not very meaningful and a name like 'CurrentTime' is much better.
This command can be used to define a new register name as well as to remove it. Just move the cursor on the register name and press enter. If you enter the new register name as an empty string, then the definition will be deleted.
If you have selected a range before using this command, then the definition will be restricted to the selected range. But in any case, the definition cannot cross the function boundaries.
You cannot use this command if the current instruction does not belong to any function.
Set function/item type
Action name: SetType
This command allows you to specify the type of the current item.
If the cursor is located on a name, the type of the named item will be edited. Otherwise, the current function type (if there is a function) or the current item type (if it has a name) will be edited.
The function type must be entered as a C declaration. Hidden arguments (like 'this' pointer in C++) should be specified explicitly. IDA will use the type information to comment the disassembly with the information about function arguments. It can also be used by the Hex-Rays decompiler plugin for better decompilation.
Here is an example of a function declaration:
int main(int argc, const char *argv[]);
To delete a type declaration, please enter an empty string.
IDA supports the user-defined calling convention. In this calling convention, the user can explicitly specify the locations of arguments and the return value. For example:
int __usercall func@<ebx>(int x, int y@<esi>);
denotes a function with 2 arguments: the first argument is passed on the stack (IDA automatically calculates its offset) and the second argument is passed in the ESI register and the return value is stored in the EBX register. Stack locations can be specified explicitly:
int __usercall runtime_memhash@<^12.4>(void *p@<^0.4>, int q@<^4.4>, int r@<^8.4>)
There is a restriction for a __usercall function type: all stack locations should be specified explicitly or all are automatically calculated by IDA. General rules for the user defined prototypes are:
- the return value must be in a register.
Exception: stack locations are accepted for the __golang and __usercall calling conventions.
- if the return type is 'void', the return location must not be specified
- if the argument location is not specified, it is assumed to be
on the stack; consequent stack locations are allocated for such arguments
- it is allowed to declare nested declarations, for example:
int **__usercall func16@<eax>(int *(__usercall *x)@<ebx>
(int, long@<ecx>, int)@<esi>);
Here the pointer "x" is passed in the ESI register;
The pointed function is a usercall function and expects its second
argument in the ECX register, its return value is in the EBX register.
The rule of thumb to apply in such complex cases is to specify the
the registers just before the opening brace for the parameter list.
- registers used for the location names must be valid for the current
processor; some registers are unsupported (if the register name is
generated on the fly, it is unsupported; inform us about such cases;
we might improve the processor module if it is easy)
- register pairs can be specified with a colon like <edx:eax>
- for really complicated cases this syntax can be used. IDA also understands the "__userpurge" calling convention. It is the same thing as __usercall, the only difference is that the callee cleans the stack.
The name used in the declaration is ignored by IDA.
If the default calling convention is __golang then explicit specification of stack offsets is permitted. For example:
__attribute__((format(printf,2,3)))
int myprnt(int id, const char *format, ...);
This declaration means that myprnt is a print-like function; the format string is the second argument and the variadic argument list starts at the third argument.
Below is the full list of attributes that can be handled by IDA. Please look up the details in the corresponding compiler help pages.
packed pack structure/union fields tightly, without gaps
aligned specify the alignment
noreturn declare as not returning function
ms_struct use microsoft layout for the structure/union
format possible formats: printf, scanf, strftime, strfmon
For data declarations, the following custom __attribute((annotate(X))) keywords have been added. The control the representation of numbers in the output:
__bin
unsigned binary number
__oct
unsigned octal number
__hex
unsigned hexadecimal number
__dec
signed decimal number
__sbin
signed binary number
__soct
signed octal number
__shex
signed hexadecimal number
__udec
unsigned decimal number
__float
floating point
__char
character
__segm
segment name
__enum()
enumeration member (symbolic constant)
__off
offset expression (a simpler version of __offset)
__offset()
offset expression
__strlit()
string __stroff() structure offset
__custom()
custom data type and format
__invsign
inverted sign
__invbits
inverted bitwise
__lzero
add leading zeroes
__tabform()
tabular form
The following additional keywords can be used in type declarations:
_BOOL1
a boolean type with explicit size specification (1 byte)
_BOOL2
a boolean type with explicit size specification (2 bytes)
_BOOL4
a boolean type with explicit size specification (4 bytes)
__int8
a integer with explicit size specification (1 byte)
__int16
a integer with explicit size specification (2 bytes)
__int32
a integer with explicit size specification (4 bytes)
__int64
a integer with explicit size specification (8 bytes)
__int128
a integer with explicit size specification (16 bytes)
_BYTE
an unknown type; the only known info is its size: 1 byte
_WORD
an unknown type; the only known info is its size: 2 bytes
_DWORD
an unknown type; the only known info is its size: 4 bytes
_QWORD
an unknown type; the only known info is its size: 8 bytes
_OWORD
an unknown type; the only known info is its size: 16 bytes
_TBYTE
10-byte floating point value
_UNKNOWN
no info is available
__pure
pure function: always returns the same value and does not modify memory in a visible way
__noreturn
function does not return
__usercall
user-defined calling convention; see above
__userpurge
user-defined calling convention; see above
__golang
golang calling convention
__swiftcall
swift calling convention
__spoils
explicit spoiled-reg specification; see above
__hidden
hidden function argument; this argument was hidden in the source code (e.g. 'this' argument in c++ methods is hidden)
__return_ptr
pointer to return value; implies hidden
__struct_ptr
was initially a structure value
__array_ptr
was initially an array
__unused
unused function argument __cppobj a c++ style struct; the struct layout depends on this keyword
__ptr32
explicit pointer size specification (32 bits)
__ptr64
explicit pointer size specification (64 bits)
__shifted
shifted pointer declaration
__high
high level prototype (does not explicitly specify hidden arguments like 'this', for example) this keyword may not be specified by the user but IDA may use it to describe high level prototypes
__bitmask
a bitmask enum, a collection of bit groups
Shifted pointers
Sometimes in binary code we can encounter a pointer to the middle of a structure. Such pointers usually do not exist in the source code but an optimizing compiler may introduce them to make the code shorter or faster.
Such pointers can be described using shifted pointers. A shifted pointer is a regular pointer with additional information about the name of the parent structure and the offset from its beginning. For example:
struct mystruct
{
char buf[16];
int dummy;
int value; // <- myptr points here
double fval;
};
int *__shifted(mystruct,20) myptr;
The above declaration means that myptr is a pointer to 'int' and if we decrement it by 20 bytes, we will end up at the beginning of 'mystruct'.
Please note that IDA does not limit parents of shifted pointers to structures. A shifted pointer after the adjustment may point to any type except 'void'.
Also, negative offsets are supported too. They mean that the pointer points to the memory before the structure.
When a shifted pointer is used with an adjustment, it will be displayed with the 'ADJ' helper function. For example, if we refer to the memory 4 bytes further, it can be represented like this:
ADJ(myptr)->fval
Shifted pointers are an improvement compared to the CONTAINING_RECORD macro because expressions with them are shorter and easier to read.
See also Set type command.
Scattered argument locations
00000000 struc_1 struc ; (sizeof=0xC)
00000000 c1 db ?
00000001 db ? ; undefined
00000002 s2 dw ?
00000004 c3 db ?
00000005 db ? ; undefined
00000006 db ? ; undefined
00000007 db ? ; undefined
00000008 i4 dd ?
0000000C struc_1 ends
If we have this function prototype:
void myfunc(struc_1 s);
the 64bit GNU compiler will pass the structure like this:
RDI: c1, s2, and c3
RSI: i4
Since compilers can use such complex calling conventions, IDA needs some mechanism to describe them. Scattered argument locations are used for that. The above calling convention can be described like this:
void __usercall myfunc(struc_1 s@<0:rdi.1, 2:rdi^2.2, 4:rdi^4.1, 8:rsi.4>);
It reads:
1 byte at offset 0 of the argument is passed in the byte 0 of RDI
2 bytes at offset 2 of the argument are passed in the byte 1,2 of RDI
1 byte at offset 4 of the argument is passed in the byte 3 of RDI
4 bytes at offset 8 of the argument are passed starting from the byte 0 of RSI
In other words, the following syntax is used:
argoff:register^regoff.size
where
argoff - offset within the argument
register - register name used to pass part of the argument
regoff - offset within the register
size - number of bytes
The regoff and size fields can be omitted if there is no ambiguity.
If the register is not specified, the expression describes a stack location:
argoff:^stkoff.size
where
argoff - offset within the argument
stkoff - offset in the stack frame (the first stack argument is at offset 0)
size - number of bytes
Please note that while IDA checks the argument location specifiers for soundness, it cannot perform all checks and some wrong locations may be accepted. In particular, IDA in general does not know the register sizes and accepts any offsets within them and any sizes.
See also Set type command.
Data representation: enum member
Syntax:
__enum(enum_name)
Instead of a plain number, a symbolic constant from the specified enum will be used. The enum can be a regular enum or a bitmask enum. For bitmask enums, a bitwise combination of symbolic constants will be printed. If the value to print cannot be represented using the specified enum, it will be displayed in red.
Example:
enum myenum { A=0, B=1, C=3 };
short var __enum(myenum);
If `var` is equal to 1, it will be represented as "B"
Another example:
enum mybits __bitmask { INITED=1, STARTED=2, DONE=4 };
short var __enum(mybits);
If `var` is equal to 3, it will be represented as "INITED|STARTED"
This annotation is useful if the enum size is not equal to the variable size. Otherwise using the enum type for the declaration is better:
myenum var; // is 4 bytes, not 2 as above
Data representation: offset expression
Syntax:
__offset(type, base, tdelta, target)
__offset(type, base, tdelta)
__offset(type, base)
__offset(type|AUTO, tdelta)
__offset(type)
__off
where
type is one of:
OFF8 8-bit full offset
OFF16 16-bit full offset
OFF32 32-bit full offset
OFF64 64-bit full offset
LOW8 low 8 bits of 16-bit offset
LOW16 low 16 bits of 32-bit offset
HIGH8 high 8 bits of 16-bit offset
HIGH16 high 16 bits of 32-bit offset
The type can also be the name of a custom refinfo.
It can be combined with the following keywords:
RVAOFF based reference (rva)
PASTEND reference past an item
it may point to an nonexistent address
NOBASE forbid the base xref creation
implies that the base can be any value
nb: base xrefs are created only if the offset base
points to the middle of a segment
SUBTRACT the reference value is subtracted from the base value instead of
(as usual) being added to it
SIGNEDOP the operand value is sign-extended (only supported for
REF_OFF8/16/32/64)
NO_ZEROS an opval of 0 will be considered invalid
NO_ONES an opval of ~0 will be considered invalid
SELFREF the self-based reference
The base, target delta, and the target can be omitted. If the base is BADADDR, it can be omitted by combining the type with AUTO:
__offset(type|AUTO, tdelta)
Zero based offsets without any additional attributes and having the size that corresponds the current application target (e.g. REF_OFF32 for a 32-bit bit application), the shoft __off form can be used.
Examples:
A 64-bit offset based on the image base:
int var __offset(OFF64|RVAOFF);
A 32-bit offset based on 0 that may point to an non-existing address:
int var __offset(OFF32|PASTEND|AUTO);
A 32-bit offset based on 0x400000:
int var __offset(OFF32, 0x400000);
A simple zero based offset that matches the current application bitness:
int var __off;
This annotation is useful the type of the pointed object is unknown or the variable size is different from the usual pointer size. Otherwise it is better to use a pointer:
type *var;
Data representation: string
Syntax:
__strlit(strtype, "encoding")
__strlit(strtype, char1, char2, "encoding")
__strlit(strtype)
where strtype is one of:
C Zero terminated string, 8 bits per symbol
C_16 Zero terminated string, 16 bits per symbol
C_32 Zero terminated string, 32 bits per symbol
PASCAL Pascal string: 1 byte length prefix, 8 bits per symbol
PASCAL_16 Pascal string: 1 byte length prefix, 16 bits per symbol
LEN2 Wide Pascal string: 2 byte length prefix, 8 bits per symbol
LEN2_16 Wide Pascal string: 2 byte length prefix, 16 bits per symbol
LEN4 Delphi string: 4 byte length prefix, 8 bits per symbol
LEN4_16 Delphi string: 4 byte length prefix, 16 bits per symbol
It may be followed by two optional string termination characters (only for C). Finally, the string encoding may be specified, as the encoding name or "no_conversion" if the string encoding was not explicitly specified.
Example:
A zero-terminated string in windows-1252 encoding:
char array[10] __strlit(C,"windows-1252");
A zero-terminated string in utf-8 encoding:
char array[10] __strlit(C,"UTF-8");
Data representation: structure offset
Syntax:
__stroff(structname)
__stroff(structname, delta)
Instead of a plain number, the name of a struct or union member will be used. If delta is present, it will be subtracted from the value before converting it into a struct/union member name.
Example:
An integer variable named `var` that hold an offset from the beginning of
the `mystruct` structure:
int var __stroff(mystruct);
If mystruct is defined like this:
struct mystruct
{
char a;
char b;
char c;
char d;
}
The value 2 will be represented as `mystruct.c`
Another example:
A structure offset with a delta:
int var __stroff(mystruct, 1);
The value 2 will be represented as `mystruct.d-1`
Data representation: custom data type and format
Syntax:
__custom(dtid, fid)
where dtid is the name of a custom data type and fid is the name of a custom data format. The custom type and format must be registered by a plugin beforehand, at the database opening time. Otherwise, custom data type and format ids will be displayed instead of names.
Data representation: tabular form
Syntax:
__tabform(flags)
__tabform(flags,lineitems)
__tabform(flags,lineitems,alignment)
__tabform(,lineitems,alignment)
__tabform(,,alignment)
This keyword is used to format arrays. The following flags are accepted:
NODUPS do not use the `dup` keyword
HEX use hexadecimal numbers to show array indexes
OCT use octal numbers to show array indexes
BIN use binary numbers to show array indexes
DEC use decimal numbers to show array indexes
It is possible to combine NODUPS with the index radix: NODUPS|HEX
The `lineitems` and `alignment` attributes have the meaning described for the create array command.
Example:
Display the array in tabular form, 4 decimal numbers on a line, each number
taking 8 positions. Display indexes as comments in hexadecimal:
char array[16] __tabform(HEX,4,8) __dec;
A possible array may look like:
dd 50462976, 117835012, 185207048, 252579084; 0
dd 319951120, 387323156, 454695192, 522067228; 4
dd 589439264, 656811300, 724183336, 791555372; 8
dd 858927408, 926299444, 993671480,1061043516; 0Ch
Without this annotation, the `dup` keyword is permitted, number of items on a line and the alignment are not defined.
See also Edit submenu.
Last updated
Was this helpful?