Parameter passing mechanism of function call in C language
The X86 instruction set started with the 16-bit 8086 and has experienced more than 40 years of development. The 64-bit X86-64 instruction set is now widely used. The registers have also changed from the previous 16-bit to the current 64-bit. The number of registers is also greatly increased. Of course gcc must also be constantly upgraded as the instruction set changes. In the 64-bit era, the parameter passing of C language in function calls has also undergone great changes. This article verifies this change by compiling C language programs into assembly language. Reading this article requires not only knowledge of C language, but also some knowledge of assembly language. All examples in this article are verified under Ubuntu 20.04, and the gcc version used is 9.4.0.
1. Parameter passing for calling functions in C language under 32-bit.
- In the 32-bit era, C language actually used the stack to pass parameters when calling functions.
- Before executing the call instruction, the parameters to be passed will be pushed onto the stack in reverse order. For example, if there are three parameters: func(1, 2, 3), then push 3 onto the stack first, then push 2 onto the stack, and then push 1 onto the stack.
- When the call instruction is executed, the function return address (that is, the address of the next instruction of the call instruction) will be pushed onto the stack first, and then the eip register will be set to the starting address of the function to complete the function call.
- When the function executes the ret instruction, it will put the return address pushed onto the stack when the call instruction is executed into the eip register, and the return process is over.
- This process is basically the same as the 8086 instruction set.
- We use a simple C language program to verify the above statement, the program file name is: param1.c.
- We compiled this program into 32-bit assembly language with some options, the purpose of which is to remove some debugging information and make the assembly code look cleaner.
- If your 64-bit computer cannot compile 32-bit programs, you may need to install 32-bit support. Refer to the installation instructions below.
- Let's see what the compiled 32-bit assembly language looks like.
- Beginning on line 26 is the assembly code for the main function. Lines 31-35 call the function func1(3, 5, str), where line 32 pushes the third parameter str onto the stack, and line 33 pushes the second parameter 5 onto the stack. Line 34 pushes the first parameter 3 onto the stack.
- There are two points to note in this assembler. One is that the parameters are pushed into the stack in the reverse order of the calling sequence. That is, push the third parameter onto the stack first, then push the second parameter onto the stack, and finally push the first parameter onto the stack. Second, for integer variables, the integer value is directly pushed onto the stack, not the address where the integer value is stored.
- Lines 5 - 18 are the assembly code for the func1 function. Line 9 extends the top of the stack by 16 bytes. This memory area is used to store the variables defined in func1, that is, the three variables i1, j1 and str in the C code. (ebp - 12) stores variable i1. (ebp - 8) stores variable j1. (ebp - 4) stores the variable str.
- Line 10 takes the 1st parameter from the stack and puts it into the eax register. Line 12 pops the second parameter from the stack. Line 14 takes the third parameter from the stack.
- The following diagram describes the changes in the stack before and after calling the function func1.
2. Parameter passing when calling a function in C language under 64-bit.
- In the 64-bit era, the number of CPU general-purpose registers has increased from 6 (excluding eip, esp, ebp) to 14 (excluding rip, rsp, rbp) in 32-bit.
- In order to improve the performance of calling functions, gcc will try to use registers to pass parameters when calling functions, instead of using the stack to pass parameters like the 32-bit instruction set.
- When less than 6 parameters need to be passed, use the six registers rdi, rsi, rdx, rcx, r8, r9 to pass parameters.
- When more than 6 parameters are passed, the first 6 parameters are passed using registers, and the parameters after the 6th parameter are still passed using the stack as in 32-bit.
- This rule reminds us that when writing C language programs, the parameters when calling functions should be less than 6 as far as possible.
- Below we still use a simple program to verify the above statement. The program file name is: param2.c
- We compile this program into 64-bit assembly language. The compile-time option adds a -fno-stack-protector compared to when compiling 32-bit assembly, which disables stack protection, also to make the assembly code look cleaner.
- Let's see what the compiled 64-bit assembly language looks like.
- The comments in the assembly code can clearly see that when preparing to pass parameters before calling func2, use the stack to pass the 7th and 8th parameters (lines 48 and 49), use rdi, rsi, rdx, rcx, r8, r9 to pass the first 6 arguments (lines 50 - 55).
- As with 32-bit assembly code, for integer parameters, the integer value is pushed directly onto the stack or into a register, rather than using a pointer to pass it.
- In the assembly code of the function func2, you can also clearly see the rules for taking parameters, which are consistent with the rules for passing parameters in main (lines 9-28).
- Again, When writing x86-64 C language programs, limiting the parameters of function calls to less than 6 will improve the performance of the program.

Comments
Post a Comment