Basic knowledge
An article to understand the C language function call stack - a must for advanced programmers
Everyone knows that function calls are implemented through the stack, and that the local variables of the function are stored in the stack. But the implementation details of the stack may not necessarily be clear. This article will introduce how the function stack is implemented on the Linux platform. Some students may feel that there is no need to understand so in-depth, but it is not. According to the experience of this number for many years, understanding the deep-level principles of the system is very helpful for analyzing difficult problems.
Figure 0 Function stack
Just as being familiar with packet capture is an advanced weapon for solving network communication problems, being familiar with the function call stack is an advanced weapon for analyzing program memory problems. This article takes the development of C language under the Linux 64-bit operating system as an example to introduce the implementation principle of the application call stack, and analyzes the content of the call stack of a program through an example and GDB tool. Before introducing the specific call stack, let's introduce some basic knowledge, which is the basis for understanding the subsequent function call stack.
Registers of X86 CPU
The registers of the CPU are the basic knowledge that needs to be understood, because in the X64 system, the parameters of the functions are passed through registers. Figure 1 is a list of X86 CPU registers and a brief description of their functions.
Figure 1 Intel X86 CPU register usage
We know that Intel's CPUs are designed to be forward compatible, that is, a new generation of CPUs can run programs compiled on an older generation of CPUs. To ensure compatibility, newer generations of CPUs retain aliases for older generation registers. Take the 16-bit register AX as an example, AL represents the lower 8 bits, and AH represents the upper 8 bits. After the advent of the 32-bit CPU, the 32-bit register is represented by a register named EAX, and AX is still reserved. And so on, RAX represents a 64-bit register.
Figure 2 Different register names
application address space
The operating system provides a uniform memory map address for all applications by means of virtual memory. As shown in Figure 3, from top to bottom are the user stack, shared library memory, runtime heap and code segment. Of course, this is an approximate segmentation, and the actual segmentation may be slightly more complicated than this, but the overall pattern has not changed much.
Figure 3 The address space of the application
It can be seen from the figure that the user stack grows from top to bottom. That is, the user stack will occupy the high address space first, and then occupy the low address space. At present, we can have a general understanding, and we will analyze the details of the user stack in detail later.
Function calls and assembly instructions
In order to understand the details of the function call stack, it is necessary to understand the implementation of function calls in assembler. The function call is mainly divided into two parts, one is the call and the other is the return. In assembly language, the function call is done through the call instruction, and the return is through the ret instruction.
The call instruction in assembly language is equivalent to performing two steps, namely, 1) push the current IP or CS and IP into the stack; 2) jump, similar to the jmp instruction. Similarly, the ret instruction is also divided into 2 steps, namely, 1) pop the address in the stack to the IP register; 2) jump to execute subsequent instructions. This is basically the principle of function calls.
In addition to jumping between code, function calls often need to pass a parameter, and there may be a return value after processing is complete. The transfer of these data is carried out through registers. The parameters are stored through the registers described above before the function call, and the return result is stored through the RAX register (EAX for 32-bit systems) before the function returns.
Another important point of knowledge is the stack-related registers RSP and RBP in the function call process. The two registers mainly record the stack location. The specific functions are as follows:
RSP: stack pointer register (reextended stack pointer), which stores a pointer, which always points to the top of the stack frame at the top of the system stack.
RBP: base address pointer register (reextended base pointer), which stores a pointer, which always points to the bottom of the top stack frame of the system stack.
The names of the registers are related to the architecture. This article is a 64-bit system, so the registers are RSP and RBP. If it is a 32-bit system, the names of the registers are ESP and EBP.
application call stack
Let's take a look at the main content of the function call stack as a whole, as shown in Figure 4. The function stack mainly includes the function parameter table, the local variable table, the base address of the stack and the function return address. The base address of the stack here is the base address of the previous stack frame, because this base address needs to be used to access the contents of the stack in this function, so the base address in the previous stack frame needs to be pushed onto the stack first.
Figure 4 Overview of the function call stack
For ease of understanding, we take a specific program as an example. This program is very simple, mainly simulating the function calling relationship and parameter passing of multiple functions. In addition, 2 formal parameters are defined in the function func_2 to simulate the process of multi-parameter passing.
Figure 5 Function stack assembly analysis
In this example, the main function calls the func_1 function. We start the analysis from the main function, you can first look at the C language code on the right. The first is the preparation process of function parameters. When the main function calls func_1, the parameters passed in in turn are 1, 2, 3, and 4+g, and the last parameter needs to be calculated. According to the dotted line of the red box, we can see the corresponding assembler. In the assembler, the last parameter is processed first, then the second to last, and so on (the processing order of function parameters needs to be paid attention to in daily development. content focus). At the same time, we see that the name of the register that stores the parameter is consistent with the previous one.
When the parameters are prepared, the function func_1 is called, which is the line of call func_1 in assembly language. Although it is only a line of assembly instructions, it actually does some things internally. We introduced this when we introduced the call instruction in the previous article. You can refer to the previous article.
Then enter the processing logic of the func_1 function. The first is the pushq %rbp assembler. The function of this instruction is to push RBP into the function stack. The value of the push stack and the following update RBP (moveq %rsp, %rbp) is the stack frame header for constructing this function, and subsequent access to the contents of this stack frame is performed through the frame header (RBP). Next is the process of parameter stacking and the process of initializing local variables. For the specific distribution, refer to the green box and the red box in Figure 5.
After completing the operation in the function, finally put the operation result into the register EAX, and then call the instructions leave and ret. What needs to be explained here is the leave instruction, which is equivalent to the following two assembly instructions. You can compare the assembly instructions of the function entry. In fact, the two are symmetrical. The leave instruction assigns the stack base address of this frame to the stack pointer (step 2 in Figure 6), and then pops the content into the RBP (step 3 in Figure 6). In fact, RBP points to the stack frame of the previous frame (caller), which is a restoration process.
movl %ebp %esppopl %ebp
Figure 6 Schematic diagram of function return
In this way, after the function returns, the registers RBP and RSP are switched from the callee's stack frame to the caller's stack frame.
Analyze function call stacks with GDB
The above is to analyze the call stack and stack frame of the function by disassembling. We can also dynamically analyze the usage of function stacks and stack frames through gdb. We still use the main function to call the func_1 function as an example to analyze. Here we set a single point at the entry of the function func_1, then run the program, and the program stops at the breakpoint. As shown in Figure 7, we gradually execute the change process of the function stack. We will not repeat the details here. You can actually operate it.
Figure 7 Function stack change process
The purpose of this article is to let everyone have an overall understanding of the function call stack, so that there are more solutions to the intractable problems of the program in the future. Because there are many stack-related problems in the actual production environment, such as stack overflow caused by too many local variables, or stack destruction caused by stepping on memory problems, etc. Therefore, after understanding the principle of the function stack, you will have new ideas when encountering so-called inexplicable problems. Often many problems are not because the problem itself is inexplicable, but because our knowledge reserve is not enough, and we feel inexplicable.