X86架构32位和64位的STACK FRAME结构差异比较

简介:

理解过程来自以下URL:

http://eli.thegreenplace.net/2011/02/04/where-the-top-of-the-stack-is-on-x86/

和另一个地址,忘了。。。

C代码:

int foobar(int a, int b, int c)
{
    int xx = a + 2;
    int yy = b + 3;
    int zz = c + 4;
    int sum = xx + yy + zz;

    return xx * yy * zz + sum;
}

int main()
{
    return foobar(77, 88, 99);
}

STACK FRAME,栈帧结构图示,以EBP作定位:

 

MASM汇码代码对比:

_foobar:
    ; ebp must be preserved across calls. Since
    ; this function modifies it, it must be
    ; saved.
    ;
    push    ebp

    ; From now on, ebp points to the current stack
    ; frame of the function
    ;
    mov     ebp, esp

    ; Make space on the stack for local variables
    ;
    sub     esp, 16

    ; eax <-- a. eax += 2. then store eax in xx
    ;
    mov     eax, DWORD PTR [ebp+8]
    add     eax, 2
    mov     DWORD PTR [ebp-4], eax

    ; eax <-- b. eax += 3. then store eax in yy
    ;
    mov     eax, DWORD PTR [ebp+12]
    add     eax, 3
    mov     DWORD PTR [ebp-8], eax

    ; eax <-- c. eax += 4. then store eax in zz
    ;
    mov     eax, DWORD PTR [ebp+16]
    add     eax, 4
    mov     DWORD PTR [ebp-12], eax

    ; add xx + yy + zz and store it in sum
    ;
    mov     eax, DWORD PTR [ebp-8]
    mov     edx, DWORD PTR [ebp-4]
    lea     eax, [edx+eax]
    add     eax, DWORD PTR [ebp-12]
    mov     DWORD PTR [ebp-16], eax

    ; Compute final result into eax, which
    ; stays there until return
    ;
    mov     eax, DWORD PTR [ebp-4]
    imul    eax, DWORD PTR [ebp-8]
    imul    eax, DWORD PTR [ebp-12]
    add     eax, DWORD PTR [ebp-16]

    ; The leave instruction here is equivalent to:
    ;
    ;   mov esp, ebp
    ;   pop ebp
    ;
    ; Which cleans the allocated locals and restores
    ; ebp.
    ;
    leave
    ret

 

而64位的架构为了提高性能速度,作FUNCTION CALL时将6个变量直接放在寄存器,而非通过内存。

而且,在STACK FRAME下方有128BYTE的REDZONE。呵呵,作啥用的没有深研,反正RETURN时会被清空回收。。。

理解来自同一作者:

 

Argument passing

 

I’m going to simplify the discussion here on purpose and focus on integer/pointer arguments [3]. According to the ABI, the first 6 integer or pointer arguments to a function are passed in registers. The first is placed in rdi, the second in rsi, the third in rdx, and then rcx, r8 and r9. Only the 7th argument and onwards are passed on the stack.

long myfunc(long a, long b, long c, long d,
            long e, long f, long g, long h)
{
    long xx = a * b * c * d * e * f * g * h;
    long yy = a + b + c + d + e + f + g + h;
    long zz = utilfunc(xx, yy, xx % yy);
    return zz + 20;
}

This is the stack frame:

long utilfunc(long a, long b, long c)
{
    long xx = a + 2;
    long yy = b + 3;
    long zz = c + 4;
    long sum = xx + yy + zz;
 
    return xx * yy * zz + sum;
}

This is indeed a leaf function. Let’s see how its stack frame looks when compiled with gcc:

关于红色128个BYTE的区域,没有细看,下次有时间再看吧。

列出解释:

The red zone

First I’ll quote the formal definition from the AMD64 ABI:

The 128-byte area beyond the location pointed to by %rsp is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone.

Put simply, the red zone is an optimization. Code can assume that the 128 bytes below rsp will not be asynchronously clobbered by signals or interrupt handlers, and thus can use it for scratch data, without explicitly moving the stack pointer. The last sentence is where the optimization lays – decrementing rsp and restoring it are two instructions that can be saved when using the red zone for data.

However, keep in mind that the red zone will be clobbered by function calls, so it’s usually most useful in leaf functions (functions that call no other functions).

目录
相关文章
【各种问题处理】X86架构和ARM架构的区别
【1月更文挑战第13天】【各种问题处理】X86架构和ARM架构的区别
|
4月前
|
边缘计算 编译器 数据中心
X86架构与Arm架构的主要区别分析
X86架构与Arm架构的主要区别分析
463 0
|
7月前
|
Ubuntu 关系型数据库 MySQL
M1 macos docker获取x86 x64 amd 等指定架构版本linux ubuntu mysql 容器并启动容器
M1 macos docker获取x86 x64 amd 等指定架构版本linux ubuntu mysql 容器并启动容器
|
8天前
|
安全 Shell 网络安全
抄个冷板凳---x86架构MS-17010漏洞的多重利用方法
x86架构永恒之蓝漏洞的多重利用复现演示
|
1月前
|
存储 机器学习/深度学习 并行计算
阿里云服务器X86计算、Arm计算、GPU/FPGA/ASIC、高性能计算架构区别
在我们选购阿里云服务器的时候,云服务器架构有X86计算、ARM计算、GPU/FPGA/ASIC、弹性裸金属服务器、高性能计算可选,有的用户并不清楚他们之间有何区别,本文主要简单介绍下不同类型的云服务器有何不同,主要特点及适用场景有哪些。
阿里云服务器X86计算、Arm计算、GPU/FPGA/ASIC、高性能计算架构区别
|
2月前
|
存储 缓存 物联网
DP读书:鲲鹏处理器 架构与编程(二)服务器与处理器——高性能处理器的并行组织结构、ARM处理器
DP读书:鲲鹏处理器 架构与编程(二)服务器与处理器——高性能处理器的并行组织结构、ARM处理器
249 0
|
7月前
|
弹性计算 运维 Cloud Native
阿里云罗晶分享 | X86+ARM,容器服务 ACK 多架构应用部署最佳实践
2023年8月31日,系列课程第五节《X86+ARM,容器服务ACK多架构应用部署最佳实践》正式上线,由阿里云云原生应用平台产品专家罗晶主讲,内容涵盖:容器服务ACK简介;ACK支持倚天ARM实例;ACK多架构应用部署最佳实践。
|
5月前
|
监控 关系型数据库 MySQL
银河麒麟V10 SP3 X86 二进制文件部署 mysql-5.7.29 GTID 半同步复制的双主架构
银河麒麟V10 SP3 X86 二进制文件部署 mysql-5.7.29 GTID 半同步复制的双主架构
125 1
|
6月前
|
弹性计算 Java 数据库连接
架构设计第七讲:数据巡检系统之daily&线上表结构自动化比对
架构设计第七讲:数据巡检系统之daily&线上表结构自动化比对
|
7月前
|
安全 Java 编译器
JDK21更新内容:舍弃对x86架构32位系统支持
JDK21更新内容:舍弃对x86架构32位系统支持