Last Updated: 2023-07-20 03:55:12 Thursday
-- TOC --
学习汇编,最好的开始,就是搞清楚CPU的那些寄存器。
x64来自x86的扩展,上图也包含了x86的寄存器,但没有画出YMM。
x86的通用寄存器有eax、ebx、ecx、edx、edi、esi
。
这些寄存器在大多数指令中是可以任意使用的。但有些指令限制只能使用其中某些寄存器做某种用途,例如除法指令idivl,它规定被除数在eax寄存器中,edx寄存器必须是0,而除数可以是任何寄存器中,计算结果的商数保存在eax寄存器中(覆盖被除数),余数保存在edx寄存器。
The mandatory or implicit use of specific registers by some instructions is a legacy design pattern that dates back to the 8086,
ostensibly to improve code density
. What this means from a modern programing perspective is that certain register usage conventions tend be observed when writing x86-32 assembly code.
x86的特殊寄存器有ebp、esp、eip、eflags
。eip是程序计数器。eflags保存计算过程中产生的标记位,包括进位、溢出、零、负数等标记,在x86的文档中,这几个标记位分别称为CF、OF、ZF、SF。ebp和esp用于维护函数调用过程的栈帧(stack frame)。
一般情况下,esp指向栈顶(低地址),而ebp为栈帧(stack frame)指针,指向当前调用栈的底(高地址)。每个函数的每次调用,都有它自己独立的一个栈帧,这个栈帧中维持着函数运行所需要的各种信息。对esp做减法,就是在扩展栈空间。
x64架构向后兼容x86架构,它提供的32位模式,与x86架构完全一样,同时有一个新的64位模式。
The x64 architecture is a backwards-compatible extension of x86. It provides a legacy 32-bit mode, which is identical to x86, and a new 64-bit mode.
由于x86_64架构惊人的后向兼容性(也许这种兼容性单纯从技术上看并无必要),同一个寄存器,我们可以使用其中的8bit、16bit、32bit、64bit,以ax寄存器为例,分别是ah/al、ax、eax、rax,如下图所示:
x64中的寄存器,都用r
开头,表示64bit,他们分别是rax,rbx,rcx,rdx,rsi,rdi,rbp,rsp
,另外,在x64中还新增了r8 -- r15
,他们作为普通寄存器存在,可以任意使用。
计算结果输出到32位寄存器时,64位寄存器的其它位置会全部自动清0。计算结果输出到8位或16位寄存器时,64位寄存器的其它位置不会清0,这与x86的行为一致。
Operations that output to a 32-bit subregister are automatically zero-extended to the entire 64-bit register. Operations that output to 8-bit or 16-bit subregisters are not zero-extended (this is compatible x86 behavior).
在x64模式下,8位寄存器不能用于所有类型的操作数。
The high 8 bits of ax, bx, cx, and dx are still addressable as ah, bh, ch, dh, but cannot be used with all types of operands.
x64架构下浮点数和SIMD寄存器:
代码执行必然要使用芯片的计算资源,但使用FPU或SIMD是可选的。
By definition, an executing task must use the computational resources provided by the core execution unit. Using the x87 FPU or any of the SIMD execution units is optional.
下面是这些通用寄存器的传统用法(并非强制):
传统上作为累加寄存器,accumulator register。通过RAX存储函数返回值属于calling convention。
RAX是64位寄存器,可以拆分。例如我们操作EAX,就是在对RAX的低32位进行操作。同样以此类推,AX表示RAX的低16位,AH表示RAX低16位中的高8位,AL表示RAX低16位中的低8位。除了RIP寄存器,其它通用寄存器都可以做类似的拆分,新增的r8-r15也可以拆分,但需要注意不同的拆分表示方法。
一般编译C得到的汇编,函数都会保证除rax以外的通用寄存器的值,这就是因为rax用来保存函数的返回值(浮点型返回值除外)。
main函数最后的return 0,翻译成汇编,常常是
xor eax, eax
。
传统上作为基址寄存器,base register,用于访问内存的基址。通用寄存器之一。
传统上作为计数寄存器,counter register,用于循环计数。loop指令指定使用此寄存器。
data register,数据寄存器。通用寄存器之一。
source index,源变址寄存器,字符串运算时常用于源地址指针。
destination index,目标变址寄存器,字符串运算时常用于目标指针。
stack pointer,栈顶指针寄存器。
call和ret指令也会修改此寄存器,还有RIP寄存器。
call addr
: push return address on stack, then call function at address
ret
: pop return address from stack and return to that address函数调用向下扩展的栈空间,貌似有个术语,叫做
shadow space
。Application programs can also use the stack to pass function arguments and store temporary data. Register ESP always points to the stack’s top-most item. While it is possible to use the ESP register as a general-purpose register, such use is impractical and strongly discouraged. RSP虽然是general purpose,但不要用作他用。
base pointer,基址寄存器,一般用来存放栈底地址,stack frame的开始地址。
编译器常常对函数调用,生成如下代码:
push rbp ;函数最开始,保存rbp到stack
mov rbp, rsp ;扩展stack之前,保存此值,作为新stack frame的底
; ...
mov rsp, rbp ;最后恢复
pop rbp
手写汇编用rbp作为一个函数内不变的anchor很方便,但编译器早已经开始不用这种pattern了(gcc在x64环境下的优化),有文章说这是更激进的优化,函数开始(prologue)和结束(epilogue)不用再push和pop,而且释放出了一个rbp作为free register,可以进一步加速。
R8, R9, R10, ... , R15属于通用寄存器,一般是可以任意使用,不指定特定用途。
这一组x64新增的通用寄存器也支持拆分,但是拆分的寄存器在命名规则上与特殊功能寄存器有所不同。32位拆分寄存器以D作为后缀(DWORD),16位寄存器以W作为后缀(WORD),低8位则以B作为后缀(BYTE),没有高8位的拆分。
注意这个用法:%r14d,%r14w,%r14b
。或者r14l
,intel风格,但是没有r8h。
There are odd limitations accessing the byte registers due to coding issues in the REX opcode prefix used for the new registers: an instruction cannot reference a legacy high byte (AH, BH, CH, DH) and one of the new byte registers at the same time (such as R11B), but it can use legacy low bytes (AL, BL, CL, DL). This is enforced by changing (AH, BH, CH, DH) to (BPL, SPL, DIL, SIL) for instructions using a REX prefix.
r0 -- r7 这组名称,属于x87 mmx regitsters...
最左边的description不重要...别被误导了...
上面16个都是general purpose register!!
instruction pointer,指令指针。只读,不可拆分。
永远指向下一条需要执行的指令地址,有CPU自动设置。在x64模式下,可以读取,在x86模式下,读都不行。
The instruction pointer always points to the next instruction address and is automatically set by the CPU; you can’t manually write it. On x86-64 you can read the value of the instruction pointer, but on 32-bit x86 you can’t even do that.
The EIP register is implicitly manipulated by control-transfer instructions. For example, the call (Call Procedure) instruction pushes the contents of the EIP register onto the stack and transfers program control to the address designated by the specified operand. The ret (Return from Procedure) instruction transfers program control by popping the top-most item off the stack into the EIP register.
x64模式下提供了一个新的基于RIP的相对寻址模式。
x64 provides a new rip-relative addressing mode. Instructions that refer to a single constant address are encoded as offsets from rip. For example, the
mov rax, [addr]?
instruction moves 8 bytes beginning at addr + rip to rax.
The EFLAGS register contains a series of status bits that the processor uses to indicate the results of logical and arithmetic operations. It also contains a collection of system control bits that are primarily used by operating systems.
这个寄存器虽然也扩展到64位了,但扩展出来的32位还未被使用。
The RFLAGS register stores flags used for results of operations and for controlling the processor. This is formed from the x86 32-bit register EFLAGS by adding a higher 32 bits which are reserved and currently unused.
这些标志大都由CPU自动设置。
CF也叫CY
。
For application programs, the most important bits in the EFLAGS register are the following status flags: auxiliary carry flag (AF), carry flag (CF), overflow flag (OF), parity flag (PF), sign flag (SF), and zero flag (ZF).
The auxiliary carry flag denotes a carry or borrow condition during binary-coded decimal addition or subtraction. The carry flag is set by the processor to signify an overflow condition when performing unsigned integer arithmetic. It is also used by some register rotate and shift instructions. The overflow flag signals that the result of a signed integer operation is too small or too large. The parity flag indicates whether the least-significant byte of a result contains an even number of 1 bits. The sign and zero flags are set by logical and arithmetic instructions to signify a negative, zero, or positive result.
The EFLAGS register also contains a control bit called the direction flag (DF). An application program can set or reset the direction flag, which defines the auto increment direction (0 = low-to-high addresses, 1 = high-to-low addresses) of the EDI and ESI registers during execution of the string instructions. The remaining bits in the EFLAGS register are used exclusively by the operating system to manage interrupts, restrict I/O operations, and support program debugging. They should never be modified by an application program. Reserved bits should also never be modified and no assumptions should ever be made regarding the state of any reserved bit.
The floating point unit (FPU) contains eight registers FPR0-FPR7, status and control registers, and a few other specialized registers.
Floating point operations conform to IEEE 754. Note that most C/C++ compilers support the 32 and 64 bit types as float and double, but not the 80-bit one available from assembly. These registers share space with the
eight 64-bit MMX registers
.
YMM0~15专门用于packed data,256bit,可以存放多个整数或浮点数。例如,一个YMM支持存放4个64位数值或者8个32位值。支持拆分成XMM0~15
(YMM的低128位)。
控制寄存器,记录cpu运行过程中自身的一些关键信息。
There are also control registers such as cr0–cr10 that the kernel uses to control the CPU’s behavior, for instance, to switch between protected mode and real mode.
本文链接:https://cs.pynote.net/hd/asm/202212111/
-- EOF --
-- MORE --