PIC和RIP-relative寻址

Last Updated: 2023-06-16 09:09:40 Friday

-- TOC --

x64

下面这段来自StackOverflow上的精彩回答:

RIP-relative addressing is a new form of effective addressing introduced with 64-bit long mode. The point is that it makes it easier to write position-independent code because you can make any memory reference RIP-relative. In fact, RIP-relative addressing is the default addressing mode in 64-bit applications. Virtually all instructions that address memory in 64-bit mode are RIP-relative. I'll quote from Ken Johnson (aka Skywing)'s blog because I couldn't say it any better myself:

One of the larger (but often overlooked) changes to x64 with respect to x86 is that most instructions that previously only referenced data via absolute addressing can now reference data via RIP-relative addressing.

RIP-relative addressing is a mode where an address reference is provided as a (signed) 32-bit displacement from the current instruction pointer. While this was typically only used on x86 for control transfer instructions (call, jmp, and soforth), x64 expands the use of instruction pointer relative addressing to cover a much larger set of instructions.

What’s the advantage of using RIP-relative addressing? Well, the main benefit is that it becomes much easier to generate position independent code, or code that does not depend on where it is loaded in memory. This is especially useful in today’s world of (relatively) self-contained modules (such as DLLs or EXEs) that contain both data (global variables) and the code that goes along with it. If one used flat addressing on x86, references to global variables typically required hardcoding the absolute address of the global in question, assuming the module loads at its preferred base address. If the module then could not be loaded at the preferred base address at runtime, the loader had to perform a set of base relocations that essentially rewrite all instructions that had an absolute address operand component to refer to take into account the new address of the module.

[...]

An instruction that uses RIP relative addressing, however, typically does not require any base relocations (otherwise known as “fixups”) at load time if the module containing it is relocated, however. This is because as long as portions of the module are not internally re-arranged in memory (something not supported by the PE format), any addresses reference that is both relative to the current instruction pointer and refers to a location within the confines of the current image will continue to refer to the correct location, no matter where the image is placed at load time.

As a result, many x64 images have a greatly reduced number of fixups, due to the fact that most operations can be performed in an RIP-relative fashion.

He's speaking in the context of Windows, but something conceptually similar applies on other operating systems as well.

上文有个关键词:flat addressing,一整块内存。我理解早起还有别的addressing方式,例如segmentation,分段的方式,内存不是一整块。

我在学习汇编的时候,常常看到OFFSET FLAT:,这就是在取绝对地址!

x86

根据《程序员的自我修养》这本书的介绍,x86时代还没有RIP相对寻址,但在访问模块内部地址时,虽然方法巧妙,但本质还是基于PC指针的地址在进行相对寻址。

一般程序前面N页的代码,后面跟着M页的数据,他们的相对位置是固定的。

call内部接口

call指令中的目的地址值(立即数),是相对下一条指令地址值的偏移。call指令要对下一条指令地址压栈,必然会取出这个地址值,将这两个值相加,得到接口地址(也完成了返回地址压栈)。

访问内部数据

首先call一个内部函数(__i686.get_pc_thunk.cx),这个函数将call指令压栈的地址取出保存到一个寄存器中,返回。这样,就巧妙的拿到了PC指针的值。然后加上offset,就可以得到内部变量的地址。

x64时代,直接RIP间接寻址(几乎通用),让一切都简单了。

FYI

通过chatGPT了解到:

本文链接:https://cs.pynote.net/hd/asm/202302171/

-- EOF --

-- MORE --