详解gcc的__attribute__机制

Last Updated: 2023-10-27 00:14:44 Friday

-- TOC --

这部分内容,深入到了编译和链接的内部,是gcc控制编译和链接各方面细节的手段。

gcc的attribute分:

很多attribute跟CPU体系架构有关系。

section

gcc有一个机制,可以让程序员将某些数据或代码放入自定义名称的section中!

#include <stdio.h>

__attribute__((section(".mydata"))) int data = 1;
__attribute__((section(".myfunc"))) void f(){}

int main() {
    printf("%d\n", data);
    f();
    return 0;
}

编译之后,data在.mydata section中,函数f在.myfunc section中。

$ readelf -S test
...
  [15] .myfunc           PROGBITS         0000000000401152  00001152
       0000000000000007  0000000000000000  AX       0     0     1
...
 [26] .mydata           PROGBITS         0000000000404024  00003024
       0000000000000004  0000000000000000  WA       0     0     4
...

汇编:

$ objdump -M intel -d test

0000000000401126 <main>:
  401126:       55                      push   rbp
  401127:       48 89 e5                mov    rbp,rsp
  40112a:       8b 05 f4 2e 00 00       mov    eax,DWORD PTR [rip+0x2ef4]        # 404024 <data>
  401130:       89 c6                   mov    esi,eax
  401132:       bf 10 20 40 00          mov    edi,0x402010
  401137:       b8 00 00 00 00          mov    eax,0x0
  40113c:       e8 ef fe ff ff          call   401030 <printf@plt>
  401141:       b8 00 00 00 00          mov    eax,0x0
  401146:       e8 07 00 00 00          call   401152 <f>
  40114b:       b8 00 00 00 00          mov    eax,0x0
  401150:       5d                      pop    rbp
  401151:       c3                      ret

Disassembly of section .myfunc:

0000000000401152 <f>:
  401152:       55                      push   rbp
  401153:       48 89 e5                mov    rbp,rsp
  401156:       90                      nop
  401157:       5d                      pop    rbp
  401158:       c3                      ret

这个技巧,在kernel代码中很常见:

#define __section(section)              __attribute__((__section__(section)))

weak

弱符号,weak symbol

在早期,gcc默认将未初始化的全局变量当做弱符号来处理,这部分可以参考理解Common Block

全局变量和函数接口都可以被定义为弱符号。

符号的强弱来自定义,如何处理在于链接:

#include <stdio.h>

__attribute__((weak)) int data = 1;
__attribute__((weak)) void f(){}
__attribute__((weak)) void g();

int main() {
    printf("%d\n", data);
    f();
    if(g == NULL)
        printf("g is NULL\n");
    return 0;
}

data虽然有初始化,但还是一个弱符号。有定义的weak函数,可以直接调用。无定义的weak函数,有可能是NULL(链接时也没有的话),当为NULL时,可执行ELF文件的符号表中找不到这个symbol。

$ gcc test.c && ./a.out
1
g is NULL
$ readelf -s a.out | grep WEAK
...
    55: 0000000000401126     7 FUNC    WEAK   DEFAULT   14 f
...
    61: 0000000000404024     4 OBJECT  WEAK   DEFAULT   24 data
...

弱符号的作用:

Linux内核代码中,有如下定义:

#define __weak                          __attribute__((__weak__))

weakref, alias

弱引用,weak reference

链接时,对于未找到的外部符号,直接报错,这就是强引用,strong reference。弱引用的符号如果没找到,不报错,它的值就是NULL。

The weakref attribute marks a declaration as a weak reference. Without arguments, it should be accompanied by an alias attribute naming the target symbol. Optionally, the target may be given as an argument to weakref itself. In either case, weakref implicitly marks the declaration as weak. Without a target, given as an argument to weakref or to alias, weakref is equivalent to weak. At present, a declaration to which weakref is attached can only be static.

下面定义weakref x指向y:

/* Given the declaration: */
extern int y (void);

/* the following... */
static int x (void) __attribute__ ((weakref ("y")));

/* is equivalent to... */
static int x (void) __attribute__ ((weakref, alias ("y")));

/* or, alternatively, to... */
static int x (void) __attribute__ ((weakref));
static int x (void) __attribute__ ((alias ("y")));

weakref是一个有自己独立名称的symbol,因此在定义时,需要有一个argument,或者通过alias的方式指定target symbol。weakref定义的symbol是一个weak symbol。而且,weakref只能采用static的方式定义。

#include <stdio.h>

void target(){
    printf("in target\n");
}

// must be static lingage
static __attribute__((weakref("target"))) void g();

int main() {
    if(g == NULL)
        printf("g is NULL\n");
    else
        g();
    return 0;
}

链接时判断是否存在target的定义,编译运行呈现两种不同的结果。

cold, hot

Linux内核中有这么个定义:

#define __cold                          __attribute__((__cold__))

gcc手册中说:

The cold attribute on functions is used to inform the compiler that the function is unlikely to be executed. The function is optimized for size rather than speed and on many targets it is placed into a special subsection of the text section so all cold functions appear close together, improving code locality of non-cold parts of program. The paths leading to calls of cold functions within code are marked as unlikely by the branch prediction mechanism. It is thus useful to mark functions used to handle unlikely conditions, such as perror, as cold to improve optimization of hot functions that do call marked functions in rare occasions.

没在Linux内核中找到关于hot的定义,不是cold,就是hot。gcc手册中说:

The hot attribute on a function is used to inform the compiler that the function is a hot spot of the compiled program. The function is optimized more aggressively and on many targets it is placed into a special subsection of the text section so all hot functions appear close together, improving locality.

const

Linux内核中的一个定义:

#define __attribute_const__             __attribute__((__const__))

gcc手册:

Calls to functions whose return value is not affected by changes to the observable state of the program and that have no observable effects on such state other than to return a value may lend themselves to optimizations such as common subexpression elimination. Declaring such functions with the const attribute allows GCC to avoid emitting some calls in repeated invocations of the function with the same argument values.

For example,

int square(int) __attribute__ ((const));

tells GCC that subsequent calls to function square with the same argument value can be replaced by the result of the first call regardless of the statements in between.

用第1次调用返回的结果,来直接代替后续相同参数的调用。程序员在使用const属性优化性能的时候,一定要小心,当参数为指针时,虽然指针地址没变,但指向内容可能改变,此时就不能应用这个属性。

unused

在C程序中,如果定义了一个static函数,而没有去使用,编译时会有一个-Wunused-function警告,未使用的变量也会触发此告警。可以用unused属性去抑制这类不使用告警。

linux中有如下定义:

#define __always_unused                 __attribute__((__unused__))
#define __maybe_unused                  __attribute__((__unused__))

使用:

__attribute__((unused)) static void a(void){...}

static grub_err_t
grub_cmd_hello (grub_extcmd_context_t ctxt __attribute__ ((unused)),
        int argc __attribute__ ((unused)),
        char **args __attribute__ ((unused)))
{
  grub_printf ("%s\n", _("Hello World"));
  return 0;
}

used

通知gcc编译器在目标文件中保留这个static函数,即使它没有被引用。标记为__attribute__((used))的函数被标记在目标文件中,以避免链接器删除未使用的符号。static变量也可以标记为used。

Linux内核中有如下定义:

#define __used                          __attribute__((__used__))

copy

The copy attribute applies the set of attributes with which function has been declared to the declaration of the function to which the attribute is applied.

就是将一个function上定义的attributes,copy到另一个function。一般在定义function alias的时候使用。

linux中有如下定义:

/*
 * Optional: only supported since gcc >= 9
 * Optional: not supported by clang
 * Optional: not supported by icc
 *
 *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-copy-function-attribute
 */
#if __has_attribute(__copy__)
# define __copy(symbol)                 __attribute__((__copy__(symbol)))
#else
# define __copy(symbol)
#endif

copy属性也可以用于variable

The copy attribute applies the set of attributes with which variable has been declared to the declaration of the variable to which the attribute is applied.

nocommon

现在默认就是nocommon。

int global_var __attribute__((nocommon));

理解Common Block

noreturn

告诉gcc某个函数是不会return的,gcc可以做一点优化。

void fatal () __attribute__ ((noreturn));

void
fatal (/* … */){
  /* … */ /* Print error message. */ /* … */
  exit (1);
}

The noreturn keyword does not affect the exceptional path when that applies: a noreturn-marked function may still return to the caller by throwing an exception or calling longjmp.

但可以通过exception或longjmp的方式返回。

bitwise (sparse)

Linux内核中有如下定义:

/* sparse defines __CHECKER__; see Documentation/dev-tools/sparse.rst */
#ifdef __CHECKER__
#define __bitwise       __attribute__((bitwise))
#else
#define __bitwise
#endif

__attribute__((bitwise))这个属性并不属于gcc,而是Sparse工具。Sparse诞生于2004年, 由linux之父Linus开发, 目的是提供一个静态类型检查代码的工具, 从而减少linux内核的隐患。__bitwise的典型用法是利用typedef定义一个有bitwise属性的基类型,之后凡是利用该基类型声明的变量都将被强制类型检查。

比如kernel中的__le32和__be32都是__u32,这两个类型在赋值的时候,gcc不会抱怨,但sparse会!

no-sanitize

针对某个函数,关闭sanitize功能,具体请参考:理解-fsanitize=address

本文链接:https://cs.pynote.net/sf/c/cdm/202303012/

-- EOF --

-- MORE --