理解Common Block

Last Updated: 2023-06-19 03:06:51 Monday

-- TOC --


未初始化的全局变量,在object file的符号表中,可能属于弱符号,index为COMMON,即COMMON Block!这是什么意思?

据说这种Common机制来自早期的Fortran,早期的Fortran没有动态分配空间的机制,程序员必须事先申明它所需要的空间大小。Fortran把这种空间叫做COMMON BLOCK,当不同的目标文件需要的COMMON块空间大小不一致的时候,以最大的那块为准。


现代链接器在处理弱符号的时候,采用的就是与COMMON BLOCK一样的机制,当多个文件出现相同名称的弱符号的时候,链接器以所占空间最大的那个符号为准!链接器只能看到符号的大小,看不到符号的类型,它只能这么干。


GCC的-fno-common允许将未初始化的全局变量,不以COMMON BLOCK的方式来处理,或者使用如下定义:

int global_var __attribute__((nocommon));

只要未初始化的全局变量不是以COMMON BLOCK的形式存在,它就是一个强符号了,出现重复定义时,链接器会直接报错。

查看了一下gcc的manual page,发现在默认情况下,-fno-common已经开启了,要测试common block,反而需要-fcommon来打开这个古老的开关。下面做个测试:


#include <stdio.h>

int a;
int b;

int main(){
    printf("%d %d\n", a, b);
    return 0;


long long a;


$ gcc -c -fcommon test.c
$ gcc -c -fcommon t2.c
$ readelf -s test.o

Symbol table '.symtab' contains 8 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000004     4 OBJECT  GLOBAL DEFAULT  COM a
     5: 0000000000000004     4 OBJECT  GLOBAL DEFAULT  COM b
     6: 0000000000000000    40 FUNC    GLOBAL DEFAULT    1 main
     7: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
[xinlin@likecat test]$ readelf -s t2.o

Symbol table '.symtab' contains 3 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS t2.c
     2: 0000000000000008     8 OBJECT  GLOBAL DEFAULT  COM a



$ gcc test.o t2.o -o tt2
$ readelf -s tt2 | grep ' a'
    64: 0000000000404030     8 OBJECT  GLOBAL DEFAULT   25 a

这段测试代码,如果在-fcommon下强行通过编译,是存在隐患的。最终链接后,变量a是8bytes的long long类型,而printf使用的%d来打印a。



In C code, this option controls the placement of global variables defined without an initializer, known as tentative definitions in the C standard. Tentative definitions are distinct from declarations of a variable with the "extern" keyword, which do not allocate storage.

The default is -fno-common, which specifies that the compiler places uninitialized global variables in the BSS section of the object file. This inhibits the merging of tentative definitions by the linker so you get a multiple-definition error if the same variable is accidentally defined in more than one compilation unit.

The -fcommon places uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. This behavior is inconsistent with C++, and on many targets implies a speed and code size penalty on global variable references. It is mainly useful to enable legacy code to link without errors.


-- EOF --

-- MORE --