+++ title = "Dynamic linking on Linux (x86_64)" [taxonomies] tags = ["elf", "linux", "x86"] +++ As I was interested in how the bits behind dynamic linking work, this article is about exploring this topic. However, since dynamic linking strongly depends on the OS, the architecture and the binary format, I only focus on one combination here. Spending most of my time with Linux on `x86` or `ARM` I chose the following for this article: - OS: Linux - arch: x86_64 - binfmt: [`Executable and Linking Format (ELF)`][elf-1.2] ## Introduction to dynamic linking Dynamic linking is used in the case we have non-statically linked applications. This means an application uses code which is not included in the application itself, but in a shared library. The shared libraries in turn can be used by multiple applications. The applications contain `relocation` entries which need to be resolved during runtime, because shared libraries are compiled as `position independant code (PIC)` so that they can be loaded at any any address in the applications virtual address space. This process of resolving the relocation entries at runtime is what I am referring as dynamic linking in this article. The following figure shows a simple example, where we have an application **foo** using a function **bar** from the shared library **libbar.so**. The boxes show the virtual memory mapping for **foo** over time where time increases to the right. ``` foo foo +-----------+ +-----------+ | | | | +-----------+ +-----------+ | .text.foo | | .text.foo | | | | | | ... | trigger resolve reloc | ... | pc->| call bar | X----+ | call bar |--+ | ... | | | ... | | +-----------+ | +-----------+ | | | | | | | | | | | | | +-----------+ | +-----------+ | | .text.bar | | | .text.bar | | | ... | | | ... | | | bar: | +---->[ld.so]----> | bar: |<-+pc | ... | | ... | +-----------+ +-----------+ | | | | +-----------+ +-----------+ ``` ## Conceptual overview && important parts of "the" ELF > In the following I assume a basic understanding of the ELF binary format. Before jumping into the details of dynamic linking it is important to get an conceptual overview, as well as to understand which sections of the ELF file actually matter.
On x86 calling a function in a shared library works via one indirect jump. When the application wants to call a function in a shared library it jumps to a well know location contained in the code of the application, called a `trampoline`. From there the application then jumps to a function pointer stored in a global table (`GOT = global offset table`). The application contains **one** trampoline per function used from a shared library. When the application jumps to a trampoline for the first time the trampoline will dispatch to the dynamic linker with the request to resolve the symbol. Once the dynamic linker found the address of the symbol it patches the function pointer in the `GOT` so that consecutive calls directly dispatch to the library function. ``` foo: GOT ... +------------+ +---- call bar_trampoline +- | 0xcafeface | [0] resolve (dynamic linker) | call bar_trampoline | +------------+ | ... | | 0xcafeface | [1] resolve (dynamic linker) | | +------------+ +-> bar_trampoline: | jump GOT[0] <-----------+ bar2_trampoline: jump GOT[1] ``` Once this is done, further calls to this symbol will be directly forwarded to the correct address from the corresponding trampoline. ``` foo: GOT ... +------------+ call bar_trampoline +- | 0x01234567 | [0] bar (libbar.so) +---- call bar_trampoline | +------------+ | .... | | 0xcafeface | [1] resolve (dynamic linker) | | +------------+ +-> bar_trampoline: | jump GOT[0] <-----------+ bar2_trampoline: jump GOT[1] ``` --- With that in mind we can take a look and check which sections of the ELF file are important for the dynamic linking process. - `.plt` > This section contains all the trampolines for the external functions used by > the ELF file - `.got.plt` > This section contains the global offset table `GOT` for this ELF files trampolines. - `.rel.plt` / `.rela.plt` > This section holds the `relocation` entries, which are used by the dynamic > linker to find which symbol needs to be resolved and which location in the > `GOT` to be patched. (Whether it is `rel` or `rela` depends on the > **DT_PLTREL** entry in the `.dynamic` section. ## The bits behind dynamic linking Now that we have the basic concept and know which sections of the ELF file matter we can take a look at an actual example. For the analysis I am going to use the following C program and build it explicitly as non `position independant executable (PIE)`. > Using `-no-pie` has no functional impact, it is only used to get absolute > virtual addresses in the ELF file, which makes the analysis easier to follow. ```cpp // main.c #include int main(int argc, const char* argv[]) { printf("%s argc=%d\n", argv[0], argc); puts("done"); return 0; } ``` ```bash > gcc -o main main.c -no-pie ``` We use [radare2][r2] to open the compiled file and print the disassembly of the `.got.plt` and `.plt` sections. ```nasm > r2 -A ./main --snip-- [0x00401050]> pd5 @ section..got.plt ;-- section..got.plt: ;-- _GLOBAL_OFFSET_TABLE_: [0] 0x00404000 .qword 0x0000000000403e10 ; section..dynamic ; sym..dynamic [1] 0x00404008 .qword 0x0000000000000000 [2] 0x00404010 .qword 0x0000000000000000 ;-- reloc.puts: [3] 0x00404018 .qword 0x0000000000401036 ;-- reloc.printf: [4] 0x00404020 .qword 0x0000000000401046 [0x00401050]> pd9 @ section..plt ;-- section..plt: ┌┌─> 0x00401020 ff35e22f0000 push qword [0x00404008] ╎╎ 0x00401026 ff25e42f0000 jmp qword [0x00404010] ╎╎ 0x0040102c 0f1f4000 nop dword [rax] int sym.imp.puts (const char *s); ╎╎ 0x00401030 ff25e22f0000 jmp qword [reloc.puts] ; 0x00404018 ╎╎ 0x00401036 6800000000 push 0 └──< 0x0040103b e9e0ffffff jmp sym..plt int sym.imp.printf (const char *format); ╎ 0x00401040 ff25da2f0000 jmp qword [reloc.printf] ; 0x00404020 ╎ 0x00401046 6801000000 push 1 └─< 0x0040104b e9d0ffffff jmp sym..plt [0x00401050]> ``` Taking a quick look at the `.got.plt` section we see the *global offset table GOT*. The entries *GOT[0..2]* have special meanings, *GOT[0]* holds the address of the `.dynamic` section for this ELF file, *GOT[1..2]* will be filled by the dynamic linker at program startup. Entries *GOT[3]* and *GOT[4]* contain the function pointers for **puts** and **printf** accordingly.
In the `.plt` section we can find three trampolines 1. `0x00401020` dispatch to runtime linker (special role) 1. `0x00401030` **puts** 1. `0x00401040` **printf** Looking at the **puts** trampoline we can see that the first instruction jumps to a location stored at `0x00404018` (reloc.puts) which is the GOT[3]. In the beginning this entry contains the address of the `push 0` instruction coming right after the `jmp`. This push instruction sets up some meta data for the dynamic linker. The next instruction then jumps into the first trampoline, which pushes more meta data (GOT[1]) onto the stack and then jumps to the address stored in GOT[2]. > GOT[1] & GOT[2] are zero here because they get filled by the dynamic linker > at program startup.
To understand the `push 0` instruction in the **puts** trampoline we have to take a look at the third section of interest in the ELF file, the `.rela.plt` section. ``` # -r print relocations # -D use .dynamic info when displaying info > readelf -W -r ./main --snip-- Relocation section '.rela.plt' at offset 0x4004d8 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000404018 0000000200000007 R_X86_64_JUMP_SLOT 0000000000000000 puts@GLIBC_2.2.5 + 0 0000000000404020 0000000300000007 R_X86_64_JUMP_SLOT 0000000000000000 printf@GLIBC_2.2.5 + 0 ``` The `0` passed as meta data to the dynamic linker means to use the relocation at index [0] in the `.rela.plt` section. From the ELF specification we can find how a relocation of type `rela` is defined: ```c // man 5 elf typedef struct { Elf64_Addr r_offset; uint64_t r_info; int64_t r_addend; } Elf64_Rela; #define ELF64_R_SYM(i) ((i) >> 32) #define ELF64_R_TYPE(i) ((i) & 0xffffffff) ``` `r_offset` holds the address to the GOT entry which the dynamic linker should patch once it found the address of the requested symbol. The offset here is `0x00404018` which is exactly the address of GOT[3], the function pointer used in the **puts** trampoline. From `r_info` the dynamic linker can find out which symbol it should look for. ```c ELF64_R_SYM(0x0000000200000007) -> 0x2 ``` The resulting index [2] is the offset into the dynamic symbol table (`.dynsym`). Dumping the dynamic symbol table with readelf we can see that the symbol at index [2] is **puts**. ``` # -s print symbols > readelf -W -s ./main Symbol table '.dynsym' contains 7 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTable 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 (2) 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (2) --snip-- ``` ## Appendix: .dynamic section The `.dynamic` section of an ELF file contains important information for the dynamic linking process and is created when linking the ELF file. The information can be accessed at runtime using following symbol ```c extern Elf64_Dyn _DYNAMIC[]; ``` which is an array of `Elf64_Dyn` entries ```c typedef struct { Elf64_Sxword d_tag; union { Elf64_Xword d_val; Elf64_Addr d_ptr; } d_un; } Elf64_Dyn; ``` > Since this meta-information is specific to an ELF file, every ELF file has > its own `.dynamic` section and `_DYNAMIC` symbol. Following entries are most interesting for dynamic linking: d_tag | d_un | description -------------|-------|------------------------------------------------- DT_PLTGOT | d_ptr | address of .got.plt DT_JMPREL | d_ptr | address of .rela.plt DT_PLTREL | d_val | DT_REL or DT_RELA DT_PLTRELSZ | d_val | size of .rela.plt table DT_RELENT | d_val | size of a single REL entry (PLTREL == DT_REL) DT_RELAENT | d_val | size of a single RELA entry (PLTREL == DT_RELA)
We can use readelf to dump the `.dynamic` section. In the following snippet I only kept the relevant entries: ``` # -d dump .dynamic section > readelf -d ./main Dynamic section at offset 0x2e10 contains 24 entries: Tag Type Name/Value 0x0000000000000003 (PLTGOT) 0x404000 0x0000000000000002 (PLTRELSZ) 48 (bytes) 0x0000000000000014 (PLTREL) RELA 0x0000000000000017 (JMPREL) 0x4004d8 0x0000000000000009 (RELAENT) 24 (bytes) ``` We can see that **PLTGOT** points to address **0x404000** which is the address of the GOT as we saw in the radare2 dump. Also we can see that **JMPREL** points to the relocation table. **PLTRELSZ / RELAENT** tells us that we have 2 relocation entries which are exactly the ones for **puts** and **printf**. ## References - [`man 5 elf`][man-elf] - [Executable and Linking Format (ELF)][elf-1.2] - [SystemV ABI 4.1][systemv-abi-4.1] - [SystemV ABI 1.0 (x86_64)][systemv-abi-1.0-x86_64] - [`man 1 readelf`][man-readelf] [r2]: https://rada.re/n/radare2.html [man-elf]: http://man7.org/linux/man-pages/man5/elf.5.html [man-readelf]: http://man7.org/linux/man-pages/man1/readelf.1.html [elf-1.2]: http://refspecs.linuxbase.org/elf/elf.pdf [systemv-abi-4.1]: https://refspecs.linuxfoundation.org/elf/gabi41.pdf [systemv-abi-1.0-x86_64]: https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf