OLD | NEW |
(Empty) | |
| 1 The design of crazy_linker: |
| 2 =========================== |
| 3 |
| 4 Introduction: |
| 5 ------------- |
| 6 |
| 7 A system linker (e.g. ld.so on Linux, or /system/bin/linker on Android), is a |
| 8 particularly sophisticated piece of code because it is used to load and start |
| 9 _executables_ on the system. This requires dealing with really low-level |
| 10 details like: |
| 11 |
| 12 - The way the kernel loads and initializes binaries into a new process. |
| 13 |
| 14 - The way it passes initialization data (e.g. command-line arguments) to |
| 15 the process being launched. |
| 16 |
| 17 - Setting up the C runtime library, thread-local storage, and others properly |
| 18 before calling main(). |
| 19 |
| 20 - Be very careful in the way it operates, due to the fact that it will be used |
| 21 to load set-uid programs. |
| 22 |
| 23 - Need to support a flurry of exotic flags and environment variables that |
| 24 affect runtime behaviour in "interesting" but mostly unpredictable ways |
| 25 (see the manpages for dlopen, dlsym and ld.so for details). |
| 26 |
| 27 Add to this that most of this must be done without the C library being loaded or |
| 28 initialized yet. No wonder this code is really complex. |
| 29 |
| 30 By contrast, crazy_linker is a static library whose only purpose is to load |
| 31 ELF shared libraries, inside an _existing_ executable process. This makes it |
| 32 considerably simpler: |
| 33 |
| 34 - The runtime environment (C library, libstdc++) is available and properly |
| 35 initialized. |
| 36 |
| 37 - No need to care about kernel interfaces. Everything uses mmap() and simple |
| 38 file accesses. |
| 39 |
| 40 - The API is simple, and straightforward (no hidden behaviour changes due to |
| 41 environment variables). |
| 42 |
| 43 This document explains how the crazy_linker works. A good understanding of the |
| 44 ELF file format is recommended, though not necessary. |
| 45 |
| 46 |
| 47 I. ELF Loading Basics: |
| 48 ---------------------- |
| 49 |
| 50 When it comes to loading shared libraries, an ELF file mainly consists in the |
| 51 following parts: |
| 52 |
| 53 - A fixed-size header that identifies the file as an ELF file and gives |
| 54 offsets/sizes to other tables. |
| 55 |
| 56 - A table (called the "program header table"), containing entries describing |
| 57 'segments' of interest in the ELF file. |
| 58 |
| 59 - A table (called the "dynamic table"), containing entries describing |
| 60 properties of the ELF library. The most interesting ones are the list |
| 61 of libraries the current one depends on. |
| 62 |
| 63 - A table describing the symbols (function or global names) that the library |
| 64 references or exports. |
| 65 |
| 66 - One or more tables containing 'relocations'. Because libraries can be loaded |
| 67 at any page-aligned address in memory, numerical pointers they contain must |
| 68 be adjusted after load. That's what the relocation entries do. They can |
| 69 also reference symbols to be found in other libraries. |
| 70 |
| 71 The process of loading a given ELF shared library can be decomposed into 4 steps
: |
| 72 |
| 73 1) Map loadable segments into memory. |
| 74 |
| 75 This step parses the program header table to identify 'loadable' segments, |
| 76 reserve the corresponding address space, then map them directly into |
| 77 memory with mmap(). |
| 78 |
| 79 Related: src/crazy_linker_elf_loader.cpp |
| 80 |
| 81 |
| 82 2) Load library dependencies. |
| 83 |
| 84 This step parses the dynamic table to identify all the other shared |
| 85 libraries the current one depends on, then will _recursively_ load them. |
| 86 |
| 87 Related: src/crazy_linker_library_list.cpp |
| 88 (crazy::LibraryList::LoadLibrary()) |
| 89 |
| 90 3) Apply all relocations. |
| 91 |
| 92 This steps adjusts all pointers within the library for the actual load |
| 93 address. This can also reference symbols that appear in other libraries |
| 94 loaded in step 2). |
| 95 |
| 96 Related: src/crazy_linker_elf_relocator.cpp |
| 97 |
| 98 4) Run constructors. |
| 99 |
| 100 Libraries include a list of functions to be run at load time, typically |
| 101 to perform static C++ initialization. |
| 102 |
| 103 Related: src/crazy_linker_shared_library.cpp |
| 104 (SharedLibrary::RunConstructors()) |
| 105 |
| 106 Unloading a library is similar, but in reverse order: |
| 107 |
| 108 1) Run destructors. |
| 109 2) Unload dependencies recursively. |
| 110 3) Unmap loadable segments. |
| 111 |
| 112 |
| 113 II. Managing the list of libraries: |
| 114 ----------------------------------- |
| 115 |
| 116 It is crucial to avoid loading the same library twice in the same process, |
| 117 otherwise some really bad undefined behaviour may happen. |
| 118 |
| 119 This implies that, inside an Android application process, all system libraries |
| 120 should be loaded by the system linker (because otherwise, the Dalvik-based |
| 121 framework might load the same library on demand, at an unpredictable time). |
| 122 |
| 123 To handle this, the crazy_linker uses a custom class (crazy::LibraryList) where |
| 124 each entry (crazy::LibraryView) is reference-counted, and either references: |
| 125 |
| 126 - An application shared libraries, loaded by the crazy_linker itself. |
| 127 - A system shared libraries, loaded through the system dlopen(). |
| 128 |
| 129 Libraries loaded by the crazy_linker are modelled by a crazy::SharedLibrary |
| 130 object. The source code comments often refer to these objects as |
| 131 "crazy libraries", as opposed to "system libraries". |
| 132 |
| 133 As an example, here's a diagram that shows the list after loading a library |
| 134 'libfoo.so' that depends on the system libraries 'libc.so', 'libm.so' and |
| 135 'libOpenSLES.so'. |
| 136 |
| 137 +-------------+ |
| 138 | LibraryList | |
| 139 +-------------+ |
| 140 | |
| 141 | +-------------+ |
| 142 +----| LibraryView | ----> libc.so |
| 143 | +-------------+ |
| 144 | |
| 145 | +-------------+ |
| 146 +----| LibraryView | ----> libm.so |
| 147 | +-------------+ |
| 148 | |
| 149 | +-------------+ |
| 150 +----| LibraryView | ----> libOpenSLES.so |
| 151 | +-------------+ |
| 152 | |
| 153 | +-------------+ +-------------+ |
| 154 +----| LibraryView |----->|SharedLibrary| ---> libfoo.so |
| 155 | +-------------+ +-------------+ |
| 156 | |
| 157 ___ |
| 158 _ |
| 159 |
| 160 System libraries are identified by name. Only the official NDK-official system |
| 161 libraries are listed. It is likely that using crazy_linker to load non-NDK |
| 162 system libraries will not work correctly, so don't do it. |
| 163 |
| 164 |
| 165 III. Wrapping of linker symbols within crazy ones: |
| 166 -------------------------------------------------- |
| 167 |
| 168 Libraries loaded by the crazy linker are not visible to the system linker. |
| 169 |
| 170 This means that directly calling the system dlopen() or dlsym() from a library |
| 171 code loaded by the crazy_linker will not work properly. |
| 172 |
| 173 To work-around this, crazy_linker redirects all linker symbols to its own |
| 174 wrapper implementation. This redirection happens transparently. |
| 175 |
| 176 Related: src/crazy_linker_wrappers.cpp |
| 177 |
| 178 This also includes a few "hidden" dynamic linker symbols which are used for |
| 179 stack-unwinding. This guarantees that C++ exception propagation works. |
| 180 |
| 181 |
| 182 IV. GDB support: |
| 183 ---------------- |
| 184 |
| 185 The crazy_linker contains support code to ensure that libraries loaded with it |
| 186 are visible through GDB at runtime. For more details, see the extensive comments |
| 187 in src/crazy_linker_rdebug.h |
| 188 |
| 189 |
| 190 V. Other Implementation details: |
| 191 -------------------------------- |
| 192 |
| 193 The crazy_linker is written in C++, but its API is completely C-based. |
| 194 |
| 195 The implementation doesn't require any C++ STL feature (except for new |
| 196 and delete). |
| 197 |
| 198 Very little of the code is actually Android-specific. The target system's |
| 199 bitness is abstracted through a C++ traits class (see src/elf_traits.h). |
| 200 |
| 201 Written originally for Chrome, so follows the Chromium coding style. Which can |
| 202 be enforced by using the 'clang-format' tool with: |
| 203 |
| 204 cd /path/to/crazy_linker/ |
| 205 find . -name "*.h" -o -name "*.cpp" | xargs clang-format -style Chromium -i |
OLD | NEW |