| OLD | NEW |
| (Empty) |
| 1 Subzero - Fast code generator for PNaCl bitcode | |
| 2 =============================================== | |
| 3 | |
| 4 Design | |
| 5 ------ | |
| 6 | |
| 7 See the accompanying DESIGN.rst file for a more detailed technical overview of | |
| 8 Subzero. | |
| 9 | |
| 10 Building | |
| 11 -------- | |
| 12 | |
| 13 Subzero is set up to be built within the Native Client tree. Follow the | |
| 14 `Developing PNaCl | |
| 15 <https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl
>`_ | |
| 16 instructions, in particular the section on building PNaCl sources. This will | |
| 17 prepare the necessary external headers and libraries that Subzero needs. | |
| 18 Checking out the Native Client project also gets the pre-built clang and LLVM | |
| 19 tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which | |
| 20 are used for building Subzero. | |
| 21 | |
| 22 The Subzero source is in ``native_client/toolchain_build/src/subzero``. From | |
| 23 within that directory, ``git checkout master && git pull`` to get the latest | |
| 24 version of Subzero source code. | |
| 25 | |
| 26 The Makefile is designed to be used as part of the higher level LLVM build | |
| 27 system. To build manually, use the ``Makefile.standalone``. There are several | |
| 28 build configurations from the command line:: | |
| 29 | |
| 30 make -f Makefile.standalone | |
| 31 make -f Makefile.standalone DEBUG=1 | |
| 32 make -f Makefile.standalone NOASSERT=1 | |
| 33 make -f Makefile.standalone DEBUG=1 NOASSERT=1 | |
| 34 make -f Makefile.standalone MINIMAL=1 | |
| 35 make -f Makefile.standalone ASAN=1 | |
| 36 make -f Makefile.standalone TSAN=1 | |
| 37 | |
| 38 ``DEBUG=1`` builds without optimizations and is good when running the translator | |
| 39 inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred | |
| 40 configuration for performance testing the translator. ``MINIMAL=1`` attempts to | |
| 41 minimize the size of the translator by compiling out everything unnecessary. | |
| 42 ``ASAN=1`` enables AddressSanitizer, and ``TSAN=1`` enables ThreadSanitizer. | |
| 43 | |
| 44 The result of the ``make`` command is the target ``pnacl-sz`` in the current | |
| 45 directory. | |
| 46 | |
| 47 ``pnacl-sz`` | |
| 48 ------------ | |
| 49 | |
| 50 The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it | |
| 51 into ICE (Subzero's intermediate representation). It then invokes the ICE | |
| 52 translate method to lower it to target-specific machine code, optionally dumping | |
| 53 the intermediate representation at various stages of the translation. | |
| 54 | |
| 55 The program can be run as follows:: | |
| 56 | |
| 57 ../pnacl-sz ./path/to/<file>.pexe | |
| 58 ../pnacl-sz ./tests_lit/pnacl-sz_tests/<file>.ll | |
| 59 | |
| 60 At this time, ``pnacl-sz`` accepts a number of arguments, including the | |
| 61 following: | |
| 62 | |
| 63 ``-help`` -- Show available arguments and possible values. (Note: this | |
| 64 unfortunately also pulls in some LLVM-specific options that are reported but | |
| 65 that Subzero doesn't use.) | |
| 66 | |
| 67 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if | |
| 68 ICE is missing some support. | |
| 69 | |
| 70 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632. | |
| 71 Future targets include x8664, arm32, and arm64. | |
| 72 | |
| 73 ``-filetype=obj|asm|iasm`` -- Select the output file type. ``obj`` is a | |
| 74 native ELF file, ``asm`` is a textual assembly file, and ``iasm`` is a | |
| 75 low-level textual assembly file demonstrating the integrated assembler. | |
| 76 | |
| 77 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, | |
| 78 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and | |
| 79 represent the minimum optimization and worst code quality, but fastest code | |
| 80 generation. | |
| 81 | |
| 82 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a | |
| 83 comma-separated list of values. The default is ``none``, and the value | |
| 84 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use | |
| 85 are ``all``, ``most``, and ``none``. | |
| 86 | |
| 87 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. | |
| 88 | |
| 89 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is | |
| 90 controlled by ``-verbose``). Default is stdout. | |
| 91 | |
| 92 ``-timing`` -- Dump some pass timing information after translating the input | |
| 93 file. | |
| 94 | |
| 95 Running the test suite | |
| 96 ---------------------- | |
| 97 | |
| 98 Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which | |
| 99 lives in ``tests_lit``. To execute the test suite, first build Subzero, and then | |
| 100 run:: | |
| 101 | |
| 102 make -f Makefile.standalone check-lit | |
| 103 | |
| 104 There is also a suite of cross tests in the ``crosstest`` directory. A cross | |
| 105 test takes a test bitcode file implementing some unit tests, and translates it | |
| 106 twice, once with Subzero and once with LLVM's known-good ``llc`` translator. | |
| 107 The Subzero-translated symbols are specially mangled to avoid multiple | |
| 108 definition errors from the linker. Both translated versions are linked together | |
| 109 with a driver program that calls each version of each unit test with a variety | |
| 110 of interesting inputs and compares the results for equality. The cross tests | |
| 111 are currently invoked by running:: | |
| 112 | |
| 113 make -f Makefile.standalone check-xtest | |
| 114 | |
| 115 Similar, there is a suite of unit tests:: | |
| 116 | |
| 117 make -f Makefile.standalone check-unit | |
| 118 | |
| 119 A convenient way to run the lit, cross, and unit tests is:: | |
| 120 | |
| 121 make -f Makefile.standalone check | |
| 122 | |
| 123 Assembling ``pnacl-sz`` output as needed | |
| 124 ---------------------------------------- | |
| 125 | |
| 126 ``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``. | |
| 127 | |
| 128 ``pnacl-sz`` can also produce textual assembly code in a structure suitable for | |
| 129 input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object | |
| 130 file can then be produced using the command:: | |
| 131 | |
| 132 llvm-mc -triple=i686 -filetype=obj -o=MyObj.o | |
| 133 | |
| 134 Building a translated binary | |
| 135 ---------------------------- | |
| 136 | |
| 137 There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe | |
| 138 into a fully linked executable. Run it with ``-help`` for extensive | |
| 139 documentation. | |
| 140 | |
| 141 By default, ``szbuild.py`` builds an executable using only Subzero translation, | |
| 142 but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is | |
| 143 the name of the LLVM translator) for bisection-based debugging. In bisection | |
| 144 debugging mode, the pexe is translated using both Subzero and ``llc``, and the | |
| 145 resulting object files are combined into a single executable using symbol | |
| 146 weakening and other linker tricks to control which Subzero symbols and which | |
| 147 ``llc`` symbols take precedence. This is controlled by the ``-include`` and | |
| 148 ``-exclude`` arguments. These can be used to rapidly find a single function | |
| 149 that Subzero translates incorrectly leading to incorrect output. | |
| 150 | |
| 151 There is another helper script, ``pydir/szbuild_spec2k.py``, that runs | |
| 152 ``szbuild.py`` on one or more components of the Spec2K suite. This assumes that | |
| 153 Spec2K is set up in the usual place in the Native Client tree, and the finalized | |
| 154 pexe files have been built. (Note: for working with Spec2K and other pexes, | |
| 155 it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the | |
| 156 original function and global variable names.) | |
| 157 | |
| 158 Status | |
| 159 ------ | |
| 160 | |
| 161 Subzero currently fully supports the x86-32 architecture, for both native and | |
| 162 Native Client sandboxing modes. The x86-64 architecture is also supported in | |
| 163 native mode only, and only for the x32 flavor due to the fact that pointers and | |
| 164 32-bit integers are indistinguishable in PNaCl bitcode. Sandboxing support for | |
| 165 x86-64 is in progress. ARM and MIPS support is in progress. Two optimization | |
| 166 levels, ``-Om1`` and ``-O2``, are implemented. | |
| 167 | |
| 168 The ``-Om1`` configuration is designed to be the simplest and fastest possible, | |
| 169 with a minimal set of passes and transformations. | |
| 170 | |
| 171 * Simple Phi lowering before target lowering, by generating temporaries and | |
| 172 adding assignments to the end of predecessor blocks. | |
| 173 | |
| 174 * Simple register allocation limited to pre-colored or infinite-weight | |
| 175 Variables. | |
| 176 | |
| 177 The ``-O2`` configuration is designed to use all optimizations available and | |
| 178 produce the best code. | |
| 179 | |
| 180 * Address mode inference to leverage the complex x86 addressing modes. | |
| 181 | |
| 182 * Compare/branch fusing based on liveness/last-use analysis. | |
| 183 | |
| 184 * Global, linear-scan register allocation. | |
| 185 | |
| 186 * Advanced phi lowering after target lowering and global register allocation, | |
| 187 via edge splitting, topological sorting of the parallel moves, and final local | |
| 188 register allocation. | |
| 189 | |
| 190 * Stack slot coalescing to reduce frame size. | |
| 191 | |
| 192 * Branch optimization to reduce the number of branches to the following block. | |
| OLD | NEW |