Chromium Code Reviews| OLD | NEW |
|---|---|
| 1 Subzero - Fast code generator for PNaCl bitcode | 1 Subzero - Fast code generator for PNaCl bitcode |
| 2 =============================================== | 2 =============================================== |
| 3 | 3 |
| 4 Building | 4 Building |
| 5 -------- | 5 -------- |
| 6 | 6 |
| 7 You must have LLVM trunk source code available and built. See | 7 Subzero is set up to be built within the Native Client tree. Follow the |
| 8 http://llvm.org/docs/GettingStarted.html#getting-started-quickly-a-summary for | 8 instructions at |
| 9 guidance. | 9 https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl |
|
JF
2014/11/18 15:57:08
follow `the instructions < https://sites.google.co
Jim Stichnoth
2014/11/18 16:18:56
Done.
| |
| 10 and in particular the section on building PNaCl sources. This will prepare the | |
| 11 necessary external headers and libraries that Subzero needs. Checking out the | |
| 12 Native Client project also gets the pre-built clang and LLVM tools in | |
| 13 ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which are used | |
| 14 for building Subzero. | |
| 10 | 15 |
| 11 Set variables ``LLVM_SRC_PATH`` and ``LLVM_BIN_PATH`` to point to the | 16 The Subzero source is in ``native_client/toolchain_build/src/subzero``. From |
| 12 appropriate directories in the LLVM source and build directories. These can be | 17 within that directory, ``git checkout master && git pull`` to get the latest |
| 13 set as environment variables, or you can modify the top-level Makefile. | 18 version of Subzero source code. |
| 14 | 19 |
| 15 Run ``make`` at the top level to build the main target ``llvm2ice``. | 20 The Makefile is designed to be used as part of the higher level LLVM build |
| 21 system. To build manually, use the ``Makefile.standalone``. There are several | |
| 22 build configurations from the command line:: | |
| 23 | |
| 24 make -f Makefile.standalone | |
| 25 make -f Makefile.standalone DEBUG=1 | |
| 26 make -f Makefile.standalone NOASSERT=1 | |
| 27 make -f Makefile.standalone DEBUG=1 NOASSERT=1 | |
| 28 make -f Makefile.standalone MINIMAL=1 | |
| 29 | |
| 30 ``DEBUG=1`` builds without optimizations and is good when running the translator | |
| 31 inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred | |
| 32 configuration for performance testing the translator. ``MINIMAL=1`` attempts to | |
| 33 minimize the size of the translator by compiling out everything unnecessary. | |
| 34 | |
| 35 The result of the ``make`` command is the target ``llvm2ice`` in the current | |
| 36 directory. | |
| 16 | 37 |
| 17 ``llvm2ice`` | 38 ``llvm2ice`` |
| 18 ------------ | 39 ------------ |
| 19 | 40 |
| 20 The ``llvm2ice`` program uses the LLVM infrastructure to parse an LLVM bitcode | 41 The ``llvm2ice`` program parses a pexe or an LLVM bitcode file and translates it |
| 21 file and translate it into ICE. It then invokes ICE's translate method to lower | 42 into ICE. It then invokes ICE's translate method to lower it to target-specific |
|
JF
2014/11/18 15:57:09
Define ICE
Jim Stichnoth
2014/11/18 16:18:57
Done.
| |
| 22 it to target-specific machine code, dumping the IR at various stages of the | 43 machine code, optionally dumping the IR at various stages of the translation. |
| 23 translation. | |
| 24 | 44 |
| 25 The program can be run as follows:: | 45 The program can be run as follows:: |
| 26 | 46 |
| 27 ../llvm2ice ./ir_samples/<file>.ll | 47 ../llvm2ice ./path/to/<file>.pexe |
| 28 ../llvm2ice ./tests_lit/llvm2ice_tests/<file>.ll | 48 ../llvm2ice ./tests_lit/llvm2ice_tests/<file>.ll |
| 29 | 49 |
| 30 At this time, ``llvm2ice`` accepts a few arguments: | 50 At this time, ``llvm2ice`` accepts a number of arguments, including the |
| 51 following: | |
| 31 | 52 |
| 32 ``-help`` -- Show available arguments and possible values. | 53 ``-help`` -- Show available arguments and possible values. (Note: this |
| 54 unfortunately also pulls in some LLVM-specific options that are reported but | |
| 55 that Subzero doesn't use.) | |
| 33 | 56 |
| 34 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if | 57 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if |
| 35 ICE is missing some support. | 58 ICE is missing some support. |
| 36 | 59 |
| 37 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632. | 60 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632. |
| 38 Future targets include x8664, arm32, and arm64. | 61 Future targets include x8664, arm32, and arm64. |
| 39 | 62 |
| 63 ``-integrated-as=0|1`` -- Disable/enable the integrated assembler. | |
| 64 | |
| 40 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, | 65 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, |
| 41 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and | 66 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and |
| 42 represent the minimum optimization and worst code quality, but fastest code | 67 represent the minimum optimization and worst code quality, but fastest code |
| 43 generation. | 68 generation. |
| 44 | 69 |
| 45 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a | 70 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a |
| 46 comma-separated list of values. The default is ``none``, and the value | 71 comma-separated list of values. The default is ``none``, and the value |
| 47 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use | 72 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use |
| 48 are ``all`` and ``none``. | 73 are ``all`` and ``none``. |
| 49 | 74 |
| 50 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. | 75 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. |
| 51 | 76 |
| 52 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is | 77 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is |
| 53 controlled by ``-verbose``). Default is stdout. | 78 controlled by ``-verbose``). Default is stdout. |
| 54 | 79 |
| 55 See ir_samples/README.rst for more details. | 80 ``-timing`` -- Dump some pass timing information after translating the input |
| 81 file. | |
| 56 | 82 |
| 57 Running the test suite | 83 Running the test suite |
| 58 ---------------------- | 84 ---------------------- |
| 59 | 85 |
| 60 Subzero uses the LLVM ``lit`` testing tool for its test suite, which lives in | 86 Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which |
| 61 ``tests_lit``. To execute the test suite, first build Subzero, and then run:: | 87 lives in ``tests_lit``. To execute the test suite, first build Subzero, and then |
| 88 run:: | |
| 62 | 89 |
| 63 python <path_to_lit.py> -sv tests_lit | 90 make -f Makefile.standalone check-lit |
| 64 | 91 |
| 65 ``path_to_lit`` is the direct path to the lit script in the LLVM source | 92 There is also a suite of cross tests in the ``crosstest`` directory. A cross |
| 66 (``$LLVM_SRC_PATH/utils/lit/lit.py``). | 93 test takes a test bitcode file implementing some unit tests, and translates it |
| 94 twice, once with Subzero and once with LLVM's known-good ``llc`` translator. | |
| 95 The Subzero-translated symbols are specially mangled to avoid multiple | |
| 96 definition errors from the linker. Both translated versions are linked together | |
| 97 with a driver program that calls each version of each unit test with a variety | |
| 98 of interesting inputs and compares the results for equality. The cross tests | |
| 99 are currently invoked by running the ``runtests.sh`` script. | |
| 67 | 100 |
| 68 The above ``lit`` execution also needs the LLVM binary path in the | 101 A convenient way to run both the lit tests and the cross tests is:: |
| 69 ``LLVM_BIN_PATH`` env var. | |
| 70 | 102 |
| 71 Assuming the LLVM paths are set up, ``make check`` is a convenient way to run | 103 make -f Makefile.standalone check |
| 72 the test suite. | |
| 73 | 104 |
| 74 Assembling ``llvm2ice`` output | 105 Assembling ``llvm2ice`` output |
| 75 ------------------------------ | 106 ------------------------------ |
| 76 | 107 |
| 77 Currently ``llvm2ice`` produces textual assembly code in a structure suitable | 108 Currently ``llvm2ice`` produces textual assembly code in a structure suitable |
| 78 for input to ``llvm-mc`` and currently using "intel" assembly syntax. The first | 109 for input to ``llvm-mc``. An object file can be produced using the command:: |
| 79 line of output is a convenient comment indicating how to pipe the output to | 110 |
| 80 ``llvm-mc`` to produce object code. | 111 llvm-mc -arch=x86 -filetype=obj -o=MyObj.o |
| 112 | |
| 113 In the future, the integrated assembler will directly produce ELF object files. | |
| 114 | |
| 115 Building a translated binary | |
| 116 ---------------------------- | |
| 117 | |
| 118 There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe | |
| 119 into a fully linked executable. Run it with ``-help`` for extensive | |
| 120 documentation. | |
| 121 | |
| 122 By default, ``szbuild.py`` builds an executable using only Subzero translation, | |
| 123 but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is | |
| 124 the name of the LLVM translator) for bisection-based debugging. In bisection | |
| 125 debugging mode, the pexe is translated using both Subzero and ``llc``, and the | |
| 126 resulting object files are combined into a single executable using symbol | |
| 127 weakening and other linker tricks to control which Subzero symbols and which | |
| 128 ``llc`` symbols take precedence. This is controlled by the ``-include`` and | |
| 129 ``-exclude`` arguments. These can be used to rapidly find a single function | |
| 130 that Subzero translates incorrectly leading to incorrect output. | |
| 131 | |
| 132 There is another helper script, ``pydir/szbuild_spec2k.py``, that runs | |
| 133 ``szbuild.py`` on one or more components of the Spec2K suite. This assumes that | |
| 134 Spec2K is set up in the usual place in the Native Client tree, and the finalized | |
| 135 pexe files have been built. | |
| 136 | |
| 137 Status | |
| 138 ------ | |
| 139 | |
| 140 Subzero currently translates only for the x86-32 architecture. Native Client | |
| 141 sandboxing is not yet implemented. Two optimization levels, ``-Om1`` and | |
| 142 ``-O2``, are implemented. | |
| 143 | |
| 144 The ``-Om1`` configuration is designed to be the simplest and fastest possible, | |
| 145 with a minimal set of passes and transformations. | |
| 146 | |
| 147 * Simple Phi lowering before target lowering, by generating temporaries and | |
| 148 adding assignments to the end of predecessor blocks. | |
| 149 | |
| 150 * Simple register allocation limited to pre-colored and infinite-weight | |
| 151 Variables. | |
| 152 | |
| 153 The ``-O2`` configuration is designed to use all optimizations available and | |
| 154 produce the best code. | |
| 155 | |
| 156 * Address mode inference to leverage the complex x86 addressing modes. | |
| 157 | |
| 158 * Compare/branch fusing based on liveness/last-use analysis. | |
| 159 | |
| 160 * Global, linear-scan register allocation. | |
| 161 | |
| 162 * Advanced phi lowering after target lowering and global register allocation, | |
| 163 via edge splitting, topological sorting of the parallel moves, and final local | |
| 164 register allocation. | |
| 165 | |
| 166 * Stack slot coalescing to reduce frame size. | |
| 167 | |
| 168 * Branch optimization to reduce the number of branches to the following block. | |
| OLD | NEW |