Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(641)

Side by Side Diff: README.rst

Issue 1309073003: Subzero: Add a detailed design document. (Closed) Base URL: https://chromium.googlesource.com/native_client/pnacl-subzero.git@master
Patch Set: Created 5 years, 3 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« DESIGN.rst ('K') | « DESIGN.rst ('k') | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 Subzero - Fast code generator for PNaCl bitcode 1 Subzero - Fast code generator for PNaCl bitcode
2 =============================================== 2 ===============================================
3 3
4 Design
5 ------
6
7 See the accompanying DESIGN.rst file for a more detailed technical overview of
8 Subzero.
9
4 Building 10 Building
5 -------- 11 --------
6 12
7 Subzero is set up to be built within the Native Client tree. Follow the 13 Subzero is set up to be built within the Native Client tree. Follow the
8 `Developing PNaCl 14 `Developing PNaCl
9 <https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl >`_ 15 <https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl >`_
10 instructions, in particular the section on building PNaCl sources. This will 16 instructions, in particular the section on building PNaCl sources. This will
11 prepare the necessary external headers and libraries that Subzero needs. 17 prepare the necessary external headers and libraries that Subzero needs.
12 Checking out the Native Client project also gets the pre-built clang and LLVM 18 Checking out the Native Client project also gets the pre-built clang and LLVM
13 tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which 19 tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which
14 are used for building Subzero. 20 are used for building Subzero.
15 21
16 The Subzero source is in ``native_client/toolchain_build/src/subzero``. From 22 The Subzero source is in ``native_client/toolchain_build/src/subzero``. From
17 within that directory, ``git checkout master && git pull`` to get the latest 23 within that directory, ``git checkout master && git pull`` to get the latest
18 version of Subzero source code. 24 version of Subzero source code.
19 25
20 The Makefile is designed to be used as part of the higher level LLVM build 26 The Makefile is designed to be used as part of the higher level LLVM build
21 system. To build manually, use the ``Makefile.standalone``. There are several 27 system. To build manually, use the ``Makefile.standalone``. There are several
22 build configurations from the command line:: 28 build configurations from the command line::
23 29
24 make -f Makefile.standalone 30 make -f Makefile.standalone
25 make -f Makefile.standalone DEBUG=1 31 make -f Makefile.standalone DEBUG=1
26 make -f Makefile.standalone NOASSERT=1 32 make -f Makefile.standalone NOASSERT=1
27 make -f Makefile.standalone DEBUG=1 NOASSERT=1 33 make -f Makefile.standalone DEBUG=1 NOASSERT=1
28 make -f Makefile.standalone MINIMAL=1 34 make -f Makefile.standalone MINIMAL=1
35 make -f Makefile.standalone ASAN=1
36 make -f Makefile.standalone TSAN=1
29 37
30 ``DEBUG=1`` builds without optimizations and is good when running the translator 38 ``DEBUG=1`` builds without optimizations and is good when running the translator
31 inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred 39 inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred
32 configuration for performance testing the translator. ``MINIMAL=1`` attempts to 40 configuration for performance testing the translator. ``MINIMAL=1`` attempts to
33 minimize the size of the translator by compiling out everything unnecessary. 41 minimize the size of the translator by compiling out everything unnecessary.
42 ``ASAN=1`` enables AddressSanitizer, and ``TSAN=1`` enables ThreadSanitizer.
34 43
35 The result of the ``make`` command is the target ``pnacl-sz`` in the current 44 The result of the ``make`` command is the target ``pnacl-sz`` in the current
36 directory. 45 directory.
37 46
38 ``pnacl-sz`` 47 ``pnacl-sz``
39 ------------ 48 ------------
40 49
41 The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it 50 The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it
42 into ICE (Subzero's intermediate representation). It then invokes the ICE 51 into ICE (Subzero's intermediate representation). It then invokes the ICE
43 translate method to lower it to target-specific machine code, optionally dumping 52 translate method to lower it to target-specific machine code, optionally dumping
(...skipping 22 matching lines...) Expand all
66 low-level textual assembly file demonstrating the integrated assembler. 75 low-level textual assembly file demonstrating the integrated assembler.
67 76
68 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, 77 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``,
69 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and 78 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and
70 represent the minimum optimization and worst code quality, but fastest code 79 represent the minimum optimization and worst code quality, but fastest code
71 generation. 80 generation.
72 81
73 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a 82 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a
74 comma-separated list of values. The default is ``none``, and the value 83 comma-separated list of values. The default is ``none``, and the value
75 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use 84 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use
76 are ``all`` and ``none``. 85 are ``all``, ``most``, and ``none``.
77 86
78 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. 87 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout.
79 88
80 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is 89 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is
81 controlled by ``-verbose``). Default is stdout. 90 controlled by ``-verbose``). Default is stdout.
82 91
83 ``-timing`` -- Dump some pass timing information after translating the input 92 ``-timing`` -- Dump some pass timing information after translating the input
84 file. 93 file.
85 94
86 Running the test suite 95 Running the test suite
87 ---------------------- 96 ----------------------
88 97
89 Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which 98 Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which
90 lives in ``tests_lit``. To execute the test suite, first build Subzero, and then 99 lives in ``tests_lit``. To execute the test suite, first build Subzero, and then
91 run:: 100 run::
92 101
93 make -f Makefile.standalone check-lit 102 make -f Makefile.standalone check-lit
94 103
95 There is also a suite of cross tests in the ``crosstest`` directory. A cross 104 There is also a suite of cross tests in the ``crosstest`` directory. A cross
96 test takes a test bitcode file implementing some unit tests, and translates it 105 test takes a test bitcode file implementing some unit tests, and translates it
97 twice, once with Subzero and once with LLVM's known-good ``llc`` translator. 106 twice, once with Subzero and once with LLVM's known-good ``llc`` translator.
98 The Subzero-translated symbols are specially mangled to avoid multiple 107 The Subzero-translated symbols are specially mangled to avoid multiple
99 definition errors from the linker. Both translated versions are linked together 108 definition errors from the linker. Both translated versions are linked together
100 with a driver program that calls each version of each unit test with a variety 109 with a driver program that calls each version of each unit test with a variety
101 of interesting inputs and compares the results for equality. The cross tests 110 of interesting inputs and compares the results for equality. The cross tests
102 are currently invoked by running the ``runtests.sh`` script. 111 are currently invoked by running::
103 112
104 A convenient way to run both the lit tests and the cross tests is:: 113 make -f Makefile.standalone check-xtest
114
115 Similar, there is a suite of unit tests::
116
117 make -f Makefile.standalone check-unit
118
119 A convenient way to run the lit, cross, and unit tests is::
105 120
106 make -f Makefile.standalone check 121 make -f Makefile.standalone check
107 122
108 Assembling ``pnacl-sz`` output as needed 123 Assembling ``pnacl-sz`` output as needed
109 ---------------------------------------- 124 ----------------------------------------
110 125
111 ``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``. 126 ``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``.
112 127
113 ``pnacl-sz`` can also produce textual assembly code in a structure suitable for 128 ``pnacl-sz`` can also produce textual assembly code in a structure suitable for
114 input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object 129 input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object
115 file can then be produced using the command:: 130 file can then be produced using the command::
116 131
117 llvm-mc -arch=x86 -filetype=obj -o=MyObj.o 132 llvm-mc -triple=i686 -filetype=obj -o=MyObj.o
118 133
119 Building a translated binary 134 Building a translated binary
120 ---------------------------- 135 ----------------------------
121 136
122 There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe 137 There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe
123 into a fully linked executable. Run it with ``-help`` for extensive 138 into a fully linked executable. Run it with ``-help`` for extensive
124 documentation. 139 documentation.
125 140
126 By default, ``szbuild.py`` builds an executable using only Subzero translation, 141 By default, ``szbuild.py`` builds an executable using only Subzero translation,
127 but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is 142 but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is
128 the name of the LLVM translator) for bisection-based debugging. In bisection 143 the name of the LLVM translator) for bisection-based debugging. In bisection
129 debugging mode, the pexe is translated using both Subzero and ``llc``, and the 144 debugging mode, the pexe is translated using both Subzero and ``llc``, and the
130 resulting object files are combined into a single executable using symbol 145 resulting object files are combined into a single executable using symbol
131 weakening and other linker tricks to control which Subzero symbols and which 146 weakening and other linker tricks to control which Subzero symbols and which
132 ``llc`` symbols take precedence. This is controlled by the ``-include`` and 147 ``llc`` symbols take precedence. This is controlled by the ``-include`` and
133 ``-exclude`` arguments. These can be used to rapidly find a single function 148 ``-exclude`` arguments. These can be used to rapidly find a single function
134 that Subzero translates incorrectly leading to incorrect output. 149 that Subzero translates incorrectly leading to incorrect output.
135 150
136 There is another helper script, ``pydir/szbuild_spec2k.py``, that runs 151 There is another helper script, ``pydir/szbuild_spec2k.py``, that runs
137 ``szbuild.py`` on one or more components of the Spec2K suite. This assumes that 152 ``szbuild.py`` on one or more components of the Spec2K suite. This assumes that
138 Spec2K is set up in the usual place in the Native Client tree, and the finalized 153 Spec2K is set up in the usual place in the Native Client tree, and the finalized
139 pexe files have been built. (Note: for working with Spec2K and other pexes, 154 pexe files have been built. (Note: for working with Spec2K and other pexes,
140 it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the 155 it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the
141 original function and global variable names.) 156 original function and global variable names.)
142 157
143 Status 158 Status
144 ------ 159 ------
145 160
146 Subzero currently translates only for the x86-32 architecture. Native Client 161 Subzero currently fully supports the x86-32 architecture, for both native and
147 sandboxing is not yet implemented. Two optimization levels, ``-Om1`` and 162 Native Client sandboxing modes. The x86-64 architecture is also supported in
148 ``-O2``, are implemented. 163 native mode only, for the x32 flavor due to PNaCl bitcode restrictions. ARM and
164 MIPS support is in progress. Two optimization levels, ``-Om1`` and ``-O2``, are
165 implemented.
149 166
150 The ``-Om1`` configuration is designed to be the simplest and fastest possible, 167 The ``-Om1`` configuration is designed to be the simplest and fastest possible,
151 with a minimal set of passes and transformations. 168 with a minimal set of passes and transformations.
152 169
153 * Simple Phi lowering before target lowering, by generating temporaries and 170 * Simple Phi lowering before target lowering, by generating temporaries and
154 adding assignments to the end of predecessor blocks. 171 adding assignments to the end of predecessor blocks.
155 172
156 * Simple register allocation limited to pre-colored and infinite-weight 173 * Simple register allocation limited to pre-colored or infinite-weight
157 Variables. 174 Variables.
158 175
159 The ``-O2`` configuration is designed to use all optimizations available and 176 The ``-O2`` configuration is designed to use all optimizations available and
160 produce the best code. 177 produce the best code.
161 178
162 * Address mode inference to leverage the complex x86 addressing modes. 179 * Address mode inference to leverage the complex x86 addressing modes.
163 180
164 * Compare/branch fusing based on liveness/last-use analysis. 181 * Compare/branch fusing based on liveness/last-use analysis.
165 182
166 * Global, linear-scan register allocation. 183 * Global, linear-scan register allocation.
167 184
168 * Advanced phi lowering after target lowering and global register allocation, 185 * Advanced phi lowering after target lowering and global register allocation,
169 via edge splitting, topological sorting of the parallel moves, and final local 186 via edge splitting, topological sorting of the parallel moves, and final local
170 register allocation. 187 register allocation.
171 188
172 * Stack slot coalescing to reduce frame size. 189 * Stack slot coalescing to reduce frame size.
173 190
174 * Branch optimization to reduce the number of branches to the following block. 191 * Branch optimization to reduce the number of branches to the following block.
OLDNEW
« DESIGN.rst ('K') | « DESIGN.rst ('k') | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698