OLD | NEW |
---|---|
(Empty) | |
1 .. _x86-64-sandbox: | |
2 | |
3 ================================ | |
4 NaCl SFI model on x86-64 systems | |
5 ================================ | |
JF
2014/06/12 20:00:47
We're missing the HTML file corresponding to the r
hamaji
2014/06/13 04:08:19
Done.
| |
6 | |
7 .. contents:: | |
8 :local: | |
9 :backlinks: none | |
10 :depth: 2 | |
11 | |
12 Summary | |
13 ======= | |
14 | |
15 This document addresses the details of the Software Fault Isolation | |
16 (SFI) model for executable code that can be run in Native Client on an | |
17 x86-64 system. An overview of this model can be found in the paper: | |
18 `Adapting Software Fault Isolation to Contemporary CPU Architectures | |
19 <http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/ 35649.pdf>`_. | |
JF
2014/06/12 20:00:47
Make it https, the server supports it.
Junichi Uekawa
2014/06/13 03:58:30
I think you want the URL
http://research.google.co
hamaji
2014/06/13 04:08:19
Done, thanks!
hamaji
2014/06/13 04:08:19
Done.
| |
20 The primary focus of the SFI model is a Windows x86-64 system but the | |
21 same techniques can be applied to run identical x86-64 binaries on | |
22 other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the | |
23 description of the SFI model tries to abstract away system | |
24 dependencies when possible. | |
25 | |
26 Please note: throughout this document we use the AT&T notation for | |
27 assembler syntax, in which the target operand appears last, e.g. mov | |
28 src, dst. | |
JF
2014/06/12 20:00:46
Code-quote code: ``mov src, dst``
hamaji
2014/06/13 04:08:19
Done.
| |
29 | |
30 Binary Format | |
31 ============= | |
32 | |
33 The format of Native Client executable binaries is identical to the | |
34 x86-64 ELF binary format (`[0] | |
35 <http://en.wikipedia.org/wiki/Executable_and_Linkable_Format>`_, `[1] | |
36 <http://www.sco.com/developers/devspecs/gabi41.pdf>`_, `[2] | |
37 <http://www.sco.com/developers/gabi/latest/contents.html>`_, `[3] | |
38 <http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf>`_) for | |
39 Linux or BSD with a few extra requirements. The additional rules that | |
40 a Native Client ELF binary must follow are: | |
41 | |
42 * The ELF magic OS ABI field must be 123. | |
43 * The ELF magic OS ABI VERSION field must be 5. | |
44 * The ELF e_flags field must be 0x200000 (32-byte alignment). | |
45 * There must be exactly one PT_LOAD text segment. It must begin at | |
46 0x20000 (128 kB) and be marked RX (no W). The contents of the text | |
47 segment must follow :ref:`Text Segment Rules <x86-64-text-segment-rules>`. | |
48 * There can be at most one PT_LOAD data segment marked R. | |
49 * There can be at most one PT_LOAD data segment marked RW. | |
50 * There can be at most one PT_GNU_STACK segment. It must be marked RW. | |
51 * All segments must end before limit address (4 GiB). | |
52 | |
53 Runtime Invariants | |
54 ================== | |
55 | |
56 To ensure fault isolation at runtime, the system must maintain a | |
57 number of runtime *invariants* across the lifetime of the running | |
58 program. Both the *Validator* and the *Service Runtime* are | |
59 responsible for maintaining the invariants. See the paper for the | |
60 rationale for the invariants: | |
61 | |
62 * ``RIP`` always points to valid instruction boundary (the validator must | |
63 ensure this with direct jumps and direct calls). | |
64 * ``R15`` (aka ``RBASE`` and ``RZP``) is never modified by code (the | |
65 validator must ensure this). Low 32 bits of ``RZP`` are all zero | |
66 (loader must ensure this). | |
67 * ``RIP``, ``RBP`` and ``RSP`` are always in the **safe zone**: between | |
68 ``R15`` and ``R15+4GiB``. | |
69 | |
70 * Exception: ``RSP`` and ``RBP`` are allowed to be in the range of | |
71 ``0..4GiB`` inside *pseudo-instructions*: ``naclrestbp``, | |
72 ``naclrestsp``, ``naclspadj``, ``naclasp``, ``naclssp``. | |
73 | |
74 * 84GiB are allocated for NaCl module (i.e. **untrusted region**): | |
75 | |
76 * ``R15-40GiB..R15`` and ``R15+4GIB..R15+44GiB`` are buffer zones with | |
77 PROT_NONE flags. | |
78 * The 4GB *safe zone* has pages with either PROT_WRITE or PROT_EXEC | |
79 but must not have PROT_WRITE+PROT_EXEC pages. | |
80 * All executable code in PROT_EXEC pages is validatable and | |
81 guaranteed to obey the invariant. | |
82 | |
83 * Trampoline/springboard code is mapped to a non-writable region in | |
84 the *untrusted 84GB region*; each trampoline/springboard is 32-byte | |
85 aligned and fits within a single *bundle*. | |
86 * The OS must not put any internal structures/code into the untrusted | |
87 region at any time (not using OS dynamic linker, etc) | |
88 | |
89 .. _x86-64-text-segment-rules: | |
90 | |
91 Text Segment Rules | |
92 ================== | |
93 | |
94 * The validation process must ensure that the text segment complies | |
95 with the following rules. The validation process must complete | |
96 successfully strictly before executing any instruction of the | |
97 untrusted code. | |
98 * The following instructions are illegal and must be rejected by the | |
99 validator (the list is not exhaustive as the validator uses a | |
100 whiteist, not a blacklist; this means there is a large but finite | |
101 list of instructions the validator allows, not a small list of | |
102 instructions the validator rejects): | |
103 | |
104 * any privileged instructions | |
105 * ``mov`` to/from segment registers | |
106 * ``int`` | |
107 * ``pusha``/``popa`` (not dangerous but not needed for gcc) | |
JF
2014/06/12 20:00:47
s/gcc/GCC/
hamaji
2014/06/13 04:08:19
Done.
| |
108 | |
109 * There must be space for at least 32 bytes after the text segment and | |
110 before the next segment in ELF (towards higher addresses) that ends | |
111 strictly at a 64K boundary (a minimum page size for untrusted | |
112 code). This space will be padded with HLT instructions as part of | |
113 the validation process, along with the optional 64K page. | |
114 * Neither instructions nor *pseudo-instructions* are permitted to span | |
115 a 32-byte boundary. | |
116 * The ELF entry address must be 32-byte aligned. | |
117 * Direct ``CALL``/``JUMP`` targets: | |
118 | |
119 * must point to a valid instruction boundary | |
120 * must not point into a *pseudo-instruction* | |
121 * must not point between a *restricted register* (see below for | |
122 definition) producer instruction and it's corresponding restricted | |
JF
2014/06/12 20:00:47
s/it's/its/
hamaji
2014/06/13 04:08:19
Done.
| |
123 register consumer instruction. | |
124 | |
125 * ``CALL`` instructions must be 5 bytes before a 32-byte boundary, so | |
126 that the return address will be 32-byte aligned. | |
127 * Indirect call targets must be 32-byte aligned. Instead of indirect | |
128 ``CALL``/``JMP`` x, use nacljmp and naclcall (see below for | |
JF
2014/06/12 20:00:47
Code-quote ``nacljmp`` and ``naclcall``.
hamaji
2014/06/13 04:08:19
Done.
| |
129 definitions of these *pseudo-instructions*) | |
130 * All instructions that **read** or **write** from/to memory must use | |
131 one of the four registers ``RZP``, ``RIP``, ``RBP`` or ``RSP`` as a | |
132 base, restricted (see below) register index (multiplied by 0, 1, 2, | |
133 4 or 8) and constant displacement (optional). | |
134 | |
135 * Exception to this rule: string instructions are allowed if used in | |
136 following sequences (the sequences should not cross *bundle* | |
137 boundaries; segment overrides are disallowed): | |
138 | |
139 .. naclcode:: | |
140 :prettyprint: 0 | |
141 | |
142 mov %edi, %edi | |
143 lea (%rZP,%rdi),%rdi | |
144 [rep] stos (other string instructions can be used here) | |
JF
2014/06/12 20:00:47
Make the parenthesized section a comment? Same in
hamaji
2014/06/13 04:08:19
Done.
| |
145 | |
146 Note: this is identical to the *pseudo-instruction*: ``[rep] stos | |
147 %?ax, %nacl:(%rdi),%rZP`` | |
148 | |
149 * An operand of a command is said to be a **restricted register** iff | |
150 it is a register that is the target of a 32-bit move in the | |
151 immediately-preceding command in the same *bundle* (consider the | |
152 previous command as additional sandboxing prefix): | |
153 | |
154 .. naclcode:: | |
155 :prettyprint: 0 | |
156 | |
157 mov ..., %eXX (any 32-bit register can be used here; the first operand is un restricted but often is the same register) | |
JF
2014/06/12 20:00:47
You should manually line-wrap the comment here, be
hamaji
2014/06/13 04:08:19
Done.
| |
158 | |
159 * Instructions capable of changing %RBP and %RSP are forbidden, except | |
JF
2014/06/12 20:00:46
Code-quote the register names.
hamaji
2014/06/13 04:08:19
Done.
| |
160 the instruction sequences in the whitelist below, which must not | |
161 cross *bundle* boundaries: | |
162 | |
163 .. naclcode:: | |
164 :prettyprint: 0 | |
165 | |
166 mov %rbp, %rsp | |
167 mov %rsp, %rbp | |
168 mov ..., %ebp | |
169 add %rZP, %rbp (restoration of %RBP from memory, register or stack - keeps t he invariant intact) | |
170 mov ..., %esp | |
171 add %rZP, %rsp (restoration of %RSP from memory, register or stack - keeps t he invariant intact) | |
172 lea xxx(%rbp), %esp | |
173 add %rZP, %rsp (restoration of %RSP from %RBP with adjust) | |
174 sub ..., %esp | |
175 add %rZP, %rsp (stack space allocation) | |
176 add ..., %esp | |
177 add %rZP, %rsp (stack space deallocation) | |
178 and $XX, %rsp (alignment; XX must be between -128 and -1) | |
179 pushq ... | |
180 popq ...(except pop %RSP, pop %RBP) | |
JF
2014/06/12 20:00:47
Ditto on parenthesis being comments, and line-wrap
hamaji
2014/06/13 04:08:19
Done.
| |
181 | |
182 List of Pseudo-instructions | |
183 =========================== | |
184 | |
185 Pseudo-instructions were introduced to let the compiler maintain the | |
186 invariants without needing to know the code alignment rules. The | |
187 assembler guarantees 32-bit alignment for all *pseudo-instructions* in | |
188 the table below. In addition, to the pseudo-instructions, one | |
189 pseudo-operand prefix is introduced: ``%nacl``. Presence of the | |
190 ``%nacl`` operand prefix ensures that: | |
191 | |
192 * The instruction ``"%mov %eXX, %eXX"`` is added immediately before the | |
193 actual command using prefix ``%nacl`` (where ``%eXX`` is a 32-bit | |
194 part of the index register of the actual command, for example: in | |
195 operand ``%nacl:(,%r11)``, the notation ``%eXX`` is referring to | |
196 ``%r11d``) | |
197 * The resulting sequence of two instructions does not cross the | |
198 *bundle* boundary. | |
199 | |
200 For example, the instruction: | |
201 | |
202 .. naclcode:: | |
203 :prettyprint: 0 | |
204 | |
205 mov %eax,%nacl:(%r15,%rdi,2) | |
206 | |
207 is translated by the assembler to: | |
208 | |
209 .. naclcode:: | |
210 :prettyprint: 0 | |
211 | |
212 mov %edi,%edi | |
213 mov %eax,(%r15,%rdi,2) | |
214 | |
215 The complete list of introduced *pseudo-instructions* is as follows: | |
216 | |
217 .. TODO(hamaji): Use rst's table instead of the raw HTML below. | |
JF
2014/06/12 20:00:46
Please do :)
hamaji
2014/06/13 04:08:19
Sorry, but I don't have spare time for this now. I
| |
218 | |
219 .. raw:: html | |
220 | |
221 <table border=1> | |
222 <tbody> | |
223 <tr> | |
224 <td>Pseudo-instruction</td> | |
225 <td>Is translated to<br/> | |
226 </td> | |
227 </tr> | |
228 <tr> | |
229 <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> | |
230 <i>(sandboxed cmps)</i><br/> | |
231 </td> | |
232 <td>mov %esi,%esi<br/> | |
233 lea (%rZP,%rsi,1),%rsi<br/> | |
234 mov %edi,%edi<br/> | |
235 lea (%rZP,%rdi,1),%rdi<br/> | |
236 [rep] cmps (%rsi),(%rdi)<i><br/> | |
237 </i> | |
238 </td> | |
239 </tr> | |
240 <tr> | |
241 <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> | |
242 <i>(sandboxed movs)</i><br/> | |
243 </td> | |
244 <td>mov %esi,%esi<br/> | |
245 lea (%rZP,%rsi,1),%rsi<br/> | |
246 mov %edi,%edi<br/> | |
247 lea (%rZP,%rdi,1),%rdi<br/> | |
248 [rep] movs (%rsi),(%rdi)<i><br/> | |
249 </i> | |
250 </td> | |
251 </tr> | |
252 <tr> | |
253 <td>naclasp ...,%rZP<br/> | |
254 <i>(sandboxed stack increment)</i></td> | |
255 <td>add ...,%esp<br/> | |
256 add %rZP,%rsp</td> | |
257 </tr> | |
258 <tr> | |
259 <td>naclcall %eXX,%rZP<br/> | |
260 <i>(sandboxed indirect call)</i></td> | |
261 <td>and $-32, %eXX<br/> | |
262 add %rZP, %rXX<br/> | |
263 call *%rXX<br/> | |
264 <i>Note: the assembler ensures all calls (including | |
265 naclcall) will end at the bundle boundary.</i></td> | |
266 </tr> | |
267 <tr> | |
268 <td>nacljmp %eXX,%rZP<br/> | |
269 <i>(sandboxed indirect jump)</i></td> | |
270 <td>and $-32,%eXX<br/> | |
271 add %rZP,%rXX<br/> | |
272 jmp *%rXX<br/> | |
273 </td> | |
274 </tr> | |
275 <tr> | |
276 <td>naclrestbp ...,%rZP<br/> | |
277 <i>(sandboxed %ebp/rbp restore)</i></td> | |
278 <td>mov ...,%ebp<br/> | |
279 add %rZP,%rbp</td> | |
280 </tr> | |
281 <tr> | |
282 <td>naclrestsp ...,%rZP | |
283 <i>(sandboxed %esp/rsp restore)</i></td> | |
284 <td>mov ...,%esp<br/> | |
285 add %rZP,%rsp</td> | |
286 </tr> | |
287 <tr> | |
288 <td>naclrestsp_noflags ...,%rZP | |
289 <i>(sandboxed %esp/rsp restore)</i></td> | |
290 <td>mov ...,%esp<br/> | |
291 lea (%rsp,%rZP,1),%rsp</td> | |
292 </tr> | |
293 <tr> | |
294 <td>naclspadj $N,%rZP<br/> | |
295 <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td> | |
296 <td>lea N(%rbp),%esp<br/> | |
297 add %rZP,%rsp</td> | |
298 </tr> | |
299 <tr> | |
300 <td>naclssp ...,%rZP<br/> | |
301 <i>(sandboxed stack decrement)</i></td> | |
302 <td>sub ...,%esp<br/> | |
303 add %rZP,%rsp</td> | |
304 </tr> | |
305 <tr> | |
306 <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/> | |
307 <i>(sandboxed stos)</i></td> | |
308 <td>mov %edi,%edi<br/> | |
309 lea (%rZP,%rdi,1),%rdi<br/> | |
310 [rep] scas (%rdi),%?ax<br/> | |
311 </td> | |
312 </tr> | |
313 <tr> | |
314 <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/> | |
315 <i>(sandboxed stos)</i></td> | |
316 <td>mov %edi,%edi<br/> | |
317 lea (%rZP,%rdi,1),%rdi<br/> | |
318 [rep] stos %?ax,(%rdi)<br/> | |
319 </td> | |
320 </tr> | |
321 </tbody> | |
322 </table></div></td></tr></tbody></table> | |
OLD | NEW |