OLD | NEW |
---|---|
(Empty) | |
1 .. _x86-64-sandbox: | |
2 | |
3 ================================ | |
4 NaCl SFI model on x86-64 systems | |
5 ================================ | |
6 | |
7 .. contents:: | |
8 :local: | |
9 :backlinks: none | |
10 :depth: 2 | |
11 | |
12 Summary | |
13 ======= | |
14 | |
15 This document addresses the details of the Software Fault Isolation | |
16 (SFI) model for executable code that can be run in Native Client on an | |
17 x86-64 system. An overview of this model can be found in the paper: | |
18 `Adapting Software Fault Isolation to Contemporary CPU Architectures | |
19 <http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/ 35649.pdf>`_. | |
20 The primary focus of the SFI model is a Windows x86-64 system but the | |
21 same techniques can be applied to run identical x86-64 binaries on | |
22 other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the | |
23 description of the SFI model tries to abstract away system | |
24 dependencies when possible. | |
25 | |
26 Please note: throughout this document we use the AT&T notation for | |
27 assembler syntax, in which the target operand appears last, e.g. mov | |
28 src, dst. | |
29 | |
30 Binary Format | |
31 ============= | |
32 | |
33 The format of Native Client executable binaries is identical to the | |
34 x86-64 ELF binary format (`[0] | |
35 <http://en.wikipedia.org/wiki/Executable_and_Linkable_Format>`_, `[1] | |
36 <http://www.sco.com/developers/devspecs/gabi41.pdf>`_, `[2] | |
37 <http://www.sco.com/developers/gabi/latest/contents.html>`_, `[3] | |
38 <http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf>`_) for | |
39 Linux or BSD with a few extra requirements. The additional rules that | |
40 a Native Client ELF binary must follow are: | |
41 | |
42 * The ELF magic OS ABI field must be 123. | |
43 * The ELF magic OS ABI VERSION field must be 5. | |
44 * The ELF e_flags field must be 0x200000 (32-byte alignment). | |
45 * There must be exactly one PT_LOAD text segment. It must begin at | |
46 0x20000 (128 kB) and be marked RX (no W). The contents of the text | |
47 segment must follow :ref:`Text Segment Rules <x86-64-text-segment-rules>`. | |
48 * There can be at most one PT_LOAD data segment marked R. | |
49 * There can be at most one PT_LOAD data segment marked RW. | |
50 * There can be at most one PT_GNU_STACK segment. It must be marked RW. | |
51 * All segments must end before limit address (4 GiB). | |
52 | |
53 Runtime Invariants | |
54 ================== | |
55 | |
56 To ensure fault isolation at runtime, the system must maintain a | |
57 number of runtime *invariants* across the lifetime of the running | |
58 program. Both the *Validator* and the *Service Runtime* are | |
59 responsible for maintaining the invariants. See the paper for the | |
60 rationale for the invariants: | |
61 | |
62 * RIP always points to valid instruction boundary (the validator must | |
63 ensure this with direct jumps and direct calls). | |
64 * R15 (aka RBASE and RZP) is never modified by code (the validator | |
65 must ensure this). Low 32 bits of RZP are all zero (loader must | |
66 ensure this). | |
67 * RIP, RBP and RSP are always in the **safe zone**: between R15 and | |
68 R15+4GiB | |
69 | |
70 * Exception: RSP and RBP are allowed to be in the range of 0..4GiB | |
71 inside pseudo-instructions: naclrestbp, naclrestsp, naclspadj, | |
72 naclasp, naclssp. | |
73 | |
74 * 84GiB are allocated for NaCl module (i.e. **untrusted region**): | |
75 | |
76 * R15-40GiB..R15 and R15+4GIB..R15+44GiB are buffer zones with | |
77 PROT_NONE flags. | |
78 * The 4GB *safe zone* has pages with either PROT_WRITE or PROT_EXEC | |
79 but must not have PROT_WRITE+PROT_EXEC pages. | |
80 * All executable code in PROT_EXEC pages is validatable and | |
81 guaranteed to obey the invariant. | |
82 | |
83 * Trampoline/springboard code is mapped to a non-writable region in | |
84 the *untrusted 84GB region*; each trampoline/springboard is 32-byte | |
85 aligned and fits within a single *bundle*. | |
86 * The OS must not put any internal structures/code into the untrusted | |
87 region at any time (not using OS dynamic linker, etc) | |
88 | |
89 .. _x86-64-text-segment-rules: | |
90 | |
91 Text Segment Rules | |
92 ================== | |
93 | |
94 * The validation process must ensure that the text segment complies | |
95 with the following rules. The validation process must complete | |
96 successfully strictly before executing any instruction of the | |
97 untrusted code. | |
98 * The following instructions are illegal and must be rejected by the | |
99 validator (the list is not exhaustive as the validator uses a | |
100 whiteist, not a blacklist; this means there is a large but finite | |
101 list of instructions the validator allows, not a small list of | |
102 instructions the validator rejects): | |
103 | |
104 * any privileged instructions | |
105 * mov to/from segment registers | |
106 * int | |
107 * pusha/popa (not dangerous but not needed for gcc) | |
108 | |
109 * There must be space for at least 32 bytes after the text segment and | |
110 before the next segment in ELF (towards higher addresses) that ends | |
111 strictly at a 64K boundary (a minimum page size for untrusted | |
112 code). This space will be padded with HLT instructions as part of | |
113 the validation process, along with the optional 64K page. | |
114 * Neither instructions nor *pseudo-instructions* are permitted to span | |
115 a 32-byte boundary. | |
116 * The ELF entry address must be 32-byte aligned. | |
117 * Direct CALL/JUMP targets: | |
118 | |
119 * must point to a valid instruction boundary | |
120 * must not point into a *pseudo-instruction* | |
121 * must not point between a *restricted register* (see below for | |
122 definition) producer instruction and it's corresponding restricted | |
123 register consumer instruction. | |
124 | |
125 * CALL instructions must be 5 bytes before a 32-byte boundary, so that | |
126 the return address will be 32-byte aligned. | |
127 * Indirect call targets must be 32-byte aligned. Instead of indirect | |
128 CALL/JMP x, use nacljmp and naclcall (see below for definitions of | |
129 these *pseudo-instructions*) | |
130 * All instructions that **read** or **write** from/to memory must use | |
131 one of the four registers RZP, RIP, RBP or RSP as a base, restricted | |
132 (see below) register index (multiplied by 0, 1, 2, 4 or 8) and | |
133 constant displacement (optional). | |
134 | |
135 * Exception to this rule: string instructions are allowed if used in | |
136 following sequences (the sequences should not cross *bundle* | |
137 boundaries; segment overrides are disallowed): | |
138 | |
139 .. naclcode:: | |
140 :prettyprint: 0 | |
141 | |
142 mov %edi, %edi | |
143 lea (%rZP,%rdi),%rdi | |
144 [rep] stos (other string instructions can be used here) | |
145 | |
146 Note: this is identical to the *pseudo-instruction*: [rep] stos | |
147 %?ax, %nacl:(%rdi),%rZP | |
148 | |
149 * An operand of a command is said to be a **restricted register** iff | |
150 it is a register that is the target of a 32-bit move in the | |
151 immediately-preceding command in the same *bundle* (consider the | |
152 previous command as additional sandboxing prefix): | |
153 | |
154 .. naclcode:: | |
155 :prettyprint: 0 | |
156 | |
157 mov ..., %eXX (any 32-bit register can be used here; the first operand is un restricted but often is the same register) | |
158 | |
159 * Instructions capable of changing %RBP and %RSP are forbidden, except | |
160 the instruction sequences in the whitelist below, which must not | |
161 cross *bundle* boundaries: | |
162 | |
163 .. naclcode:: | |
164 :prettyprint: 0 | |
165 | |
166 mov %rbp, %rsp | |
167 mov %rsp, %rbp | |
168 mov ..., %ebp | |
169 add %rZP, %rbp (restoration of %RBP from memory, register or stack - keeps t he invariant intact) | |
170 mov ..., %esp | |
171 add %rZP, %rsp (restoration of %RSP from memory, register or stack - keeps t he invariant intact) | |
172 lea xxx(%rbp), %esp | |
173 add %rZP, %rsp (restoration of %RSP from %RBP with adjust) | |
174 sub ..., %esp | |
175 add %rZP, %rsp (stack space allocation) | |
176 add ..., %esp | |
177 add %rZP, %rsp (stack space deallocation) | |
178 and $XX, %rsp (alignment; XX must be between -128 and -1) | |
179 pushq ... | |
180 popq ...(except pop %RSP, pop %RBP) | |
181 | |
182 List of Pseudo-instructions | |
183 =========================== | |
184 | |
185 Pseudo-instructions were introduced to let the compiler maintain the | |
186 invariants without needing to know the code alignment rules. The | |
187 assembler guarantees 32-bit alignment for all *pseudo-instructions* in | |
188 the table below. In addition, to the pseudo-instructions, one | |
189 pseudo-operand prefix is introduced: %nacl. Presence of the %nacl | |
190 operand prefix ensures that: | |
Sam Clegg
2014/06/12 17:36:16
nit: It looks like your paragraphs are wrapped at
hamaji
2014/06/12 18:03:53
I'm using the default value of emacs, which seems
| |
191 | |
192 * The instruction "%mov %eXX, %eXX" is added immediately before the | |
193 actual command using prefix %nacl (where %eXX is a 32-bit part of | |
194 the index register of the actual command, for example: in operand | |
195 %nacl:(,%r11), the notation %eXX is referring to %r11d) | |
Sam Clegg
2014/06/12 17:36:16
Maybe use fixed width for inline asm and register
hamaji
2014/06/12 18:03:53
Done.
| |
196 * The resulting sequence of two instructions does not cross the | |
197 *bundle* boundary. | |
198 | |
199 For example, the instruction: | |
200 | |
201 .. naclcode:: | |
202 :prettyprint: 0 | |
203 | |
204 mov %eax,%nacl:(%r15,%rdi,2) | |
205 | |
206 is translated by the assembler to: | |
207 | |
208 .. naclcode:: | |
209 :prettyprint: 0 | |
210 | |
211 mov %edi,%edi | |
212 mov %eax,(%r15,%rdi,2) | |
213 | |
214 The complete list of introduced *pseudo-instructions* is as follows: | |
215 | |
216 .. raw:: html | |
hamaji
2014/06/12 16:48:59
Let me copy the html in the original document...
Sam Clegg
2014/06/12 17:36:15
Would be nice not use html if possible. Or at lea
hamaji
2014/06/12 18:03:53
Let me leave this with a TODO.
| |
217 | |
218 <table border=1> | |
219 <tbody> | |
220 <tr> | |
221 <td>Pseudo-instruction</td> | |
222 <td>Is translated to<br/> | |
223 </td> | |
224 </tr> | |
225 <tr> | |
226 <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> | |
227 <i>(sandboxed cmps)</i><br/> | |
228 </td> | |
229 <td>mov %esi,%esi<br/> | |
230 lea (%rZP,%rsi,1),%rsi<br/> | |
231 mov %edi,%edi<br/> | |
232 lea (%rZP,%rdi,1),%rdi<br/> | |
233 [rep] cmps (%rsi),(%rdi)<i><br/> | |
234 </i> | |
235 </td> | |
236 </tr> | |
237 <tr> | |
238 <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> | |
239 <i>(sandboxed movs)</i><br/> | |
240 </td> | |
241 <td>mov %esi,%esi<br/> | |
242 lea (%rZP,%rsi,1),%rsi<br/> | |
243 mov %edi,%edi<br/> | |
244 lea (%rZP,%rdi,1),%rdi<br/> | |
245 [rep] movs (%rsi),(%rdi)<i><br/> | |
246 </i> | |
247 </td> | |
248 </tr> | |
249 <tr> | |
250 <td>naclasp ...,%rZP<br/> | |
251 <i>(sandboxed stack increment)</i></td> | |
252 <td>add ...,%esp<br/> | |
253 add %rZP,%rsp</td> | |
254 </tr> | |
255 <tr> | |
256 <td>naclcall %eXX,%rZP<br/> | |
257 <i>(sandboxed indirect call)</i></td> | |
258 <td>and $-32, %eXX<br/> | |
259 add %rZP, %rXX<br/> | |
260 call *%rXX<br/> | |
261 <i>Note: the assembler ensures all calls (including | |
262 naclcall) will end at the bundle boundary.</i></td> | |
263 </tr> | |
264 <tr> | |
265 <td>nacljmp %eXX,%rZP<br/> | |
266 <i>(sandboxed indirect jump)</i></td> | |
267 <td>and $-32,%eXX<br/> | |
268 add %rZP,%rXX<br/> | |
269 jmp *%rXX<br/> | |
270 </td> | |
271 </tr> | |
272 <tr> | |
273 <td>naclrestbp ...,%rZP<br/> | |
274 <i>(sandboxed %ebp/rbp restore)</i></td> | |
275 <td>mov ...,%ebp<br/> | |
276 add %rZP,%rbp</td> | |
277 </tr> | |
278 <tr> | |
279 <td>naclrestsp ...,%rZP | |
280 <i>(sandboxed %esp/rsp restore)</i></td> | |
281 <td>mov ...,%esp<br/> | |
282 add %rZP,%rsp</td> | |
283 </tr> | |
284 <tr> | |
285 <td>naclrestsp_noflags ...,%rZP | |
286 <i>(sandboxed %esp/rsp restore)</i></td> | |
287 <td>mov ...,%esp<br/> | |
288 lea (%rsp,%rZP,1),%rsp</td> | |
289 </tr> | |
290 <tr> | |
291 <td>naclspadj $N,%rZP<br/> | |
292 <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td> | |
293 <td>lea N(%rbp),%esp<br/> | |
294 add %rZP,%rsp</td> | |
295 </tr> | |
296 <tr> | |
297 <td>naclssp ...,%rZP<br/> | |
298 <i>(sandboxed stack decrement)</i></td> | |
299 <td>sub ...,%esp<br/> | |
300 add %rZP,%rsp</td> | |
301 </tr> | |
302 <tr> | |
303 <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/> | |
304 <i>(sandboxed stos)</i></td> | |
305 <td>mov %edi,%edi<br/> | |
306 lea (%rZP,%rdi,1),%rdi<br/> | |
307 [rep] scas (%rdi),%?ax<br/> | |
308 </td> | |
309 </tr> | |
310 <tr> | |
311 <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/> | |
312 <i>(sandboxed stos)</i></td> | |
313 <td>mov %edi,%edi<br/> | |
314 lea (%rZP,%rdi,1),%rdi<br/> | |
315 [rep] stos %?ax,(%rdi)<br/> | |
316 </td> | |
317 </tr> | |
318 </tbody> | |
319 </table></div></td></tr></tbody></table> | |
OLD | NEW |