Chromium Code Reviews| Index: native_client_sdk/src/doc/reference/sandbox_internals/x86-64-sandbox.rst |
| diff --git a/native_client_sdk/src/doc/reference/sandbox_internals/x86-64-sandbox.rst b/native_client_sdk/src/doc/reference/sandbox_internals/x86-64-sandbox.rst |
| new file mode 100644 |
| index 0000000000000000000000000000000000000000..6c172dcdf2392b2fe3ad9ac2fc740c4f8ad9a8d6 |
| --- /dev/null |
| +++ b/native_client_sdk/src/doc/reference/sandbox_internals/x86-64-sandbox.rst |
| @@ -0,0 +1,319 @@ |
| +.. _x86-64-sandbox: |
| + |
| +================================ |
| +NaCl SFI model on x86-64 systems |
| +================================ |
| + |
| +.. contents:: |
| + :local: |
| + :backlinks: none |
| + :depth: 2 |
| + |
| +Summary |
| +======= |
| + |
| +This document addresses the details of the Software Fault Isolation |
| +(SFI) model for executable code that can be run in Native Client on an |
| +x86-64 system. An overview of this model can be found in the paper: |
| +`Adapting Software Fault Isolation to Contemporary CPU Architectures |
| +<http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/35649.pdf>`_. |
| +The primary focus of the SFI model is a Windows x86-64 system but the |
| +same techniques can be applied to run identical x86-64 binaries on |
| +other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the |
| +description of the SFI model tries to abstract away system |
| +dependencies when possible. |
| + |
| +Please note: throughout this document we use the AT&T notation for |
| +assembler syntax, in which the target operand appears last, e.g. mov |
| +src, dst. |
| + |
| +Binary Format |
| +============= |
| + |
| +The format of Native Client executable binaries is identical to the |
| +x86-64 ELF binary format (`[0] |
| +<http://en.wikipedia.org/wiki/Executable_and_Linkable_Format>`_, `[1] |
| +<http://www.sco.com/developers/devspecs/gabi41.pdf>`_, `[2] |
| +<http://www.sco.com/developers/gabi/latest/contents.html>`_, `[3] |
| +<http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf>`_) for |
| +Linux or BSD with a few extra requirements. The additional rules that |
| +a Native Client ELF binary must follow are: |
| + |
| +* The ELF magic OS ABI field must be 123. |
| +* The ELF magic OS ABI VERSION field must be 5. |
| +* The ELF e_flags field must be 0x200000 (32-byte alignment). |
| +* There must be exactly one PT_LOAD text segment. It must begin at |
| + 0x20000 (128 kB) and be marked RX (no W). The contents of the text |
| + segment must follow :ref:`Text Segment Rules <x86-64-text-segment-rules>`. |
| +* There can be at most one PT_LOAD data segment marked R. |
| +* There can be at most one PT_LOAD data segment marked RW. |
| +* There can be at most one PT_GNU_STACK segment. It must be marked RW. |
| +* All segments must end before limit address (4 GiB). |
| + |
| +Runtime Invariants |
| +================== |
| + |
| +To ensure fault isolation at runtime, the system must maintain a |
| +number of runtime *invariants* across the lifetime of the running |
| +program. Both the *Validator* and the *Service Runtime* are |
| +responsible for maintaining the invariants. See the paper for the |
| +rationale for the invariants: |
| + |
| +* RIP always points to valid instruction boundary (the validator must |
| + ensure this with direct jumps and direct calls). |
| +* R15 (aka RBASE and RZP) is never modified by code (the validator |
| + must ensure this). Low 32 bits of RZP are all zero (loader must |
| + ensure this). |
| +* RIP, RBP and RSP are always in the **safe zone**: between R15 and |
| + R15+4GiB |
| + |
| + * Exception: RSP and RBP are allowed to be in the range of 0..4GiB |
| + inside pseudo-instructions: naclrestbp, naclrestsp, naclspadj, |
| + naclasp, naclssp. |
| + |
| +* 84GiB are allocated for NaCl module (i.e. **untrusted region**): |
| + |
| + * R15-40GiB..R15 and R15+4GIB..R15+44GiB are buffer zones with |
| + PROT_NONE flags. |
| + * The 4GB *safe zone* has pages with either PROT_WRITE or PROT_EXEC |
| + but must not have PROT_WRITE+PROT_EXEC pages. |
| + * All executable code in PROT_EXEC pages is validatable and |
| + guaranteed to obey the invariant. |
| + |
| +* Trampoline/springboard code is mapped to a non-writable region in |
| + the *untrusted 84GB region*; each trampoline/springboard is 32-byte |
| + aligned and fits within a single *bundle*. |
| +* The OS must not put any internal structures/code into the untrusted |
| + region at any time (not using OS dynamic linker, etc) |
| + |
| +.. _x86-64-text-segment-rules: |
| + |
| +Text Segment Rules |
| +================== |
| + |
| +* The validation process must ensure that the text segment complies |
| + with the following rules. The validation process must complete |
| + successfully strictly before executing any instruction of the |
| + untrusted code. |
| +* The following instructions are illegal and must be rejected by the |
| + validator (the list is not exhaustive as the validator uses a |
| + whiteist, not a blacklist; this means there is a large but finite |
| + list of instructions the validator allows, not a small list of |
| + instructions the validator rejects): |
| + |
| + * any privileged instructions |
| + * mov to/from segment registers |
| + * int |
| + * pusha/popa (not dangerous but not needed for gcc) |
| + |
| +* There must be space for at least 32 bytes after the text segment and |
| + before the next segment in ELF (towards higher addresses) that ends |
| + strictly at a 64K boundary (a minimum page size for untrusted |
| + code). This space will be padded with HLT instructions as part of |
| + the validation process, along with the optional 64K page. |
| +* Neither instructions nor *pseudo-instructions* are permitted to span |
| + a 32-byte boundary. |
| +* The ELF entry address must be 32-byte aligned. |
| +* Direct CALL/JUMP targets: |
| + |
| + * must point to a valid instruction boundary |
| + * must not point into a *pseudo-instruction* |
| + * must not point between a *restricted register* (see below for |
| + definition) producer instruction and it's corresponding restricted |
| + register consumer instruction. |
| + |
| +* CALL instructions must be 5 bytes before a 32-byte boundary, so that |
| + the return address will be 32-byte aligned. |
| +* Indirect call targets must be 32-byte aligned. Instead of indirect |
| + CALL/JMP x, use nacljmp and naclcall (see below for definitions of |
| + these *pseudo-instructions*) |
| +* All instructions that **read** or **write** from/to memory must use |
| + one of the four registers RZP, RIP, RBP or RSP as a base, restricted |
| + (see below) register index (multiplied by 0, 1, 2, 4 or 8) and |
| + constant displacement (optional). |
| + |
| + * Exception to this rule: string instructions are allowed if used in |
| + following sequences (the sequences should not cross *bundle* |
| + boundaries; segment overrides are disallowed): |
| + |
| + .. naclcode:: |
| + :prettyprint: 0 |
| + |
| + mov %edi, %edi |
| + lea (%rZP,%rdi),%rdi |
| + [rep] stos (other string instructions can be used here) |
| + |
| + Note: this is identical to the *pseudo-instruction*: [rep] stos |
| + %?ax, %nacl:(%rdi),%rZP |
| + |
| +* An operand of a command is said to be a **restricted register** iff |
| + it is a register that is the target of a 32-bit move in the |
| + immediately-preceding command in the same *bundle* (consider the |
| + previous command as additional sandboxing prefix): |
| + |
| + .. naclcode:: |
| + :prettyprint: 0 |
| + |
| + mov ..., %eXX (any 32-bit register can be used here; the first operand is unrestricted but often is the same register) |
| + |
| +* Instructions capable of changing %RBP and %RSP are forbidden, except |
| + the instruction sequences in the whitelist below, which must not |
| + cross *bundle* boundaries: |
| + |
| + .. naclcode:: |
| + :prettyprint: 0 |
| + |
| + mov %rbp, %rsp |
| + mov %rsp, %rbp |
| + mov ..., %ebp |
| + add %rZP, %rbp (restoration of %RBP from memory, register or stack - keeps the invariant intact) |
| + mov ..., %esp |
| + add %rZP, %rsp (restoration of %RSP from memory, register or stack - keeps the invariant intact) |
| + lea xxx(%rbp), %esp |
| + add %rZP, %rsp (restoration of %RSP from %RBP with adjust) |
| + sub ..., %esp |
| + add %rZP, %rsp (stack space allocation) |
| + add ..., %esp |
| + add %rZP, %rsp (stack space deallocation) |
| + and $XX, %rsp (alignment; XX must be between -128 and -1) |
| + pushq ... |
| + popq ...(except pop %RSP, pop %RBP) |
| + |
| +List of Pseudo-instructions |
| +=========================== |
| + |
| +Pseudo-instructions were introduced to let the compiler maintain the |
| +invariants without needing to know the code alignment rules. The |
| +assembler guarantees 32-bit alignment for all *pseudo-instructions* in |
| +the table below. In addition, to the pseudo-instructions, one |
| +pseudo-operand prefix is introduced: %nacl. Presence of the %nacl |
| +operand prefix ensures that: |
|
Sam Clegg
2014/06/12 17:36:16
nit: It looks like your paragraphs are wrapped at
hamaji
2014/06/12 18:03:53
I'm using the default value of emacs, which seems
|
| + |
| +* The instruction "%mov %eXX, %eXX" is added immediately before the |
| + actual command using prefix %nacl (where %eXX is a 32-bit part of |
| + the index register of the actual command, for example: in operand |
| + %nacl:(,%r11), the notation %eXX is referring to %r11d) |
|
Sam Clegg
2014/06/12 17:36:16
Maybe use fixed width for inline asm and register
hamaji
2014/06/12 18:03:53
Done.
|
| +* The resulting sequence of two instructions does not cross the |
| + *bundle* boundary. |
| + |
| +For example, the instruction: |
| + |
| +.. naclcode:: |
| + :prettyprint: 0 |
| + |
| + mov %eax,%nacl:(%r15,%rdi,2) |
| + |
| +is translated by the assembler to: |
| + |
| +.. naclcode:: |
| + :prettyprint: 0 |
| + |
| + mov %edi,%edi |
| + mov %eax,(%r15,%rdi,2) |
| + |
| +The complete list of introduced *pseudo-instructions* is as follows: |
| + |
| +.. raw:: html |
|
hamaji
2014/06/12 16:48:59
Let me copy the html in the original document...
Sam Clegg
2014/06/12 17:36:15
Would be nice not use html if possible. Or at lea
hamaji
2014/06/12 18:03:53
Let me leave this with a TODO.
|
| + |
| + <table border=1> |
| + <tbody> |
| + <tr> |
| + <td>Pseudo-instruction</td> |
| + <td>Is translated to<br/> |
| + </td> |
| + </tr> |
| + <tr> |
| + <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> |
| + <i>(sandboxed cmps)</i><br/> |
| + </td> |
| + <td>mov %esi,%esi<br/> |
| + lea (%rZP,%rsi,1),%rsi<br/> |
| + mov %edi,%edi<br/> |
| + lea (%rZP,%rdi,1),%rdi<br/> |
| + [rep] cmps (%rsi),(%rdi)<i><br/> |
| + </i> |
| + </td> |
| + </tr> |
| + <tr> |
| + <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> |
| + <i>(sandboxed movs)</i><br/> |
| + </td> |
| + <td>mov %esi,%esi<br/> |
| + lea (%rZP,%rsi,1),%rsi<br/> |
| + mov %edi,%edi<br/> |
| + lea (%rZP,%rdi,1),%rdi<br/> |
| + [rep] movs (%rsi),(%rdi)<i><br/> |
| + </i> |
| + </td> |
| + </tr> |
| + <tr> |
| + <td>naclasp ...,%rZP<br/> |
| + <i>(sandboxed stack increment)</i></td> |
| + <td>add ...,%esp<br/> |
| + add %rZP,%rsp</td> |
| + </tr> |
| + <tr> |
| + <td>naclcall %eXX,%rZP<br/> |
| + <i>(sandboxed indirect call)</i></td> |
| + <td>and $-32, %eXX<br/> |
| + add %rZP, %rXX<br/> |
| + call *%rXX<br/> |
| + <i>Note: the assembler ensures all calls (including |
| + naclcall) will end at the bundle boundary.</i></td> |
| + </tr> |
| + <tr> |
| + <td>nacljmp %eXX,%rZP<br/> |
| + <i>(sandboxed indirect jump)</i></td> |
| + <td>and $-32,%eXX<br/> |
| + add %rZP,%rXX<br/> |
| + jmp *%rXX<br/> |
| + </td> |
| + </tr> |
| + <tr> |
| + <td>naclrestbp ...,%rZP<br/> |
| + <i>(sandboxed %ebp/rbp restore)</i></td> |
| + <td>mov ...,%ebp<br/> |
| + add %rZP,%rbp</td> |
| + </tr> |
| + <tr> |
| + <td>naclrestsp ...,%rZP |
| + <i>(sandboxed %esp/rsp restore)</i></td> |
| + <td>mov ...,%esp<br/> |
| + add %rZP,%rsp</td> |
| + </tr> |
| + <tr> |
| + <td>naclrestsp_noflags ...,%rZP |
| + <i>(sandboxed %esp/rsp restore)</i></td> |
| + <td>mov ...,%esp<br/> |
| + lea (%rsp,%rZP,1),%rsp</td> |
| + </tr> |
| + <tr> |
| + <td>naclspadj $N,%rZP<br/> |
| + <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td> |
| + <td>lea N(%rbp),%esp<br/> |
| + add %rZP,%rsp</td> |
| + </tr> |
| + <tr> |
| + <td>naclssp ...,%rZP<br/> |
| + <i>(sandboxed stack decrement)</i></td> |
| + <td>sub ...,%esp<br/> |
| + add %rZP,%rsp</td> |
| + </tr> |
| + <tr> |
| + <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/> |
| + <i>(sandboxed stos)</i></td> |
| + <td>mov %edi,%edi<br/> |
| + lea (%rZP,%rdi,1),%rdi<br/> |
| + [rep] scas (%rdi),%?ax<br/> |
| + </td> |
| + </tr> |
| + <tr> |
| + <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/> |
| + <i>(sandboxed stos)</i></td> |
| + <td>mov %edi,%edi<br/> |
| + lea (%rZP,%rdi,1),%rdi<br/> |
| + [rep] stos %?ax,(%rdi)<br/> |
| + </td> |
| + </tr> |
| + </tbody> |
| + </table></div></td></tr></tbody></table> |