Index: native_client_sdk/doc_generated/reference/pnacl-undefined-behavior.html |
diff --git a/native_client_sdk/doc_generated/reference/pnacl-undefined-behavior.html b/native_client_sdk/doc_generated/reference/pnacl-undefined-behavior.html |
new file mode 100644 |
index 0000000000000000000000000000000000000000..ade2b6faac706cd5f63bc01071e00850b1a52a3c |
--- /dev/null |
+++ b/native_client_sdk/doc_generated/reference/pnacl-undefined-behavior.html |
@@ -0,0 +1,243 @@ |
+{{+bindTo:partials.standard_nacl_article}} |
+ |
+<section id="pnacl-undefined-behavior"> |
+<h1 id="pnacl-undefined-behavior">PNaCl Undefined Behavior</h1> |
+<div class="contents local" id="contents" style="display: none"> |
+<ul class="small-gap"> |
+<li><a class="reference internal" href="#overview" id="id1">Overview</a></li> |
+<li><a class="reference internal" href="#specification" id="id2">Specification</a></li> |
+<li><p class="first"><a class="reference internal" href="#behavior-in-pnacl-bitcode" id="id3">Behavior in PNaCl Bitcode</a></p> |
+<ul class="small-gap"> |
+<li><a class="reference internal" href="#well-defined" id="id4">Well Defined</a></li> |
+<li><p class="first"><a class="reference internal" href="#not-well-defined" id="id5">Not Well Defined</a></p> |
+<ul class="small-gap"> |
+<li><a class="reference internal" href="#potentially-fixable" id="id6">Potentially Fixable</a></li> |
+<li><a class="reference internal" href="#floating-point" id="id7">Floating Point</a></li> |
+<li><a class="reference internal" href="#hard-to-fix" id="id8">Hard to Fix</a></li> |
+</ul> |
+</li> |
+</ul> |
+</li> |
+</ul> |
+ |
+</div><section id="overview"> |
+<span id="undefined-behavior"></span><h2 id="overview"><span id="undefined-behavior"></span>Overview</h2> |
+<p>C and C++ undefined behavior allows efficient mapping of the source |
+language onto hardware, but leads to different behavior on different |
+platforms.</p> |
+<p>PNaCl exposes undefined behavior in the following ways:</p> |
+<ul class="small-gap"> |
+<li><p class="first">The Clang front-end and optimizations that occur on the developer’s |
+machine determine what behavior will occur, and it will be specified |
+deterministically in the <em>pexe</em>. All targets will observe the same |
+behavior. In some cases, recompiling with a newer PNaCl SDK version |
+will either:</p> |
+<ul class="small-gap"> |
+<li>Reliably emit the same behavior in the resulting <em>pexe</em>.</li> |
+<li>Change the behavior that gets specified in the <em>pexe</em>.</li> |
+</ul> |
+</li> |
+<li><p class="first">The behavior specified in the <em>pexe</em> relies on PNaCl’s bitcode, |
+runtime or CPU architecture vagaries.</p> |
+<ul class="small-gap"> |
+<li>In some cases, the behavior using the same PNaCl translator version |
+on different architectures will produce different behavior.</li> |
+<li>Sometimes runtime parameters determine the behavior, e.g. memory |
+allocation determines which out-of-bounds accesses crash versus |
+returning garbage.</li> |
+<li>In some cases, different versions of the PNaCl translator |
+(i.e. after a Chrome update) will compile the code differently and |
+cause different behavior.</li> |
+<li>In some cases, the same versions of the PNaCl translator, on the |
+same architecture, will generate a different <em>nexe</em> for |
+defense-in-depth purposes, but may cause code that reads invalid |
+stack values or code sections on the heap to observe these |
+randomizations.</li> |
+</ul> |
+</li> |
+</ul> |
+</section><section id="specification"> |
+<h2 id="specification">Specification</h2> |
+<p>PNaCl’s goal is that a single <em>pexe</em> should work reliably in the same |
+manner on all architectures, irrespective of runtime parameters and |
+through Chrome updates. This goal is unfortunately not attainable, PNaCl |
+therefore specifies as much as it can and outlines areas for |
+improvement.</p> |
+<p>One interesting solution is to offer good support for LLVM’s sanitizer |
+tools (including <a class="reference external" href="http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation">UBSan</a>) |
+at development time, so that developers can test their code against |
+undefined behavior. Shipping code would then still get good performance, |
+and diverging behavior would be rare.</p> |
+<p>Note that none of these issues are vulnerabilities in PNaCl and Chrome: |
+the NaCl sandboxing still constrains the code through Software Fault |
+Isolation.</p> |
+</section><section id="behavior-in-pnacl-bitcode"> |
+<h2 id="behavior-in-pnacl-bitcode">Behavior in PNaCl Bitcode</h2> |
+<section id="well-defined"> |
+<h3 id="well-defined">Well Defined</h3> |
+<p>The following are traditionally undefined behavior in C/C++ but are well |
+defined at the <em>pexe</em> level:</p> |
+<ul class="small-gap"> |
+<li>Dynamic initialization order dependencies: the order is deterministic |
+in the <em>pexe</em>.</li> |
+<li>Bool which isn’t <code>0</code>/<code>1</code>: the bitcode instruction sequence is |
+deterministic in the <em>pexe</em>.</li> |
+<li>Out-of-range <code>enum</code> value: the backing integer type and bitcode |
+instruction sequence is deterministic in the <em>pexe</em>.</li> |
+<li>Reaching end-of-value-returning-function without returning a value: |
+reduces to <code>ret i32 undef</code> in bitcode.</li> |
+<li>Aggressive optimizations based on type-based alias analysis: TBAA |
+optimizations are done before stable bitcode is generated and their |
+metadata is stripped from the <em>pexe</em>, behavior is therefore |
+deterministic in the <em>pexe</em>.</li> |
+<li>Operator and subexpression evaluation order in the same expression |
+(e.g. function parameter passing, or pre-increment): the order is |
+defined in the <em>pexe</em>.</li> |
+<li>Signed integer overflow: two’s complement integer arithmetic is |
+assumed.</li> |
+<li>Atomic access to a non-atomic memory location (not declared as |
+<code>std::atomic</code>): atomics and <code>volatile</code> variables all lower to the |
+same compatible intrinsics or external functions, the behavior is |
+therefore deterministic in the <em>pexe</em> (see <a class="reference internal" href="/native-client/reference/pnacl-c-cpp-language-support.html#memory-model-and-atomics"><em>Memory Model and |
+Atomics</em></a>).</li> |
+<li>Integer divide by zero: always raises a fault (through hardware on |
+x86, and through integer divide emulation routine or explicit checks |
+on ARM).</li> |
+</ul> |
+</section><section id="not-well-defined"> |
+<h3 id="not-well-defined">Not Well Defined</h3> |
+<p>The following are traditionally undefined behavior in C/C++ which also |
+exhibit undefined behavior at the <em>pexe</em> level. Some are easier to fix |
+than others.</p> |
+<section id="potentially-fixable"> |
+<h4 id="potentially-fixable">Potentially Fixable</h4> |
+<ul class="small-gap"> |
+<li><p class="first">Shift by greater-than-or-equal to left-hand-side’s bit-width or |
+negative (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3604">bug 3604</a>).</p> |
+<ul class="small-gap"> |
+<li>Some of the behavior will be specified in the <em>pexe</em> depending on |
+constant propagation and integer type of variables.</li> |
+<li>There is still some architecture specific behavior.</li> |
+<li>PNaCl could force-mask the right-hand-side to <cite>bitwidth-1</cite>, which |
+could become a no-op on some architectures while ensuring all |
+architectures behave similarly. Regular optimizations could also be |
+applied, removing redundant masks.</li> |
+</ul> |
+</li> |
+<li><p class="first">Using a virtual pointer of the wrong type, or of an unallocated |
+object.</p> |
+<ul class="small-gap"> |
+<li>Will produce wrong results which will depend on what data is treated |
+as a <cite>vftable</cite>.</li> |
+<li>PNaCl could add runtime checks for this, and elide them when types |
+are provably correct (see this CFI <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3786">bug 3786</a>).</li> |
+</ul> |
+</li> |
+<li><p class="first">Some unaligned load/store (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3445">bug 3445</a>).</p> |
+<ul class="small-gap"> |
+<li>Could force everything to <cite>align 1</cite>, performance cost should be |
+measured.</li> |
+<li>The frontend could also be more pessimistic when it sees dubious |
+casts.</li> |
+<li>The</li> |
+</ul> |
+</li> |
+<li><p class="first">Reaching “unreachable” code.</p> |
+<ul class="small-gap"> |
+<li>LLVM provides an IR instruction called “unreachable” whose effect |
+will be undefined. PNaCl could change this to always trap, as the |
+<code>llvm.trap</code> intrinsic does.</li> |
+</ul> |
+</li> |
+<li>Zero or negative-sized variable-length array (and <code>alloca</code>): can |
+insert checks with <code>-fsanitize=vla-bound</code>.</li> |
+</ul> |
+</section><section id="floating-point"> |
+<h4 id="floating-point">Floating Point</h4> |
+<p>PNaCl offers a IEEE-754 implementation which is as correct as the |
+underlying hardware allows, with a few limitations. These are a few |
+sources of undefined behavior which are believed to be fixable:</p> |
+<ul class="small-gap"> |
+<li>Float cast overflow is currently undefined.</li> |
+<li>Float divide by zero is currently undefined.</li> |
+<li>Different rounding modes are currently not usable, which isn’t |
+IEEE-754 compliant. PNaCl could support switching modes (the 4 modes |
+exposed by C99 <code>FLT_ROUNDS</code> macros).</li> |
+<li>The default denormal behavior is currently unspecified, which isn’t |
+IEEE-754 compliant (denormals must be supported in IEEE-754). PNaCl |
+could mandate flush-to-zero, and may give an API to enable denormals |
+in a future release. The latter is problematic for SIMD and |
+vectorization support, where some platforms do not support denormal |
+SIMD operations.</li> |
+<li><code>NaN</code> values are currently not guaranteed to be canonical, see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3536">bug |
+3536</a>.</li> |
+<li>It is currently unspecified whether signaling <code>NaN</code> faults.</li> |
+<li>Passing <code>NaN</code> to STL functions (the math is defined, but the |
+function implementation isn’t, e.g. <code>std::min</code> and <code>std::max</code>), is |
+well-defined in the <em>pexe</em>.</li> |
+<li><p class="first">Fast-math optimizations are currently supported before <em>pexe</em> creation |
+time. A <em>pexe</em> loses all fast-math information when it is |
+created. Fast-math translation could be enabled at a later date, |
+potentially at a perf-function granularity. This wouldn’t affect |
+already-existing <em>pexe</em>, it would be an opt-in feature.</p> |
+<ul class="small-gap"> |
+<li>Fused-multiply-add have higher precision and often execute faster, |
+PNaCl currently disallows them in the <em>pexe</em>. PNaCl could (but |
+currently doesn’t) only generate them in the backend if fast-math |
+was specified and the hardware supports the operation.</li> |
+<li>Transcendentals aren’t exposed by PNaCl’s ABI, they are part of the |
+math library that is included in the <em>pexe</em>. PNaCl could, but |
+currently doesn’t, use hardware support if fast-math were provided |
+in the <em>pexe</em>.</li> |
+</ul> |
+</li> |
+</ul> |
+</section><section id="hard-to-fix"> |
+<h4 id="hard-to-fix">Hard to Fix</h4> |
+<ul class="small-gap"> |
+<li><p class="first">Null pointer/reference has behavior determined by the NaCl sandbox:</p> |
+<ul class="small-gap"> |
+<li>Raises a segmentation fault in the bottom <code>64KiB</code> bytes on all |
+platforms, and on some sandboxes there are further non-writable |
+pages after the initial <code>64KiB</code>.</li> |
+<li>Negative offsets aren’t handled consistently on all platforms: |
+x86-64 and ARM will wrap around to the stack (because they mask the |
+address), whereas x86-32 will fault (because of segmentation).</li> |
+</ul> |
+</li> |
+<li><p class="first">Accessing uninitialized/free’d memory (including out-of-bounds array |
+access):</p> |
+<ul class="small-gap"> |
+<li>Might cause a segmentation fault or not, depending on where memory |
+is allocated and how it gets reclaimed.</li> |
+<li>Added complexity because of the NaCl sandboxing: some of the |
+load/stores might be forced back into sandbox range, or eliminated |
+entirely if they fall out of the sandbox.</li> |
+</ul> |
+</li> |
+<li><p class="first">Executing non-program data (jumping to an address obtained from a |
+non-function pointer is undefined, can only do <code>void(*)()</code> to |
+<code>intptr_t</code> to <code>void(*)()</code>).</p> |
+<ul class="small-gap"> |
+<li>Just-In-Time code generation is supported by NaCl, but will not be |
+supported by PNaCl’s first release. It will not be possible to mark |
+code as executable in the first release.</li> |
+<li>Offering full JIT capabilities would reduce PNaCl’s ability to |
+change the sandboxing model. It would also require a “jump to JIT |
+code” syscall (to guarantee a calling convention), and means that |
+JITs aren’t portable.</li> |
+<li>PNaCl could offer “portable” JIT capabilities where the code hands |
+PNaCl some form of LLVM IR, which PNaCl then JIT-compiles.</li> |
+</ul> |
+</li> |
+<li>Out-of-scope variable usage: will produce unknown data, mostly |
+dependent on stack and memory allocation.</li> |
+<li>Data races: any two operations that conflict (target overlapping |
+memory), at least one of which is a store or atomic read-modify-write, |
+and at least one of which is not atomic: this will be very dependent |
+on processor and execution sequence, see <a class="reference internal" href="/native-client/reference/pnacl-c-cpp-language-support.html#memory-model-and-atomics"><em>Memory Model and |
+Atomics</em></a>.</li> |
+</ul> |
+</section></section></section></section> |
+ |
+{{/partials.standard_nacl_article}} |