Index: docs/PNaClLangRef.rst |
diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst |
index b1d39a7187806fc36a4dc8a54d38feb6441eb48d..488c61ba16a25ed4dfeb61c3220ce06170475cf8 100644 |
--- a/docs/PNaClLangRef.rst |
+++ b/docs/PNaClLangRef.rst |
@@ -106,21 +106,99 @@ Volatile Memory Accesses |
`LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ |
-TODO: are we going to promote volatile to atomic? |
+The C and C++ standards mandate that ``volatile`` accesses execute in |
eliben
2013/06/27 19:32:24
The C11 and C++11 standards...
JF
2013/06/27 21:03:17
These restrictions are actually all in the previou
|
+program order (but are not fences, so other memory operations can |
+reorder around them), are not necessarily atomic, and can’t be elided or |
+fused. |
+ |
+The PNaCl toolchain applies regular LLVM optimizations along these |
+guidelines, but prevents any load/store (even non-``volatile`` and |
+non-atomic ones) from moving past a volatile operations: they act as |
eliben
2013/06/27 19:32:24
"past volatile operations"?
JF
2013/06/27 21:03:17
I mean: a regular load/store can't move above or b
|
+compiler barriers before optimizations occur. The PNaCl toolchain |
+freezes ``volatile`` accesses after optimizations into atomic accesses |
+with sequential consistency memory ordering. This eases the support of |
eliben
2013/06/27 19:32:24
sequentially consistent memory ordering?
JF
2013/06/27 21:03:17
Done.
|
+legacy (i.e. non-C11/C++11) code, and combined with builtin fences these |
+programs can do meaningful cross-thread communication without changing |
+code. It also reflects the original code's intent and guarantees better |
+portability. |
+ |
+Relaxed ordering could be used instead, but for the first release it is |
+more conservative to apply sequential consistency. Future releases may |
+change what happens at compile-time, but already-released pexes will |
+continue using sequential consistency. |
+ |
+The PNaCl toolchain also requires that ``volatile`` accesses be at least |
+naturally aligned, and tries to guarantee this alignment. |
Memory Model for Concurrent Operations |
-------------------------------------- |
`LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ |
-TODO. |
+The memory model offered by PNaCl relies the same coding guidelines as |
eliben
2013/06/27 19:32:24
"relies on the same" ?
JF
2013/06/27 21:03:17
Done.
|
+the C11/C++11 one: concurrent accesses must always occur through atomic |
+primitives, and these accesses must always occur with the same size for |
+the same memory locations. Visibility of stores is provided on a |
+happens-before basis that relates memory locations to each other as the |
+C11/C++11 standards do. |
+ |
+As in C11/C++11 some atomic accesses may be implemented with locks on |
+certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be |
+``1``, signifying that all types are sometimes lock-free. The |
+``is_lock_free`` methods will return the current platform's |
+implementation at runtime. |
+ |
+The PNaCl toolchain supports concurrent memory accesses through legacy |
+GCC-style ``__sync_*`` builtins, as well as through C11/C++11 atomic |
+primitives. ``volatile`` memory accesses can also be used, though these |
+are discouraged. |
+ |
+Note that PNaCl explicitly supports concurrency through threading and |
+inter-process communication (shared memory), but doesn't support |
+interacting with device memory. Setting these up require assistance from |
+the embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using |
+them once setup can be done through regular C/C++ code. |
+ |
+PNaCl also doesn't currently support signal handling, and therefore |
+promotes all primitives to cross-thread (instead of single-thread). This |
+may change at a later date. |
+ |
+The PNaCl toolchain currently optimizes for memory ordering as LLVM |
+normally does, but at pexe creation time it promotes all ``volatile`` |
+accesses as well as all atomic accesses to be sequentially |
+consistent. Other memory orderings will be supported in a future |
+release, but pexes generate with the current toolchain will continue |
+functioning with sequential consistency. Using sequential consistency |
+provides a total ordering for all sequentially-consistent operations on |
+all addresses. |
+ |
+This means that ``volatile`` and atomic memory accesses can only be |
+re-ordered in some limited way before the pexe is created, and will act |
+as fences for all memory accesses (even non-atomic and non-``volatile``) |
+after pexe creation. Non-atomic and non-``volatile`` memory accesses may |
+be reordered (unless a fence intervenes), separated, elided or fused |
+according to C and C++'s memory model before the pexe is created as well |
+as after its creation. |
Atomic Memory Ordering Constraints |
---------------------------------- |
`LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ |
-TODO. |
+Atomics follow the same ordering constraints as in regular LLVM, but all |
+accesses are promoted to sequential consistency (the strongest memory |
+ordering) at pexe creation time. We may relax these rules and honor the |
+program's memory ordering constraints as more C11/C++11 code allows us |
+to understand performance and portability needs. |
+ |
+As in C11/C++11: |
+ |
+ - Atomic and volatile accesses must at least be naturally aligned. |
+ - Some accesses may not actually be atomic on certain platforms, |
+ requiring an implementation that uses a global lock. |
+ - An atomic memory location must always be accesses with atomic |
eliben
2013/06/27 19:32:24
accessed
JF
2013/06/27 21:03:17
Done.
|
+ primitives, and these primitives must always be of the same type for |
+ that location. |
Fast-Math Flags |
--------------- |
@@ -270,14 +348,6 @@ Only the LLVM instructions listed here are supported by PNaCl bitcode. |
The pointer argument of these instructions must be a *normalized* pointer |
(see :ref:`pointer types <pointertypes>`). |
-* ``fence`` |
-* ``cmpxchg``, ``atomicrmw`` |
- |
- The pointer argument of these instructions must be a *normalized* pointer |
- (see :ref:`pointer types <pointertypes>`). |
- |
- TODO(jfb): this may change |
- |
* ``trunc`` |
* ``zext`` |
* ``sext`` |
@@ -316,8 +386,6 @@ Intrinsic Functions |
The only intrinsics supported by PNaCl bitcode are the following. |
-TODO(jfb): atomics |
- |
* ``llvm.memcpy`` |
* ``llvm.memmove`` |
* ``llvm.memset`` |
@@ -346,3 +414,67 @@ TODO(jfb): atomics |
TODO: describe |
+* ``llvm.nacl.atomic.store`` |
+* ``llvm.nacl.atomic.load`` |
+* ``llvm.nacl.atomic.rmw`` |
+* ``llvm.nacl.atomic.cmpxchg`` |
+* ``llvm.nacl.atomic.fence`` |
+ |
+ These intrinsics allow PNaCl to support C11/C++11 style atomic |
+ operations as well as some legacy GCC-style ``__sync_*`` builtins |
+ while remaining stable as the LLVM codebase changes. The user isn't |
+ expected to use these intrinsics directly. |
+ |
+ :: |
+ |
+ declare void @llvm.nacl.atomic.store( |
+ iN* <dest>, iN <val>, i32 <memory_order>) |
+ declare iN @llvm.nacl.atomic.load( |
+ iN* <src>, i32 <memory_order>) |
+ declare iN @llvm.nacl.atomic.rmw( |
+ i32 <op>, iN* <loc>, iN <val>, i32 <memory_order>) |
+ declare iN @llvm.nacl.atomic.compare_exchange( |
eliben
2013/06/27 19:32:24
name mismatch with above
JF
2013/06/27 21:03:17
Done. I went with the C11/C++11 naming and forgot
|
+ iN* <loc>, iN <expected>, iN <desired>, |
+ i32 <memory_order_success>, i32 <memory_order_failure>) |
+ declare void @llvm.nacl.atomic.fence(i32 <memory_order>) |
+ |
+ Each of these intrinsic is overloaded on the ``iN`` argument. Integral |
eliben
2013/06/27 19:32:24
intrinsics
JF
2013/06/27 21:03:17
Done.
|
+ types of 8, 16, 32 and 64-bit width are currently supported for these |
+ ``iN`` arguments. |
+ |
+ The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following |
+ read-modify-write operations, from the general and arithmetic sections |
+ of the C11/C++11 standards: |
+ |
+ - ``add`` |
+ - ``sub`` |
+ - ``or`` |
+ - ``and`` |
+ - ``xor`` |
+ - ``exchange`` |
+ |
+ For all of these read-modify-write operations, the returned value is |
+ that at ``loc`` before the operation. The ``op`` argument must be a |
+ compile-time constant. |
+ |
+ All atomic intrinsics also support C11/C++11 memory orderings, which |
+ must be compile-time constants: |
+ |
+ - Relaxed: no operation orders memory. |
+ - Consume: a load operation performs a consume operation on the |
+ affected memory location (currently unsupported by LLVM). |
+ - Acquire: a load operation performs an acquire operation on the |
+ affected memory location. |
+ - Release: a store operation performs a release operation on the |
+ affected memory location. |
+ - Acquire-release: load and store operations perform acquire and |
+ release operations on the affected memory. |
+ - Sequentially consistent: same as acquire-release, but providing a |
+ global total ordering for all affected locations. |
+ |
+ Note that PNaCl currently strengthens all memory ordering |
+ specifications to sequential consistency, the strongest form of memory |
+ ordering. |
+ |
+ Values for these operations and memory orderings are defined in |
+ llvm/IR/NaClIntrinsics.h. |