docs/PNaClLangRef.rst - Issue 17777004: Concurrency support for PNaCl ABI

Unified Diff: docs/PNaClLangRef.rst

Issue 17777004: Concurrency support for PNaCl ABI (Closed) Base URL: http://git.chromium.org/native_client/pnacl-llvm.git@master

Patch Set: Update PNaClLangRef to reflect the implementation work I will now go forward with. Created 7 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: docs/PNaClLangRef.rst

diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst

index b1d39a7187806fc36a4dc8a54d38feb6441eb48d..488c61ba16a25ed4dfeb61c3220ce06170475cf8 100644

--- a/docs/PNaClLangRef.rst

+++ b/docs/PNaClLangRef.rst

@@ -106,21 +106,99 @@ Volatile Memory Accesses

`LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_

-TODO: are we going to promote volatile to atomic?

+The C and C++ standards mandate that ``volatile`` accesses execute in

eliben 2013/06/27 19:32:24 The C11 and C++11 standards...

JF 2013/06/27 21:03:17 These restrictions are actually all in the previou

+program order (but are not fences, so other memory operations can

+reorder around them), are not necessarily atomic, and can’t be elided or

+fused.

+The PNaCl toolchain applies regular LLVM optimizations along these

+guidelines, but prevents any load/store (even non-``volatile`` and

+non-atomic ones) from moving past a volatile operations: they act as

eliben 2013/06/27 19:32:24 "past volatile operations"?

JF 2013/06/27 21:03:17 I mean: a regular load/store can't move above or b

+compiler barriers before optimizations occur. The PNaCl toolchain

+freezes ``volatile`` accesses after optimizations into atomic accesses

+with sequential consistency memory ordering. This eases the support of

eliben 2013/06/27 19:32:24 sequentially consistent memory ordering?

JF 2013/06/27 21:03:17 Done.

+legacy (i.e. non-C11/C++11) code, and combined with builtin fences these

+programs can do meaningful cross-thread communication without changing

+code. It also reflects the original code's intent and guarantees better

+portability.

+Relaxed ordering could be used instead, but for the first release it is

+more conservative to apply sequential consistency. Future releases may

+change what happens at compile-time, but already-released pexes will

+continue using sequential consistency.

+The PNaCl toolchain also requires that ``volatile`` accesses be at least

+naturally aligned, and tries to guarantee this alignment.

Memory Model for Concurrent Operations

--------------------------------------

`LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_

-TODO.

+The memory model offered by PNaCl relies the same coding guidelines as

eliben 2013/06/27 19:32:24 "relies on the same" ?

JF 2013/06/27 21:03:17 Done.

+the C11/C++11 one: concurrent accesses must always occur through atomic

+primitives, and these accesses must always occur with the same size for

+the same memory locations. Visibility of stores is provided on a

+happens-before basis that relates memory locations to each other as the

+C11/C++11 standards do.

+As in C11/C++11 some atomic accesses may be implemented with locks on

+certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be

+``1``, signifying that all types are sometimes lock-free. The

+``is_lock_free`` methods will return the current platform's

+implementation at runtime.

+The PNaCl toolchain supports concurrent memory accesses through legacy

+GCC-style ``__sync_*`` builtins, as well as through C11/C++11 atomic

+primitives. ``volatile`` memory accesses can also be used, though these

+are discouraged.

+Note that PNaCl explicitly supports concurrency through threading and

+inter-process communication (shared memory), but doesn't support

+interacting with device memory. Setting these up require assistance from

+the embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using

+them once setup can be done through regular C/C++ code.

+PNaCl also doesn't currently support signal handling, and therefore

+promotes all primitives to cross-thread (instead of single-thread). This

+may change at a later date.

+The PNaCl toolchain currently optimizes for memory ordering as LLVM

+normally does, but at pexe creation time it promotes all ``volatile``

+accesses as well as all atomic accesses to be sequentially

+consistent. Other memory orderings will be supported in a future

+release, but pexes generate with the current toolchain will continue

+functioning with sequential consistency. Using sequential consistency

+provides a total ordering for all sequentially-consistent operations on

+all addresses.

+This means that ``volatile`` and atomic memory accesses can only be

+re-ordered in some limited way before the pexe is created, and will act

+as fences for all memory accesses (even non-atomic and non-``volatile``)

+after pexe creation. Non-atomic and non-``volatile`` memory accesses may

+be reordered (unless a fence intervenes), separated, elided or fused

+according to C and C++'s memory model before the pexe is created as well

+as after its creation.

Atomic Memory Ordering Constraints

----------------------------------

`LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_

-TODO.

+Atomics follow the same ordering constraints as in regular LLVM, but all

+accesses are promoted to sequential consistency (the strongest memory

+ordering) at pexe creation time. We may relax these rules and honor the

+program's memory ordering constraints as more C11/C++11 code allows us

+to understand performance and portability needs.

+As in C11/C++11:

+ - Atomic and volatile accesses must at least be naturally aligned.

+ - Some accesses may not actually be atomic on certain platforms,

+ requiring an implementation that uses a global lock.

+ - An atomic memory location must always be accesses with atomic

eliben 2013/06/27 19:32:24 accessed

JF 2013/06/27 21:03:17 Done.

+ primitives, and these primitives must always be of the same type for

+ that location.

Fast-Math Flags

---------------

@@ -270,14 +348,6 @@ Only the LLVM instructions listed here are supported by PNaCl bitcode.

The pointer argument of these instructions must be a *normalized* pointer

(see :ref:`pointer types <pointertypes>`).

-* ``fence``

-* ``cmpxchg``, ``atomicrmw``

- The pointer argument of these instructions must be a *normalized* pointer

- (see :ref:`pointer types <pointertypes>`).

- TODO(jfb): this may change

* ``trunc``

* ``zext``

* ``sext``

@@ -316,8 +386,6 @@ Intrinsic Functions

The only intrinsics supported by PNaCl bitcode are the following.

-TODO(jfb): atomics

* ``llvm.memcpy``

* ``llvm.memmove``

* ``llvm.memset``

@@ -346,3 +414,67 @@ TODO(jfb): atomics

TODO: describe

+* ``llvm.nacl.atomic.store``

+* ``llvm.nacl.atomic.load``

+* ``llvm.nacl.atomic.rmw``

+* ``llvm.nacl.atomic.cmpxchg``

+* ``llvm.nacl.atomic.fence``

+ These intrinsics allow PNaCl to support C11/C++11 style atomic

+ operations as well as some legacy GCC-style ``__sync_*`` builtins

+ while remaining stable as the LLVM codebase changes. The user isn't

+ expected to use these intrinsics directly.

+ ::

+ declare void @llvm.nacl.atomic.store(

+ iN* <dest>, iN <val>, i32 <memory_order>)

+ declare iN @llvm.nacl.atomic.load(

+ iN* <src>, i32 <memory_order>)

+ declare iN @llvm.nacl.atomic.rmw(

+ i32 <op>, iN* <loc>, iN <val>, i32 <memory_order>)

+ declare iN @llvm.nacl.atomic.compare_exchange(

eliben 2013/06/27 19:32:24 name mismatch with above

JF 2013/06/27 21:03:17 Done. I went with the C11/C++11 naming and forgot

+ iN* <loc>, iN <expected>, iN <desired>,

+ i32 <memory_order_success>, i32 <memory_order_failure>)

+ declare void @llvm.nacl.atomic.fence(i32 <memory_order>)

+ Each of these intrinsic is overloaded on the ``iN`` argument. Integral

eliben 2013/06/27 19:32:24 intrinsics

JF 2013/06/27 21:03:17 Done.

+ types of 8, 16, 32 and 64-bit width are currently supported for these

+ ``iN`` arguments.

+ The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following

+ read-modify-write operations, from the general and arithmetic sections

+ of the C11/C++11 standards:

+ - ``add``

+ - ``sub``

+ - ``or``

+ - ``and``

+ - ``xor``

+ - ``exchange``

+ For all of these read-modify-write operations, the returned value is

+ that at ``loc`` before the operation. The ``op`` argument must be a

+ compile-time constant.

+ All atomic intrinsics also support C11/C++11 memory orderings, which

+ must be compile-time constants:

+ - Relaxed: no operation orders memory.

+ - Consume: a load operation performs a consume operation on the

+ affected memory location (currently unsupported by LLVM).

+ - Acquire: a load operation performs an acquire operation on the

+ affected memory location.

+ - Release: a store operation performs a release operation on the

+ affected memory location.

+ - Acquire-release: load and store operations perform acquire and

+ release operations on the affected memory.

+ - Sequentially consistent: same as acquire-release, but providing a

+ global total ordering for all affected locations.

+ Note that PNaCl currently strengthens all memory ordering

+ specifications to sequential consistency, the strongest form of memory

+ ordering.

+ Values for these operations and memory orderings are defined in

+ llvm/IR/NaClIntrinsics.h.

« no previous file with comments | « no previous file | include/llvm/IR/Intrinsics.td » ('j') | no next file with comments »