docs/PNaClLangRef.rst - Issue 17777004: Concurrency support for PNaCl ABI

Unified Diff: docs/PNaClLangRef.rst

Issue 17777004: Concurrency support for PNaCl ABI (Closed) Base URL: http://git.chromium.org/native_client/pnacl-llvm.git@master

Patch Set: Simplify overloading and function verification. Created 7 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: docs/PNaClLangRef.rst

diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst

index b1d39a7187806fc36a4dc8a54d38feb6441eb48d..2c7f7f9743a2a353ec5e8675e82e745da26ee38c 100644

--- a/docs/PNaClLangRef.rst

+++ b/docs/PNaClLangRef.rst

@@ -106,21 +106,128 @@ Volatile Memory Accesses

`LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_

-TODO: are we going to promote volatile to atomic?

+PNaCl bitcode does not support volatile memory accesses.

+.. note::

+ The C11/C++11 standards mandate that ``volatile`` accesses execute

+ in program order (but are not fences, so other memory operations can

+ reorder around them), are not necessarily atomic, and can’t be

+ elided or fused.

+ The PNaCl toolchain applies regular LLVM optimizations along these

+ guidelines, and it further prevents any load/store (even

+ non-``volatile`` and non-atomic ones) from moving above or below a

+ volatile operations: they act as compiler barriers before

+ optimizations occur. The PNaCl toolchain freezes ``volatile``

+ accesses after optimizations into atomic accesses with sequentially

+ consistent memory ordering. This eases the support of legacy

+ (i.e. non-C11/C++11) code, and combined with builtin fences these

+ programs can do meaningful cross-thread communication without

+ changing code. It also reflects the original code's intent and

+ guarantees better portability.

+ Relaxed ordering could be used instead, but for the first release it

+ is more conservative to apply sequential consistency. Future

+ releases may change what happens at compile-time, but

+ already-released pexes will continue using sequential consistency.

+ The PNaCl toolchain also requires that ``volatile`` accesses be at

+ least naturally aligned, and tries to guarantee this alignment.

Memory Model for Concurrent Operations

--------------------------------------

`LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_

-TODO.

+The memory model offered by PNaCl relies on the same coding guidelines

+as the C11/C++11 one: concurrent accesses must always occur through

+atomic primitives, and these accesses must always occur with the same

+size for the same memory location. Visibility of stores is provided on a

+happens-before basis that relates memory locations to each other as the

+C11/C++11 standards do.

+PNaCl bitcode requires all concurrency to occur through `atomic

+intrinsics`_.

+.. note::

+ As in C11/C++11 some atomic accesses may be implemented with locks

+ on certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always

+ be ``1``, signifying that all types are sometimes lock-free. The

+ ``is_lock_free`` methods will return the current platform's

+ implementation at runtime.

+ The PNaCl toolchain supports concurrent memory accesses through

+ legacy GCC-style ``__sync_*`` builtins, as well as through C11/C++11

+ atomic primitives. ``volatile`` memory accesses can also be used,

+ though these are discouraged, and aren't present in bitcode.

+ Note that PNaCl explicitly supports concurrency through threading

+ and inter-process communication (shared memory), but doesn't support

Derek Schuff 2013/07/02 22:13:17 should probably remove the reference to shared mem

JF 2013/07/02 23:44:32 I clarified this entire section, please review aga

+ interacting with device memory. Setting these up require assistance

+ from the embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but

+ using them once setup can be done through regular C/C++ code.

+ PNaCl also doesn't currently support signal handling, and therefore

+ promotes all primitives to cross-thread (instead of

+ single-thread). This may change at a later date.

+ The PNaCl toolchain currently optimizes for memory ordering as LLVM

+ normally does, but at pexe creation time it promotes all

+ ``volatile`` accesses as well as all atomic accesses to be

+ sequentially consistent. Other memory orderings will be supported in

+ a future release, but pexes generate with the current toolchain will

Derek Schuff 2013/07/02 22:13:17 s/generate/generated

JF 2013/07/02 23:44:32 Done.

+ continue functioning with sequential consistency. Using sequential

+ consistency provides a total ordering for all

+ sequentially-consistent operations on all addresses.

+ This means that ``volatile`` and atomic memory accesses can only be

+ re-ordered in some limited way before the pexe is created, and will

+ act as fences for all memory accesses (even non-atomic and

+ non-``volatile``) after pexe creation. Non-atomic and

+ non-``volatile`` memory accesses may be reordered (unless a fence

+ intervenes), separated, elided or fused according to C and C++'s

+ memory model before the pexe is created as well as after its

+ creation.

Atomic Memory Ordering Constraints

----------------------------------

`LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_

-TODO.

+PNaCl bitcode currently supports sequential consistency only, through

+its `atomic intrinsics`_.

+.. note::

+ Atomics follow the same ordering constraints as in regular LLVM, but

+ all accesses are promoted to sequential consistency (the strongest

+ memory ordering) at pexe creation time. As more C11/C++11 code

+ allows us to understand performance and portability needs we intend

+ to support the full gamut of C11/C++11 memory orderings:

+ - Relaxed: no operation orders memory.

+ - Consume: a load operation performs a consume operation on the

+ affected memory location (currently unsupported by LLVM).

+ - Acquire: a load operation performs an acquire operation on the

+ affected memory location.

+ - Release: a store operation performs a release operation on the

+ affected memory location.

+ - Acquire-release: load and store operations perform acquire and

+ release operations on the affected memory.

+ - Sequentially consistent: same as acquire-release, but providing

+ a global total ordering for all affected locations.

+ As in C11/C++11:

+ - Atomic and volatile accesses must at least be naturally aligned.

+ - Some accesses may not actually be atomic on certain platforms,

+ requiring an implementation that uses a global lock.

+ - An atomic memory location must always be accessed with atomic

+ primitives, and these primitives must always be of the same bit

+ size for that location.

+ - Not all memory orderings are valid for all atomic operations.

Fast-Math Flags

---------------

@@ -270,14 +377,6 @@ Only the LLVM instructions listed here are supported by PNaCl bitcode.

The pointer argument of these instructions must be a *normalized* pointer

(see :ref:`pointer types <pointertypes>`).

-* ``fence``

-* ``cmpxchg``, ``atomicrmw``

- The pointer argument of these instructions must be a *normalized* pointer

- (see :ref:`pointer types <pointertypes>`).

- TODO(jfb): this may change

* ``trunc``

* ``zext``

* ``sext``

@@ -316,8 +415,6 @@ Intrinsic Functions

The only intrinsics supported by PNaCl bitcode are the following.

-TODO(jfb): atomics

* ``llvm.memcpy``

* ``llvm.memmove``

* ``llvm.memset``

@@ -346,3 +443,56 @@ TODO(jfb): atomics

TODO: describe

+.. _atomic intrinsics:

+* ``llvm.nacl.atomic.store``

+* ``llvm.nacl.atomic.load``

+* ``llvm.nacl.atomic.rmw``

+* ``llvm.nacl.atomic.cmpxchg``

+* ``llvm.nacl.atomic.fence``

+ .. code-block:: llvm

+ declare iN @llvm.nacl.atomic.load(

+ iN* <source>, i32 <memory_order>)

+ declare void @llvm.nacl.atomic.store(

+ iN <operand>, iN* <destination>, i32 <memory_order>)

+ declare iN @llvm.nacl.atomic.rmw(

+ i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)

+ declare iN @llvm.nacl.atomic.cmpxchg(

+ iN* <object>, iN <expected>, iN <desired>,

+ i32 <memory_order_success>, i32 <memory_order_failure>)

+ declare void @llvm.nacl.atomic.fence(i32 <memory_order>)

+ Each of these intrinsics is overloaded on the ``iN``

+ argument. Integral types of 8, 16, 32 and 64-bit width are supported

+ for these ``iN`` arguments.

+ The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following

+ read-modify-write operations, from the general and arithmetic sections

+ of the C11/C++11 standards:

+ - ``add``

+ - ``sub``

+ - ``or``

+ - ``and``

+ - ``xor``

+ - ``exchange``

+ For all of these read-modify-write operations, the returned value is

+ that at ``object`` before the computation. The ``computation``

+ argument must be a compile-time constant.

+ All atomic intrinsics also support C11/C++11 memory orderings, which

+ must be compile-time constants. Those are detailed in `Atomic Memory

+ Ordering Constraints`_.

+ Integer values for these computations and memory orderings are defined

+ in ``"llvm/IR/NaClIntrinsics.h"``.

+ .. note::

+ These intrinsics allow PNaCl to support C11/C++11 style atomic

+ operations as well as some legacy GCC-style ``__sync_*`` builtins

+ while remaining stable as the LLVM codebase changes. The user

+ isn't expected to use these intrinsics directly.

« no previous file with comments | « no previous file | include/llvm/IR/Intrinsics.td » ('j') | include/llvm/IR/Intrinsics.td » ('J')