Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(29)

Side by Side Diff: docs/PNaClDeveloperGuide.rst

Issue 22240002: Rework PNaCl memory ordering (Closed) Base URL: http://git.chromium.org/native_client/pnacl-llvm.git@master
Patch Set: Created 7 years, 4 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | docs/PNaClLangRef.rst » ('j') | docs/PNaClLangRef.rst » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 ======================= 1 =======================
2 PNaCl Developer's Guide 2 PNaCl Developer's Guide
3 ======================= 3 =======================
4 4
5 .. contents:: 5 .. contents::
6 :local: 6 :local:
7 :depth: 3 7 :depth: 3
8 8
9 Introduction 9 Introduction
10 ============ 10 ============
11 11
12 TODO 12 TODO
13 13
14 Memory Model and Atomics 14 Memory Model and Atomics
15 ======================== 15 ========================
16 16
17 Memory Model for Concurrent Operations
18 --------------------------------------
19
20 The memory model offered by PNaCl relies on the same coding guidelines
21 as the C11/C++11 one: concurrent accesses must always occur through
22 atomic primitives (offered by `atomic intrinsics
23 <PNaClLangRef.html#atomicintrinsics>`_), and these accesses must always
24 occur with the same size for the same memory location. Visibility of
25 stores is provided on a happens-before basis that relates memory
26 locations to each other as the C11/C++11 standards do.
27
28 As in C11/C++11 some atomic accesses may be implemented with locks on
29 certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be
30 ``1``, signifying that all types are sometimes lock-free. The
31 ``is_lock_free`` methods and ``atomic_is_lock_free`` will return the
32 current platform's implementation at translation time. These macros,
33 methods and functions are in the C11 header ``<stdatomic.h>`` and the
34 C++11 header ``<atomic>``.
35
36 The PNaCl toolchain supports concurrent memory accesses through legacy
37 GCC-style ``__sync_*`` builtins, as well as through C11/C++11 atomic
38 primitives. ``volatile`` memory accesses can also be used, though these
39 are discouraged, and aren't present in bitcode. See `Volatile Memory
eliben 2013/08/05 18:35:54 Remove the "aren't present in bitcode" - i don't t
jvoung (off chromium) 2013/08/05 18:59:41 Yeah, this is the summary for developers. What is
JF 2013/08/05 20:37:48 Done.
40 Accesses`_.
41
42 PNaCl supports concurrency and parallelism with some restrictions:
43
44 * Threading is explicitly supported through C11/C++11's threading
45 libraries as well as POSIX threads.
46
47 * Inter-process communication through shared memory is limited to
eliben 2013/08/05 18:35:54 What does inter-process communication even mean in
JF 2013/08/05 20:37:48 Done.
48 operations which are lock-free on the current platform
49 (``is_lock_free`` methods). This may change at a later date.
50
51 * Direct interaction with device memory isn't supported.
52
53 * Signal handling isn't supported, PNaCl therefore promotes all
54 primitives to cross-thread (instead of single-thread). This may change
55 at a later date. Note that using atomic operations which aren't
56 lock-free may lead to deadlocks when handling asynchronous signals.
57
58 * ``volatile`` and atomic operations are address-free (operations on the
59 same memory location via two different addresses work atomically), as
60 intended by the C11/C++11 standards. This is critical for
61 inter-process communication as well as synchronous "external
62 modifications" such as mapping underlying memory at multiple
63 locations.
64
65 Setting up the above mechanisms requires assistance from the embedding
66 sandbox's runtime (e.g. NaCl's Pepper APIs), but using them once setup
67 can be done through regular C/C++ code.
68
69 Atomic Memory Ordering Constraints
70 ----------------------------------
71
72 Atomics follow the same ordering constraints as in regular LLVM, but all
73 accesses are promoted to sequential consistency (the strongest memory
74 ordering) at pexe creation time. As more C11/C++11 code allows us to
jvoung (off chromium) 2013/08/05 18:59:41 Should the memory orderings change also be done by
JF 2013/08/05 20:37:48 I think the current implementation should offer a
75 understand performance and portability needs we intend to support the
76 full gamut of C11/C++11 memory orderings:
77
78 - Relaxed: no operation orders memory.
eliben 2013/08/05 18:35:54 Maybe this list does not belong here? This is user
JF 2013/08/05 20:37:48 This is not an addition, I just moved it around. I
79 - Consume: a load operation performs a consume operation on the affected
80 memory location (currently unsupported by LLVM).
81 - Acquire: a load operation performs an acquire operation on the
82 affected memory location.
83 - Release: a store operation performs a release operation on the
84 affected memory location.
85 - Acquire-release: load and store operations perform acquire and release
86 operations on the affected memory.
87 - Sequentially consistent: same as acquire-release, but providing a
88 global total ordering for all affected locations.
89
90 As in C11/C++11:
91
92 - Atomic accesses must at least be naturally aligned.
93 - Some accesses may not actually be atomic on certain platforms,
94 requiring an implementation that uses global lock(s).
95 - An atomic memory location must always be accessed with atomic
96 primitives, and these primitives must always be of the same bit size
97 for that location.
98 - Not all memory orderings are valid for all atomic operations.
99
17 Volatile Memory Accesses 100 Volatile Memory Accesses
18 ------------------------ 101 ------------------------
19 102
20 The C11/C++11 standards mandate that ``volatile`` accesses execute in program 103 The C11/C++11 standards mandate that ``volatile`` accesses execute in
21 order (but are not fences, so other memory operations can reorder around them), 104 program order (but are not fences, so other memory operations can
22 are not necessarily atomic, and can’t be elided. They can be separated into 105 reorder around them), are not necessarily atomic, and can’t be
23 smaller width accesses. 106 elided. They can be separated into smaller width accesses.
24 107
25 The PNaCl toolchain applies regular LLVM optimizations along these guidelines, 108 Before any optimizations occur the PNaCl toolchain transforms
26 and it further prevents any load/store (even non-``volatile`` and non-atomic 109 ``volatile`` loads and stores into sequentially consistent ``volatile``
27 ones) from moving above or below a volatile operations: they act as compiler 110 atomic loads and stores, and applies regular LLVM optimizations along
28 barriers before optimizations occur. The PNaCl toolchain freezes ``volatile`` 111 the above guidelines. This orders ``volatiles`` according to the atomic
29 accesses after optimizations into atomic accesses with sequentially consistent 112 rules, and means that fences (including ``__sync_synchronize``) act in a
30 memory ordering. This eases the support of legacy (i.e. non-C11/C++11) code, and 113 better-defined manner. Regular memory accesses still do not have
31 combined with builtin fences these programs can do meaningful cross-thread 114 ordering guarantees with ``volatile`` and atomic accesses, though the
32 communication without changing code. It also reflects the original code's intent 115 internal representation of ``__sync_synchronize`` attempts to prevent
33 and guarantees better portability. 116 reordering of memory accesses to objects which may escape.
34 117
35 Relaxed ordering could be used instead, but for the first release it is more 118 Relaxed ordering could be used instead, but for the first release it is
36 conservative to apply sequential consistency. Future releases may change what 119 more conservative to apply sequential consistency. Future releases may
37 happens at compile-time, but already-released pexes will continue using 120 change what happens at compile-time, but already-released pexes will
38 sequential consistency. 121 continue using sequential consistency.
39 122
40 The PNaCl toolchain also requires that ``volatile`` accesses be at least 123 The PNaCl toolchain also requires that ``volatile`` accesses be at least
41 naturally aligned, and tries to guarantee this alignment. 124 naturally aligned, and tries to guarantee this alignment.
42 125
43 Memory Model for Concurrent Operations 126 The above guarantees ease the support of legacy (i.e. non-C11/C++11)
44 -------------------------------------- 127 code, and combined with builtin fences these programs can do meaningful
128 cross-thread communication without changing code. They also better
129 reflect the original code's intent and guarantee better portability.
45 130
46 The memory model offered by PNaCl relies on the same coding guidelines as the 131 Stable Transfer Format
47 C11/C++11 one: concurrent accesses must always occur through atomic primitives 132 ----------------------
48 (offered by `atomic intrinsics <PNaClLangRef.html#atomicintrinsics>`_), and
49 these accesses must always occur with the same size for the same memory
50 location. Visibility of stores is provided on a happens-before basis that
51 relates memory locations to each other as the C11/C++11 standards do.
52 133
53 As in C11/C++11 some atomic accesses may be implemented with locks on certain 134 The PNaCl toolchain freezes atomic and ``volatile`` memory accesses
54 platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be ``1``, signifying 135 after optimizations into atomic accesses with sequentially consistent
jvoung (off chromium) 2013/08/05 18:59:41 re "after optimizations": volatiles get converted
JF 2013/08/05 20:37:48 OK, I can remove this section.
55 that all types are sometimes lock-free. The ``is_lock_free`` methods will return 136 memory ordering. Other memory orderings will be exposed in future
56 the current platform's implementation at translation time. 137 releases, when we have a better grasp of existing code's needs,
138 portability implications, and are confident that implementation limits
139 are overcome. Future releases may change what happens at compile-time,
140 but already-released pexes will continue using sequential consistency.
57 141
58 The PNaCl toolchain supports concurrent memory accesses through legacy GCC-style 142 Non-atomic and non-``volatile`` memory accesses may be reordered,
59 ``__sync_*`` builtins, as well as through C11/C++11 atomic primitives. 143 separated, elided or fused according to C and C++'s memory model before
60 ``volatile`` memory accesses can also be used, though these are discouraged, and 144 the pexe is created as well as after its creation.
61 aren't present in bitcode.
62
63 PNaCl supports concurrency and parallelism with some restrictions:
64
65 * Threading is explicitly supported.
66
67 * Inter-process communication through shared memory is limited to operations
68 which are lock-free on the current platform (``is_lock_free`` methods). This
69 may change at a later date.
70
71 * Direct interaction with device memory isn't supported.
72
73 * Signal handling isn't supported, PNaCl therefore promotes all primitives to
74 cross-thread (instead of single-thread). This may change at a later date. Note
75 that using atomic operations which aren't lock-free may lead to deadlocks when
76 handling asynchronous signals.
77
78 * ``volatile`` and atomic operations are address-free (operations on the same
79 memory location via two different addresses work atomically), as intended by
80 the C11/C++11 standards. This is critical for inter-process communication as
81 well as synchronous "external modifications" such as mapping underlying memory
82 at multiple locations.
83
84 Setting up the above mechanisms requires assistance from the embedding sandbox's
85 runtime (e.g. NaCl's Pepper APIs), but using them once setup can be done through
86 regular C/C++ code.
87
88 The PNaCl toolchain currently optimizes for memory ordering as LLVM normally
89 does, but at pexe creation time it promotes all ``volatile`` accesses as well as
90 all atomic accesses to be sequentially consistent. Other memory orderings will
91 be supported in a future release, but pexes generated with the current toolchain
92 will continue functioning with sequential consistency. Using sequential
93 consistency provides a total ordering for all sequentially-consistent operations
94 on all addresses.
95
96 This means that ``volatile`` and atomic memory accesses can only be re-ordered
97 in some limited way before the pexe is created, and will act as fences for all
98 memory accesses (even non-atomic and non-``volatile``) after pexe creation.
99 Non-atomic and non-``volatile`` memory accesses may be reordered (unless a fence
100 intervenes), separated, elided or fused according to C and C++'s memory model
101 before the pexe is created as well as after its creation.
102
103 Atomic Memory Ordering Constraints
104 ----------------------------------
105
106 Atomics follow the same ordering constraints as in regular LLVM, but
107 all accesses are promoted to sequential consistency (the strongest
108 memory ordering) at pexe creation time. As more C11/C++11 code
109 allows us to understand performance and portability needs we intend
110 to support the full gamut of C11/C++11 memory orderings:
111
112 - Relaxed: no operation orders memory.
113 - Consume: a load operation performs a consume operation on the affected memory
114 location (currently unsupported by LLVM).
115 - Acquire: a load operation performs an acquire operation on the affected memory
116 location.
117 - Release: a store operation performs a release operation on the affected memory
118 location.
119 - Acquire-release: load and store operations perform acquire and release
120 operations on the affected memory.
121 - Sequentially consistent: same as acquire-release, but providing a global total
122 ordering for all affected locations.
123
124 As in C11/C++11:
125
126 - Atomic accesses must at least be naturally aligned.
127 - Some accesses may not actually be atomic on certain platforms, requiring an
128 implementation that uses a global lock.
129 - An atomic memory location must always be accessed with atomic primitives, and
130 these primitives must always be of the same bit size for that location.
131 - Not all memory orderings are valid for all atomic operations.
132 145
133 Inline Assembly 146 Inline Assembly
134 =============== 147 ===============
135 148
136 Inline assembly isn't supported by PNaCl because it isn't portable. The 149 Inline assembly isn't supported by PNaCl because it isn't portable. The
137 one current exception is the common compiler barrier idiom 150 one current exception is the common compiler barrier idiom
138 ``asm("":::"memory")``, which gets transformed to a sequentially 151 ``asm("":::"memory")``, which gets transformed to a sequentially
139 consistent memory barrier (equivalent to ``__sync_synchronize()``). 152 consistent memory barrier (equivalent to ``__sync_synchronize()``). In
153 PNaCl this barrier is only guaranteed to order ``volatile`` and atomic
154 memory accesses, though in practice the implementation attempts to also
155 prevent reordering of memory accesses to objects which may escape.
OLDNEW
« no previous file with comments | « no previous file | docs/PNaClLangRef.rst » ('j') | docs/PNaClLangRef.rst » ('J')

Powered by Google App Engine
This is Rietveld 408576698