OLD | NEW |
---|---|
1 ============================== | 1 ============================== |
2 PNaCl Bitcode Reference Manual | 2 PNaCl Bitcode Reference Manual |
3 ============================== | 3 ============================== |
4 | 4 |
5 .. contents:: | 5 .. contents:: |
6 :local: | 6 :local: |
7 :depth: 3 | 7 :depth: 3 |
8 | 8 |
9 Introduction | 9 Introduction |
10 ============ | 10 ============ |
(...skipping 88 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
99 | 99 |
100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_ | 100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_ |
101 | 101 |
102 PNaCl bitcode does not support inline assembly. | 102 PNaCl bitcode does not support inline assembly. |
103 | 103 |
104 Volatile Memory Accesses | 104 Volatile Memory Accesses |
105 ------------------------ | 105 ------------------------ |
106 | 106 |
107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ | 107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ |
108 | 108 |
109 TODO: are we going to promote volatile to atomic? | 109 The C and C++ standards mandate that ``volatile`` accesses execute in |
eliben
2013/06/27 19:32:24
The C11 and C++11 standards...
JF
2013/06/27 21:03:17
These restrictions are actually all in the previou
| |
110 program order (but are not fences, so other memory operations can | |
111 reorder around them), are not necessarily atomic, and can’t be elided or | |
112 fused. | |
113 | |
114 The PNaCl toolchain applies regular LLVM optimizations along these | |
115 guidelines, but prevents any load/store (even non-``volatile`` and | |
116 non-atomic ones) from moving past a volatile operations: they act as | |
eliben
2013/06/27 19:32:24
"past volatile operations"?
JF
2013/06/27 21:03:17
I mean: a regular load/store can't move above or b
| |
117 compiler barriers before optimizations occur. The PNaCl toolchain | |
118 freezes ``volatile`` accesses after optimizations into atomic accesses | |
119 with sequential consistency memory ordering. This eases the support of | |
eliben
2013/06/27 19:32:24
sequentially consistent memory ordering?
JF
2013/06/27 21:03:17
Done.
| |
120 legacy (i.e. non-C11/C++11) code, and combined with builtin fences these | |
121 programs can do meaningful cross-thread communication without changing | |
122 code. It also reflects the original code's intent and guarantees better | |
123 portability. | |
124 | |
125 Relaxed ordering could be used instead, but for the first release it is | |
126 more conservative to apply sequential consistency. Future releases may | |
127 change what happens at compile-time, but already-released pexes will | |
128 continue using sequential consistency. | |
129 | |
130 The PNaCl toolchain also requires that ``volatile`` accesses be at least | |
131 naturally aligned, and tries to guarantee this alignment. | |
110 | 132 |
111 Memory Model for Concurrent Operations | 133 Memory Model for Concurrent Operations |
112 -------------------------------------- | 134 -------------------------------------- |
113 | 135 |
114 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ | 136 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ |
115 | 137 |
116 TODO. | 138 The memory model offered by PNaCl relies the same coding guidelines as |
eliben
2013/06/27 19:32:24
"relies on the same" ?
JF
2013/06/27 21:03:17
Done.
| |
139 the C11/C++11 one: concurrent accesses must always occur through atomic | |
140 primitives, and these accesses must always occur with the same size for | |
141 the same memory locations. Visibility of stores is provided on a | |
142 happens-before basis that relates memory locations to each other as the | |
143 C11/C++11 standards do. | |
144 | |
145 As in C11/C++11 some atomic accesses may be implemented with locks on | |
146 certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be | |
147 ``1``, signifying that all types are sometimes lock-free. The | |
148 ``is_lock_free`` methods will return the current platform's | |
149 implementation at runtime. | |
150 | |
151 The PNaCl toolchain supports concurrent memory accesses through legacy | |
152 GCC-style ``__sync_*`` builtins, as well as through C11/C++11 atomic | |
153 primitives. ``volatile`` memory accesses can also be used, though these | |
154 are discouraged. | |
155 | |
156 Note that PNaCl explicitly supports concurrency through threading and | |
157 inter-process communication (shared memory), but doesn't support | |
158 interacting with device memory. Setting these up require assistance from | |
159 the embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using | |
160 them once setup can be done through regular C/C++ code. | |
161 | |
162 PNaCl also doesn't currently support signal handling, and therefore | |
163 promotes all primitives to cross-thread (instead of single-thread). This | |
164 may change at a later date. | |
165 | |
166 The PNaCl toolchain currently optimizes for memory ordering as LLVM | |
167 normally does, but at pexe creation time it promotes all ``volatile`` | |
168 accesses as well as all atomic accesses to be sequentially | |
169 consistent. Other memory orderings will be supported in a future | |
170 release, but pexes generate with the current toolchain will continue | |
171 functioning with sequential consistency. Using sequential consistency | |
172 provides a total ordering for all sequentially-consistent operations on | |
173 all addresses. | |
174 | |
175 This means that ``volatile`` and atomic memory accesses can only be | |
176 re-ordered in some limited way before the pexe is created, and will act | |
177 as fences for all memory accesses (even non-atomic and non-``volatile``) | |
178 after pexe creation. Non-atomic and non-``volatile`` memory accesses may | |
179 be reordered (unless a fence intervenes), separated, elided or fused | |
180 according to C and C++'s memory model before the pexe is created as well | |
181 as after its creation. | |
117 | 182 |
118 Atomic Memory Ordering Constraints | 183 Atomic Memory Ordering Constraints |
119 ---------------------------------- | 184 ---------------------------------- |
120 | 185 |
121 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ | 186 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ |
122 | 187 |
123 TODO. | 188 Atomics follow the same ordering constraints as in regular LLVM, but all |
189 accesses are promoted to sequential consistency (the strongest memory | |
190 ordering) at pexe creation time. We may relax these rules and honor the | |
191 program's memory ordering constraints as more C11/C++11 code allows us | |
192 to understand performance and portability needs. | |
193 | |
194 As in C11/C++11: | |
195 | |
196 - Atomic and volatile accesses must at least be naturally aligned. | |
197 - Some accesses may not actually be atomic on certain platforms, | |
198 requiring an implementation that uses a global lock. | |
199 - An atomic memory location must always be accesses with atomic | |
eliben
2013/06/27 19:32:24
accessed
JF
2013/06/27 21:03:17
Done.
| |
200 primitives, and these primitives must always be of the same type for | |
201 that location. | |
124 | 202 |
125 Fast-Math Flags | 203 Fast-Math Flags |
126 --------------- | 204 --------------- |
127 | 205 |
128 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_ | 206 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_ |
129 | 207 |
130 Fast-math mode is not currently supported by the PNaCl bitcode. | 208 Fast-math mode is not currently supported by the PNaCl bitcode. |
131 | 209 |
132 Type System | 210 Type System |
133 =========== | 211 =========== |
(...skipping 129 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
263 | 341 |
264 .. code-block:: llvm | 342 .. code-block:: llvm |
265 | 343 |
266 %buf = alloca i8, i32 8, align 4 | 344 %buf = alloca i8, i32 8, align 4 |
267 | 345 |
268 * ``load``, ``store`` | 346 * ``load``, ``store`` |
269 | 347 |
270 The pointer argument of these instructions must be a *normalized* pointer | 348 The pointer argument of these instructions must be a *normalized* pointer |
271 (see :ref:`pointer types <pointertypes>`). | 349 (see :ref:`pointer types <pointertypes>`). |
272 | 350 |
273 * ``fence`` | |
274 * ``cmpxchg``, ``atomicrmw`` | |
275 | |
276 The pointer argument of these instructions must be a *normalized* pointer | |
277 (see :ref:`pointer types <pointertypes>`). | |
278 | |
279 TODO(jfb): this may change | |
280 | |
281 * ``trunc`` | 351 * ``trunc`` |
282 * ``zext`` | 352 * ``zext`` |
283 * ``sext`` | 353 * ``sext`` |
284 * ``fptrunc`` | 354 * ``fptrunc`` |
285 * ``fpext`` | 355 * ``fpext`` |
286 * ``fptoui`` | 356 * ``fptoui`` |
287 * ``fptosi`` | 357 * ``fptosi`` |
288 * ``uitofp`` | 358 * ``uitofp`` |
289 * ``sitofp`` | 359 * ``sitofp`` |
290 | 360 |
(...skipping 18 matching lines...) Expand all Loading... | |
309 * ``select`` | 379 * ``select`` |
310 * ``call`` | 380 * ``call`` |
311 | 381 |
312 Intrinsic Functions | 382 Intrinsic Functions |
313 =================== | 383 =================== |
314 | 384 |
315 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_ | 385 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_ |
316 | 386 |
317 The only intrinsics supported by PNaCl bitcode are the following. | 387 The only intrinsics supported by PNaCl bitcode are the following. |
318 | 388 |
319 TODO(jfb): atomics | |
320 | |
321 * ``llvm.memcpy`` | 389 * ``llvm.memcpy`` |
322 * ``llvm.memmove`` | 390 * ``llvm.memmove`` |
323 * ``llvm.memset`` | 391 * ``llvm.memset`` |
324 * ``llvm.bswap`` | 392 * ``llvm.bswap`` |
325 | 393 |
326 The llvm.bswap intrinsic is only supported with the following argument types: | 394 The llvm.bswap intrinsic is only supported with the following argument types: |
327 i16, i32, i64. | 395 i16, i32, i64. |
328 | 396 |
329 * ``llvm.ctlz`` | 397 * ``llvm.ctlz`` |
330 * ``llvm.cttz`` | 398 * ``llvm.cttz`` |
331 * ``llvm.ctpop`` | 399 * ``llvm.ctpop`` |
332 | 400 |
333 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support | 401 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support |
334 i32 and i64 argument types (the types supported by C-style GCC builtins). | 402 i32 and i64 argument types (the types supported by C-style GCC builtins). |
335 | 403 |
336 * ``llvm.trap`` | 404 * ``llvm.trap`` |
337 * ``llvm.nacl.read.tp`` | 405 * ``llvm.nacl.read.tp`` |
338 | 406 |
339 TODO: describe | 407 TODO: describe |
340 | 408 |
341 * ``llvm.nacl.longjmp`` | 409 * ``llvm.nacl.longjmp`` |
342 | 410 |
343 TODO: describe | 411 TODO: describe |
344 | 412 |
345 * ``llvm.nacl.setjmp`` | 413 * ``llvm.nacl.setjmp`` |
346 | 414 |
347 TODO: describe | 415 TODO: describe |
348 | 416 |
417 * ``llvm.nacl.atomic.store`` | |
418 * ``llvm.nacl.atomic.load`` | |
419 * ``llvm.nacl.atomic.rmw`` | |
420 * ``llvm.nacl.atomic.cmpxchg`` | |
421 * ``llvm.nacl.atomic.fence`` | |
422 | |
423 These intrinsics allow PNaCl to support C11/C++11 style atomic | |
424 operations as well as some legacy GCC-style ``__sync_*`` builtins | |
425 while remaining stable as the LLVM codebase changes. The user isn't | |
426 expected to use these intrinsics directly. | |
427 | |
428 :: | |
429 | |
430 declare void @llvm.nacl.atomic.store( | |
431 iN* <dest>, iN <val>, i32 <memory_order>) | |
432 declare iN @llvm.nacl.atomic.load( | |
433 iN* <src>, i32 <memory_order>) | |
434 declare iN @llvm.nacl.atomic.rmw( | |
435 i32 <op>, iN* <loc>, iN <val>, i32 <memory_order>) | |
436 declare iN @llvm.nacl.atomic.compare_exchange( | |
eliben
2013/06/27 19:32:24
name mismatch with above
JF
2013/06/27 21:03:17
Done. I went with the C11/C++11 naming and forgot
| |
437 iN* <loc>, iN <expected>, iN <desired>, | |
438 i32 <memory_order_success>, i32 <memory_order_failure>) | |
439 declare void @llvm.nacl.atomic.fence(i32 <memory_order>) | |
440 | |
441 Each of these intrinsic is overloaded on the ``iN`` argument. Integral | |
eliben
2013/06/27 19:32:24
intrinsics
JF
2013/06/27 21:03:17
Done.
| |
442 types of 8, 16, 32 and 64-bit width are currently supported for these | |
443 ``iN`` arguments. | |
444 | |
445 The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following | |
446 read-modify-write operations, from the general and arithmetic sections | |
447 of the C11/C++11 standards: | |
448 | |
449 - ``add`` | |
450 - ``sub`` | |
451 - ``or`` | |
452 - ``and`` | |
453 - ``xor`` | |
454 - ``exchange`` | |
455 | |
456 For all of these read-modify-write operations, the returned value is | |
457 that at ``loc`` before the operation. The ``op`` argument must be a | |
458 compile-time constant. | |
459 | |
460 All atomic intrinsics also support C11/C++11 memory orderings, which | |
461 must be compile-time constants: | |
462 | |
463 - Relaxed: no operation orders memory. | |
464 - Consume: a load operation performs a consume operation on the | |
465 affected memory location (currently unsupported by LLVM). | |
466 - Acquire: a load operation performs an acquire operation on the | |
467 affected memory location. | |
468 - Release: a store operation performs a release operation on the | |
469 affected memory location. | |
470 - Acquire-release: load and store operations perform acquire and | |
471 release operations on the affected memory. | |
472 - Sequentially consistent: same as acquire-release, but providing a | |
473 global total ordering for all affected locations. | |
474 | |
475 Note that PNaCl currently strengthens all memory ordering | |
476 specifications to sequential consistency, the strongest form of memory | |
477 ordering. | |
478 | |
479 Values for these operations and memory orderings are defined in | |
480 llvm/IR/NaClIntrinsics.h. | |
OLD | NEW |