OLD | NEW |
---|---|
1 ============================== | 1 ============================== |
2 PNaCl Bitcode Reference Manual | 2 PNaCl Bitcode Reference Manual |
3 ============================== | 3 ============================== |
4 | 4 |
5 .. contents:: | 5 .. contents:: |
6 :local: | 6 :local: |
7 :depth: 3 | 7 :depth: 3 |
8 | 8 |
9 Introduction | 9 Introduction |
10 ============ | 10 ============ |
(...skipping 88 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
99 | 99 |
100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_ | 100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_ |
101 | 101 |
102 PNaCl bitcode does not support inline assembly. | 102 PNaCl bitcode does not support inline assembly. |
103 | 103 |
104 Volatile Memory Accesses | 104 Volatile Memory Accesses |
105 ------------------------ | 105 ------------------------ |
106 | 106 |
107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ | 107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ |
108 | 108 |
109 TODO: are we going to promote volatile to atomic? | 109 We recommend that C11/C++11 atomics be used instead of ``volatile``. |
110 | |
111 The C and C++ standards mandate that ``volatile`` accesses execute in | |
112 program order (but are not fences, so other memory operations can | |
113 reorder around them), are not necessarily atomic, and can’t be elided or | |
114 fused. | |
115 | |
116 The PNaCl toolchain applies regular LLVM optimizations along these | |
117 guidelines, and the PNaCl then toolchain freezes ``volatile`` accesses | |
118 into atomic accesses with sequential consistency memory ordering. This | |
119 eases the support of legacy (i.e. non-C11/C++11) code, and combined with | |
120 builtin fences these programs can do meaningful cross-thread | |
121 communication without changing code. | |
122 | |
123 Relaxed ordering could be used instead, but for the first release it is | |
124 more conservative to apply sequential consistency. Future releases may | |
125 change what happens at compile-time, but already-release pexes will | |
Derek Schuff
2013/06/26 17:03:29
release->released
JF
2013/06/26 23:41:12
Done.
| |
126 continue using sequential consistency. | |
127 | |
128 The PNaCl toolchain also tries to guarantee natural alignment of | |
129 ``volatile`` accesses, a requirement for atomicity on some platforms. | |
110 | 130 |
111 Memory Model for Concurrent Operations | 131 Memory Model for Concurrent Operations |
112 -------------------------------------- | 132 -------------------------------------- |
113 | 133 |
114 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ | 134 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ |
115 | 135 |
116 TODO. | 136 The PNaCl toolchain currently supports concurrent memory accesses |
137 through legacy GCC-style ``__sync_*`` builtins, as well as through | |
138 C11/C++11 atomic primitives. ``volatile`` memory accesses can also be | |
139 used, though these are discouraged. | |
140 | |
141 Note that PNaCl explicitly supports concurrency through threading, but | |
142 doesn't support interacting with device memory, nor does it attempt to | |
143 support cross-program communication, including through shared | |
144 memory. These concerns are left up to the embedding sandbox's runtime | |
145 (e.g. NaCl's Pepper APIs). | |
146 | |
147 PNaCl also doesn't currently support signal handling, and therefore | |
148 promotes all primitives to cross-thread (instead of single-thread). This | |
149 may change at a later date. | |
150 | |
151 The PNaCl toolchain currently optimizes for memory ordering as LLVM | |
152 normally does, but at pexe creation time it promotes all ``volatile`` | |
153 accesses as well as all atomic accesses to be sequentially consistent. | |
154 | |
155 This means that ``volatile`` and atomic memory accesses can only be | |
156 re-ordered before the pexe is created, and will act as fences for all | |
157 memory accesses (even non-atomic and non-``volatile``) after pexe | |
158 creation. Non-atomic and non-``volatile`` memory accesses may be | |
159 reordered (unless a fence intervenes), separate, elided or fused | |
Derek Schuff
2013/06/26 17:03:29
separate->separated?
JF
2013/06/26 23:41:12
Done.
| |
160 according to C and C++'s memory model before the pexe is created as well | |
161 as after its creation. | |
117 | 162 |
118 Atomic Memory Ordering Constraints | 163 Atomic Memory Ordering Constraints |
119 ---------------------------------- | 164 ---------------------------------- |
120 | 165 |
121 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ | 166 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ |
122 | 167 |
123 TODO. | 168 Atomics follow the same ordering constraints as in regular LLVM, but all |
169 accesses are promoted to sequential consistency (the strongest memory | |
170 ordering) at pexe creation time. We may relax these rules and honor the | |
171 program's memory ordering constraints as more C11/C++11 code allows us | |
172 to understand performance and portability needs. | |
173 | |
174 As in C11/C++11: | |
175 | |
176 - Atomic accesses must at least be naturally aligned. | |
177 - Some accesses may not actually be atomic on certain platforms, | |
178 requiring an implementation that uses a global lock. | |
124 | 179 |
125 Fast-Math Flags | 180 Fast-Math Flags |
126 --------------- | 181 --------------- |
127 | 182 |
128 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_ | 183 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_ |
129 | 184 |
130 Fast-math mode is not currently supported by the PNaCl bitcode. | 185 Fast-math mode is not currently supported by the PNaCl bitcode. |
131 | 186 |
132 Type System | 187 Type System |
133 =========== | 188 =========== |
(...skipping 129 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... | |
263 | 318 |
264 .. code-block:: llvm | 319 .. code-block:: llvm |
265 | 320 |
266 %buf = alloca i8, i32 8, align 4 | 321 %buf = alloca i8, i32 8, align 4 |
267 | 322 |
268 * ``load``, ``store`` | 323 * ``load``, ``store`` |
269 | 324 |
270 The pointer argument of these instructions must be a *normalized* pointer | 325 The pointer argument of these instructions must be a *normalized* pointer |
271 (see :ref:`pointer types <pointertypes>`). | 326 (see :ref:`pointer types <pointertypes>`). |
272 | 327 |
273 * ``fence`` | |
274 * ``cmpxchg``, ``atomicrmw`` | |
275 | |
276 The pointer argument of these instructions must be a *normalized* pointer | |
277 (see :ref:`pointer types <pointertypes>`). | |
278 | |
279 TODO(jfb): this may change | |
280 | |
281 * ``trunc`` | 328 * ``trunc`` |
282 * ``zext`` | 329 * ``zext`` |
283 * ``sext`` | 330 * ``sext`` |
284 * ``fptrunc`` | 331 * ``fptrunc`` |
285 * ``fpext`` | 332 * ``fpext`` |
286 * ``fptoui`` | 333 * ``fptoui`` |
287 * ``fptosi`` | 334 * ``fptosi`` |
288 * ``uitofp`` | 335 * ``uitofp`` |
289 * ``sitofp`` | 336 * ``sitofp`` |
290 | 337 |
(...skipping 18 matching lines...) Expand all Loading... | |
309 * ``select`` | 356 * ``select`` |
310 * ``call`` | 357 * ``call`` |
311 | 358 |
312 Intrinsic Functions | 359 Intrinsic Functions |
313 =================== | 360 =================== |
314 | 361 |
315 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_ | 362 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_ |
316 | 363 |
317 The only intrinsics supported by PNaCl bitcode are the following. | 364 The only intrinsics supported by PNaCl bitcode are the following. |
318 | 365 |
319 TODO(jfb): atomics | |
320 | |
321 * ``llvm.memcpy`` | 366 * ``llvm.memcpy`` |
322 * ``llvm.memmove`` | 367 * ``llvm.memmove`` |
323 * ``llvm.memset`` | 368 * ``llvm.memset`` |
324 * ``llvm.bswap`` | 369 * ``llvm.bswap`` |
325 | 370 |
326 The llvm.bswap intrinsic is only supported with the following argument types: | 371 The llvm.bswap intrinsic is only supported with the following argument types: |
327 i16, i32, i64. | 372 i16, i32, i64. |
328 | 373 |
329 * ``llvm.ctlz`` | 374 * ``llvm.ctlz`` |
330 * ``llvm.cttz`` | 375 * ``llvm.cttz`` |
331 * ``llvm.ctpop`` | 376 * ``llvm.ctpop`` |
332 | 377 |
333 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support | 378 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support |
334 i32 and i64 argument types (the types supported by C-style GCC builtins). | 379 i32 and i64 argument types (the types supported by C-style GCC builtins). |
335 | 380 |
336 * ``llvm.trap`` | 381 * ``llvm.trap`` |
337 * ``llvm.nacl.read.tp`` | 382 * ``llvm.nacl.read.tp`` |
338 | 383 |
339 TODO: describe | 384 TODO: describe |
340 | 385 |
341 * ``llvm.nacl.longjmp`` | 386 * ``llvm.nacl.longjmp`` |
342 | 387 |
343 TODO: describe | 388 TODO: describe |
344 | 389 |
345 * ``llvm.nacl.setjmp`` | 390 * ``llvm.nacl.setjmp`` |
346 | 391 |
347 TODO: describe | 392 TODO: describe |
348 | 393 |
394 * ``llvm.nacl.atomic.8`` | |
395 * ``llvm.nacl.atomic.16`` | |
396 * ``llvm.nacl.atomic.32`` | |
397 * ``llvm.nacl.atomic.64`` | |
398 | |
399 These intrinsics provide support for atomic accesses at 8, 16, 32 and | |
400 64-bit sizes for primitives required to implement C11/C++11 atomic | |
401 accesses: | |
402 | |
403 - load | |
404 - store | |
405 - add | |
406 - sub | |
407 - or | |
408 - and | |
409 - xor | |
410 - xchg | |
411 - cmpxchg | |
412 - fence | |
413 | |
414 They also support C11/C++11 memory orderings: | |
415 | |
416 - Relaxed: no operation orders memory. | |
417 - Consume: a load operation performs a consume operation on the | |
418 affected memory location (currently unsupported by LLVM). | |
419 - Acquire: a load operation performs an acquire operation on the | |
420 affected memory location. | |
421 - Release: a store operation performs a release operation on the | |
422 affected memory location. | |
423 - Acquire-release: load and store operations perform acquire and | |
424 release operations on the affected memory. | |
425 - Sequentially consistent: same as acquire-release, but providing a | |
426 total ordering for all affected locations. | |
427 | |
428 Note that PNaCl currently strengthens all memory ordering | |
429 specifications to sequential consistency, the strongest form of memory | |
430 ordering. | |
OLD | NEW |