docs/PNaClLangRef.rst - Issue 17777004: Concurrency support for PNaCl ABI

Side by Side Diff: docs/PNaClLangRef.rst

Issue 17777004: Concurrency support for PNaCl ABI (Closed) Base URL: http://git.chromium.org/native_client/pnacl-llvm.git@master

Patch Set: Update PNaClLangRef to reflect the implementation work I will now go forward with. Created 7 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 ==============================	1 ==============================

2 PNaCl Bitcode Reference Manual	2 PNaCl Bitcode Reference Manual

3 ==============================	3 ==============================

4	4

5 .. contents::	5 .. contents::

6 :local:	6 :local:

7 :depth: 3	7 :depth: 3

8	8

9 Introduction	9 Introduction

10 ============	10 ============

(...skipping 88 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
99	99

100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_	100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_

101	101

102 PNaCl bitcode does not support inline assembly.	102 PNaCl bitcode does not support inline assembly.

103	103

104 Volatile Memory Accesses	104 Volatile Memory Accesses

105 ------------------------	105 ------------------------

106	106

107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_	107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_

108	108

109 TODO: are we going to promote volatile to atomic?	109 The C and C++ standards mandate that ``volatile`` accesses execute in
	eliben 2013/06/27 19:32:24 The C11 and C++11 standards... The C11 and C++11 standards... JF 2013/06/27 21:03:17 These restrictions are actually all in the previou Show quoted text On 2013/06/27 19:32:24, eliben wrote: > The C11 and C++11 standards... These restrictions are actually all in the previous standard IIRC: they don't talk about concurrency at all. I'll still update to C11/C++11 since it does look inconsistent.
	110 program order (but are not fences, so other memory operations can

	111 reorder around them), are not necessarily atomic, and can’t be elided or

	112 fused.

	113

	114 The PNaCl toolchain applies regular LLVM optimizations along these

	115 guidelines, but prevents any load/store (even non-``volatile`` and

	116 non-atomic ones) from moving past a volatile operations: they act as
	eliben 2013/06/27 19:32:24 "past volatile operations"? "past volatile operations"? JF 2013/06/27 21:03:17 I mean: a regular load/store can't move above or b Show quoted text On 2013/06/27 19:32:24, eliben wrote: > "past volatile operations"? I mean: a regular load/store can't move above or below a volatile load/store. Clarified.
	117 compiler barriers before optimizations occur. The PNaCl toolchain

	118 freezes ``volatile`` accesses after optimizations into atomic accesses

	119 with sequential consistency memory ordering. This eases the support of
	eliben 2013/06/27 19:32:24 sequentially consistent memory ordering? sequentially consistent memory ordering? JF 2013/06/27 21:03:17 Done. Show quoted text On 2013/06/27 19:32:24, eliben wrote: > sequentially consistent memory ordering? Done.
	120 legacy (i.e. non-C11/C++11) code, and combined with builtin fences these

	121 programs can do meaningful cross-thread communication without changing

	122 code. It also reflects the original code's intent and guarantees better

	123 portability.

	124

	125 Relaxed ordering could be used instead, but for the first release it is

	126 more conservative to apply sequential consistency. Future releases may

	127 change what happens at compile-time, but already-released pexes will

	128 continue using sequential consistency.

	129

	130 The PNaCl toolchain also requires that ``volatile`` accesses be at least

	131 naturally aligned, and tries to guarantee this alignment.

110	132

111 Memory Model for Concurrent Operations	133 Memory Model for Concurrent Operations

112 --------------------------------------	134 --------------------------------------

113	135

114 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_	136 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_

115	137

116 TODO.	138 The memory model offered by PNaCl relies the same coding guidelines as
	eliben 2013/06/27 19:32:24 "relies on the same" ? "relies on the same" ? JF 2013/06/27 21:03:17 Done. Show quoted text On 2013/06/27 19:32:24, eliben wrote: > "relies on the same" ? Done.
	139 the C11/C++11 one: concurrent accesses must always occur through atomic

	140 primitives, and these accesses must always occur with the same size for

	141 the same memory locations. Visibility of stores is provided on a

	142 happens-before basis that relates memory locations to each other as the

	143 C11/C++11 standards do.

	144

	145 As in C11/C++11 some atomic accesses may be implemented with locks on

	146 certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be

	147 ``1``, signifying that all types are sometimes lock-free. The

	148 ``is_lock_free`` methods will return the current platform's

	149 implementation at runtime.

	150

	151 The PNaCl toolchain supports concurrent memory accesses through legacy

	152 GCC-style ``__sync_*`` builtins, as well as through C11/C++11 atomic

	153 primitives. ``volatile`` memory accesses can also be used, though these

	154 are discouraged.

	155

	156 Note that PNaCl explicitly supports concurrency through threading and

	157 inter-process communication (shared memory), but doesn't support

	158 interacting with device memory. Setting these up require assistance from

	159 the embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using

	160 them once setup can be done through regular C/C++ code.

	161

	162 PNaCl also doesn't currently support signal handling, and therefore

	163 promotes all primitives to cross-thread (instead of single-thread). This

	164 may change at a later date.

	165

	166 The PNaCl toolchain currently optimizes for memory ordering as LLVM

	167 normally does, but at pexe creation time it promotes all ``volatile``

	168 accesses as well as all atomic accesses to be sequentially

	169 consistent. Other memory orderings will be supported in a future

	170 release, but pexes generate with the current toolchain will continue

	171 functioning with sequential consistency. Using sequential consistency

	172 provides a total ordering for all sequentially-consistent operations on

	173 all addresses.

	174

	175 This means that ``volatile`` and atomic memory accesses can only be

	176 re-ordered in some limited way before the pexe is created, and will act

	177 as fences for all memory accesses (even non-atomic and non-``volatile``)

	178 after pexe creation. Non-atomic and non-``volatile`` memory accesses may

	179 be reordered (unless a fence intervenes), separated, elided or fused

	180 according to C and C++'s memory model before the pexe is created as well

	181 as after its creation.

117	182

118 Atomic Memory Ordering Constraints	183 Atomic Memory Ordering Constraints

119 ----------------------------------	184 ----------------------------------

120	185

121 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_	186 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_

122	187

123 TODO.	188 Atomics follow the same ordering constraints as in regular LLVM, but all

	189 accesses are promoted to sequential consistency (the strongest memory

	190 ordering) at pexe creation time. We may relax these rules and honor the

	191 program's memory ordering constraints as more C11/C++11 code allows us

	192 to understand performance and portability needs.

	193

	194 As in C11/C++11:

	195

	196 - Atomic and volatile accesses must at least be naturally aligned.

	197 - Some accesses may not actually be atomic on certain platforms,

	198 requiring an implementation that uses a global lock.

	199 - An atomic memory location must always be accesses with atomic
	eliben 2013/06/27 19:32:24 accessed accessed JF 2013/06/27 21:03:17 Done. Show quoted text On 2013/06/27 19:32:24, eliben wrote: > accessed Done.
	200 primitives, and these primitives must always be of the same type for

	201 that location.

124	202

125 Fast-Math Flags	203 Fast-Math Flags

126 ---------------	204 ---------------

127	205

128 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_	206 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_

129	207

130 Fast-math mode is not currently supported by the PNaCl bitcode.	208 Fast-math mode is not currently supported by the PNaCl bitcode.

131	209

132 Type System	210 Type System

133 ===========	211 ===========

(...skipping 129 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
263	341

264 .. code-block:: llvm	342 .. code-block:: llvm

265	343

266 %buf = alloca i8, i32 8, align 4	344 %buf = alloca i8, i32 8, align 4

267	345

268 * ``load``, ``store``	346 * ``load``, ``store``

269	347

270 The pointer argument of these instructions must be a normalized pointer	348 The pointer argument of these instructions must be a normalized pointer

271 (see :ref:`pointer types <pointertypes>`).	349 (see :ref:`pointer types <pointertypes>`).

272	350

273 * ``fence``

274 * ``cmpxchg``, ``atomicrmw``

275

276 The pointer argument of these instructions must be a normalized pointer

277 (see :ref:`pointer types <pointertypes>`).

278

279 TODO(jfb): this may change

280

281 * ``trunc``	351 * ``trunc``

282 * ``zext``	352 * ``zext``

283 * ``sext``	353 * ``sext``

284 * ``fptrunc``	354 * ``fptrunc``

285 * ``fpext``	355 * ``fpext``

286 * ``fptoui``	356 * ``fptoui``

287 * ``fptosi``	357 * ``fptosi``

288 * ``uitofp``	358 * ``uitofp``

289 * ``sitofp``	359 * ``sitofp``

290	360

(...skipping 18 matching lines...) Expand all Loading...
309 * ``select``	379 * ``select``

310 * ``call``	380 * ``call``

311	381

312 Intrinsic Functions	382 Intrinsic Functions

313 ===================	383 ===================

314	384

315 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_	385 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_

316	386

317 The only intrinsics supported by PNaCl bitcode are the following.	387 The only intrinsics supported by PNaCl bitcode are the following.

318	388

319 TODO(jfb): atomics

320

321 * ``llvm.memcpy``	389 * ``llvm.memcpy``

322 * ``llvm.memmove``	390 * ``llvm.memmove``

323 * ``llvm.memset``	391 * ``llvm.memset``

324 * ``llvm.bswap``	392 * ``llvm.bswap``

325	393

326 The llvm.bswap intrinsic is only supported with the following argument types:	394 The llvm.bswap intrinsic is only supported with the following argument types:

327 i16, i32, i64.	395 i16, i32, i64.

328	396

329 * ``llvm.ctlz``	397 * ``llvm.ctlz``

330 * ``llvm.cttz``	398 * ``llvm.cttz``

331 * ``llvm.ctpop``	399 * ``llvm.ctpop``

332	400

333 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support	401 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support

334 i32 and i64 argument types (the types supported by C-style GCC builtins).	402 i32 and i64 argument types (the types supported by C-style GCC builtins).

335	403

336 * ``llvm.trap``	404 * ``llvm.trap``

337 * ``llvm.nacl.read.tp``	405 * ``llvm.nacl.read.tp``

338	406

339 TODO: describe	407 TODO: describe

340	408

341 * ``llvm.nacl.longjmp``	409 * ``llvm.nacl.longjmp``

342	410

343 TODO: describe	411 TODO: describe

344	412

345 * ``llvm.nacl.setjmp``	413 * ``llvm.nacl.setjmp``

346	414

347 TODO: describe	415 TODO: describe

348	416

	417 * ``llvm.nacl.atomic.store``

	418 * ``llvm.nacl.atomic.load``

	419 * ``llvm.nacl.atomic.rmw``

	420 * ``llvm.nacl.atomic.cmpxchg``

	421 * ``llvm.nacl.atomic.fence``

	422

	423 These intrinsics allow PNaCl to support C11/C++11 style atomic

	424 operations as well as some legacy GCC-style ``__sync_*`` builtins

	425 while remaining stable as the LLVM codebase changes. The user isn't

	426 expected to use these intrinsics directly.

	427

	428 ::

	429

	430 declare void @llvm.nacl.atomic.store(

	431 iN* <dest>, iN <val>, i32 <memory_order>)

	432 declare iN @llvm.nacl.atomic.load(

	433 iN* <src>, i32 <memory_order>)

	434 declare iN @llvm.nacl.atomic.rmw(

	435 i32 <op>, iN* <loc>, iN <val>, i32 <memory_order>)

	436 declare iN @llvm.nacl.atomic.compare_exchange(
	eliben 2013/06/27 19:32:24 name mismatch with above name mismatch with above JF 2013/06/27 21:03:17 Done. I went with the C11/C++11 naming and forgot Show quoted text On 2013/06/27 19:32:24, eliben wrote: > name mismatch with above Done. I went with the C11/C++11 naming and forgot one.
	437 iN* <loc>, iN <expected>, iN <desired>,

	438 i32 <memory_order_success>, i32 <memory_order_failure>)

	439 declare void @llvm.nacl.atomic.fence(i32 <memory_order>)

	440

	441 Each of these intrinsic is overloaded on the ``iN`` argument. Integral
	eliben 2013/06/27 19:32:24 intrinsics intrinsics JF 2013/06/27 21:03:17 Done. Show quoted text On 2013/06/27 19:32:24, eliben wrote: > intrinsics Done.
	442 types of 8, 16, 32 and 64-bit width are currently supported for these

	443 ``iN`` arguments.

	444

	445 The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following

	446 read-modify-write operations, from the general and arithmetic sections

	447 of the C11/C++11 standards:

	448

	449 - ``add``

	450 - ``sub``

	451 - ``or``

	452 - ``and``

	453 - ``xor``

	454 - ``exchange``

	455

	456 For all of these read-modify-write operations, the returned value is

	457 that at ``loc`` before the operation. The ``op`` argument must be a

	458 compile-time constant.

	459

	460 All atomic intrinsics also support C11/C++11 memory orderings, which

	461 must be compile-time constants:

	462

	463 - Relaxed: no operation orders memory.

	464 - Consume: a load operation performs a consume operation on the

	465 affected memory location (currently unsupported by LLVM).

	466 - Acquire: a load operation performs an acquire operation on the

	467 affected memory location.

	468 - Release: a store operation performs a release operation on the

	469 affected memory location.

	470 - Acquire-release: load and store operations perform acquire and

	471 release operations on the affected memory.

	472 - Sequentially consistent: same as acquire-release, but providing a

	473 global total ordering for all affected locations.

	474

	475 Note that PNaCl currently strengthens all memory ordering

	476 specifications to sequential consistency, the strongest form of memory

	477 ordering.

	478

	479 Values for these operations and memory orderings are defined in

	480 llvm/IR/NaClIntrinsics.h.

OLD	NEW

« no previous file with comments | « no previous file | include/llvm/IR/Intrinsics.td » ('j') | no next file with comments »