docs/PNaClLangRef.rst - Issue 17777004: Concurrency support for PNaCl ABI

Side by Side Diff: docs/PNaClLangRef.rst

Issue 17777004: Concurrency support for PNaCl ABI (Closed) Base URL: http://git.chromium.org/native_client/pnacl-llvm.git@master

Patch Set: Fix whitespace. Created 7 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 ==============================	1 ==============================

2 PNaCl Bitcode Reference Manual	2 PNaCl Bitcode Reference Manual

3 ==============================	3 ==============================

4	4

5 .. contents::	5 .. contents::

6 :local:	6 :local:

7 :depth: 3	7 :depth: 3

8	8

9 Introduction	9 Introduction

10 ============	10 ============

(...skipping 88 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
99	99

100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_	100 `LLVM LangRef: Module-Level Inline Assembly <LangRef.html#moduleasm>`_

101	101

102 PNaCl bitcode does not support inline assembly.	102 PNaCl bitcode does not support inline assembly.

103	103

104 Volatile Memory Accesses	104 Volatile Memory Accesses

105 ------------------------	105 ------------------------

106	106

107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_	107 `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_

108	108

109 TODO: are we going to promote volatile to atomic?	109 We recommend that C11/C++11 atomics be used instead of ``volatile``.

	110

	111 The C and C++ standards mandate that ``volatile`` accesses execute in

	112 program order (but are not fences, so other memory operations can

	113 reorder around them), are not necessarily atomic, and can’t be elided or

	114 fused.

	115

	116 The PNaCl toolchain applies regular LLVM optimizations along these

	117 guidelines, and the PNaCl then toolchain freezes ``volatile`` accesses

	118 into atomic accesses with sequential consistency memory ordering. This

	119 eases the support of legacy (i.e. non-C11/C++11) code, and combined with

	120 builtin fences these programs can do meaningful cross-thread

	121 communication without changing code.

	122

	123 Relaxed ordering could be used instead, but for the first release it is

	124 more conservative to apply sequential consistency. Future releases may

	125 change what happens at compile-time, but already-release pexes will
	Derek Schuff 2013/06/26 17:03:29 release->released release->released JF 2013/06/26 23:41:12 Done. Show quoted text On 2013/06/26 17:03:29, Derek Schuff wrote: > release->released Done.
	126 continue using sequential consistency.

	127

	128 The PNaCl toolchain also tries to guarantee natural alignment of

	129 ``volatile`` accesses, a requirement for atomicity on some platforms.

110	130

111 Memory Model for Concurrent Operations	131 Memory Model for Concurrent Operations

112 --------------------------------------	132 --------------------------------------

113	133

114 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_	134 `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_

115	135

116 TODO.	136 The PNaCl toolchain currently supports concurrent memory accesses

	137 through legacy GCC-style ``__sync_*`` builtins, as well as through

	138 C11/C++11 atomic primitives. ``volatile`` memory accesses can also be

	139 used, though these are discouraged.

	140

	141 Note that PNaCl explicitly supports concurrency through threading, but

	142 doesn't support interacting with device memory, nor does it attempt to

	143 support cross-program communication, including through shared

	144 memory. These concerns are left up to the embedding sandbox's runtime

	145 (e.g. NaCl's Pepper APIs).

	146

	147 PNaCl also doesn't currently support signal handling, and therefore

	148 promotes all primitives to cross-thread (instead of single-thread). This

	149 may change at a later date.

	150

	151 The PNaCl toolchain currently optimizes for memory ordering as LLVM

	152 normally does, but at pexe creation time it promotes all ``volatile``

	153 accesses as well as all atomic accesses to be sequentially consistent.

	154

	155 This means that ``volatile`` and atomic memory accesses can only be

	156 re-ordered before the pexe is created, and will act as fences for all

	157 memory accesses (even non-atomic and non-``volatile``) after pexe

	158 creation. Non-atomic and non-``volatile`` memory accesses may be

	159 reordered (unless a fence intervenes), separate, elided or fused
	Derek Schuff 2013/06/26 17:03:29 separate->separated? separate->separated? JF 2013/06/26 23:41:12 Done. Show quoted text On 2013/06/26 17:03:29, Derek Schuff wrote: > separate->separated? Done.
	160 according to C and C++'s memory model before the pexe is created as well

	161 as after its creation.

117	162

118 Atomic Memory Ordering Constraints	163 Atomic Memory Ordering Constraints

119 ----------------------------------	164 ----------------------------------

120	165

121 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_	166 `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_

122	167

123 TODO.	168 Atomics follow the same ordering constraints as in regular LLVM, but all

	169 accesses are promoted to sequential consistency (the strongest memory

	170 ordering) at pexe creation time. We may relax these rules and honor the

	171 program's memory ordering constraints as more C11/C++11 code allows us

	172 to understand performance and portability needs.

	173

	174 As in C11/C++11:

	175

	176 - Atomic accesses must at least be naturally aligned.

	177 - Some accesses may not actually be atomic on certain platforms,

	178 requiring an implementation that uses a global lock.

124	179

125 Fast-Math Flags	180 Fast-Math Flags

126 ---------------	181 ---------------

127	182

128 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_	183 `LLVM LangRef: Fast-Math Flags <LangRef.html#fastmath>`_

129	184

130 Fast-math mode is not currently supported by the PNaCl bitcode.	185 Fast-math mode is not currently supported by the PNaCl bitcode.

131	186

132 Type System	187 Type System

133 ===========	188 ===========

(...skipping 129 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
263	318

264 .. code-block:: llvm	319 .. code-block:: llvm

265	320

266 %buf = alloca i8, i32 8, align 4	321 %buf = alloca i8, i32 8, align 4

267	322

268 * ``load``, ``store``	323 * ``load``, ``store``

269	324

270 The pointer argument of these instructions must be a normalized pointer	325 The pointer argument of these instructions must be a normalized pointer

271 (see :ref:`pointer types <pointertypes>`).	326 (see :ref:`pointer types <pointertypes>`).

272	327

273 * ``fence``

274 * ``cmpxchg``, ``atomicrmw``

275

276 The pointer argument of these instructions must be a normalized pointer

277 (see :ref:`pointer types <pointertypes>`).

278

279 TODO(jfb): this may change

280

281 * ``trunc``	328 * ``trunc``

282 * ``zext``	329 * ``zext``

283 * ``sext``	330 * ``sext``

284 * ``fptrunc``	331 * ``fptrunc``

285 * ``fpext``	332 * ``fpext``

286 * ``fptoui``	333 * ``fptoui``

287 * ``fptosi``	334 * ``fptosi``

288 * ``uitofp``	335 * ``uitofp``

289 * ``sitofp``	336 * ``sitofp``

290	337

(...skipping 18 matching lines...) Expand all Loading...
309 * ``select``	356 * ``select``

310 * ``call``	357 * ``call``

311	358

312 Intrinsic Functions	359 Intrinsic Functions

313 ===================	360 ===================

314	361

315 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_	362 `LLVM LangRef: Intrinsic Functions <LangRef.html#intrinsics>`_

316	363

317 The only intrinsics supported by PNaCl bitcode are the following.	364 The only intrinsics supported by PNaCl bitcode are the following.

318	365

319 TODO(jfb): atomics

320

321 * ``llvm.memcpy``	366 * ``llvm.memcpy``

322 * ``llvm.memmove``	367 * ``llvm.memmove``

323 * ``llvm.memset``	368 * ``llvm.memset``

324 * ``llvm.bswap``	369 * ``llvm.bswap``

325	370

326 The llvm.bswap intrinsic is only supported with the following argument types:	371 The llvm.bswap intrinsic is only supported with the following argument types:

327 i16, i32, i64.	372 i16, i32, i64.

328	373

329 * ``llvm.ctlz``	374 * ``llvm.ctlz``

330 * ``llvm.cttz``	375 * ``llvm.cttz``

331 * ``llvm.ctpop``	376 * ``llvm.ctpop``

332	377

333 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support	378 The llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics only support

334 i32 and i64 argument types (the types supported by C-style GCC builtins).	379 i32 and i64 argument types (the types supported by C-style GCC builtins).

335	380

336 * ``llvm.trap``	381 * ``llvm.trap``

337 * ``llvm.nacl.read.tp``	382 * ``llvm.nacl.read.tp``

338	383

339 TODO: describe	384 TODO: describe

340	385

341 * ``llvm.nacl.longjmp``	386 * ``llvm.nacl.longjmp``

342	387

343 TODO: describe	388 TODO: describe

344	389

345 * ``llvm.nacl.setjmp``	390 * ``llvm.nacl.setjmp``

346	391

347 TODO: describe	392 TODO: describe

348	393

	394 * ``llvm.nacl.atomic.8``

	395 * ``llvm.nacl.atomic.16``

	396 * ``llvm.nacl.atomic.32``

	397 * ``llvm.nacl.atomic.64``

	398

	399 These intrinsics provide support for atomic accesses at 8, 16, 32 and

	400 64-bit sizes for primitives required to implement C11/C++11 atomic

	401 accesses:

	402

	403 - load

	404 - store

	405 - add

	406 - sub

	407 - or

	408 - and

	409 - xor

	410 - xchg

	411 - cmpxchg

	412 - fence

	413

	414 They also support C11/C++11 memory orderings:

	415

	416 - Relaxed: no operation orders memory.

	417 - Consume: a load operation performs a consume operation on the

	418 affected memory location (currently unsupported by LLVM).

	419 - Acquire: a load operation performs an acquire operation on the

	420 affected memory location.

	421 - Release: a store operation performs a release operation on the

	422 affected memory location.

	423 - Acquire-release: load and store operations perform acquire and

	424 release operations on the affected memory.

	425 - Sequentially consistent: same as acquire-release, but providing a

	426 total ordering for all affected locations.

	427

	428 Note that PNaCl currently strengthens all memory ordering

	429 specifications to sequential consistency, the strongest form of memory

	430 ordering.

OLD	NEW

« no previous file with comments | « no previous file | include/llvm/IR/Intrinsics.td » ('j') | lib/Analysis/NaCl/PNaClABIVerifyModule.cpp » ('J')