Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(160)

Side by Side Diff: native_client_sdk/src/doc/reference/pnacl-c-cpp-language-support.rst

Issue 265163004: NaCl documentation: update vector documentation (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/src
Patch Set: Created 6 years, 7 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
OLDNEW
1 ============================ 1 ============================
2 PNaCl C/C++ Language Support 2 PNaCl C/C++ Language Support
3 ============================ 3 ============================
4 4
5 .. contents:: 5 .. contents::
6 :local: 6 :local:
7 :backlinks: none 7 :backlinks: none
8 :depth: 3 8 :depth: 3
9 9
10 Source language support 10 Source language support
(...skipping 209 matching lines...) Expand 10 before | Expand all | Expand 10 after
220 <http://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors >`_ 220 <http://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors >`_
221 and `GCC vectors 221 and `GCC vectors
222 <http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html>`_ since these 222 <http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html>`_ since these
223 are well supported by different hardware platforms and don't require any 223 are well supported by different hardware platforms and don't require any
224 new compiler intrinsics. 224 new compiler intrinsics.
225 225
226 Vector types can be used through the ``vector_size`` attribute: 226 Vector types can be used through the ``vector_size`` attribute:
227 227
228 .. naclcode:: 228 .. naclcode::
229 229
230 typedef int v4s __attribute__((vector_size(16))); 230 #define VECTOR_BYTES 16
231 typedef int v4s __attribute__((vector_size(VECTOR_BYTES)));
231 v4s a = {1,2,3,4}; 232 v4s a = {1,2,3,4};
232 v4s b = {5,6,7,8}; 233 v4s b = {5,6,7,8};
233 v4s c, d, e; 234 v4s c, d, e;
234 c = b + 1; /* c = b + {1,1,1,1}; */ 235 c = b + 1; /* c = b + {1,1,1,1}; */
235 d = 2 * b; /* d = {2,2,2,2} * b; */ 236 d = 2 * b; /* d = {2,2,2,2} * b; */
236 e = c + d; 237 e = c + d;
237 238
238 Vector comparisons are represented as a bitmask as wide as the compared 239 Vector comparisons are represented as a bitmask as wide as the compared
239 elements of all ``0`` or all ``1``: 240 elements of all ``0`` or all ``1``:
240 241
(...skipping 65 matching lines...) Expand 10 before | Expand all | Expand 10 after
306 .. naclcode:: 307 .. naclcode::
307 308
308 typedef unsigned v4u __attribute__((vector_size(16))); 309 typedef unsigned v4u __attribute__((vector_size(16)));
309 template<typename T> 310 template<typename T>
310 void print(const T v) { 311 void print(const T v) {
311 for (size_t i = 0; i != sizeof(v) / sizeof(v[0]); ++i) 312 for (size_t i = 0; i != sizeof(v) / sizeof(v[0]); ++i)
312 std::cout << v[i] << ' '; 313 std::cout << v[i] << ' ';
313 std::cout << std::endl; 314 std::cout << std::endl;
314 } 315 }
315 316
316 Vector shuffles are currently unsupported but will be added soon. 317 Vector shuffles (often called permutation or swizzle) operations are
318 supported through ``__builtin_shufflevector``. The builtin has two
319 vector arguments of the same element type, followed by a list of
320 constant integers that specify the elements indices of the first two
321 vectors that should be extracted and returned in a new vector. These
322 element indices are numbered sequentially starting with the first
323 vector, continuing into the second vector. Thus, if ``vec1`` is a
324 4-element vector, index ``5`` would refer to the second element of
325 ``vec2``. An index of ``-1`` can be used to indicate that the
326 corresponding element in the returned vector is a don’t care and can be
327 optimized by the backend.
328
329 The result of ``__builtin_shufflevector`` is a vector with the same
330 element type as ``vec1`` / ``vec2`` but that has an element count equal
331 to the number of indices specified.
332
333 .. naclcode::
334
335 // identity operation - return 4-element vector v1.
336 __builtin_shufflevector(v1, v1, 0, 1, 2, 3)
337
338 // "Splat" element 0 of V1 into a 4-element result.
339 __builtin_shufflevector(V1, V1, 0, 0, 0, 0)
340
341 // Reverse 4-element vector V1.
342 __builtin_shufflevector(V1, V1, 3, 2, 1, 0)
343
344 // Concatenate every other element of 4-element vectors V1 and V2.
345 __builtin_shufflevector(V1, V2, 0, 2, 4, 6)
346
347 // Concatenate every other element of 8-element vectors V1 and V2.
348 __builtin_shufflevector(V1, V2, 0, 2, 4, 6, 8, 10, 12, 14)
349
350 // Shuffle v1 with some elements being undefined
351 __builtin_shufflevector(v1, v1, 3, -1, 1, -1)
317 352
318 Auto-Vectorization 353 Auto-Vectorization
319 ------------------ 354 ------------------
320 355
321 Auto-vectorization is currently not enabled for Portable Native Client, 356 Auto-vectorization is currently not enabled for Portable Native Client,
322 but will be in a future release. 357 but will be in a future release.
323 358
324 Undefined Behavior 359 Undefined Behavior
325 ================== 360 ==================
326 361
(...skipping 76 matching lines...) Expand 10 before | Expand all | Expand 10 after
403 A similar feature is **thread suspension**: The ability to 438 A similar feature is **thread suspension**: The ability to
404 asynchronously suspend and resume a thread and inspect or modify its 439 asynchronously suspend and resume a thread and inspect or modify its
405 execution state (such as register state). 440 execution state (such as register state).
406 441
407 Neither PNaCl nor NaCl currently support asynchronous interruption 442 Neither PNaCl nor NaCl currently support asynchronous interruption
408 or suspension of threads. 443 or suspension of threads.
409 444
410 If PNaCl were to support either of these, the interaction of 445 If PNaCl were to support either of these, the interaction of
411 ``volatile`` and atomics with same-thread signal handling would need 446 ``volatile`` and atomics with same-thread signal handling would need
412 to be carefully detailed. 447 to be carefully detailed.
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698