Index: gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml |
diff --git a/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml b/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml |
deleted file mode 100644 |
index 7cb2a05986f86ea2aa9c9a95c0f5df42bbca3f2c..0000000000000000000000000000000000000000 |
--- a/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml |
+++ /dev/null |
@@ -1,896 +0,0 @@ |
-<?xml version='1.0'?> |
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" |
- "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" |
-[ ]> |
- |
-<chapter id="manual.ext.parallel_mode" xreflabel="Parallel Mode"> |
-<?dbhtml filename="parallel_mode.html"?> |
- |
-<chapterinfo> |
- <keywordset> |
- <keyword> |
- C++ |
- </keyword> |
- <keyword> |
- library |
- </keyword> |
- <keyword> |
- parallel |
- </keyword> |
- </keywordset> |
-</chapterinfo> |
- |
-<title>Parallel Mode</title> |
- |
-<para> The libstdc++ parallel mode is an experimental parallel |
-implementation of many algorithms the C++ Standard Library. |
-</para> |
- |
-<para> |
-Several of the standard algorithms, for instance |
-<function>std::sort</function>, are made parallel using OpenMP |
-annotations. These parallel mode constructs and can be invoked by |
-explicit source declaration or by compiling existing sources with a |
-specific compiler flag. |
-</para> |
- |
- |
-<sect1 id="manual.ext.parallel_mode.intro" xreflabel="Intro"> |
- <title>Intro</title> |
- |
-<para>The following library components in the include |
-<filename class="headerfile">numeric</filename> are included in the parallel mode:</para> |
-<itemizedlist> |
- <listitem><para><function>std::accumulate</function></para></listitem> |
- <listitem><para><function>std::adjacent_difference</function></para></listitem> |
- <listitem><para><function>std::inner_product</function></para></listitem> |
- <listitem><para><function>std::partial_sum</function></para></listitem> |
-</itemizedlist> |
- |
-<para>The following library components in the include |
-<filename class="headerfile">algorithm</filename> are included in the parallel mode:</para> |
-<itemizedlist> |
- <listitem><para><function>std::adjacent_find</function></para></listitem> |
- <listitem><para><function>std::count</function></para></listitem> |
- <listitem><para><function>std::count_if</function></para></listitem> |
- <listitem><para><function>std::equal</function></para></listitem> |
- <listitem><para><function>std::find</function></para></listitem> |
- <listitem><para><function>std::find_if</function></para></listitem> |
- <listitem><para><function>std::find_first_of</function></para></listitem> |
- <listitem><para><function>std::for_each</function></para></listitem> |
- <listitem><para><function>std::generate</function></para></listitem> |
- <listitem><para><function>std::generate_n</function></para></listitem> |
- <listitem><para><function>std::lexicographical_compare</function></para></listitem> |
- <listitem><para><function>std::mismatch</function></para></listitem> |
- <listitem><para><function>std::search</function></para></listitem> |
- <listitem><para><function>std::search_n</function></para></listitem> |
- <listitem><para><function>std::transform</function></para></listitem> |
- <listitem><para><function>std::replace</function></para></listitem> |
- <listitem><para><function>std::replace_if</function></para></listitem> |
- <listitem><para><function>std::max_element</function></para></listitem> |
- <listitem><para><function>std::merge</function></para></listitem> |
- <listitem><para><function>std::min_element</function></para></listitem> |
- <listitem><para><function>std::nth_element</function></para></listitem> |
- <listitem><para><function>std::partial_sort</function></para></listitem> |
- <listitem><para><function>std::partition</function></para></listitem> |
- <listitem><para><function>std::random_shuffle</function></para></listitem> |
- <listitem><para><function>std::set_union</function></para></listitem> |
- <listitem><para><function>std::set_intersection</function></para></listitem> |
- <listitem><para><function>std::set_symmetric_difference</function></para></listitem> |
- <listitem><para><function>std::set_difference</function></para></listitem> |
- <listitem><para><function>std::sort</function></para></listitem> |
- <listitem><para><function>std::stable_sort</function></para></listitem> |
- <listitem><para><function>std::unique_copy</function></para></listitem> |
-</itemizedlist> |
- |
-</sect1> |
- |
-<sect1 id="manual.ext.parallel_mode.semantics" xreflabel="Semantics"> |
- <title>Semantics</title> |
- |
-<para> The parallel mode STL algorithms are currently not exception-safe, |
-i.e. user-defined functors must not throw exceptions. |
-Also, the order of execution is not guaranteed for some functions, of course. |
-Therefore, user-defined functors should not have any concurrent side effects. |
-</para> |
- |
-<para> Since the current GCC OpenMP implementation does not support |
-OpenMP parallel regions in concurrent threads, |
-it is not possible to call parallel STL algorithm in |
-concurrent threads, either. |
-It might work with other compilers, though.</para> |
- |
-</sect1> |
- |
-<sect1 id="manual.ext.parallel_mode.using" xreflabel="Using"> |
- <title>Using</title> |
- |
-<sect2 id="parallel_mode.using.prereq_flags" xreflabel="using.prereq_flags"> |
- <title>Prerequisite Compiler Flags</title> |
- |
-<para> |
- Any use of parallel functionality requires additional compiler |
- and runtime support, in particular support for OpenMP. Adding this support is |
- not difficult: just compile your application with the compiler |
- flag <literal>-fopenmp</literal>. This will link |
- in <code>libgomp</code>, the GNU |
- OpenMP <ulink url="http://gcc.gnu.org/onlinedocs/libgomp/">implementation</ulink>, |
- whose presence is mandatory. |
-</para> |
- |
-<para> |
-In addition, hardware that supports atomic operations and a compiler |
- capable of producing atomic operations is mandatory: GCC defaults to no |
- support for atomic operations on some common hardware |
- architectures. Activating atomic operations may require explicit |
- compiler flags on some targets (like sparc and x86), such |
- as <literal>-march=i686</literal>, |
- <literal>-march=native</literal> or <literal>-mcpu=v9</literal>. See |
- the GCC manual for more information. |
-</para> |
- |
-</sect2> |
- |
-<sect2 id="parallel_mode.using.parallel_mode" xreflabel="using.parallel_mode"> |
- <title>Using Parallel Mode</title> |
- |
-<para> |
- To use the libstdc++ parallel mode, compile your application with |
- the prerequisite flags as detailed above, and in addition |
- add <constant>-D_GLIBCXX_PARALLEL</constant>. This will convert all |
- use of the standard (sequential) algorithms to the appropriate parallel |
- equivalents. Please note that this doesn't necessarily mean that |
- everything will end up being executed in a parallel manner, but |
- rather that the heuristics and settings coded into the parallel |
- versions will be used to determine if all, some, or no algorithms |
- will be executed using parallel variants. |
-</para> |
- |
-<para>Note that the <constant>_GLIBCXX_PARALLEL</constant> define may change the |
- sizes and behavior of standard class templates such as |
- <function>std::search</function>, and therefore one can only link code |
- compiled with parallel mode and code compiled without parallel mode |
- if no instantiation of a container is passed between the two |
- translation units. Parallel mode functionality has distinct linkage, |
- and cannot be confused with normal mode symbols. |
-</para> |
-</sect2> |
- |
-<sect2 id="parallel_mode.using.specific" xreflabel="using.specific"> |
- <title>Using Specific Parallel Components</title> |
- |
-<para>When it is not feasible to recompile your entire application, or |
- only specific algorithms need to be parallel-aware, individual |
- parallel algorithms can be made available explicitly. These |
- parallel algorithms are functionally equivalent to the standard |
- drop-in algorithms used in parallel mode, but they are available in |
- a separate namespace as GNU extensions and may be used in programs |
- compiled with either release mode or with parallel mode. |
-</para> |
- |
- |
-<para>An example of using a parallel version |
-of <function>std::sort</function>, but no other parallel algorithms, is: |
-</para> |
- |
-<programlisting> |
-#include <vector> |
-#include <parallel/algorithm> |
- |
-int main() |
-{ |
- std::vector<int> v(100); |
- |
- // ... |
- |
- // Explicitly force a call to parallel sort. |
- __gnu_parallel::sort(v.begin(), v.end()); |
- return 0; |
-} |
-</programlisting> |
- |
-<para> |
-Then compile this code with the prerequisite compiler flags |
-(<literal>-fopenmp</literal> and any necessary architecture-specific |
-flags for atomic operations.) |
-</para> |
- |
-<para> The following table provides the names and headers of all the |
- parallel algorithms that can be used in a similar manner: |
-</para> |
- |
-<table frame='all'> |
-<title>Parallel Algorithms</title> |
-<tgroup cols='4' align='left' colsep='1' rowsep='1'> |
-<colspec colname='c1'></colspec> |
-<colspec colname='c2'></colspec> |
-<colspec colname='c3'></colspec> |
-<colspec colname='c4'></colspec> |
- |
-<thead> |
- <row> |
- <entry>Algorithm</entry> |
- <entry>Header</entry> |
- <entry>Parallel algorithm</entry> |
- <entry>Parallel header</entry> |
- </row> |
-</thead> |
- |
-<tbody> |
- <row> |
- <entry><function>std::accumulate</function></entry> |
- <entry><filename class="headerfile">numeric</filename></entry> |
- <entry><function>__gnu_parallel::accumulate</function></entry> |
- <entry><filename class="headerfile">parallel/numeric</filename></entry> |
- </row> |
- <row> |
- <entry><function>std::adjacent_difference</function></entry> |
- <entry><filename class="headerfile">numeric</filename></entry> |
- <entry><function>__gnu_parallel::adjacent_difference</function></entry> |
- <entry><filename class="headerfile">parallel/numeric</filename></entry> |
- </row> |
- <row> |
- <entry><function>std::inner_product</function></entry> |
- <entry><filename class="headerfile">numeric</filename></entry> |
- <entry><function>__gnu_parallel::inner_product</function></entry> |
- <entry><filename class="headerfile">parallel/numeric</filename></entry> |
- </row> |
- <row> |
- <entry><function>std::partial_sum</function></entry> |
- <entry><filename class="headerfile">numeric</filename></entry> |
- <entry><function>__gnu_parallel::partial_sum</function></entry> |
- <entry><filename class="headerfile">parallel/numeric</filename></entry> |
- </row> |
- <row> |
- <entry><function>std::adjacent_find</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::adjacent_find</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::count</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::count</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::count_if</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::count_if</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::equal</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::equal</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::find</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::find</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::find_if</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::find_if</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::find_first_of</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::find_first_of</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::for_each</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::for_each</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::generate</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::generate</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::generate_n</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::generate_n</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::lexicographical_compare</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::lexicographical_compare</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::mismatch</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::mismatch</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::search</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::search</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::search_n</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::search_n</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::transform</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::transform</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::replace</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::replace</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::replace_if</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::replace_if</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::max_element</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::max_element</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::merge</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::merge</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::min_element</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::min_element</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::nth_element</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::nth_element</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::partial_sort</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::partial_sort</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::partition</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::partition</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::random_shuffle</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::random_shuffle</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::set_union</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::set_union</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::set_intersection</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::set_intersection</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::set_symmetric_difference</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::set_symmetric_difference</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::set_difference</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::set_difference</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::sort</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::sort</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::stable_sort</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::stable_sort</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
- |
- <row> |
- <entry><function>std::unique_copy</function></entry> |
- <entry><filename class="headerfile">algorithm</filename></entry> |
- <entry><function>__gnu_parallel::unique_copy</function></entry> |
- <entry><filename class="headerfile">parallel/algorithm</filename></entry> |
- </row> |
-</tbody> |
-</tgroup> |
-</table> |
- |
-</sect2> |
- |
-</sect1> |
- |
-<sect1 id="manual.ext.parallel_mode.design" xreflabel="Design"> |
- <title>Design</title> |
- <para> |
- </para> |
-<sect2 id="manual.ext.parallel_mode.design.intro" xreflabel="Intro"> |
- <title>Interface Basics</title> |
- |
-<para> |
-All parallel algorithms are intended to have signatures that are |
-equivalent to the ISO C++ algorithms replaced. For instance, the |
-<function>std::adjacent_find</function> function is declared as: |
-</para> |
-<programlisting> |
-namespace std |
-{ |
- template<typename _FIter> |
- _FIter |
- adjacent_find(_FIter, _FIter); |
-} |
-</programlisting> |
- |
-<para> |
-Which means that there should be something equivalent for the parallel |
-version. Indeed, this is the case: |
-</para> |
- |
-<programlisting> |
-namespace std |
-{ |
- namespace __parallel |
- { |
- template<typename _FIter> |
- _FIter |
- adjacent_find(_FIter, _FIter); |
- |
- ... |
- } |
-} |
-</programlisting> |
- |
-<para>But.... why the ellipses? |
-</para> |
- |
-<para> The ellipses in the example above represent additional overloads |
-required for the parallel version of the function. These additional |
-overloads are used to dispatch calls from the ISO C++ function |
-signature to the appropriate parallel function (or sequential |
-function, if no parallel functions are deemed worthy), based on either |
-compile-time or run-time conditions. |
-</para> |
- |
-<para> The available signature options are specific for the different |
-algorithms/algorithm classes.</para> |
- |
-<para> The general view of overloads for the parallel algorithms look like this: |
-</para> |
-<itemizedlist> |
- <listitem><para>ISO C++ signature</para></listitem> |
- <listitem><para>ISO C++ signature + sequential_tag argument</para></listitem> |
- <listitem><para>ISO C++ signature + algorithm-specific tag type |
- (several signatures)</para></listitem> |
-</itemizedlist> |
- |
-<para> Please note that the implementation may use additional functions |
-(designated with the <code>_switch</code> suffix) to dispatch from the |
-ISO C++ signature to the correct parallel version. Also, some of the |
-algorithms do not have support for run-time conditions, so the last |
-overload is therefore missing. |
-</para> |
- |
- |
-</sect2> |
- |
-<sect2 id="manual.ext.parallel_mode.design.tuning" xreflabel="Tuning"> |
- <title>Configuration and Tuning</title> |
- |
- |
-<sect3 id="parallel_mode.design.tuning.omp" xreflabel="OpenMP Environment"> |
- <title>Setting up the OpenMP Environment</title> |
- |
-<para> |
-Several aspects of the overall runtime environment can be manipulated |
-by standard OpenMP function calls. |
-</para> |
- |
-<para> |
-To specify the number of threads to be used for the algorithms globally, |
-use the function <function>omp_set_num_threads</function>. An example: |
-</para> |
- |
-<programlisting> |
-#include <stdlib.h> |
-#include <omp.h> |
- |
-int main() |
-{ |
- // Explicitly set number of threads. |
- const int threads_wanted = 20; |
- omp_set_dynamic(false); |
- omp_set_num_threads(threads_wanted); |
- |
- // Call parallel mode algorithms. |
- |
- return 0; |
-} |
-</programlisting> |
- |
-<para> |
- Some algorithms allow the number of threads being set for a particular call, |
- by augmenting the algorithm variant. |
- See the next section for further information. |
-</para> |
- |
-<para> |
-Other parts of the runtime environment able to be manipulated include |
-nested parallelism (<function>omp_set_nested</function>), schedule kind |
-(<function>omp_set_schedule</function>), and others. See the OpenMP |
-documentation for more information. |
-</para> |
- |
-</sect3> |
- |
-<sect3 id="parallel_mode.design.tuning.compile" xreflabel="Compile Switches"> |
- <title>Compile Time Switches</title> |
- |
-<para> |
-To force an algorithm to execute sequentially, even though parallelism |
-is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>, |
-add <classname>__gnu_parallel::sequential_tag()</classname> to the end |
-of the algorithm's argument list. |
-</para> |
- |
-<para> |
-Like so: |
-</para> |
- |
-<programlisting> |
-std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag()); |
-</programlisting> |
- |
-<para> |
-Some parallel algorithm variants can be excluded from compilation by |
-preprocessor defines. See the doxygen documentation on |
-<code>compiletime_settings.h</code> and <code>features.h</code> for details. |
-</para> |
- |
-<para> |
-For some algorithms, the desired variant can be chosen at compile-time by |
-appending a tag object. The available options are specific to the particular |
-algorithm (class). |
-</para> |
- |
-<para> |
-For the "embarrassingly parallel" algorithms, there is only one "tag object |
-type", the enum _Parallelism. |
-It takes one of the following values, |
-<code>__gnu_parallel::parallel_tag</code>, |
-<code>__gnu_parallel::balanced_tag</code>, |
-<code>__gnu_parallel::unbalanced_tag</code>, |
-<code>__gnu_parallel::omp_loop_tag</code>, |
-<code>__gnu_parallel::omp_loop_static_tag</code>. |
-This means that the actual parallelization strategy is chosen at run-time. |
-(Choosing the variants at compile-time will come soon.) |
-</para> |
- |
-<para> |
-For the following algorithms in general, we have |
-<code>__gnu_parallel::parallel_tag</code> and |
-<code>__gnu_parallel::default_parallel_tag</code>, in addition to |
-<code>__gnu_parallel::sequential_tag</code>. |
-<code>__gnu_parallel::default_parallel_tag</code> chooses the default |
-algorithm at compiletime, as does omitting the tag. |
-<code>__gnu_parallel::parallel_tag</code> postpones the decision to runtime |
-(see next section). |
-For all tags, the number of threads desired for this call can optionally be |
-passed to the respective tag's constructor. |
-</para> |
- |
-<para> |
-The <code>multiway_merge</code> algorithm comes with the additional choices, |
-<code>__gnu_parallel::exact_tag</code> and |
-<code>__gnu_parallel::sampling_tag</code>. |
-Exact and sampling are the two available splitting strategies. |
-</para> |
- |
-<para> |
-For the <code>sort</code> and <code>stable_sort</code> algorithms, there are |
-several additional choices, namely |
-<code>__gnu_parallel::multiway_mergesort_tag</code>, |
-<code>__gnu_parallel::multiway_mergesort_exact_tag</code>, |
-<code>__gnu_parallel::multiway_mergesort_sampling_tag</code>, |
-<code>__gnu_parallel::quicksort_tag</code>, and |
-<code>__gnu_parallel::balanced_quicksort_tag</code>. |
-Multiway mergesort comes with the two splitting strategies for multi-way |
-merging. The quicksort options cannot be used for <code>stable_sort</code>. |
-</para> |
- |
-</sect3> |
- |
-<sect3 id="parallel_mode.design.tuning.settings" xreflabel="_Settings"> |
- <title>Run Time Settings and Defaults</title> |
- |
-<para> |
-The default parallelization strategy, the choice of specific algorithm |
-strategy, the minimum threshold limits for individual parallel |
-algorithms, and aspects of the underlying hardware can be specified as |
-desired via manipulation |
-of <classname>__gnu_parallel::_Settings</classname> member data. |
-</para> |
- |
-<para> |
-First off, the choice of parallelization strategy: serial, parallel, |
-or heuristically deduced. This corresponds |
-to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a |
-value of enum <type>__gnu_parallel::_AlgorithmStrategy</type> |
-type. Choices |
-include: <type>heuristic</type>, <type>force_sequential</type>, |
-and <type>force_parallel</type>. The default is <type>heuristic</type>. |
-</para> |
- |
- |
-<para> |
-Next, the sub-choices for algorithm variant, if not fixed at compile-time. |
-Specific algorithms like <function>find</function> or <function>sort</function> |
-can be implemented in multiple ways: when this is the case, |
-a <classname>__gnu_parallel::_Settings</classname> member exists to |
-pick the default strategy. For |
-example, <code>__gnu_parallel::_Settings::sort_algorithm</code> can |
-have any values of |
-enum <type>__gnu_parallel::_SortAlgorithm</type>: <type>MWMS</type>, <type>QS</type>, |
-or <type>QS_BALANCED</type>. |
-</para> |
- |
-<para> |
-Likewise for setting the minimal threshold for algorithm |
-parallelization. Parallelism always incurs some overhead. Thus, it is |
-not helpful to parallelize operations on very small sets of |
-data. Because of this, measures are taken to avoid parallelizing below |
-a certain, pre-determined threshold. For each algorithm, a minimum |
-problem size is encoded as a variable in the |
-active <classname>__gnu_parallel::_Settings</classname> object. This |
-threshold variable follows the following naming scheme: |
-<code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So, |
-for <function>fill</function>, the threshold variable |
-is <code>__gnu_parallel::_Settings::fill_minimal_n</code>, |
-</para> |
- |
-<para> |
-Finally, hardware details like L1/L2 cache size can be hardwired |
-via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends. |
-</para> |
- |
-<para> |
-</para> |
- |
-<para> |
-All these configuration variables can be changed by the user, if |
-desired. |
-There exists one global instance of the class <classname>_Settings</classname>, |
-i. e. it is a singleton. It can be read and written by calling |
-<code>__gnu_parallel::_Settings::get</code> and |
-<code>__gnu_parallel::_Settings::set</code>, respectively. |
-Please note that the first call return a const object, so direct manipulation |
-is forbidden. |
-See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html"> |
- <filename class="headerfile">settings.h</filename></ulink> |
-for complete details. |
-</para> |
- |
-<para> |
-A small example of tuning the default: |
-</para> |
- |
-<programlisting> |
-#include <parallel/algorithm> |
-#include <parallel/settings.h> |
- |
-int main() |
-{ |
- __gnu_parallel::_Settings s; |
- s.algorithm_strategy = __gnu_parallel::force_parallel; |
- __gnu_parallel::_Settings::set(s); |
- |
- // Do work... all algorithms will be parallelized, always. |
- |
- return 0; |
-} |
-</programlisting> |
- |
-</sect3> |
- |
-</sect2> |
- |
-<sect2 id="manual.ext.parallel_mode.design.impl" xreflabel="Impl"> |
- <title>Implementation Namespaces</title> |
- |
-<para> One namespace contain versions of code that are always |
-explicitly sequential: |
-<code>__gnu_serial</code>. |
-</para> |
- |
-<para> Two namespaces contain the parallel mode: |
-<code>std::__parallel</code> and <code>__gnu_parallel</code>. |
-</para> |
- |
-<para> Parallel implementations of standard components, including |
-template helpers to select parallelism, are defined in <code>namespace |
-std::__parallel</code>. For instance, <function>std::transform</function> from <filename class="headerfile">algorithm</filename> has a parallel counterpart in |
-<function>std::__parallel::transform</function> from <filename class="headerfile">parallel/algorithm</filename>. In addition, these parallel |
-implementations are injected into <code>namespace |
-__gnu_parallel</code> with using declarations. |
-</para> |
- |
-<para> Support and general infrastructure is in <code>namespace |
-__gnu_parallel</code>. |
-</para> |
- |
-<para> More information, and an organized index of types and functions |
-related to the parallel mode on a per-namespace basis, can be found in |
-the generated source documentation. |
-</para> |
- |
-</sect2> |
- |
-</sect1> |
- |
-<sect1 id="manual.ext.parallel_mode.test" xreflabel="Testing"> |
- <title>Testing</title> |
- |
- <para> |
- Both the normal conformance and regression tests and the |
- supplemental performance tests work. |
- </para> |
- |
- <para> |
- To run the conformance and regression tests with the parallel mode |
- active, |
- </para> |
- |
- <screen> |
- <userinput>make check-parallel</userinput> |
- </screen> |
- |
- <para> |
- The log and summary files for conformance testing are in the |
- <filename class="directory">testsuite/parallel</filename> directory. |
- </para> |
- |
- <para> |
- To run the performance tests with the parallel mode active, |
- </para> |
- |
- <screen> |
- <userinput>make check-performance-parallel</userinput> |
- </screen> |
- |
- <para> |
- The result file for performance testing are in the |
- <filename class="directory">testsuite</filename> directory, in the file |
- <filename>libstdc++_performance.sum</filename>. In addition, the |
- policy-based containers have their own visualizations, which have |
- additional software dependencies than the usual bare-boned text |
- file, and can be generated by using the <code>make |
- doc-performance</code> rule in the testsuite's Makefile. |
-</para> |
-</sect1> |
- |
-<bibliography id="parallel_mode.biblio" xreflabel="parallel_mode.biblio"> |
-<title>Bibliography</title> |
- |
- <biblioentry> |
- <title> |
- Parallelization of Bulk Operations for STL Dictionaries |
- </title> |
- |
- <author> |
- <firstname>Johannes</firstname> |
- <surname>Singler</surname> |
- </author> |
- <author> |
- <firstname>Leonor</firstname> |
- <surname>Frias</surname> |
- </author> |
- |
- <copyright> |
- <year>2007</year> |
- <holder></holder> |
- </copyright> |
- |
- <publisher> |
- <publishername> |
- Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS) |
- </publishername> |
- </publisher> |
- </biblioentry> |
- |
- <biblioentry> |
- <title> |
- The Multi-Core Standard Template Library |
- </title> |
- |
- <author> |
- <firstname>Johannes</firstname> |
- <surname>Singler</surname> |
- </author> |
- <author> |
- <firstname>Peter</firstname> |
- <surname>Sanders</surname> |
- </author> |
- <author> |
- <firstname>Felix</firstname> |
- <surname>Putze</surname> |
- </author> |
- |
- <copyright> |
- <year>2007</year> |
- <holder></holder> |
- </copyright> |
- |
- <publisher> |
- <publishername> |
- Euro-Par 2007: Parallel Processing. (LNCS 4641) |
- </publishername> |
- </publisher> |
- </biblioentry> |
- |
-</bibliography> |
- |
-</chapter> |