| Index: gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml
|
| diff --git a/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml b/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml
|
| deleted file mode 100644
|
| index 7cb2a05986f86ea2aa9c9a95c0f5df42bbca3f2c..0000000000000000000000000000000000000000
|
| --- a/gcc/libstdc++-v3/doc/xml/manual/parallel_mode.xml
|
| +++ /dev/null
|
| @@ -1,896 +0,0 @@
|
| -<?xml version='1.0'?>
|
| -<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
|
| - "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
|
| -[ ]>
|
| -
|
| -<chapter id="manual.ext.parallel_mode" xreflabel="Parallel Mode">
|
| -<?dbhtml filename="parallel_mode.html"?>
|
| -
|
| -<chapterinfo>
|
| - <keywordset>
|
| - <keyword>
|
| - C++
|
| - </keyword>
|
| - <keyword>
|
| - library
|
| - </keyword>
|
| - <keyword>
|
| - parallel
|
| - </keyword>
|
| - </keywordset>
|
| -</chapterinfo>
|
| -
|
| -<title>Parallel Mode</title>
|
| -
|
| -<para> The libstdc++ parallel mode is an experimental parallel
|
| -implementation of many algorithms the C++ Standard Library.
|
| -</para>
|
| -
|
| -<para>
|
| -Several of the standard algorithms, for instance
|
| -<function>std::sort</function>, are made parallel using OpenMP
|
| -annotations. These parallel mode constructs and can be invoked by
|
| -explicit source declaration or by compiling existing sources with a
|
| -specific compiler flag.
|
| -</para>
|
| -
|
| -
|
| -<sect1 id="manual.ext.parallel_mode.intro" xreflabel="Intro">
|
| - <title>Intro</title>
|
| -
|
| -<para>The following library components in the include
|
| -<filename class="headerfile">numeric</filename> are included in the parallel mode:</para>
|
| -<itemizedlist>
|
| - <listitem><para><function>std::accumulate</function></para></listitem>
|
| - <listitem><para><function>std::adjacent_difference</function></para></listitem>
|
| - <listitem><para><function>std::inner_product</function></para></listitem>
|
| - <listitem><para><function>std::partial_sum</function></para></listitem>
|
| -</itemizedlist>
|
| -
|
| -<para>The following library components in the include
|
| -<filename class="headerfile">algorithm</filename> are included in the parallel mode:</para>
|
| -<itemizedlist>
|
| - <listitem><para><function>std::adjacent_find</function></para></listitem>
|
| - <listitem><para><function>std::count</function></para></listitem>
|
| - <listitem><para><function>std::count_if</function></para></listitem>
|
| - <listitem><para><function>std::equal</function></para></listitem>
|
| - <listitem><para><function>std::find</function></para></listitem>
|
| - <listitem><para><function>std::find_if</function></para></listitem>
|
| - <listitem><para><function>std::find_first_of</function></para></listitem>
|
| - <listitem><para><function>std::for_each</function></para></listitem>
|
| - <listitem><para><function>std::generate</function></para></listitem>
|
| - <listitem><para><function>std::generate_n</function></para></listitem>
|
| - <listitem><para><function>std::lexicographical_compare</function></para></listitem>
|
| - <listitem><para><function>std::mismatch</function></para></listitem>
|
| - <listitem><para><function>std::search</function></para></listitem>
|
| - <listitem><para><function>std::search_n</function></para></listitem>
|
| - <listitem><para><function>std::transform</function></para></listitem>
|
| - <listitem><para><function>std::replace</function></para></listitem>
|
| - <listitem><para><function>std::replace_if</function></para></listitem>
|
| - <listitem><para><function>std::max_element</function></para></listitem>
|
| - <listitem><para><function>std::merge</function></para></listitem>
|
| - <listitem><para><function>std::min_element</function></para></listitem>
|
| - <listitem><para><function>std::nth_element</function></para></listitem>
|
| - <listitem><para><function>std::partial_sort</function></para></listitem>
|
| - <listitem><para><function>std::partition</function></para></listitem>
|
| - <listitem><para><function>std::random_shuffle</function></para></listitem>
|
| - <listitem><para><function>std::set_union</function></para></listitem>
|
| - <listitem><para><function>std::set_intersection</function></para></listitem>
|
| - <listitem><para><function>std::set_symmetric_difference</function></para></listitem>
|
| - <listitem><para><function>std::set_difference</function></para></listitem>
|
| - <listitem><para><function>std::sort</function></para></listitem>
|
| - <listitem><para><function>std::stable_sort</function></para></listitem>
|
| - <listitem><para><function>std::unique_copy</function></para></listitem>
|
| -</itemizedlist>
|
| -
|
| -</sect1>
|
| -
|
| -<sect1 id="manual.ext.parallel_mode.semantics" xreflabel="Semantics">
|
| - <title>Semantics</title>
|
| -
|
| -<para> The parallel mode STL algorithms are currently not exception-safe,
|
| -i.e. user-defined functors must not throw exceptions.
|
| -Also, the order of execution is not guaranteed for some functions, of course.
|
| -Therefore, user-defined functors should not have any concurrent side effects.
|
| -</para>
|
| -
|
| -<para> Since the current GCC OpenMP implementation does not support
|
| -OpenMP parallel regions in concurrent threads,
|
| -it is not possible to call parallel STL algorithm in
|
| -concurrent threads, either.
|
| -It might work with other compilers, though.</para>
|
| -
|
| -</sect1>
|
| -
|
| -<sect1 id="manual.ext.parallel_mode.using" xreflabel="Using">
|
| - <title>Using</title>
|
| -
|
| -<sect2 id="parallel_mode.using.prereq_flags" xreflabel="using.prereq_flags">
|
| - <title>Prerequisite Compiler Flags</title>
|
| -
|
| -<para>
|
| - Any use of parallel functionality requires additional compiler
|
| - and runtime support, in particular support for OpenMP. Adding this support is
|
| - not difficult: just compile your application with the compiler
|
| - flag <literal>-fopenmp</literal>. This will link
|
| - in <code>libgomp</code>, the GNU
|
| - OpenMP <ulink url="http://gcc.gnu.org/onlinedocs/libgomp/">implementation</ulink>,
|
| - whose presence is mandatory.
|
| -</para>
|
| -
|
| -<para>
|
| -In addition, hardware that supports atomic operations and a compiler
|
| - capable of producing atomic operations is mandatory: GCC defaults to no
|
| - support for atomic operations on some common hardware
|
| - architectures. Activating atomic operations may require explicit
|
| - compiler flags on some targets (like sparc and x86), such
|
| - as <literal>-march=i686</literal>,
|
| - <literal>-march=native</literal> or <literal>-mcpu=v9</literal>. See
|
| - the GCC manual for more information.
|
| -</para>
|
| -
|
| -</sect2>
|
| -
|
| -<sect2 id="parallel_mode.using.parallel_mode" xreflabel="using.parallel_mode">
|
| - <title>Using Parallel Mode</title>
|
| -
|
| -<para>
|
| - To use the libstdc++ parallel mode, compile your application with
|
| - the prerequisite flags as detailed above, and in addition
|
| - add <constant>-D_GLIBCXX_PARALLEL</constant>. This will convert all
|
| - use of the standard (sequential) algorithms to the appropriate parallel
|
| - equivalents. Please note that this doesn't necessarily mean that
|
| - everything will end up being executed in a parallel manner, but
|
| - rather that the heuristics and settings coded into the parallel
|
| - versions will be used to determine if all, some, or no algorithms
|
| - will be executed using parallel variants.
|
| -</para>
|
| -
|
| -<para>Note that the <constant>_GLIBCXX_PARALLEL</constant> define may change the
|
| - sizes and behavior of standard class templates such as
|
| - <function>std::search</function>, and therefore one can only link code
|
| - compiled with parallel mode and code compiled without parallel mode
|
| - if no instantiation of a container is passed between the two
|
| - translation units. Parallel mode functionality has distinct linkage,
|
| - and cannot be confused with normal mode symbols.
|
| -</para>
|
| -</sect2>
|
| -
|
| -<sect2 id="parallel_mode.using.specific" xreflabel="using.specific">
|
| - <title>Using Specific Parallel Components</title>
|
| -
|
| -<para>When it is not feasible to recompile your entire application, or
|
| - only specific algorithms need to be parallel-aware, individual
|
| - parallel algorithms can be made available explicitly. These
|
| - parallel algorithms are functionally equivalent to the standard
|
| - drop-in algorithms used in parallel mode, but they are available in
|
| - a separate namespace as GNU extensions and may be used in programs
|
| - compiled with either release mode or with parallel mode.
|
| -</para>
|
| -
|
| -
|
| -<para>An example of using a parallel version
|
| -of <function>std::sort</function>, but no other parallel algorithms, is:
|
| -</para>
|
| -
|
| -<programlisting>
|
| -#include <vector>
|
| -#include <parallel/algorithm>
|
| -
|
| -int main()
|
| -{
|
| - std::vector<int> v(100);
|
| -
|
| - // ...
|
| -
|
| - // Explicitly force a call to parallel sort.
|
| - __gnu_parallel::sort(v.begin(), v.end());
|
| - return 0;
|
| -}
|
| -</programlisting>
|
| -
|
| -<para>
|
| -Then compile this code with the prerequisite compiler flags
|
| -(<literal>-fopenmp</literal> and any necessary architecture-specific
|
| -flags for atomic operations.)
|
| -</para>
|
| -
|
| -<para> The following table provides the names and headers of all the
|
| - parallel algorithms that can be used in a similar manner:
|
| -</para>
|
| -
|
| -<table frame='all'>
|
| -<title>Parallel Algorithms</title>
|
| -<tgroup cols='4' align='left' colsep='1' rowsep='1'>
|
| -<colspec colname='c1'></colspec>
|
| -<colspec colname='c2'></colspec>
|
| -<colspec colname='c3'></colspec>
|
| -<colspec colname='c4'></colspec>
|
| -
|
| -<thead>
|
| - <row>
|
| - <entry>Algorithm</entry>
|
| - <entry>Header</entry>
|
| - <entry>Parallel algorithm</entry>
|
| - <entry>Parallel header</entry>
|
| - </row>
|
| -</thead>
|
| -
|
| -<tbody>
|
| - <row>
|
| - <entry><function>std::accumulate</function></entry>
|
| - <entry><filename class="headerfile">numeric</filename></entry>
|
| - <entry><function>__gnu_parallel::accumulate</function></entry>
|
| - <entry><filename class="headerfile">parallel/numeric</filename></entry>
|
| - </row>
|
| - <row>
|
| - <entry><function>std::adjacent_difference</function></entry>
|
| - <entry><filename class="headerfile">numeric</filename></entry>
|
| - <entry><function>__gnu_parallel::adjacent_difference</function></entry>
|
| - <entry><filename class="headerfile">parallel/numeric</filename></entry>
|
| - </row>
|
| - <row>
|
| - <entry><function>std::inner_product</function></entry>
|
| - <entry><filename class="headerfile">numeric</filename></entry>
|
| - <entry><function>__gnu_parallel::inner_product</function></entry>
|
| - <entry><filename class="headerfile">parallel/numeric</filename></entry>
|
| - </row>
|
| - <row>
|
| - <entry><function>std::partial_sum</function></entry>
|
| - <entry><filename class="headerfile">numeric</filename></entry>
|
| - <entry><function>__gnu_parallel::partial_sum</function></entry>
|
| - <entry><filename class="headerfile">parallel/numeric</filename></entry>
|
| - </row>
|
| - <row>
|
| - <entry><function>std::adjacent_find</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::adjacent_find</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::count</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::count</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::count_if</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::count_if</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::equal</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::equal</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::find</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::find</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::find_if</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::find_if</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::find_first_of</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::find_first_of</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::for_each</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::for_each</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::generate</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::generate</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::generate_n</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::generate_n</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::lexicographical_compare</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::lexicographical_compare</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::mismatch</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::mismatch</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::search</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::search</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::search_n</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::search_n</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::transform</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::transform</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::replace</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::replace</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::replace_if</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::replace_if</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::max_element</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::max_element</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::merge</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::merge</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::min_element</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::min_element</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::nth_element</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::nth_element</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::partial_sort</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::partial_sort</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::partition</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::partition</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::random_shuffle</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::random_shuffle</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::set_union</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::set_union</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::set_intersection</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::set_intersection</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::set_symmetric_difference</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::set_symmetric_difference</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::set_difference</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::set_difference</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::sort</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::sort</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::stable_sort</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::stable_sort</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -
|
| - <row>
|
| - <entry><function>std::unique_copy</function></entry>
|
| - <entry><filename class="headerfile">algorithm</filename></entry>
|
| - <entry><function>__gnu_parallel::unique_copy</function></entry>
|
| - <entry><filename class="headerfile">parallel/algorithm</filename></entry>
|
| - </row>
|
| -</tbody>
|
| -</tgroup>
|
| -</table>
|
| -
|
| -</sect2>
|
| -
|
| -</sect1>
|
| -
|
| -<sect1 id="manual.ext.parallel_mode.design" xreflabel="Design">
|
| - <title>Design</title>
|
| - <para>
|
| - </para>
|
| -<sect2 id="manual.ext.parallel_mode.design.intro" xreflabel="Intro">
|
| - <title>Interface Basics</title>
|
| -
|
| -<para>
|
| -All parallel algorithms are intended to have signatures that are
|
| -equivalent to the ISO C++ algorithms replaced. For instance, the
|
| -<function>std::adjacent_find</function> function is declared as:
|
| -</para>
|
| -<programlisting>
|
| -namespace std
|
| -{
|
| - template<typename _FIter>
|
| - _FIter
|
| - adjacent_find(_FIter, _FIter);
|
| -}
|
| -</programlisting>
|
| -
|
| -<para>
|
| -Which means that there should be something equivalent for the parallel
|
| -version. Indeed, this is the case:
|
| -</para>
|
| -
|
| -<programlisting>
|
| -namespace std
|
| -{
|
| - namespace __parallel
|
| - {
|
| - template<typename _FIter>
|
| - _FIter
|
| - adjacent_find(_FIter, _FIter);
|
| -
|
| - ...
|
| - }
|
| -}
|
| -</programlisting>
|
| -
|
| -<para>But.... why the ellipses?
|
| -</para>
|
| -
|
| -<para> The ellipses in the example above represent additional overloads
|
| -required for the parallel version of the function. These additional
|
| -overloads are used to dispatch calls from the ISO C++ function
|
| -signature to the appropriate parallel function (or sequential
|
| -function, if no parallel functions are deemed worthy), based on either
|
| -compile-time or run-time conditions.
|
| -</para>
|
| -
|
| -<para> The available signature options are specific for the different
|
| -algorithms/algorithm classes.</para>
|
| -
|
| -<para> The general view of overloads for the parallel algorithms look like this:
|
| -</para>
|
| -<itemizedlist>
|
| - <listitem><para>ISO C++ signature</para></listitem>
|
| - <listitem><para>ISO C++ signature + sequential_tag argument</para></listitem>
|
| - <listitem><para>ISO C++ signature + algorithm-specific tag type
|
| - (several signatures)</para></listitem>
|
| -</itemizedlist>
|
| -
|
| -<para> Please note that the implementation may use additional functions
|
| -(designated with the <code>_switch</code> suffix) to dispatch from the
|
| -ISO C++ signature to the correct parallel version. Also, some of the
|
| -algorithms do not have support for run-time conditions, so the last
|
| -overload is therefore missing.
|
| -</para>
|
| -
|
| -
|
| -</sect2>
|
| -
|
| -<sect2 id="manual.ext.parallel_mode.design.tuning" xreflabel="Tuning">
|
| - <title>Configuration and Tuning</title>
|
| -
|
| -
|
| -<sect3 id="parallel_mode.design.tuning.omp" xreflabel="OpenMP Environment">
|
| - <title>Setting up the OpenMP Environment</title>
|
| -
|
| -<para>
|
| -Several aspects of the overall runtime environment can be manipulated
|
| -by standard OpenMP function calls.
|
| -</para>
|
| -
|
| -<para>
|
| -To specify the number of threads to be used for the algorithms globally,
|
| -use the function <function>omp_set_num_threads</function>. An example:
|
| -</para>
|
| -
|
| -<programlisting>
|
| -#include <stdlib.h>
|
| -#include <omp.h>
|
| -
|
| -int main()
|
| -{
|
| - // Explicitly set number of threads.
|
| - const int threads_wanted = 20;
|
| - omp_set_dynamic(false);
|
| - omp_set_num_threads(threads_wanted);
|
| -
|
| - // Call parallel mode algorithms.
|
| -
|
| - return 0;
|
| -}
|
| -</programlisting>
|
| -
|
| -<para>
|
| - Some algorithms allow the number of threads being set for a particular call,
|
| - by augmenting the algorithm variant.
|
| - See the next section for further information.
|
| -</para>
|
| -
|
| -<para>
|
| -Other parts of the runtime environment able to be manipulated include
|
| -nested parallelism (<function>omp_set_nested</function>), schedule kind
|
| -(<function>omp_set_schedule</function>), and others. See the OpenMP
|
| -documentation for more information.
|
| -</para>
|
| -
|
| -</sect3>
|
| -
|
| -<sect3 id="parallel_mode.design.tuning.compile" xreflabel="Compile Switches">
|
| - <title>Compile Time Switches</title>
|
| -
|
| -<para>
|
| -To force an algorithm to execute sequentially, even though parallelism
|
| -is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>,
|
| -add <classname>__gnu_parallel::sequential_tag()</classname> to the end
|
| -of the algorithm's argument list.
|
| -</para>
|
| -
|
| -<para>
|
| -Like so:
|
| -</para>
|
| -
|
| -<programlisting>
|
| -std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag());
|
| -</programlisting>
|
| -
|
| -<para>
|
| -Some parallel algorithm variants can be excluded from compilation by
|
| -preprocessor defines. See the doxygen documentation on
|
| -<code>compiletime_settings.h</code> and <code>features.h</code> for details.
|
| -</para>
|
| -
|
| -<para>
|
| -For some algorithms, the desired variant can be chosen at compile-time by
|
| -appending a tag object. The available options are specific to the particular
|
| -algorithm (class).
|
| -</para>
|
| -
|
| -<para>
|
| -For the "embarrassingly parallel" algorithms, there is only one "tag object
|
| -type", the enum _Parallelism.
|
| -It takes one of the following values,
|
| -<code>__gnu_parallel::parallel_tag</code>,
|
| -<code>__gnu_parallel::balanced_tag</code>,
|
| -<code>__gnu_parallel::unbalanced_tag</code>,
|
| -<code>__gnu_parallel::omp_loop_tag</code>,
|
| -<code>__gnu_parallel::omp_loop_static_tag</code>.
|
| -This means that the actual parallelization strategy is chosen at run-time.
|
| -(Choosing the variants at compile-time will come soon.)
|
| -</para>
|
| -
|
| -<para>
|
| -For the following algorithms in general, we have
|
| -<code>__gnu_parallel::parallel_tag</code> and
|
| -<code>__gnu_parallel::default_parallel_tag</code>, in addition to
|
| -<code>__gnu_parallel::sequential_tag</code>.
|
| -<code>__gnu_parallel::default_parallel_tag</code> chooses the default
|
| -algorithm at compiletime, as does omitting the tag.
|
| -<code>__gnu_parallel::parallel_tag</code> postpones the decision to runtime
|
| -(see next section).
|
| -For all tags, the number of threads desired for this call can optionally be
|
| -passed to the respective tag's constructor.
|
| -</para>
|
| -
|
| -<para>
|
| -The <code>multiway_merge</code> algorithm comes with the additional choices,
|
| -<code>__gnu_parallel::exact_tag</code> and
|
| -<code>__gnu_parallel::sampling_tag</code>.
|
| -Exact and sampling are the two available splitting strategies.
|
| -</para>
|
| -
|
| -<para>
|
| -For the <code>sort</code> and <code>stable_sort</code> algorithms, there are
|
| -several additional choices, namely
|
| -<code>__gnu_parallel::multiway_mergesort_tag</code>,
|
| -<code>__gnu_parallel::multiway_mergesort_exact_tag</code>,
|
| -<code>__gnu_parallel::multiway_mergesort_sampling_tag</code>,
|
| -<code>__gnu_parallel::quicksort_tag</code>, and
|
| -<code>__gnu_parallel::balanced_quicksort_tag</code>.
|
| -Multiway mergesort comes with the two splitting strategies for multi-way
|
| -merging. The quicksort options cannot be used for <code>stable_sort</code>.
|
| -</para>
|
| -
|
| -</sect3>
|
| -
|
| -<sect3 id="parallel_mode.design.tuning.settings" xreflabel="_Settings">
|
| - <title>Run Time Settings and Defaults</title>
|
| -
|
| -<para>
|
| -The default parallelization strategy, the choice of specific algorithm
|
| -strategy, the minimum threshold limits for individual parallel
|
| -algorithms, and aspects of the underlying hardware can be specified as
|
| -desired via manipulation
|
| -of <classname>__gnu_parallel::_Settings</classname> member data.
|
| -</para>
|
| -
|
| -<para>
|
| -First off, the choice of parallelization strategy: serial, parallel,
|
| -or heuristically deduced. This corresponds
|
| -to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a
|
| -value of enum <type>__gnu_parallel::_AlgorithmStrategy</type>
|
| -type. Choices
|
| -include: <type>heuristic</type>, <type>force_sequential</type>,
|
| -and <type>force_parallel</type>. The default is <type>heuristic</type>.
|
| -</para>
|
| -
|
| -
|
| -<para>
|
| -Next, the sub-choices for algorithm variant, if not fixed at compile-time.
|
| -Specific algorithms like <function>find</function> or <function>sort</function>
|
| -can be implemented in multiple ways: when this is the case,
|
| -a <classname>__gnu_parallel::_Settings</classname> member exists to
|
| -pick the default strategy. For
|
| -example, <code>__gnu_parallel::_Settings::sort_algorithm</code> can
|
| -have any values of
|
| -enum <type>__gnu_parallel::_SortAlgorithm</type>: <type>MWMS</type>, <type>QS</type>,
|
| -or <type>QS_BALANCED</type>.
|
| -</para>
|
| -
|
| -<para>
|
| -Likewise for setting the minimal threshold for algorithm
|
| -parallelization. Parallelism always incurs some overhead. Thus, it is
|
| -not helpful to parallelize operations on very small sets of
|
| -data. Because of this, measures are taken to avoid parallelizing below
|
| -a certain, pre-determined threshold. For each algorithm, a minimum
|
| -problem size is encoded as a variable in the
|
| -active <classname>__gnu_parallel::_Settings</classname> object. This
|
| -threshold variable follows the following naming scheme:
|
| -<code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So,
|
| -for <function>fill</function>, the threshold variable
|
| -is <code>__gnu_parallel::_Settings::fill_minimal_n</code>,
|
| -</para>
|
| -
|
| -<para>
|
| -Finally, hardware details like L1/L2 cache size can be hardwired
|
| -via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends.
|
| -</para>
|
| -
|
| -<para>
|
| -</para>
|
| -
|
| -<para>
|
| -All these configuration variables can be changed by the user, if
|
| -desired.
|
| -There exists one global instance of the class <classname>_Settings</classname>,
|
| -i. e. it is a singleton. It can be read and written by calling
|
| -<code>__gnu_parallel::_Settings::get</code> and
|
| -<code>__gnu_parallel::_Settings::set</code>, respectively.
|
| -Please note that the first call return a const object, so direct manipulation
|
| -is forbidden.
|
| -See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html">
|
| - <filename class="headerfile">settings.h</filename></ulink>
|
| -for complete details.
|
| -</para>
|
| -
|
| -<para>
|
| -A small example of tuning the default:
|
| -</para>
|
| -
|
| -<programlisting>
|
| -#include <parallel/algorithm>
|
| -#include <parallel/settings.h>
|
| -
|
| -int main()
|
| -{
|
| - __gnu_parallel::_Settings s;
|
| - s.algorithm_strategy = __gnu_parallel::force_parallel;
|
| - __gnu_parallel::_Settings::set(s);
|
| -
|
| - // Do work... all algorithms will be parallelized, always.
|
| -
|
| - return 0;
|
| -}
|
| -</programlisting>
|
| -
|
| -</sect3>
|
| -
|
| -</sect2>
|
| -
|
| -<sect2 id="manual.ext.parallel_mode.design.impl" xreflabel="Impl">
|
| - <title>Implementation Namespaces</title>
|
| -
|
| -<para> One namespace contain versions of code that are always
|
| -explicitly sequential:
|
| -<code>__gnu_serial</code>.
|
| -</para>
|
| -
|
| -<para> Two namespaces contain the parallel mode:
|
| -<code>std::__parallel</code> and <code>__gnu_parallel</code>.
|
| -</para>
|
| -
|
| -<para> Parallel implementations of standard components, including
|
| -template helpers to select parallelism, are defined in <code>namespace
|
| -std::__parallel</code>. For instance, <function>std::transform</function> from <filename class="headerfile">algorithm</filename> has a parallel counterpart in
|
| -<function>std::__parallel::transform</function> from <filename class="headerfile">parallel/algorithm</filename>. In addition, these parallel
|
| -implementations are injected into <code>namespace
|
| -__gnu_parallel</code> with using declarations.
|
| -</para>
|
| -
|
| -<para> Support and general infrastructure is in <code>namespace
|
| -__gnu_parallel</code>.
|
| -</para>
|
| -
|
| -<para> More information, and an organized index of types and functions
|
| -related to the parallel mode on a per-namespace basis, can be found in
|
| -the generated source documentation.
|
| -</para>
|
| -
|
| -</sect2>
|
| -
|
| -</sect1>
|
| -
|
| -<sect1 id="manual.ext.parallel_mode.test" xreflabel="Testing">
|
| - <title>Testing</title>
|
| -
|
| - <para>
|
| - Both the normal conformance and regression tests and the
|
| - supplemental performance tests work.
|
| - </para>
|
| -
|
| - <para>
|
| - To run the conformance and regression tests with the parallel mode
|
| - active,
|
| - </para>
|
| -
|
| - <screen>
|
| - <userinput>make check-parallel</userinput>
|
| - </screen>
|
| -
|
| - <para>
|
| - The log and summary files for conformance testing are in the
|
| - <filename class="directory">testsuite/parallel</filename> directory.
|
| - </para>
|
| -
|
| - <para>
|
| - To run the performance tests with the parallel mode active,
|
| - </para>
|
| -
|
| - <screen>
|
| - <userinput>make check-performance-parallel</userinput>
|
| - </screen>
|
| -
|
| - <para>
|
| - The result file for performance testing are in the
|
| - <filename class="directory">testsuite</filename> directory, in the file
|
| - <filename>libstdc++_performance.sum</filename>. In addition, the
|
| - policy-based containers have their own visualizations, which have
|
| - additional software dependencies than the usual bare-boned text
|
| - file, and can be generated by using the <code>make
|
| - doc-performance</code> rule in the testsuite's Makefile.
|
| -</para>
|
| -</sect1>
|
| -
|
| -<bibliography id="parallel_mode.biblio" xreflabel="parallel_mode.biblio">
|
| -<title>Bibliography</title>
|
| -
|
| - <biblioentry>
|
| - <title>
|
| - Parallelization of Bulk Operations for STL Dictionaries
|
| - </title>
|
| -
|
| - <author>
|
| - <firstname>Johannes</firstname>
|
| - <surname>Singler</surname>
|
| - </author>
|
| - <author>
|
| - <firstname>Leonor</firstname>
|
| - <surname>Frias</surname>
|
| - </author>
|
| -
|
| - <copyright>
|
| - <year>2007</year>
|
| - <holder></holder>
|
| - </copyright>
|
| -
|
| - <publisher>
|
| - <publishername>
|
| - Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS)
|
| - </publishername>
|
| - </publisher>
|
| - </biblioentry>
|
| -
|
| - <biblioentry>
|
| - <title>
|
| - The Multi-Core Standard Template Library
|
| - </title>
|
| -
|
| - <author>
|
| - <firstname>Johannes</firstname>
|
| - <surname>Singler</surname>
|
| - </author>
|
| - <author>
|
| - <firstname>Peter</firstname>
|
| - <surname>Sanders</surname>
|
| - </author>
|
| - <author>
|
| - <firstname>Felix</firstname>
|
| - <surname>Putze</surname>
|
| - </author>
|
| -
|
| - <copyright>
|
| - <year>2007</year>
|
| - <holder></holder>
|
| - </copyright>
|
| -
|
| - <publisher>
|
| - <publishername>
|
| - Euro-Par 2007: Parallel Processing. (LNCS 4641)
|
| - </publishername>
|
| - </publisher>
|
| - </biblioentry>
|
| -
|
| -</bibliography>
|
| -
|
| -</chapter>
|
|
|