Commit e491ed09 by Johannes Singler Committed by Johannes Singler

parallel_mode.xml: General revision and documentation of new compile-time options for sorting.

2008-05-15  Johannes Singler  <singler@ira.uka.de>

           * doc/xml/manual/parallel_mode.xml:
           General revision and documentation of new compile-time 
           options for sorting.

From-SVN: r135327
parent 22ac021b
2008-05-15 Johannes Singler <singler@ira.uka.de>
* xml/manual/parallel_mode.xml:
General revision and documentation of new compile-time
options for sorting.
2008-05-14 Benjamin Kosnik <bkoz@redhat.com> 2008-05-14 Benjamin Kosnik <bkoz@redhat.com>
* include/std/mutex (mutex::try_lock): Eat errors. * include/std/mutex (mutex::try_lock): Eat errors.
......
...@@ -90,6 +90,8 @@ specific compiler flag. ...@@ -90,6 +90,8 @@ specific compiler flag.
<para> The parallel mode STL algorithms are currently not exception-safe, <para> The parallel mode STL algorithms are currently not exception-safe,
i.e. user-defined functors must not throw exceptions. i.e. user-defined functors must not throw exceptions.
Also, the order of execution is not guaranteed for some functions, of course.
Therefore, user-defined functors should not have any concurrent side effects.
</para> </para>
<para> Since the current GCC OpenMP implementation does not support <para> Since the current GCC OpenMP implementation does not support
...@@ -459,34 +461,16 @@ function, if no parallel functions are deemed worthy), based on either ...@@ -459,34 +461,16 @@ function, if no parallel functions are deemed worthy), based on either
compile-time or run-time conditions. compile-time or run-time conditions.
</para> </para>
<para> Compile-time conditions are referred to as "embarrassingly <para> The available signature options are specific for the different
parallel," and are denoted with the appropriate dispatch object, i.e., algorithms/algorithm classes.</para>
one of <code>__gnu_parallel::sequential_tag</code>,
<code>__gnu_parallel::parallel_tag</code>,
<code>__gnu_parallel::balanced_tag</code>,
<code>__gnu_parallel::unbalanced_tag</code>,
<code>__gnu_parallel::omp_loop_tag</code>, or
<code>__gnu_parallel::omp_loop_static_tag</code>.
</para>
<para> Run-time conditions depend on the hardware being used, the number
of threads available, etc., and are denoted by the use of the enum
<code>__gnu_parallel::parallelism</code>. Values of this enum include
<code>__gnu_parallel::sequential</code>,
<code>__gnu_parallel::parallel_unbalanced</code>,
<code>__gnu_parallel::parallel_balanced</code>,
<code>__gnu_parallel::parallel_omp_loop</code>,
<code>__gnu_parallel::parallel_omp_loop_static</code>, or
<code>__gnu_parallel::parallel_taskqueue</code>.
</para>
<para> Putting all this together, the general view of overloads for the <para> The general view of overloads for the parallel algorithms look like this:
parallel algorithms look like this:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem><para>ISO C++ signature</para></listitem> <listitem><para>ISO C++ signature</para></listitem>
<listitem><para>ISO C++ signature + sequential_tag argument</para></listitem> <listitem><para>ISO C++ signature + sequential_tag argument</para></listitem>
<listitem><para>ISO C++ signature + parallelism argument</para></listitem> <listitem><para>ISO C++ signature + algorithm-specific tag type
(several signatures)</para></listitem>
</itemizedlist> </itemizedlist>
<para> Please note that the implementation may use additional functions <para> Please note that the implementation may use additional functions
...@@ -512,8 +496,8 @@ by standard OpenMP function calls. ...@@ -512,8 +496,8 @@ by standard OpenMP function calls.
</para> </para>
<para> <para>
To specify the number of threads to be used for an algorithm, use the To specify the number of threads to be used for the algorithms globally,
function <function>omp_set_num_threads</function>. An example: use the function <function>omp_set_num_threads</function>. An example:
</para> </para>
<programlisting> <programlisting>
...@@ -527,13 +511,19 @@ int main() ...@@ -527,13 +511,19 @@ int main()
omp_set_dynamic(false); omp_set_dynamic(false);
omp_set_num_threads(threads_wanted); omp_set_num_threads(threads_wanted);
// Do work. // Call parallel mode algorithms.
return 0; return 0;
} }
</programlisting> </programlisting>
<para> <para>
Some algorithms allow the number of threads being set for a particular call,
by augmenting the algorithm variant.
See the next section for further information.
</para>
<para>
Other parts of the runtime environment able to be manipulated include Other parts of the runtime environment able to be manipulated include
nested parallelism (<function>omp_set_nested</function>), schedule kind nested parallelism (<function>omp_set_nested</function>), schedule kind
(<function>omp_set_schedule</function>), and others. See the OpenMP (<function>omp_set_schedule</function>), and others. See the OpenMP
...@@ -549,8 +539,7 @@ documentation for more information. ...@@ -549,8 +539,7 @@ documentation for more information.
To force an algorithm to execute sequentially, even though parallelism To force an algorithm to execute sequentially, even though parallelism
is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>, is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>,
add <classname>__gnu_parallel::sequential_tag()</classname> to the end add <classname>__gnu_parallel::sequential_tag()</classname> to the end
of the algorithm's argument list, or explicitly qualify the algorithm of the algorithm's argument list.
with the <code>__gnu_parallel::</code> namespace.
</para> </para>
<para> <para>
...@@ -562,22 +551,50 @@ std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag()); ...@@ -562,22 +551,50 @@ std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag());
</programlisting> </programlisting>
<para> <para>
or Some parallel algorithm variants can be excluded from compilation by
preprocessor defines. See the doxygen documentation on
<code>compiletime_settings.h</code> and <code>features.h</code> for details.
</para> </para>
<programlisting> <para>
__gnu_serial::sort(v.begin(), v.end()); For some algorithms, the desired variant can be chosen at compile-time by
</programlisting> appending a tag object. The available options are specific to the particular
algorithm (class).
</para>
<para> <para>
In addition, some parallel algorithm variants can be enabled/disabled/selected For the "embarrassingly parallel" algorithms, there is only one "tag object
at compile-time. type", the enum _Parallelism.
It takes one of the following values,
<code>__gnu_parallel::parallel_tag</code>,
<code>__gnu_parallel::balanced_tag</code>,
<code>__gnu_parallel::unbalanced_tag</code>,
<code>__gnu_parallel::omp_loop_tag</code>,
<code>__gnu_parallel::omp_loop_static_tag</code>.
This means that the actual parallelization strategy is chosen at run-time.
(Choosing the variants at compile-time will come soon.)
</para> </para>
<para> <para>
See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00446.html"><filename class="headerfile">compiletime_settings.h</filename></ulink> and For the <code>sort</code> and <code>stable_sort</code> algorithms, there are
See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00505.html"><filename class="headerfile">features.h</filename></ulink> for details. several possible choices,
<code>__gnu_parallel::parallel_tag</code>,
<code>__gnu_parallel::default_parallel_tag</code>,
<code>__gnu_parallel::multiway_mergesort_tag</code>,
<code>__gnu_parallel::multiway_mergesort_exact_tag</code>,
<code>__gnu_parallel::multiway_mergesort_sampling_tag</code>,
<code>__gnu_parallel::quicksort_tag</code>,
<code>__gnu_parallel::balanced_quicksort_tag</code>.
Multiway mergesort comes with two splitting strategies for merging, therefore
the extra choice. If non is chosen, the default splitting strategy is selected.
<code>__gnu_parallel::default_parallel_tag</code> chooses the default parallel
sorting algorithm at runtime. <code>__gnu_parallel::parallel_tag</code>
postpones the decision to runtime (see next section).
The quicksort options cannot be used for <code>stable_sort</code>.
For all tags, the number of threads desired for this call can optionally be
passed to the tag's constructor.
</para> </para>
</sect3> </sect3>
<sect3 id="parallel_mode.design.tuning.settings" xreflabel="_Settings"> <sect3 id="parallel_mode.design.tuning.settings" xreflabel="_Settings">
...@@ -593,19 +610,18 @@ of <classname>__gnu_parallel::_Settings</classname> member data. ...@@ -593,19 +610,18 @@ of <classname>__gnu_parallel::_Settings</classname> member data.
<para> <para>
First off, the choice of parallelization strategy: serial, parallel, First off, the choice of parallelization strategy: serial, parallel,
or implementation-deduced. This corresponds or heuristically deduced. This corresponds
to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a
value of enum <type>__gnu_parallel::_AlgorithmStrategy</type> value of enum <type>__gnu_parallel::_AlgorithmStrategy</type>
type. Choices type. Choices
include: <type>heuristic</type>, <type>force_sequential</type>, include: <type>heuristic</type>, <type>force_sequential</type>,
and <type>force_parallel</type>. The default is and <type>force_parallel</type>. The default is <type>heuristic</type>.
implementation-deduced, i.e. <type>heuristic</type>.
</para> </para>
<para> <para>
Next, the sub-choices for algorithm implementation. Specific Next, the sub-choices for algorithm variant, if not fixed at compile-time.
algorithms like <function>find</function> or <function>sort</function> Specific algorithms like <function>find</function> or <function>sort</function>
can be implemented in multiple ways: when this is the case, can be implemented in multiple ways: when this is the case,
a <classname>__gnu_parallel::_Settings</classname> member exists to a <classname>__gnu_parallel::_Settings</classname> member exists to
pick the default strategy. For pick the default strategy. For
...@@ -626,7 +642,7 @@ active <classname>__gnu_parallel::_Settings</classname> object. This ...@@ -626,7 +642,7 @@ active <classname>__gnu_parallel::_Settings</classname> object. This
threshold variable follows the following naming scheme: threshold variable follows the following naming scheme:
<code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So, <code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So,
for <function>fill</function>, the threshold variable for <function>fill</function>, the threshold variable
is <code>__gnu_parallel::_Settings::fill_minimal_n</code> is <code>__gnu_parallel::_Settings::fill_minimal_n</code>,
</para> </para>
<para> <para>
...@@ -635,9 +651,19 @@ via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends. ...@@ -635,9 +651,19 @@ via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends.
</para> </para>
<para> <para>
</para>
<para>
All these configuration variables can be changed by the user, if All these configuration variables can be changed by the user, if
desired. Please desired.
see <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html"><filename class="headerfile">settings.h</filename></ulink> There exists one global instance of the class <classname>_Settings</classname>,
i. e. it is a singleton. It can be read and written by calling
<code>__gnu_parallel::_Settings::get</code> and
<code>__gnu_parallel::_Settings::set</code>, respectively.
Please note that the first call return a const object, so direct manipulation
is forbidden.
See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html">
<filename class="headerfile">settings.h</filename></ulink>
for complete details. for complete details.
</para> </para>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment