Commit 1904bef1 by Johannes Singler Committed by Johannes Singler

parallel_mode.html: Added reference to MCSTL.

        * docs/html/parallel_mode.html: Added reference to MCSTL.
        More documentation on compile-time settings and tuning.
        * include/parallel/multiway_merge.h: Added reference to paper.
        * include/parallel/multiseq_selection.h: Added reference to paper.
        * include/parallel/workstealing.h: Added reference to paper.
        * include/parallel/balanced_quicksort.h: Added reference to paper.
        * include/parallel/tree.h: Added reference to paper.

From-SVN: r129129
parent 8174836f
2007-10-08 Johannes Singler <singler@ira.uka.de>
* include/parallel/multiway_merge.h: Added reference to paper.
* include/parallel/multiseq_selection.h: Added reference to paper.
* include/parallel/workstealing.h: Added reference to paper.
* include/parallel/balanced_quicksort.h: Added reference to paper.
* include/parallel/tree.h: Added reference to paper.
* docs/html/parallel_mode.html: Added reference to MCSTL.
More documentation on compile-time settings and tuning.
2007-10-08 Paolo Carlini <pcarlini@suse.de> 2007-10-08 Paolo Carlini <pcarlini@suse.de>
* include/std/utility (identity, move, forward): Move to... * include/std/utility (identity, move, forward): Move to...
......
...@@ -35,7 +35,7 @@ implementation of many algorithms the C++ Standard Library. ...@@ -35,7 +35,7 @@ implementation of many algorithms the C++ Standard Library.
<p> <p>
Several of the standard algorithms, for instance Several of the standard algorithms, for instance
<code>std::search</code>, are made parallel using OpenMP <code>std::sort</code>, are made parallel using OpenMP
annotations. These parallel mode constructs and can be invoked by annotations. These parallel mode constructs and can be invoked by
explicit source declaration or by compiling existing sources with a explicit source declaration or by compiling existing sources with a
specific compiler flag. specific compiler flag.
...@@ -43,7 +43,7 @@ specific compiler flag. ...@@ -43,7 +43,7 @@ specific compiler flag.
<h3 class="left"><a name="parallel">The libstdc++ parallel mode</a></h3> <h3 class="left"><a name="parallel">The libstdc++ parallel mode</a></h3>
<p>The libstdc++ parallel mode performs parallization of algorithms, <p>The libstdc++ parallel mode performs parallelization of algorithms,
function objects, classes, and functions in the C++ Standard.</p> function objects, classes, and functions in the C++ Standard.</p>
<h4 class="left">Using the libstdc++ parallel mode</h4> <h4 class="left">Using the libstdc++ parallel mode</h4>
...@@ -53,7 +53,7 @@ function objects, classes, and functions in the C++ Standard.</p> ...@@ -53,7 +53,7 @@ function objects, classes, and functions in the C++ Standard.</p>
will link in <code>libgomp</code>, the GNU OpenMP <a will link in <code>libgomp</code>, the GNU OpenMP <a
href="http://gcc.gnu.org/onlinedocs/libgomp">implementation</a>, href="http://gcc.gnu.org/onlinedocs/libgomp">implementation</a>,
whose presence is mandatory. In addition, hardware capable of atomic whose presence is mandatory. In addition, hardware capable of atomic
operations is de rigueur. Actually activating these atomic operations is mandatory. Actually activating these atomic
operations may require explicit compiler flags on some targets operations may require explicit compiler flags on some targets
(like sparc and x86), such as <code>-march=i686</code>, (like sparc and x86), such as <code>-march=i686</code>,
<code>-march=native</code> or <code>-mcpu=v9</code>. <code>-march=native</code> or <code>-mcpu=v9</code>.
...@@ -113,6 +113,13 @@ function objects, classes, and functions in the C++ Standard.</p> ...@@ -113,6 +113,13 @@ function objects, classes, and functions in the C++ Standard.</p>
<li><code>std::unique_copy</code></li> <li><code>std::unique_copy</code></li>
</ul> </ul>
<p>The following library components in the includes
<code>&lt;set&gt;</code> and <code>&lt;map&gt;</code> are included in the parallel mode:</p>
<ul>
<li><code>std::(multi_)map/set&lt;T&gt;::(multi_)map/set(Iterator begin, Iterator end)</code> (bulk construction)</li>
<li><code>std::(multi_)map/set&lt;T&gt;::insert(Iterator begin, Iterator end)</code> (bulk insertion)</li>
</ul>
<h4 class="left">Using the parallel algorithms without parallel mode</h4> <h4 class="left">Using the parallel algorithms without parallel mode</h4>
...@@ -380,13 +387,47 @@ function objects, classes, and functions in the C++ Standard.</p> ...@@ -380,13 +387,47 @@ function objects, classes, and functions in the C++ Standard.</p>
<h4 class="left">Parallel mode semantics</h4> <h4 class="left">Parallel mode semantics</h4>
<p> Something about exception safety, interaction with threads,
etc. Goal is to have the usual constraints of the STL with respect to
exception safety and threads, but add in support for parallel
computing.</p>
<p> Something about compile-time settings and configuration, ie using <p> The parallel mode STL algorithms are currently not exception-safe,
<code>__gnu_parallel::Settings</code>. XXX Up in the air.</p> i. e. user-defined functors must not throw exceptions.
</p>
<p> Since the current GCC OpenMP implementation does not support
OpenMP parallel regions in concurrent threads,
it is not possible to call parallel STL algorithm in
concurrent threads, either.
It might work with other compilers, though.</p>
<h4 class="left">Configuration and Tuning</h4>
<p> Some algorithm variants can be enabled/disabled/selected at compile-time.
See <a href="latest-doxygen/compiletime__settings_8h.html">
<code>&lt;compiletime_settings.h&gt;</code></a> and
See <a href="latest-doxygen/compiletime__settings_8h.html">
<code>&lt;features.h&gt;</code></a> for details.
</p>
<p>
To specify the number of threads to be used for an algorithm,
use <code>omp_set_num_threads</code>.
To force a function to execute sequentially,
even though parallelism is switched on in general,
add <code>__gnu_parallel::sequential_tag()</code>
to the end of the argument list.
</p>
<p>
Parallelism always incurs some overhead. Thus, it is not
helpful to parallelize operations on very small sets of data.
There are measures to avoid parallelizing stuff that is not worth it.
For each algorithm, a minimum problem size can be stated,
usually using the variable
<code>__gnu_parallel::Settings::[algorithm]_minimal_n</code>.
Please see <a href="latest-doxygen/settings_8h.html">
<code>&lt;settings.h&gt;</code><a> for details.</p>
<h4 class="left">Interface basics and general design</h4> <h4 class="left">Interface basics and general design</h4>
...@@ -485,7 +526,7 @@ std::__parallel</code>. For instance, <code>std::transform</code> from ...@@ -485,7 +526,7 @@ std::__parallel</code>. For instance, <code>std::transform</code> from
&lt;algorithm&gt; has a parallel counterpart in &lt;algorithm&gt; has a parallel counterpart in
<code>std::__parallel::transform</code> from <code>std::__parallel::transform</code> from
&lt;parallel/algorithm&gt;. In addition, these parallel &lt;parallel/algorithm&gt;. In addition, these parallel
implementatations are injected into <code>namespace implementations are injected into <code>namespace
__gnu_parallel</code> with using declarations. __gnu_parallel</code> with using declarations.
</p> </p>
...@@ -526,6 +567,16 @@ testsuite's Makefile.</p> ...@@ -526,6 +567,16 @@ testsuite's Makefile.</p>
</p> </p>
<h4 class="left">References / Further Reading</h4>
<p>
Johannes Singler, Peter Sanders, Felix Putze. The Multi-Core Standard Template Library. Euro-Par 2007: Parallel Processing. (LNCS 4641)
</p>
<p>
Leonor Frias, Johannes Singler: Parallelization of Bulk Operations for STL Dictionaries. Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS)
</p>
<!-- ####################################################### --> <!-- ####################################################### -->
<hr /> <hr />
......
...@@ -32,6 +32,14 @@ ...@@ -32,6 +32,14 @@
* @brief Implementation of a dynamically load-balanced parallel quicksort. * @brief Implementation of a dynamically load-balanced parallel quicksort.
* *
* It works in-place and needs only logarithmic extra memory. * It works in-place and needs only logarithmic extra memory.
* The algorithm is similar to the one proposed in
*
* P. Tsigas and Y. Zhang.
* A simple, fast parallel implementation of quicksort and
* its performance evaluation on SUN enterprise 10000.
* In 11th Euromicro Conference on Parallel, Distributed and
* Network-Based Processing, page 372, 2003.
*
* This file is a GNU parallel extension to the Standard C++ Library. * This file is a GNU parallel extension to the Standard C++ Library.
*/ */
......
...@@ -32,6 +32,13 @@ ...@@ -32,6 +32,13 @@
* @brief Functions to find elements of a certain global rank in * @brief Functions to find elements of a certain global rank in
* multiple sorted sequences. Also serves for splitting such * multiple sorted sequences. Also serves for splitting such
* sequence sets. * sequence sets.
*
* The algorithm description can be found in
*
* P. J. Varman, S. D. Scheufler, B. R. Iyer, and G. R. Ricard.
* Merging Multiple Lists on Hierarchical-Memory Multiprocessors.
* Journal of Parallel and Distributed Computing, 12(2):171–177, 1991.
*
* This file is a GNU parallel extension to the Standard C++ Library. * This file is a GNU parallel extension to the Standard C++ Library.
*/ */
......
...@@ -30,6 +30,13 @@ ...@@ -30,6 +30,13 @@
/** @file parallel/multiway_merge.h /** @file parallel/multiway_merge.h
* @brief Implementation of sequential and parallel multiway merge. * @brief Implementation of sequential and parallel multiway merge.
*
* Explanations on the high-speed merging routines in the appendix of
*
* P. Sanders.
* Fast priority queues for cached memory.
* ACM Journal of Experimental Algorithmics, 5, 2000.
*
* This file is a GNU parallel extension to the Standard C++ Library. * This file is a GNU parallel extension to the Standard C++ Library.
*/ */
......
...@@ -30,6 +30,13 @@ ...@@ -30,6 +30,13 @@
/** @file parallel/tree.h /** @file parallel/tree.h
* @brief Parallel red-black tree operations. * @brief Parallel red-black tree operations.
*
* This implementation is described in
*
* Leonor Frias, Johannes Singler.
* Parallelization of Bulk Operations for STL Dictionaries.
* Workshop on Highly Parallel Processing on a Chip (HPPC) 2007.
*
* This file is a GNU parallel extension to the Standard C++ Library. * This file is a GNU parallel extension to the Standard C++ Library.
*/ */
......
...@@ -31,6 +31,13 @@ ...@@ -31,6 +31,13 @@
/** @file parallel/workstealing.h /** @file parallel/workstealing.h
* @brief Parallelization of embarrassingly parallel execution by * @brief Parallelization of embarrassingly parallel execution by
* means of work-stealing. * means of work-stealing.
*
* Work stealing is described in
*
* R. D. Blumofe and C. E. Leiserson.
* Scheduling multithreaded computations by work stealing.
* Journal of the ACM, 46(5):720–748, 1999.
*
* This file is a GNU parallel extension to the Standard C++ Library. * This file is a GNU parallel extension to the Standard C++ Library.
*/ */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment