Commit 46cdd0c8 by Mircea Namolaru Committed by Mircea Namolaru

Support for unroll and jam optimization.

From-SVN: r217682
parent d6f1bcb2
2014-11-17 Mircea Namolaru <mircea.namolaru@inria.fr>
* doc/invoke.texi (floop-unroll-and-jam): Document
(loop-unroll-jam-size): Likewise.
(loop-unroll-jam-depth): Likewise.
* graphite-optimize-isl.c (getPrevectorMap_full): Modify comment.
(getScheduleForBandList): Replaced unsafe union_map reuse.
2014-11-17 Andrew Pinski <apinski@cavium.com> 2014-11-17 Andrew Pinski <apinski@cavium.com>
* config/aarch64/thunderx.md: Remove copyright which should not * config/aarch64/thunderx.md: Remove copyright which should not
...@@ -391,7 +391,8 @@ Objective-C and Objective-C++ Dialects}. ...@@ -391,7 +391,8 @@ Objective-C and Objective-C++ Dialects}.
-fno-ira-share-spill-slots -fira-verbose=@var{n} @gol -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
-fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol -fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts -flive-range-shrinkage @gol -fivopts -fkeep-inline-functions -fkeep-static-consts -flive-range-shrinkage @gol
-floop-block -floop-interchange -floop-strip-mine -floop-nest-optimize @gol -floop-block -floop-interchange -floop-strip-mine @gol
-floop-unroll-and-jam -floop-nest-optimize @gol
-floop-parallelize-all -flra-remat -flto -flto-compression-level @gol -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
-flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants @gol -flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants @gol
-fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
...@@ -8352,6 +8353,13 @@ optimizer based on the Pluto optimization algorithms. It calculates a loop ...@@ -8352,6 +8353,13 @@ optimizer based on the Pluto optimization algorithms. It calculates a loop
structure optimized for data-locality and parallelism. This option structure optimized for data-locality and parallelism. This option
is experimental. is experimental.
@item -floop-unroll-and-jam
@opindex floop-unroll-and-jam
Enable unroll and jam for the ISL based loop nest optimizer. The unroll
factor can be changed using the @option{loop-unroll-jam-size} parameter.
The unrolled dimension (counting from the most inner one) can be changed
using the @option{loop-unroll-jam-depth} parameter. .
@item -floop-parallelize-all @item -floop-parallelize-all
@opindex floop-parallelize-all @opindex floop-parallelize-all
Use the Graphite data dependence analysis to identify loops that can Use the Graphite data dependence analysis to identify loops that can
...@@ -10469,6 +10477,14 @@ loop in the loop nest by a given number of iterations. The strip ...@@ -10469,6 +10477,14 @@ loop in the loop nest by a given number of iterations. The strip
length can be changed using the @option{loop-block-tile-size} length can be changed using the @option{loop-block-tile-size}
parameter. The default value is 51 iterations. parameter. The default value is 51 iterations.
@item loop-unroll-jam-size
Specify the unroll factor for the @option{-floop-unroll-and-jam}. The
default value is 4.
@item loop-unroll-jam-depth
Specify the dimension to be unrolled (counting from the most inner loop)
for the @option{-floop-unroll-and-jam}. The default value is 2.
@item ipa-cp-value-list-size @item ipa-cp-value-list-size
IPA-CP attempts to track all possible values and types passed to a function's IPA-CP attempts to track all possible values and types passed to a function's
parameter in order to propagate them and perform devirtualization. parameter in order to propagate them and perform devirtualization.
......
...@@ -320,7 +320,7 @@ getPrevectorMap (isl_ctx *ctx, int DimToVectorize, ...@@ -320,7 +320,7 @@ getPrevectorMap (isl_ctx *ctx, int DimToVectorize,
ip >= 0 ip >= 0
The image of this map is the separation class. The range of this map includes The image of this map is the separation class. The range of this map includes
all the i that are multiple of 4 in the domain beside the greater one. all the i multiple of 4 in the domain such as i + 3 is in the domain too.
*/ */
static isl_map * static isl_map *
...@@ -486,20 +486,25 @@ getScheduleForBandList (isl_band_list *BandList, isl_union_map **map_sepcl) ...@@ -486,20 +486,25 @@ getScheduleForBandList (isl_band_list *BandList, isl_union_map **map_sepcl)
} }
} }
} }
Schedule = isl_union_map_union (Schedule, PartialSchedule); Schedule = isl_union_map_union (Schedule,
isl_union_map_copy(PartialSchedule));
isl_band_free (Band); isl_band_free (Band);
isl_space_free (Space); isl_space_free (Space);
if (!flag_loop_unroll_jam) if (!flag_loop_unroll_jam)
continue; {
isl_union_map_free (PartialSchedule);
continue;
}
if (PartialSchedule_f) if (PartialSchedule_f)
*map_sepcl = isl_union_map_union (*map_sepcl, {
PartialSchedule_f); *map_sepcl = isl_union_map_union (*map_sepcl, PartialSchedule_f);
isl_union_map_free (PartialSchedule);
}
else else
*map_sepcl = isl_union_map_union (*map_sepcl, *map_sepcl = isl_union_map_union (*map_sepcl, PartialSchedule);
isl_union_map_copy (PartialSchedule));
} }
return Schedule; return Schedule;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment