Document AMD GCN.

2019-01-22 Andrew Stubbs <ams@codesourcery.com> * doc/extend.tex (AMD GCN Function Attributes): New section. * doc/install.texi (amdgcn-unknown-amdhsa): New instructions. * doc/invoke.texi (AMD GCN Options): New section. * doc/md.texi (Constraints for Particular Machines): Add AMD GCN. From-SVN: r268146

Document AMD GCN.
2019-01-22 Andrew Stubbs <ams@codesourcery.com> * doc/extend.tex (AMD GCN Function Attributes): New section. * doc/install.texi (amdgcn-unknown-amdhsa): New instructions. * doc/invoke.texi (AMD GCN Options): New section. * doc/md.texi (Constraints for Particular Machines): Add AMD GCN. From-SVN: r268146
1b7ee8b4 · Andrew Stubbs · Andrew Stubbs · d0b042c6 · 1b7ee8b4 · 1b7ee8b4
Commit 1b7ee8b4 authored Jan 22, 2019 by Andrew Stubbs Committed by Andrew Stubbs Jan 22, 2019
Hide whitespace changes
Inline Side-by-side

Showing with 252 additions and 0 deletions

gcc/ChangeLog
+7 -0

gcc/doc/extend.texi
+91 -0

gcc/doc/install.texi
+21 -0

gcc/doc/invoke.texi
+39 -0

gcc/doc/md.texi
+94 -0

No files found.
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
+2019-01-22  Andrew Stubbs  <ams@codesourcery.com>
+	* doc/extend.tex (AMD GCN Function Attributes): New section.
+	* doc/install.texi (amdgcn-unknown-amdhsa): New instructions.
+	* doc/invoke.texi (AMD GCN Options): New section.
+	* doc/md.texi (Constraints for Particular Machines): Add AMD GCN.
 2019-01-22  Eric Botcazou  <ebotcazou@adacore.com>
 	* config/sparc/sparc.c (parc_delegitimize_address): Recognize the GOT

--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2393,6 +2393,7 @@ GCC plugins may provide their own attributes.
 @menu
 * Common Function Attributes::
 * AArch64 Function Attributes::
+* AMD GCN Function Attributes::
 * ARC Function Attributes::
 * ARM Function Attributes::
 * AVR Function Attributes::
@@ -3954,6 +3955,96 @@ Note that CPU tuning options and attributes such as the @option{-mcpu=},
 @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the
 architectural feature rules specified above.
+@node AMD GCN Function Attributes
+@subsection AMD GCN Function Attributes
+These function attributes are supported by the AMD GCN back end:
+@table @code
+@item amdgpu_hsa_kernel
+@cindex @code{amdgpu_hsa_kernel} function attribute, AMD GCN
+This attribute indicates that the corresponding function should be compiled as
+a kernel function, that is an entry point that can be invoked from the host
+via the HSA runtime library.  By default functions are only callable only from
+other GCN functions.
+This attribute is implicitly applied to any function named @code{main}, using
+default parameters.
+Kernel functions may return an integer value, which will be written to a
+conventional place within the HSA "kernargs" region.
+The attribute parameters configure what values are passed into the kernel
+function by the GPU drivers, via the initial register state.  Some values are
+used by the compiler, and therefore forced on.  Enabling other options may
+break assumptions in the compiler and/or run-time libraries.
+@table @code
+@item private_segment_buffer
+Set @code{enable_sgpr_private_segment_buffer} flag.  Always on (required to
+locate the stack).
+@item dispatch_ptr
+Set @code{enable_sgpr_dispatch_ptr} flag.  Always on (required to locate the
+launch dimensions).
+@item queue_ptr
+Set @code{enable_sgpr_queue_ptr} flag.  Always on (required to convert address
+spaces).
+@item kernarg_segment_ptr
+Set @code{enable_sgpr_kernarg_segment_ptr} flag.  Always on (required to
+locate the kernel arguments, "kernargs").
+@item dispatch_id
+Set @code{enable_sgpr_dispatch_id} flag.
+@item flat_scratch_init
+Set @code{enable_sgpr_flat_scratch_init} flag.
+@item private_segment_size
+Set @code{enable_sgpr_private_segment_size} flag.
+@item grid_workgroup_count_X
+Set @code{enable_sgpr_grid_workgroup_count_x} flag.  Always on (required to
+use OpenACC/OpenMP).
+@item grid_workgroup_count_Y
+Set @code{enable_sgpr_grid_workgroup_count_y} flag.
+@item grid_workgroup_count_Z
+Set @code{enable_sgpr_grid_workgroup_count_z} flag.
+@item workgroup_id_X
+Set @code{enable_sgpr_workgroup_id_x} flag.
+@item workgroup_id_Y
+Set @code{enable_sgpr_workgroup_id_y} flag.
+@item workgroup_id_Z
+Set @code{enable_sgpr_workgroup_id_z} flag.
+@item workgroup_info
+Set @code{enable_sgpr_workgroup_info} flag.
+@item private_segment_wave_offset
+Set @code{enable_sgpr_private_segment_wave_byte_offset} flag.  Always on
+(required to locate the stack).
+@item work_item_id_X
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (can't be disabled).
+@item work_item_id_Y
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (required to enable
+vectorization.)
+@item work_item_id_Z
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (required to use
+OpenACC/OpenMP).
+@end table
+@end table
 @node ARC Function Attributes
 @subsection ARC Function Attributes

--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3456,6 +3456,27 @@ This is a synonym for @samp{x86_64-*-solaris2.1[0-9]*}.
 @html
 <hr />
 @end html
+@anchor{amdgcn-unknown-amdhsa}
+@heading amdgcn-unknown-amdhsa
+AMD GCN GPU target.
+Instead of GNU Binutils, you will need to install LLVM 6, or later, and copy
+@file{bin/llvm-mc} to @file{amdgcn-unknown-amdhsa/bin/as},
+@file{bin/lld} to @file{amdgcn-unknown-amdhsa/bin/ld},
+@file{bin/llvm-nm} to @file{amdgcn-unknown-amdhsa/bin/nm}, and
+@file{bin/llvm-ar} to both @file{bin/amdgcn-unknown-amdhsa-ar} and
+@file{bin/amdgcn-unknown-amdhsa-ranlib}.
+Use Newlib (2019-01-16, or newer).
+To run the binaries, install the HSA Runtime from the
+@uref{https://rocm.github.io,,ROCm Platform}, and use
+@file{libexec/gcc/amdhsa-unknown-amdhsa/@var{version}/gcn-run} to launch them
+on the GPU.
+@html
+<hr />
+@end html
 @anchor{arc-x-elf32}
 @heading arc-*-elf32

--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -645,6 +645,9 @@ Objective-C and Objective-C++ Dialects}.
 -mfp-mode=@var{mode}  -mvect-double  -max-vect-align=@var{num} @gol
 -msplit-vecmove-early  -m1reg-@var{reg}}
+@emph{AMD GCN Options}
+@gccoptlist{-march=@var{gpu} -mtune=@var{gpu} -mstack-size=@var{bytes}}
 @emph{ARC Options}
 @gccoptlist{-mbarrel-shifter  -mjli-always @gol
 -mcpu=@var{cpu}  -mA6  -mARC600  -mA7  -mARC700 @gol
@@ -15481,6 +15484,7 @@ platform.
 @menu
 * AArch64 Options::
 * Adapteva Epiphany Options::
+* AMD GCN Options::
 * ARC Options::
 * ARM Options::
 * AVR Options::
@@ -16121,6 +16125,41 @@ purpose.  The default is @option{-m1reg-none}.
 @end table
+@node AMD GCN Options
+@subsection AMD GCN Options
+@cindex AMD GCN Options
+These options are defined specifically for the AMD GCN port.
+@table @gcctabopt
+@item -march=@var{gpu}
+@opindex march
+@itemx -mtune=@var{gpu}
+@opindex mtune
+Set architecture type or tuning for @var{gpu}. Supported values for @var{gpu}
+are
+@table @samp
+@opindex fiji
+@item fiji
+Compile for GCN3 Fiji devices (gfx803).
+@item gfx900
+Compile for GCN5 Vega 10 devices (gfx900).
+@end table
+@item -mstack-size=@var{bytes}
+@opindex mstack-size
+Specify how many @var{bytes} of stack space will be requested for each GPU
+thread (wave-front).  Beware that there may be many threads and limited memory
+available.  The size of the stack allocation may also have an impact on
+run-time performance.  The default is 32KB when using OpenACC or OpenMP, and
+1MB otherwise.
+@end table
 @node ARC Options
 @subsection ARC Options
 @cindex ARC options
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1800,6 +1800,100 @@ DF modes
 @end table
+@item AMD GCN ---@file{config/gcn/constraints.md}
+@table @code
+@item I
+Immediate integer in the range @minus{}16 to 64
+@item J
+Immediate 16-bit signed integer
+@item Kf
+Immediate constant @minus{}1
+@item L
+Immediate 15-bit unsigned integer
+@item A
+Immediate constant that can be inlined in an instruction encoding: integer
+@minus{}16..64, or float 0.0, +/@minus{}0.5, +/@minus{}1.0, +/@minus{}2.0,
+/@minus{}4.0, 1.0/(2.0*PI)
+@item B
+Immediate 32-bit signed integer that can be attached to an instruction encoding
+@item C
+Immediate 32-bit integer in range @minus{}16..4294967295 (i.e. 32-bit unsigned
+integer or @samp{A} constraint)
+@item DA
+Immediate 64-bit constant that can be split into two @samp{A} constants
+@item DB
+Immediate 64-bit constant that can be split into two @samp{B} constants
+@item U
+Any @code{unspec}
+@item Y
+Any @code{symbol_ref} or @code{label_ref}
+@item v
+VGPR register
+@item Sg
+SGPR register
+@item SD
+SGPR registers valid for instruction destinations, including VCC, M0 and EXEC
+@item SS
+SGPR registers valid for instruction sources, including VCC, M0, EXEC and SCC
+@item Sm
+SGPR registers valid as a source for scalar memory instructions (excludes M0
+and EXEC)
+@item Sv
+SGPR registers valid as a source or destination for vector instructions
+(excludes EXEC)
+@item ca
+All condition registers: SCC, VCCZ, EXECZ
+@item cs
+Scalar condition register: SCC
+@item cV
+Vector condition register: VCC, VCC_LO, VCC_HI
+@item e
+EXEC register (EXEC_LO and EXEC_HI)
+@item RB
+Memory operand with address space suitable for @code{buffer_*} instructions
+@item RF
+Memory operand with address space suitable for @code{flat_*} instructions
+@item RS
+Memory operand with address space suitable for @code{s_*} instructions
+@item RL
+Memory operand with address space suitable for @code{ds_*} LDS instructions
+@item RG
+Memory operand with address space suitable for @code{ds_*} GDS instructions
+@item RD
+Memory operand with address space suitable for any @code{ds_*} instructions
+@item RM
+Memory operand with address space suitable for @code{global_*} instructions
+@end table
 @item ARC ---@file{config/arc/constraints.md}
 @table @code
 @item q