Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
R
riscv-gcc-1
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
lvzhengyang
riscv-gcc-1
Commits
cdf6119d
Commit
cdf6119d
authored
Jan 12, 2016
by
James Norris
Committed by
James Norris
Jan 12, 2016
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
* libgomp.texi: Updates for OpenACC.
From-SVN: r232283
parent
5342156c
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
1290 additions
and
0 deletions
+1290
-0
libgomp/ChangeLog
+4
-0
libgomp/libgomp.texi
+1286
-0
No files found.
libgomp/ChangeLog
View file @
cdf6119d
2016-01-12 James Norris <jnorris@codesourcery.com>
* libgomp.texi: Updates for OpenACC.
2016-01-11 Alexander Monakov <amonakov@ispras.ru>
* plugin/plugin-nvptx.c (link_ptx): Do not set CU_JIT_TARGET.
...
...
libgomp/libgomp.texi
View file @
cdf6119d
...
...
@@ -99,6 +99,16 @@ changed to GNU Offloading and Multi Processing Runtime Library.
interface
.
*
Environment
Variables
::
Influencing
runtime
behavior
with
environment
variables
.
*
Enabling
OpenACC
::
How
to
enable
OpenACC
for
your
applications
.
*
OpenACC
Runtime
Library
Routines
::
The
OpenACC
runtime
application
programming
interface
.
*
OpenACC
Environment
Variables
::
Influencing
OpenACC
runtime
behavior
with
environment
variables
.
*
CUDA
Streams
Usage
::
Notes
on
the
implementation
of
asynchronous
operations
.
*
OpenACC
Library
Interoperability
::
OpenACC
library
interoperability
with
the
NVIDIA
CUBLAS
library
.
*
The
libgomp
ABI
::
Notes
on
the
external
ABI
presented
by
libgomp
.
*
Reporting
Bugs
::
How
to
report
bugs
in
the
GNU
Offloading
and
Multi
Processing
Runtime
Library
.
...
...
@@ -1790,6 +1800,1272 @@ pools available and their worker threads run at priority four.
@c
---------------------------------------------------------------------
@c
Enabling
OpenACC
@c
---------------------------------------------------------------------
@node
Enabling
OpenACC
@chapter
Enabling
OpenACC
To
activate
the
OpenACC
extensions
for
C
/
C
++
and
Fortran
,
the
compile
-
time
flag
@option{
-
fopenacc
}
must
be
specified
.
This
enables
the
OpenACC
directive
@code{#
pragma
acc
}
in
C
/
C
++
and
@code{
!
$
accp
}
directives
in
free
form
,
@code{
c
$
acc
}
,
@code{
*
$
acc
}
and
@code{
!
$
acc
}
directives
in
fixed
form
,
@code{
!
$}
conditional
compilation
sentinels
in
free
form
and
@code{
c
$}
,
@code{
*
$}
and
@code{
!
$}
sentinels
in
fixed
form
,
for
Fortran
.
The
flag
also
arranges
for
automatic
linking
of
the
OpenACC
runtime
library
(
@ref{
OpenACC
Runtime
Library
Routines
}
).
A
complete
description
of
all
OpenACC
directives
accepted
may
be
found
in
the
@uref{
http
:
//www.openacc.org/, OpenACC} Application Programming
Interface
manual
,
version
2
.
0
.
Note
that
this
is
an
experimental
feature
and
subject
to
change
in
future
versions
of
GCC
.
See
@uref
{
https
:
//gcc.gnu.org/wiki/OpenACC} for more information.
@c
---------------------------------------------------------------------
@c
OpenACC
Runtime
Library
Routines
@c
---------------------------------------------------------------------
@node
OpenACC
Runtime
Library
Routines
@chapter
OpenACC
Runtime
Library
Routines
The
runtime
routines
described
here
are
defined
by
section
3
of
the
OpenACC
specifications
in
version
2
.
0
.
They
have
C
linkage
,
and
do
not
throw
exceptions
.
Generally
,
they
are
available
only
for
the
host
,
with
the
exception
of
@code
{
acc_on_device
},
which
is
available
for
both
the
host
and
the
acceleration
device
.
@menu
*
acc_get_num_devices
::
Get
number
of
devices
for
the
given
device
type
.
*
acc_set_device_type
::
Set
type
of
device
accelerator
to
use
.
*
acc_get_device_type
::
Get
type
of
device
accelerator
to
be
used
.
*
acc_set_device_num
::
Set
device
number
to
use
.
*
acc_get_device_num
::
Get
device
number
to
be
used
.
*
acc_async_test
::
Tests
for
completion
of
a
specific
asynchronous
operation
.
*
acc_async_test_all
::
Tests
for
completion
of
all
asychronous
operations
.
*
acc_wait
::
Wait
for
completion
of
a
specific
asynchronous
operation
.
*
acc_wait_all
::
Waits
for
completion
of
all
asyncrhonous
operations
.
*
acc_wait_all_async
::
Wait
for
completion
of
all
asynchronous
operations
.
*
acc_wait_async
::
Wait
for
completion
of
asynchronous
operations
.
*
acc_init
::
Initialize
runtime
for
a
specific
device
type
.
*
acc_shutdown
::
Shuts
down
the
runtime
for
a
specific
device
type
.
*
acc_on_device
::
Whether
executing
on
a
particular
device
*
acc_malloc
::
Allocate
device
memory
.
*
acc_free
::
Free
device
memory
.
*
acc_copyin
::
Allocate
device
memory
and
copy
host
memory
to
it
.
*
acc_present_or_copyin
::
If
the
data
is
not
present
on
the
device
,
allocate
device
memory
and
copy
from
host
memory
.
*
acc_create
::
Allocate
device
memory
and
map
it
to
host
memory
.
*
acc_present_or_create
::
If
the
data
is
not
present
on
the
device
,
allocate
device
memory
and
map
it
to
host
memory
.
*
acc_copyout
::
Copy
device
memory
to
host
memory
.
*
acc_delete
::
Free
device
memory
.
*
acc_update_device
::
Update
device
memory
from
mapped
host
memory
.
*
acc_update_self
::
Update
host
memory
from
mapped
device
memory
.
*
acc_map_data
::
Map
previously
allocated
device
memory
to
host
memory
.
*
acc_unmap_data
::
Unmap
device
memory
from
host
memory
.
*
acc_deviceptr
::
Get
device
pointer
associated
with
specific
host
address
.
*
acc_hostptr
::
Get
host
pointer
associated
with
specific
device
address
.
*
acc_is_present
::
Indiciate
whether
host
variable
/
array
is
present
on
device
.
*
acc_memcpy_to_device
::
Copy
host
memory
to
device
memory
.
*
acc_memcpy_from_device
::
Copy
device
memory
to
host
memory
.
API
routines
for
target
platforms
.
*
acc_get_current_cuda_device
::
Get
CUDA
device
handle
.
*
acc_get_current_cuda_context
::
Get
CUDA
context
handle
.
*
acc_get_cuda_stream
::
Get
CUDA
stream
handle
.
*
acc_set_cuda_stream
::
Set
CUDA
stream
handle
.
@end
menu
@node
acc_get_num_devices
@section
@code
{
acc_get_num_devices
}
--
Get
number
of
devices
for
given
device
type
@table
@asis
@item
@emph
{
Description
}
This
function
returns
a
value
indicating
the
number
of
devices
available
for
the
device
type
specified
in
@var
{
devicetype
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
int
acc_get_num_devices
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
integer
function
acc_get_num_devices
(
devicetype
)}
@item
@tab
@code
{
integer
(
kind
=
acc_device_kind
)
devicetype
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
1
.
@end
table
@node
acc_set_device_type
@section
@code
{
acc_set_device_type
}
--
Set
type
of
device
accelerator
to
use
.
@table
@asis
@item
@emph
{
Description
}
This
function
indicates
to
the
runtime
library
which
device
typr
,
specified
in
@var
{
devicetype
},
to
use
when
executing
a
parallel
or
kernels
region
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_set_device_type
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_set_device_type
(
devicetype
)}
@item
@tab
@code
{
integer
(
kind
=
acc_device_kind
)
devicetype
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
2
.
@end
table
@node
acc_get_device_type
@section
@code
{
acc_get_device_type
}
--
Get
type
of
device
accelerator
to
be
used
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
what
device
type
will
be
used
when
executing
a
parallel
or
kernels
region
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_device_t
acc_get_device_type
(
void
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_get_device_type
(
void
)}
@item
@tab
@code
{
integer
(
kind
=
acc_device_kind
)
acc_get_device_type
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
3
.
@end
table
@node
acc_set_device_num
@section
@code
{
acc_set_device_num
}
--
Set
device
number
to
use
.
@table
@asis
@item
@emph
{
Description
}
This
function
will
indicate
to
the
runtime
which
device
number
,
specified
by
@var
{
num
},
associated
with
the
specifed
device
type
@var
{
devicetype
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_set_device_num
(
int
num
,
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_set_device_num
(
devicenum
,
devicetype
)}
@item
@tab
@code
{
integer
devicenum
}
@item
@tab
@code
{
integer
(
kind
=
acc_device_kind
)
devicetype
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
4
.
@end
table
@node
acc_get_device_num
@section
@code
{
acc_get_device_num
}
--
Get
device
number
to
be
used
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
which
device
number
associated
with
the
specified
device
type
@var
{
devicetype
},
will
be
used
when
executing
a
parallel
or
kernels
region
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
int
acc_get_device_num
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_get_device_num
(
devicetype
)}
@item
@tab
@code
{
integer
(
kind
=
acc_device_kind
)
devicetype
}
@item
@tab
@code
{
integer
acc_get_device_num
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
5
.
@end
table
@node
acc_async_test
@section
@code
{
acc_async_test
}
--
Test
for
completion
of
a
specific
asynchronous
operation
.
@table
@asis
@item
@emph
{
Description
}
This
function
tests
for
completion
of
the
asynchrounous
operation
specified
in
@var
{
arg
}.
In
C
/
C
++
,
a
non
-
zero
value
will
be
returned
to
indicate
the
specified
asynchronous
operation
has
completed
.
While
Fortran
will
return
a
@code
{
true
}.
If
the
asynchrounous
operation
has
not
completed
,
C
/
C
++
returns
a
zero
and
Fortran
returns
a
@code
{
false
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
int
acc_async_test
(
int
arg
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_async_test
(
arg
)}
@item
@tab
@code
{
integer
(
kind
=
acc_handle_kind
)
arg
}
@item
@tab
@code
{
logical
acc_async_test
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
6
.
@end
table
@node
acc_async_test_all
@section
@code
{
acc_async_test_all
}
--
Tests
for
completion
of
all
asynchronous
operations
.
@table
@asis
@item
@emph
{
Description
}
This
function
tests
for
completion
of
all
asynchrounous
operations
.
In
C
/
C
++
,
a
non
-
zero
value
will
be
returned
to
indicate
all
asynchronous
operations
have
completed
.
While
Fortran
will
return
a
@code
{
true
}.
If
any
asynchronous
operation
has
not
completed
,
C
/
C
++
returns
a
zero
and
Fortran
returns
a
@code
{
false
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
int
acc_async_test_all
(
void
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_async_test
()}
@item
@tab
@code
{
logical
acc_get_device_num
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
7
.
@end
table
@node
acc_wait
@section
@code
{
acc_wait
}
--
Wait
for
completion
of
a
specific
asynchronous
operation
.
@table
@asis
@item
@emph
{
Description
}
This
function
waits
for
completion
of
the
asynchronous
operation
specified
in
@var
{
arg
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_wait
(
arg
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_wait
(
arg
)}
@item
@tab
@code
{
integer
(
acc_handle_kind
)
arg
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
8
.
@end
table
@node
acc_wait_all
@section
@code
{
acc_wait_all
}
--
Waits
for
completion
of
all
asynchronous
operations
.
@table
@asis
@item
@emph
{
Description
}
This
function
waits
for
the
completion
of
all
asynchronous
operations
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_wait_all
(
void
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_wait_async
()}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
10
.
@end
table
@node
acc_wait_all_async
@section
@code
{
acc_wait_all_async
}
--
Wait
for
completion
of
all
asynchronous
operations
.
@table
@asis
@item
@emph
{
Description
}
This
function
enqueues
a
wait
operation
on
the
queue
@var
{
async
}
for
any
and
all
asynchronous
operations
that
have
been
previously
enqueued
on
any
queue
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_wait_all_async
(
int
async
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_wait_all_async
(
async
)}
@item
@tab
@code
{
integer
(
acc_handle_kind
)
async
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
11
.
@end
table
@node
acc_wait_async
@section
@code
{
acc_wait_async
}
--
Wait
for
completion
of
asynchronous
operations
.
@table
@asis
@item
@emph
{
Description
}
This
function
enqueues
a
wait
operation
on
queue
@var
{
async
}
for
any
and
all
asynchronous
operations
enqueued
on
queue
@var
{
arg
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_wait_async
(
int
arg
,
int
async
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_wait_async
(
arg
,
async
)}
@item
@tab
@code
{
integer
(
acc_handle_kind
)
arg
,
async
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
9
.
@end
table
@node
acc_init
@section
@code
{
acc_init
}
--
Initialize
runtime
for
a
specific
device
type
.
@table
@asis
@item
@emph
{
Description
}
This
function
initializes
the
runtime
for
the
device
type
specified
in
@var
{
devicetype
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_init
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_init
(
devicetype
)}
@item
@tab
@code
{
integer
(
acc_device_kind
)
devicetype
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
12
.
@end
table
@node
acc_shutdown
@section
@code
{
acc_shutdown
}
--
Shuts
down
the
runtime
for
a
specific
device
type
.
@table
@asis
@item
@emph
{
Description
}
This
function
shuts
down
the
runtime
for
the
device
type
specified
in
@var
{
devicetype
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_shutdown
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_shutdown
(
devicetype
)}
@item
@tab
@code
{
integer
(
acc_device_kind
)
devicetype
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
13
.
@end
table
@node
acc_on_device
@section
@code
{
acc_on_device
}
--
Whether
executing
on
a
particular
device
@table
@asis
@item
@emph
{
Description
}
:
This
function
returns
whether
the
program
is
executing
on
a
particular
device
specified
in
@var
{
devicetype
}.
In
C
/
C
++
a
non
-
zero
value
is
returned
to
indicate
the
device
is
execiting
on
the
specified
device
type
.
In
Fortran
,
@code
{
true
}
will
be
returned
.
If
the
program
is
not
executing
on
the
specified
device
type
C
/
C
++
will
return
a
zero
,
while
Fortran
will
return
@code
{
false
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_on_device
(
acc_device_t
devicetype
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_on_device
(
devicetype
)}
@item
@tab
@code
{
integer
(
acc_device_kind
)
devicetype
}
@item
@tab
@code
{
logical
acc_on_device
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
14
.
@end
table
@node
acc_malloc
@section
@code
{
acc_malloc
}
--
Allocate
device
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
allocates
@var
{
len
}
bytes
of
device
memory
.
It
returns
the
device
address
of
the
allocated
memory
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
d_void
*
acc_malloc
(
size_t
len
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
15
.
@end
table
@node
acc_free
@section
@code
{
acc_free
}
--
Free
device
memory
.
@table
@asis
@item
@emph
{
Description
}
Free
previously
allocated
device
memory
at
the
device
address
@code
{
a
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_free
(
d_void
*
a
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
16
.
@end
table
@node
acc_copyin
@section
@code
{
acc_copyin
}
--
Allocate
device
memory
and
copy
host
memory
to
it
.
@table
@asis
@item
@emph
{
Description
}
In
C
/
C
++
,
this
function
allocates
@var
{
len
}
bytes
of
device
memory
and
maps
it
to
the
specified
host
address
in
@var
{
a
}.
The
device
address
of
the
newly
allocated
device
memory
is
returned
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_copyin
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_copyin
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_copyin
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
17
.
@end
table
@node
acc_present_or_copyin
@section
@code
{
acc_present_or_copyin
}
--
If
the
data
is
not
present
on
the
device
,
allocate
device
memory
and
copy
from
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
tests
if
the
host
data
specifed
by
@var
{
a
}
and
of
length
@var
{
len
}
is
present
or
not
.
If
it
is
not
present
,
then
device
memory
will
be
allocated
and
the
host
memory
copied
.
The
device
address
of
the
newly
allocated
device
memory
is
returned
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_present_or_copyin
(
h_void
*
a
,
size_t
len
);}
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_pcopyin
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_present_or_copyin
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_present_or_copyin
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_pcopyin
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_pcopyin
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
18
.
@end
table
@node
acc_create
@section
@code
{
acc_create
}
--
Allocate
device
memory
and
map
it
to
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
allocates
device
memory
and
maps
it
to
host
memory
specified
by
the
host
address
@var
{
a
}
with
a
length
of
@var
{
len
}
bytes
.
In
C
/
C
++
,
the
function
returns
the
device
address
of
the
allocated
device
memory
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_create
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_create
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_create
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
19
.
@end
table
@node
acc_present_or_create
@section
@code
{
acc_present_or_create
}
--
If
the
data
is
not
present
on
the
device
,
allocate
device
memory
and
map
it
to
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
tests
if
the
host
data
specifed
by
@var
{
a
}
and
of
length
@var
{
len
}
is
present
or
not
.
If
it
is
not
present
,
then
device
memory
will
be
allocated
and
mapped
to
host
memory
.
In
C
/
C
++
,
the
device
address
of
the
newly
allocated
device
memory
is
returned
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_present_or_create
(
h_void
*
a
,
size_t
len
)}
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_pcreate
(
h_void
*
a
,
size_t
len
)}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_present_or_create
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_present_or_create
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_pcreate
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_pcreate
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
20
.
@end
table
@node
acc_copyout
@section
@code
{
acc_copyout
}
--
Copy
device
memory
to
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
copies
mapped
device
memory
to
host
memory
which
is
specified
by
host
address
@var
{
a
}
for
a
length
@var
{
len
}
bytes
in
C
/
C
++
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_copyout
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_copyout
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_copyout
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
21
.
@end
table
@node
acc_delete
@section
@code
{
acc_delete
}
--
Free
device
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
frees
previously
allocated
device
memory
specified
by
the
device
address
@var
{
a
}
and
the
length
of
@var
{
len
}
bytes
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_delete
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_delete
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_delete
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
22
.
@end
table
@node
acc_update_device
@section
@code
{
acc_update_device
}
--
Update
device
memory
from
mapped
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
updates
the
device
copy
from
the
previously
mapped
host
memory
.
The
host
memory
is
specified
with
the
host
address
@var
{
a
}
and
a
length
of
@var
{
len
}
bytes
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_update_device
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_update_device
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_update_device
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
23
.
@end
table
@node
acc_update_self
@section
@code
{
acc_update_self
}
--
Update
host
memory
from
mapped
device
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
updates
the
host
copy
from
the
previously
mapped
device
memory
.
The
host
memory
is
specified
with
the
host
address
@var
{
a
}
and
a
length
of
@var
{
len
}
bytes
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_update_self
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_update_self
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@emph
{
Interface
}
:
@tab
@code
{
subroutine
acc_update_self
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
24
.
@end
table
@node
acc_map_data
@section
@code
{
acc_map_data
}
--
Map
previously
allocated
device
memory
to
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
maps
previously
allocated
device
and
host
memory
.
The
device
memory
is
specified
with
the
device
address
@var
{
d
}.
The
host
memory
is
specified
with
the
host
address
@var
{
h
}
and
a
length
of
@var
{
len
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_map_data
(
h_void
*
h
,
d_void
*
d
,
size_t
len
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
25
.
@end
table
@node
acc_unmap_data
@section
@code
{
acc_unmap_data
}
--
Unmap
device
memory
from
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
unmaps
previously
mapped
device
and
host
memory
.
The
latter
specified
by
@var
{
h
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_unmap_data
(
h_void
*
h
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
26
.
@end
table
@node
acc_deviceptr
@section
@code
{
acc_deviceptr
}
--
Get
device
pointer
associated
with
specific
host
address
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
the
device
address
that
has
been
mapped
to
the
host
address
specified
by
@var
{
h
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_deviceptr
(
h_void
*
h
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
27
.
@end
table
@node
acc_hostptr
@section
@code
{
acc_hostptr
}
--
Get
host
pointer
associated
with
specific
device
address
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
the
host
address
that
has
been
mapped
to
the
device
address
specified
by
@var
{
d
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_hostptr
(
d_void
*
d
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
28
.
@end
table
@node
acc_is_present
@section
@code
{
acc_is_present
}
--
Indicate
whether
host
variable
/
array
is
present
on
device
.
@table
@asis
@item
@emph
{
Description
}
This
function
indicates
whether
the
specified
host
address
in
@var
{
a
}
and
a
length
of
@var
{
len
}
bytes
is
present
on
the
device
.
In
C
/
C
++
,
a
non
-
zero
value
is
returned
to
indicate
the
presence
of
the
mapped
memory
on
the
device
.
A
zero
is
returned
to
indicate
the
memory
is
not
mapped
on
the
device
.
In
Fortran
,
two
(
2
)
forms
are
supported
.
In
the
first
form
,
@var
{
a
}
specifies
a
contiguous
array
section
.
The
second
form
@var
{
a
}
specifies
a
variable
or
array
element
and
@var
{
len
}
specifies
the
length
in
bytes
.
If
the
host
memory
is
mapped
to
device
memory
,
then
a
@code
{
true
}
is
returned
.
Otherwise
,
a
@code
{
false
}
is
return
to
indicate
the
mapped
memory
is
not
present
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
int
acc_is_present
(
h_void
*
a
,
size_t
len
);}
@end
multitable
@item
@emph
{
Fortran
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_is_present
(
a
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
logical
acc_is_present
}
@item
@emph
{
Interface
}
:
@tab
@code
{
function
acc_is_present
(
a
,
len
)}
@item
@tab
@code
{
type
,
dimension
(
:
[,
:
]...)
::
a
}
@item
@tab
@code
{
integer
len
}
@item
@tab
@code
{
logical
acc_is_present
}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
29
.
@end
table
@node
acc_memcpy_to_device
@section
@code
{
acc_memcpy_to_device
}
--
Copy
host
memory
to
device
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
copies
host
memory
specified
by
host
address
of
@var
{
src
}
to
device
memory
specified
by
the
device
address
@var
{
dest
}
for
a
length
of
@var
{
bytes
}
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_memcpy_to_device
(
d_void
*
dest
,
h_void
*
src
,
size_t
bytes
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
30
.
@end
table
@node
acc_memcpy_from_device
@section
@code
{
acc_memcpy_from_device
}
--
Copy
device
memory
to
host
memory
.
@table
@asis
@item
@emph
{
Description
}
This
function
copies
host
memory
specified
by
host
address
of
@var
{
src
}
from
device
memory
specified
by
the
device
address
@var
{
dest
}
for
a
length
of
@var
{
bytes
}
bytes
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_memcpy_from_device
(
d_void
*
dest
,
h_void
*
src
,
size_t
bytes
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
3
.
2
.
31
.
@end
table
@node
acc_get_current_cuda_device
@section
@code
{
acc_get_current_cuda_device
}
--
Get
CUDA
device
handle
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
the
CUDA
device
handle
.
This
handle
is
the
same
as
used
by
the
CUDA
Runtime
or
Driver
API
'
s
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
void
*
acc_get_current_cuda_device
(
void
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
A
.
2
.
1
.
1
.
@end
table
@node
acc_get_current_cuda_context
@section
@code
{
acc_get_current_cuda_context
}
--
Get
CUDA
context
handle
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
the
CUDA
context
handle
.
This
handle
is
the
same
as
used
by
the
CUDA
Runtime
or
Driver
API
'
s
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_get_current_cuda_context
(
void
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
A
.
2
.
1
.
2
.
@end
table
@node
acc_get_cuda_stream
@section
@code
{
acc_get_cuda_stream
}
--
Get
CUDA
stream
handle
.
@table
@asis
@item
@emph
{
Description
}
This
function
returns
the
CUDA
stream
handle
.
This
handle
is
the
same
as
used
by
the
CUDA
Runtime
or
Driver
API
'
s
.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_get_cuda_stream
(
void
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
A
.
2
.
1
.
3
.
@end
table
@node
acc_set_cuda_stream
@section
@code
{
acc_set_cuda_stream
}
--
Set
CUDA
stream
handle
.
@table
@asis
@item
@emph
{
Description
}
This
function
associates
the
stream
handle
specified
by
@var
{
stream
}
with
the
asynchronous
value
specified
by
@var
{
async
}.
@item
@emph
{
C
/
C
++
}
:
@multitable
@columnfractions
.
20
.
80
@item
@emph
{
Prototype
}
:
@tab
@code
{
acc_set_cuda_stream
(
int
async
void
*
stream
);}
@end
multitable
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
A
.
2
.
1
.
4
.
@end
table
@c
---------------------------------------------------------------------
@c
OpenACC
Environment
Variables
@c
---------------------------------------------------------------------
@node
OpenACC
Environment
Variables
@chapter
OpenACC
Environment
Variables
The
variables
@env
{
ACC_DEVICE_TYPE
}
and
@env
{
ACC_DEVICE_NUM
}
are
defined
by
section
4
of
the
OpenACC
specification
in
version
2
.
0
.
The
variable
@env
{
GCC_ACC_NOTIFY
}
is
used
for
diagnostic
purposes
.
@menu
*
ACC_DEVICE_TYPE
::
*
ACC_DEVICE_NUM
::
*
GCC_ACC_NOTIFY
::
@end
menu
@node
ACC_DEVICE_TYPE
@section
@code
{
ACC_DEVICE_TYPE
}
@table
@asis
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
4
.
1
.
@end
table
@node
ACC_DEVICE_NUM
@section
@code
{
ACC_DEVICE_NUM
}
@table
@asis
@item
@emph
{
Reference
}
:
@uref
{
http
:
//www.openacc.org/, OpenACC specification v2.0}, section
4
.
2
.
@end
table
@node
GCC_ACC_NOTIFY
@section
@code
{
GCC_ACC_NOTIFY
}
@table
@asis
@item
@emph
{
Description
}
:
Print
debug
information
pertaining
to
the
accelerator
.
@end
table
@c
---------------------------------------------------------------------
@c
CUDA
Streams
Usage
@c
---------------------------------------------------------------------
@node
CUDA
Streams
Usage
@chapter
CUDA
Streams
Usage
This
applies
to
the
@code
{
nvptx
}
plugin
only
.
The
library
provides
elements
that
perform
asynchronous
movement
of
data
and
asynchronous
operation
of
computing
constructs
.
This
asynchronous
functionality
is
implemented
by
making
use
of
CUDA
streams
@footnote
{
See
"Stream Management"
in
"CUDA Driver API"
,
TRM
-
06703
-
001
,
Version
5
.
5
,
for
additional
information
}.
The
primary
means
by
that
the
asychronous
functionality
is
accessed
is
through
the
use
of
those
OpenACC
directives
which
make
use
of
the
@code
{
async
}
and
@code
{
wait
}
clauses
.
When
the
@code
{
async
}
clause
is
first
used
with
a
directive
,
it
creates
a
CUDA
stream
.
If
an
@code
{
async
-
argument
}
is
used
with
the
@code
{
async
}
clause
,
then
the
stream
is
associated
with
the
specified
@code
{
async
-
argument
}.
Following
the
creation
of
an
association
between
a
CUDA
stream
and
the
@code
{
async
-
argument
}
of
an
@code
{
async
}
clause
,
both
the
@code
{
wait
}
clause
and
the
@code
{
wait
}
directive
can
be
used
.
When
either
the
clause
or
directive
is
used
after
stream
creation
,
it
creates
a
rendezvous
point
whereby
execution
waits
until
all
operations
associated
with
the
@code
{
async
-
argument
},
that
is
,
stream
,
have
completed
.
Normally
,
the
management
of
the
streams
that
are
created
as
a
result
of
using
the
@code
{
async
}
clause
,
is
done
without
any
intervention
by
the
caller
.
This
implies
the
association
between
the
@code
{
async
-
argument
}
and
the
CUDA
stream
will
be
maintained
for
the
lifetime
of
the
program
.
However
,
this
association
can
be
changed
through
the
use
of
the
library
function
@code
{
acc_set_cuda_stream
}.
When
the
function
@code
{
acc_set_cuda_stream
}
is
called
,
the
CUDA
stream
that
was
originally
associated
with
the
@code
{
async
}
clause
will
be
destroyed
.
Caution
should
be
taken
when
changing
the
association
as
subsequent
references
to
the
@code
{
async
-
argument
}
refer
to
a
different
CUDA
stream
.
@c
---------------------------------------------------------------------
@c
OpenACC
Library
Interoperability
@c
---------------------------------------------------------------------
@node
OpenACC
Library
Interoperability
@chapter
OpenACC
Library
Interoperability
@section
Introduction
The
OpenACC
library
uses
the
CUDA
Driver
API
,
and
may
interact
with
programs
that
use
the
Runtime
library
directly
,
or
another
library
based
on
the
Runtime
library
,
e
.
g
.,
CUBLAS
@footnote
{
See
section
2
.
26
,
"Interactions with the CUDA Driver API"
in
"CUDA Runtime API"
,
Version
5
.
5
,
and
section
2
.
27
,
"VDPAU
Interoperability"
,
in
"CUDA Driver API"
,
TRM
-
06703
-
001
,
Version
5
.
5
,
for
additional
information
on
library
interoperability
.}.
This
chapter
describes
the
use
cases
and
what
changes
are
required
in
order
to
use
both
the
OpenACC
library
and
the
CUBLAS
and
Runtime
libraries
within
a
program
.
@section
First
invocation
:
NVIDIA
CUBLAS
library
API
In
this
first
use
case
(
see
below
),
a
function
in
the
CUBLAS
library
is
called
prior
to
any
of
the
functions
in
the
OpenACC
library
.
More
specifically
,
the
function
@code{
cublasCreate
()
}
.
When
invoked
,
the
function
initializes
the
library
and
allocates
the
hardware
resources
on
the
host
and
the
device
on
behalf
of
the
caller
.
Once
the
initialization
and
allocation
has
completed
,
a
handle
is
returned
to
the
caller
.
The
OpenACC
library
also
requires
initialization
and
allocation
of
hardware
resources
.
Since
the
CUBLAS
library
has
already
allocated
the
hardware
resources
for
the
device
,
all
that
is
left
to
do
is
to
initialize
the
OpenACC
library
and
acquire
the
hardware
resources
on
the
host
.
Prior
to
calling
the
OpenACC
function
that
initializes
the
library
and
allocate
the
host
hardware
resources
,
you
need
to
acquire
the
device
number
that
was
allocated
during
the
call
to
@code{
cublasCreate
()
}
.
The
invoking
of
the
runtime
library
function
@code{
cudaGetDevice
()
}
accomplishes
this
.
Once
acquired
,
the
device
number
is
passed
along
with
the
device
type
as
parameters
to
the
OpenACC
library
function
@code{
acc_set_device_num
()
}
.
Once
the
call
to
@code{
acc_set_device_num
()
}
has
completed
,
the
OpenACC
library
uses
the
context
that
was
created
during
the
call
to
@code{
cublasCreate
()
}
.
In
other
words
,
both
libraries
will
be
sharing
the
same
context
.
@smallexample
/* Create the handle */
s
=
cublasCreate
(
&
h
)
;
if
(
s
!=
CUBLAS_STATUS_SUCCESS
)
@{
fprintf
(
stderr
,
"cublasCreate failed %d
\n
"
,
s
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Get the device number */
e
=
cudaGetDevice
(
&
dev
)
;
if
(
e
!=
cudaSuccess
)
@{
fprintf
(
stderr
,
"cudaGetDevice failed %d
\n
"
,
e
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Initialize OpenACC library and use device 'dev' */
acc_set_device_num
(
dev
,
acc_device_nvidia
)
;
@end
smallexample
@center
Use
Case
1
@section
First
invocation
:
OpenACC
library
API
In
this
second
use
case
(
see
below
),
a
function
in
the
OpenACC
library
is
called
prior
to
any
of
the
functions
in
the
CUBLAS
library
.
More
specificially
,
the
function
@code{
acc_set_device_num
()
}
.
In
the
use
case
presented
here
,
the
function
@code{
acc_set_device_num
()
}
is
used
to
both
initialize
the
OpenACC
library
and
allocate
the
hardware
resources
on
the
host
and
the
device
.
In
the
call
to
the
function
,
the
call
parameters
specify
which
device
to
use
and
what
device
type
to
use
,
i
.
e
.,
@code{
acc_device_nvidia
}
.
It
should
be
noted
that
this
is
but
one
method
to
initialize
the
OpenACC
library
and
allocate
the
appropriate
hardware
resources
.
Other
methods
are
available
through
the
use
of
environment
variables
and
these
will
be
discussed
in
the
next
section
.
Once
the
call
to
@code{
acc_set_device_num
()
}
has
completed
,
other
OpenACC
functions
can
be
called
as
seen
with
multiple
calls
being
made
to
@code{
acc_copyin
()
}
.
In
addition
,
calls
can
be
made
to
functions
in
the
CUBLAS
library
.
In
the
use
case
a
call
to
@code{
cublasCreate
()
}
is
made
subsequent
to
the
calls
to
@code{
acc_copyin
()
}
.
As
seen
in
the
previous
use
case
,
a
call
to
@code{
cublasCreate
()
}
initializes
the
CUBLAS
library
and
allocates
the
hardware
resources
on
the
host
and
the
device
.
However
,
since
the
device
has
already
been
allocated
,
@code{
cublasCreate
()
}
will
only
initialize
the
CUBLAS
library
and
allocate
the
appropriate
hardware
resources
on
the
host
.
The
context
that
was
created
as
part
of
the
OpenACC
initialization
is
shared
with
the
CUBLAS
library
,
similarly
to
the
first
use
case
.
@smallexample
dev
=
0
;
acc_set_device_num
(
dev
,
acc_device_nvidia
)
;
/* Copy the first set to the device */
d_X
=
acc_copyin
(
&
h_X
[
0
],
N
*
sizeof
(
float
))
;
if
(
d_X
==
NULL
)
@{
fprintf
(
stderr
,
"copyin error h_X
\n
"
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Copy the second set to the device */
d_Y
=
acc_copyin
(
&
h_Y1
[
0
],
N
*
sizeof
(
float
))
;
if
(
d_Y
==
NULL
)
@{
fprintf
(
stderr
,
"copyin error h_Y1
\n
"
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Create the handle */
s
=
cublasCreate
(
&
h
)
;
if
(
s
!=
CUBLAS_STATUS_SUCCESS
)
@{
fprintf
(
stderr
,
"cublasCreate failed %d
\n
"
,
s
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Perform saxpy using CUBLAS library function */
s
=
cublasSaxpy
(
h
,
N
,
&
alpha
,
d_X
,
1
,
d_Y
,
1
)
;
if
(
s
!=
CUBLAS_STATUS_SUCCESS
)
@{
fprintf
(
stderr
,
"cublasSaxpy failed %d
\n
"
,
s
)
;
exit
(
EXIT_FAILURE
)
;
@
}
/* Copy the results from the device */
acc_memcpy_from_device
(
&
h_Y1
[
0
],
d_Y
,
N
*
sizeof
(
float
))
;
@end
smallexample
@center
Use
Case
2
@section
OpenACC
library
and
environment
variables
There
are
two
environment
variables
associated
with
the
OpenACC
library
that
may
be
used
to
control
the
device
type
and
device
number
:
@env{
ACC_DEVICE_TYPE
}
and
@env{
ACC_DEVICE_NUM
}
,
respecively
.
These
two
environement
variables
can
be
used
as
an
alternative
to
calling
@code{
acc_set_device_num
()
}
.
As
seen
in
the
second
use
case
,
the
device
type
and
device
number
were
specified
using
@code{
acc_set_device_num
()
}
.
If
however
,
the
aforementioned
environment
variables
were
set
,
then
the
call
to
@code{
acc_set_device_num
()
}
would
not
be
required
.
The
use
of
the
environment
variables
is
only
relevant
when
an
OpenACC
function
is
called
prior
to
a
call
to
@code{
cudaCreate
()
}
.
If
@code{
cudaCreate
()
}
is
called
prior
to
a
call
to
an
OpenACC
function
,
then
you
must
call
@code{
acc_set_device_num
()
}@footnote{
More
complete
information
about
@env{
ACC_DEVICE_TYPE
}
and
@env{
ACC_DEVICE_NUM
}
can
be
found
in
sections
4
.
1
and
4
.
2
of
the
@uref{
http
:
//www.openacc.org/, OpenACC}
Application
Programming
Interface
”
,
Version
2
.
0
.
}
@c
---------------------------------------------------------------------
@c
The
libgomp
ABI
@c
---------------------------------------------------------------------
...
...
@@ -1814,6 +3090,7 @@ presented by libgomp. Only maintainers should need them.
*
Implementing
ORDERED
construct
::
*
Implementing
SECTIONS
construct
::
*
Implementing
SINGLE
construct
::
*
Implementing
OpenACC
'
s
PARALLEL
construct
::
@end
menu
...
...
@@ -2178,6 +3455,15 @@ becomes
@node
Implementing
OpenACC
'
s
PARALLEL
construct
@section
Implementing
OpenACC
'
s
PARALLEL
construct
@smallexample
void
GOACC_parallel
()
@end
smallexample
@c
---------------------------------------------------------------------
@c
Reporting
Bugs
@c
---------------------------------------------------------------------
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment