Commit 2eab9666 by Ilya Verbin

backport: Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host and libmyo-client.

Merge liboffloadmic from upstream, version 20150803.

liboffloadmic/
	* Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host
	and libmyo-client.  liboffloadmic_host loads them dynamically.
	* Makefile.in: Regenerate.
	* doc/doxygen/header.tex: Merge from upstream, version 20150803
	<https://openmprtl.org/sites/default/files/liboffload_oss_20150803.tgz>.
	* runtime/cean_util.cpp: Likewise.
	* runtime/cean_util.h: Likewise.
	* runtime/coi/coi_client.cpp: Likewise.
	* runtime/coi/coi_client.h: Likewise.
	* runtime/coi/coi_server.cpp: Likewise.
	* runtime/coi/coi_server.h: Likewise.
	* runtime/compiler_if_host.cpp: Likewise.
	* runtime/compiler_if_host.h: Likewise.
	* runtime/compiler_if_target.cpp: Likewise.
	* runtime/compiler_if_target.h: Likewise.
	* runtime/dv_util.cpp: Likewise.
	* runtime/dv_util.h: Likewise.
	* runtime/liboffload_error.c: Likewise.
	* runtime/liboffload_error_codes.h: Likewise.
	* runtime/liboffload_msg.c: Likewise.
	* runtime/liboffload_msg.h: Likewise.
	* runtime/mic_lib.f90: Likewise.
	* runtime/offload.h: Likewise.
	* runtime/offload_common.cpp: Likewise.
	* runtime/offload_common.h: Likewise.
	* runtime/offload_engine.cpp: Likewise.
	* runtime/offload_engine.h: Likewise.
	* runtime/offload_env.cpp: Likewise.
	* runtime/offload_env.h: Likewise.
	* runtime/offload_host.cpp: Likewise.
	* runtime/offload_host.h: Likewise.
	* runtime/offload_iterator.h: Likewise.
	* runtime/offload_myo_host.cpp: Likewise.
	* runtime/offload_myo_host.h: Likewise.
	* runtime/offload_myo_target.cpp: Likewise.
	* runtime/offload_myo_target.h: Likewise.
	* runtime/offload_omp_host.cpp: Likewise.
	* runtime/offload_omp_target.cpp: Likewise.
	* runtime/offload_orsl.cpp: Likewise.
	* runtime/offload_orsl.h: Likewise.
	* runtime/offload_table.cpp: Likewise.
	* runtime/offload_table.h: Likewise.
	* runtime/offload_target.cpp: Likewise.
	* runtime/offload_target.h: Likewise.
	* runtime/offload_target_main.cpp: Likewise.
	* runtime/offload_timer.h: Likewise.
	* runtime/offload_timer_host.cpp: Likewise.
	* runtime/offload_timer_target.cpp: Likewise.
	* runtime/offload_trace.cpp: Likewise.
	* runtime/offload_trace.h: Likewise.
	* runtime/offload_util.cpp: Likewise.
	* runtime/offload_util.h: Likewise.
	* runtime/ofldbegin.cpp: Likewise.
	* runtime/ofldend.cpp: Likewise.
	* runtime/orsl-lite/include/orsl-lite.h: Likewise.
	* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
	* runtime/use_mpss2.txt: Likewise.
	* include/coi/common/COIEngine_common.h: Merge from upstream, MPSS
	version 3.5
	<http://registrationcenter.intel.com/irc_nas/7445/mpss-src-3.5.tar>.
	* include/coi/common/COIEvent_common.h: Likewise.
	* include/coi/common/COIMacros_common.h: Likewise.
	* include/coi/common/COIPerf_common.h: Likewise.
	* include/coi/common/COIResult_common.h: Likewise.
	* include/coi/common/COISysInfo_common.h: Likewise.
	* include/coi/common/COITypes_common.h: Likewise.
	* include/coi/sink/COIBuffer_sink.h: Likewise.
	* include/coi/sink/COIPipeline_sink.h: Likewise.
	* include/coi/sink/COIProcess_sink.h: Likewise.
	* include/coi/source/COIBuffer_source.h: Likewise.
	* include/coi/source/COIEngine_source.h: Likewise.
	* include/coi/source/COIEvent_source.h: Likewise.
	* include/coi/source/COIPipeline_source.h: Likewise.
	* include/coi/source/COIProcess_source.h: Likewise.
	* include/myo/myo.h: Likewise.
	* include/myo/myoimpl.h: Likewise.
	* include/myo/myotypes.h: Likewise.
	* plugin/Makefile.am (myo_inc_dir): Remove.
	(libgomp_plugin_intelmic_la_CPPFLAGS): Do not define MYO_SUPPORT.
	(AM_CPPFLAGS): Likewise for offload_target_main.
	* plugin/Makefile.in: Regenerate.
	* runtime/emulator/coi_common.h: Update copyright years.
	(OFFLOAD_EMUL_KNC_NUM_ENV): Replace with ...
	(OFFLOAD_EMUL_NUM_ENV): ... this.
	(enum cmd_t): Add CMD_CLOSE_LIBRARY.
	* runtime/emulator/coi_device.cpp: Update copyright years.
	(COIProcessWaitForShutdown): Add space between string constants.
	Return handle to host in CMD_OPEN_LIBRARY.
	Support CMD_CLOSE_LIBRARY.
	* runtime/emulator/coi_device.h: Update copyright years.
	* runtime/emulator/coi_host.cpp: Update copyright years.
	(knc_engines_num): Replace with ...
	(num_engines): ... this.
	(init): Replace OFFLOAD_EMUL_KNC_NUM_ENV with OFFLOAD_EMUL_NUM_ENV.
	(COIEngineGetCount): Replace COI_ISA_KNC with COI_ISA_MIC, and
	knc_engines_num with num_engines.
	(COIEngineGetHandle): Likewise.
	(COIProcessCreateFromMemory): Add space between string constants.
	(COIProcessCreateFromFile): New function.
	(COIProcessLoadLibraryFromMemory): Rename arguments according to
	COIProcess_source.h.  Return handle, received from target.
	(COIProcessUnloadLibrary): New function.
	(COIPipelineClearCPUMask): New function.
	(COIPipelineSetCPUMask): New function.
	(COIEngineGetInfo): New function.
	* runtime/emulator/coi_host.h: Update copyright years.
	* runtime/emulator/coi_version_asm.h: Regenerate.
	* runtime/emulator/coi_version_linker_script.map: Regenerate.
	* runtime/emulator/myo_client.cpp: Update copyright years.
	* runtime/emulator/myo_service.cpp: Update copyright years.
	(myoArenaRelease): New function.
	(myoArenaAcquire): New function.
	(myoArenaAlignedFree): New function.
	(myoArenaAlignedMalloc): New function.
	* runtime/emulator/myo_service.h: Update copyright years.
	* runtime/emulator/myo_version_asm.h: Regenerate.
	* runtime/emulator/myo_version_linker_script.map: Regenerate.

From-SVN: r227532
parent 761f8e2f
2015-09-08 Ilya Verbin <ilya.verbin@intel.com>
* Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host
and libmyo-client. liboffloadmic_host loads them dynamically.
* Makefile.in: Regenerate.
* doc/doxygen/header.tex: Merge from upstream, version 20150803
<https://openmprtl.org/sites/default/files/liboffload_oss_20150803.tgz>.
* runtime/cean_util.cpp: Likewise.
* runtime/cean_util.h: Likewise.
* runtime/coi/coi_client.cpp: Likewise.
* runtime/coi/coi_client.h: Likewise.
* runtime/coi/coi_server.cpp: Likewise.
* runtime/coi/coi_server.h: Likewise.
* runtime/compiler_if_host.cpp: Likewise.
* runtime/compiler_if_host.h: Likewise.
* runtime/compiler_if_target.cpp: Likewise.
* runtime/compiler_if_target.h: Likewise.
* runtime/dv_util.cpp: Likewise.
* runtime/dv_util.h: Likewise.
* runtime/liboffload_error.c: Likewise.
* runtime/liboffload_error_codes.h: Likewise.
* runtime/liboffload_msg.c: Likewise.
* runtime/liboffload_msg.h: Likewise.
* runtime/mic_lib.f90: Likewise.
* runtime/offload.h: Likewise.
* runtime/offload_common.cpp: Likewise.
* runtime/offload_common.h: Likewise.
* runtime/offload_engine.cpp: Likewise.
* runtime/offload_engine.h: Likewise.
* runtime/offload_env.cpp: Likewise.
* runtime/offload_env.h: Likewise.
* runtime/offload_host.cpp: Likewise.
* runtime/offload_host.h: Likewise.
* runtime/offload_iterator.h: Likewise.
* runtime/offload_myo_host.cpp: Likewise.
* runtime/offload_myo_host.h: Likewise.
* runtime/offload_myo_target.cpp: Likewise.
* runtime/offload_myo_target.h: Likewise.
* runtime/offload_omp_host.cpp: Likewise.
* runtime/offload_omp_target.cpp: Likewise.
* runtime/offload_orsl.cpp: Likewise.
* runtime/offload_orsl.h: Likewise.
* runtime/offload_table.cpp: Likewise.
* runtime/offload_table.h: Likewise.
* runtime/offload_target.cpp: Likewise.
* runtime/offload_target.h: Likewise.
* runtime/offload_target_main.cpp: Likewise.
* runtime/offload_timer.h: Likewise.
* runtime/offload_timer_host.cpp: Likewise.
* runtime/offload_timer_target.cpp: Likewise.
* runtime/offload_trace.cpp: Likewise.
* runtime/offload_trace.h: Likewise.
* runtime/offload_util.cpp: Likewise.
* runtime/offload_util.h: Likewise.
* runtime/ofldbegin.cpp: Likewise.
* runtime/ofldend.cpp: Likewise.
* runtime/orsl-lite/include/orsl-lite.h: Likewise.
* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
* runtime/use_mpss2.txt: Likewise.
* include/coi/common/COIEngine_common.h: Merge from upstream, MPSS
version 3.5
<http://registrationcenter.intel.com/irc_nas/7445/mpss-src-3.5.tar>.
* include/coi/common/COIEvent_common.h: Likewise.
* include/coi/common/COIMacros_common.h: Likewise.
* include/coi/common/COIPerf_common.h: Likewise.
* include/coi/common/COIResult_common.h: Likewise.
* include/coi/common/COISysInfo_common.h: Likewise.
* include/coi/common/COITypes_common.h: Likewise.
* include/coi/sink/COIBuffer_sink.h: Likewise.
* include/coi/sink/COIPipeline_sink.h: Likewise.
* include/coi/sink/COIProcess_sink.h: Likewise.
* include/coi/source/COIBuffer_source.h: Likewise.
* include/coi/source/COIEngine_source.h: Likewise.
* include/coi/source/COIEvent_source.h: Likewise.
* include/coi/source/COIPipeline_source.h: Likewise.
* include/coi/source/COIProcess_source.h: Likewise.
* include/myo/myo.h: Likewise.
* include/myo/myoimpl.h: Likewise.
* include/myo/myotypes.h: Likewise.
* plugin/Makefile.am (myo_inc_dir): Remove.
(libgomp_plugin_intelmic_la_CPPFLAGS): Do not define MYO_SUPPORT.
(AM_CPPFLAGS): Likewise for offload_target_main.
* plugin/Makefile.in: Regenerate.
* runtime/emulator/coi_common.h: Update copyright years.
(OFFLOAD_EMUL_KNC_NUM_ENV): Replace with ...
(OFFLOAD_EMUL_NUM_ENV): ... this.
(enum cmd_t): Add CMD_CLOSE_LIBRARY.
* runtime/emulator/coi_device.cpp: Update copyright years.
(COIProcessWaitForShutdown): Add space between string constants.
Return handle to host in CMD_OPEN_LIBRARY.
Support CMD_CLOSE_LIBRARY.
* runtime/emulator/coi_device.h: Update copyright years.
* runtime/emulator/coi_host.cpp: Update copyright years.
(knc_engines_num): Replace with ...
(num_engines): ... this.
(init): Replace OFFLOAD_EMUL_KNC_NUM_ENV with OFFLOAD_EMUL_NUM_ENV.
(COIEngineGetCount): Replace COI_ISA_KNC with COI_ISA_MIC, and
knc_engines_num with num_engines.
(COIEngineGetHandle): Likewise.
(COIProcessCreateFromMemory): Add space between string constants.
(COIProcessCreateFromFile): New function.
(COIProcessLoadLibraryFromMemory): Rename arguments according to
COIProcess_source.h. Return handle, received from target.
(COIProcessUnloadLibrary): New function.
(COIPipelineClearCPUMask): New function.
(COIPipelineSetCPUMask): New function.
(COIEngineGetInfo): New function.
* runtime/emulator/coi_host.h: Update copyright years.
* runtime/emulator/coi_version_asm.h: Regenerate.
* runtime/emulator/coi_version_linker_script.map: Regenerate.
* runtime/emulator/myo_client.cpp: Update copyright years.
* runtime/emulator/myo_service.cpp: Update copyright years.
(myoArenaRelease): New function.
(myoArenaAcquire): New function.
(myoArenaAlignedFree): New function.
(myoArenaAlignedMalloc): New function.
* runtime/emulator/myo_service.h: Update copyright years.
* runtime/emulator/myo_version_asm.h: Regenerate.
* runtime/emulator/myo_version_linker_script.map: Regenerate.
2015-08-24 Nathan Sidwell <nathan@codesourcery.com>
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_version): New.
......@@ -17,11 +137,11 @@
* configure: Reflects renaming of configure.in to configure.ac
2015-07-17 Nathan Sidwell <nathan@acm.org>
Ilya Verbin <iverbin@gmail.com>
Ilya Verbin <ilya.verbin@intel.com>
* plugin/libgomp-plugin-intelmic.cpp (ImgDevAddrMap): Constify.
(offload_image, GOMP_OFFLOAD_load_image,
OMP_OFFLOAD_unload_image): Constify target data.
GOMP_OFFLOAD_unload_image): Constify target data.
2015-07-08 Thomas Schwinge <thomas@codesourcery.com>
......
......@@ -84,8 +84,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \
liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1
liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0
liboffloadmic_host_la_LIBADD = libcoi_host.la libmyo-client.la
liboffloadmic_host_la_DEPENDENCIES = $(liboffloadmic_host_la_LIBADD)
liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \
runtime/coi/coi_server.cpp \
......
......@@ -165,6 +165,7 @@ libmyo_service_la_LINK = $(LIBTOOL) --tag=CXX $(AM_LIBTOOLFLAGS) \
$(CXXFLAGS) $(libmyo_service_la_LDFLAGS) $(LDFLAGS) -o $@
@LIBOFFLOADMIC_HOST_FALSE@am_libmyo_service_la_rpath = -rpath \
@LIBOFFLOADMIC_HOST_FALSE@ $(toolexeclibdir)
liboffloadmic_host_la_LIBADD =
am__objects_1 = liboffloadmic_host_la-dv_util.lo \
liboffloadmic_host_la-liboffload_error.lo \
liboffloadmic_host_la-liboffload_msg.lo \
......@@ -445,8 +446,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \
liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1
liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0
liboffloadmic_host_la_LIBADD = libcoi_host.la libmyo-client.la
liboffloadmic_host_la_DEPENDENCIES = $(liboffloadmic_host_la_LIBADD)
liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \
runtime/coi/coi_server.cpp \
runtime/compiler_if_target.cpp \
......
......@@ -82,7 +82,7 @@ Notice revision \#20110804
Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.
This document is Copyright \textcopyright 2014, Intel Corporation. All rights reserved.
This document is Copyright \textcopyright 2014-2015, Intel Corporation. All rights reserved.
\pagenumbering{roman}
\tableofcontents
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -64,7 +64,7 @@ extern "C" {
///////////////////////////////////////////////////////////////////////////////
///
/// List of ISA types of supported engines.
/// List of ISA types of supported engines.
///
typedef enum
{
......@@ -89,7 +89,7 @@ typedef enum
/// [out] The zero-based index of this engine in the collection of
/// engines of the ISA returned in out_pType.
///
/// @return COI_INVALID_POINTER if the any of the parameters are NULL.
/// @return COI_INVALID_POINTER if any of the parameters are NULL.
///
/// @return COI_SUCCESS
///
......
/*
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, version 2.1.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* Disclaimer: The codes contained in these modules may be specific
* to the Intel Software Development Platform codenamed Knights Ferry,
* and the Intel product codenamed Knights Corner, and are not backward
* compatible with other Intel products. Additionally, Intel will NOT
* support the codes or instruction set in future products.
*
* Intel offers no warranty of any kind regarding the code. This code is
* licensed on an "AS IS" basis and Intel is not obligated to provide
* any support, assistance, installation, training, or other services
* of any kind. Intel is also not obligated to provide any updates,
* enhancements or extensions. Intel specifically disclaims any warranty
* of merchantability, non-infringement, fitness for any particular
* purpose, and any other warranty.
*
* Further, Intel disclaims all liability of any kind, including but
* not limited to liability for infringement of any proprietary rights,
* relating to the use of the code, even if Intel is notified of the
* possibility of such liability. Except as expressly stated in an Intel
* license agreement provided with this code and agreed upon with Intel,
* no license, express or implied, by estoppel or otherwise, to any
* intellectual property rights is granted herein.
*/
#ifndef _COIEVENT_COMMON_H
#define _COIEVENT_COMMON_H
/** @ingroup COIEvent
* @addtogroup COIEventcommon
@{
* @file common/COIEvent_common.h
*/
#ifndef DOXYGEN_SHOULD_SKIP_THIS
#include "../common/COITypes_common.h"
#include "../common/COIResult_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#endif // DOXYGEN_SHOULD_SKIP_THIS
///////////////////////////////////////////////////////////////////////////////
///
/// Signal one shot user event. User events created on source can be
/// signaled from both sink and source. This fires the event and wakes up
/// threads waiting on COIEventWait.
///
/// Note: For events that are not registered or already signaled this call
/// will behave as a NOP. Users need to make sure that they pass valid
/// events on the sink side.
///
/// @param in_Event
/// Event Handle to be signaled.
///
/// @return COI_INVAILD_HANDLE if in_Event was not a User event.
///
/// @return COI_ERROR if the signal fails to be sent from the sink.
///
/// @return COI_SUCCESS if the event was successfully signaled or ignored.
///
COIACCESSAPI
COIRESULT COIEventSignalUserEvent(COIEVENT in_Event);
///
///
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* _COIEVENT_COMMON_H */
/*! @} */
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -41,12 +41,17 @@
#ifndef _COIMACROS_COMMON_H
#define _COIMACROS_COMMON_H
#include <string.h>
#include "../source/COIPipeline_source.h"
#include "../common/COITypes_common.h"
/// @file common/COIMacros_common.h
/// Commonly used macros
// Note that UNUSUED_ATTR means that it is "possibly" unused, not "definitely".
// This should compile out in release mode if indeed it is unused.
#define UNUSED_ATTR __attribute__((unused))
#include <sched.h>
#ifndef UNREFERENCED_CONST_PARAM
#define UNREFERENCED_CONST_PARAM(P) { void* x UNUSED_ATTR = \
(void*)(uint64_t)P; \
......@@ -66,4 +71,150 @@
#endif
/* The following are static inline definitions of functions used for manipulating
COI_CPU_MASK info (The COI_CPU_MASK type is declared as an array of 16 uint64_t's
in COITypes_common.h "typedef uint64_t COI_CPU_MASK[16]").
These static inlined functions are intended on being roughly the same as the Linux
CPU_* macros defined in sched.h - with the important difference being a different
fundamental type difference: cpu_set_t versus COI_CPU_MASK.
The motivation for writing this code was to ease portability on the host side of COI
applications to both Windows and Linux.
*/
/* Roughly equivalent to CPU_ISSET(). */
static inline uint64_t COI_CPU_MASK_ISSET(int bitNumber, const COI_CPU_MASK cpu_mask)
{
if ((size_t)bitNumber < sizeof(COI_CPU_MASK)*8)
return ((cpu_mask)[bitNumber/64] & (((uint64_t)1) << (bitNumber%64)));
return 0;
}
/* Roughly equivalent to CPU_SET(). */
static inline void COI_CPU_MASK_SET(int bitNumber, COI_CPU_MASK cpu_mask)
{
if ((size_t)bitNumber < sizeof(COI_CPU_MASK)*8)
((cpu_mask)[bitNumber/64] |= (((uint64_t)1) << (bitNumber%64)));
}
/* Roughly equivalent to CPU_ZERO(). */
static inline void COI_CPU_MASK_ZERO(COI_CPU_MASK cpu_mask)
{
memset(cpu_mask,0,sizeof(COI_CPU_MASK));
}
/* Roughly equivalent to CPU_AND(). */
static inline void COI_CPU_MASK_AND(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] & src2[i];
}
/* Roughly equivalent to CPU_XOR(). */
static inline void COI_CPU_MASK_XOR(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] ^ src2[i];
}
/* Roughly equivalent to CPU_OR(). */
static inline void COI_CPU_MASK_OR(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] | src2[i];
}
/* Utility function for COI_CPU_MASK_COUNT() below. */
static inline int __COI_CountBits(uint64_t n)
{
int cnt=0;
for (;n;cnt++)
n &= (n-1);
return cnt;
}
/* Roughly equivalent to CPU_COUNT(). */
static inline int COI_CPU_MASK_COUNT(const COI_CPU_MASK cpu_mask)
{
int cnt=0;
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(cpu_mask[0]);
for(unsigned int i=0;i < loopIterations;++i)
{
cnt += __COI_CountBits(cpu_mask[i]);
}
return cnt;
}
/* Roughly equivalent to CPU_EQUAL(). */
static inline int COI_CPU_MASK_EQUAL(const COI_CPU_MASK cpu_mask1,const COI_CPU_MASK cpu_mask2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(cpu_mask1[0]);
for(unsigned int i=0;i < loopIterations;++i)
{
if (cpu_mask1[i] != cpu_mask2[i])
return 0;
}
return 1;
}
/* Utility function to translate from cpu_set * to COI_CPU_MASK. */
static inline void COI_CPU_MASK_XLATE(COI_CPU_MASK dest,const cpu_set_t *src)
{
COI_CPU_MASK_ZERO(dest);
#if 0
/* Slightly slower version than the following #else/#endif block. Left here only to
document the intent of the code. */
for(unsigned int i=0;i < sizeof(cpu_set_t)*8;++i)
if (CPU_ISSET(i,src))
COI_CPU_MASK_SET(i,dest);
#else
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)/sizeof(dest[0]);++i)
{
for(unsigned int j=0;j < 64;++j)
{
if (CPU_ISSET(i*64+j,src))
dest[i] |= ((uint64_t)1) << j;
}
}
#endif
}
/* Utility function to translate from COI_CPU_MASK to cpu_set *. */
static inline void COI_CPU_MASK_XLATE_EX(cpu_set_t *dest,const COI_CPU_MASK src)
{
CPU_ZERO(dest);
#if 0
/* Slightly slower version than the following #else/#endif block. Left here only to
document the intent of the code. */
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)*8;++i)
if (COI_CPU_MASK_ISSET(i,src))
CPU_SET(i,dest);
#else
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)/sizeof(src[0]);++i)
{
const uint64_t cpu_mask = src[i];
for(unsigned int j=0;j < 64;++j)
{
const uint64_t bit = ((uint64_t)1) << j;
if (bit & cpu_mask)
CPU_SET(i*64+j,dest);
}
}
#endif
}
#endif /* _COIMACROS_COMMON_H */
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -110,12 +110,13 @@ typedef enum COIRESULT
///< Offload Infrastructure on the host
///< is not compatible with the version
///< on the device.
COI_BAD_PORT, ///< The port that the host is set to
COI_BAD_PORT, ///< The port that the host is set to
///< connect to is invalid.
COI_AUTHENTICATION_FAILURE, ///< The daemon was unable to authenticate
///< the user that requested an engine.
///< Only reported if daemon is set up for
///< authorization.
///< authorization. Is also reported in
///< Windows if host can not find user.
COI_NUM_RESULTS ///< Reserved, do not use.
}
COIRESULT;
......
/*
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, version 2.1.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* Disclaimer: The codes contained in these modules may be specific
* to the Intel Software Development Platform codenamed Knights Ferry,
* and the Intel product codenamed Knights Corner, and are not backward
* compatible with other Intel products. Additionally, Intel will NOT
* support the codes or instruction set in future products.
*
* Intel offers no warranty of any kind regarding the code. This code is
* licensed on an "AS IS" basis and Intel is not obligated to provide
* any support, assistance, installation, training, or other services
* of any kind. Intel is also not obligated to provide any updates,
* enhancements or extensions. Intel specifically disclaims any warranty
* of merchantability, non-infringement, fitness for any particular
* purpose, and any other warranty.
*
* Further, Intel disclaims all liability of any kind, including but
* not limited to liability for infringement of any proprietary rights,
* relating to the use of the code, even if Intel is notified of the
* possibility of such liability. Except as expressly stated in an Intel
* license agreement provided with this code and agreed upon with Intel,
* no license, express or implied, by estoppel or otherwise, to any
* intellectual property rights is granted herein.
*/
#ifndef _COISYSINFO_COMMON_H
#define _COISYSINFO_COMMON_H
/** @ingroup COISysInfo
* @addtogroup COISysInfoCommon
@{
* @file common/COISysInfo_common.h
* This interface allows developers to query the platform for system level
* information. */
#ifndef DOXYGEN_SHOULD_SKIP_THIS
#include "../common/COITypes_common.h"
#include <assert.h>
#include <string.h>
#ifdef __cplusplus
extern "C" {
#endif
#endif // DOXYGEN_SHOULD_SKIP_THIS
#define INITIAL_APIC_ID_BITS 0xFF000000 // EBX[31:24] unique APIC ID
///////////////////////////////////////////////////////////////////////////////
/// \fn uint32_t COISysGetAPICID(void)
/// @return The Advanced Programmable Interrupt Controller (APIC) ID of
/// the hardware thread on which the caller is running.
///
/// @warning APIC IDs are unique to each hardware thread within a processor,
/// but may not be sequential.
COIACCESSAPI
uint32_t COISysGetAPICID(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of cores exposed by the processor on which the caller is
/// running. Returns 0 if there is an error loading the processor info.
COIACCESSAPI
uint32_t COISysGetCoreCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of hardware threads exposed by the processor on which
/// the caller is running. Returns 0 if there is an error loading processor
/// info.
COIACCESSAPI
uint32_t COISysGetHardwareThreadCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the hardware thread on which the caller is running.
///
/// The indexes of neighboring hardware threads will differ by a value of one
/// and are within the range zero through COISysGetHardwareThreadCount()-1.
/// Returns ((uint32_t)-1) if there was an error loading processor info.
COIACCESSAPI
uint32_t COISysGetHardwareThreadIndex(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the core on which the caller is running.
///
/// The indexes of neighboring cores will differ by a value of one and are
/// within the range zero through COISysGetCoreCount()-1. Returns ((uint32_t)-1)
/// if there was an error loading processor info.
COIACCESSAPI
uint32_t COISysGetCoreIndex(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of level 2 caches within the processor on which the
/// caller is running. Returns ((uint32_t)-1) if there was an error loading
/// processor info.
COIACCESSAPI
uint32_t COISysGetL2CacheCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the level 2 cache on which the caller is running.
/// Returns ((uint32_t)-1) if there was an error loading processor info.
///
/// The indexes of neighboring cores will differ by a value of one and are
/// within the range zero through COISysGetL2CacheCount()-1.
COIACCESSAPI
uint32_t COISysGetL2CacheIndex(void);
#ifdef __cplusplus
} /* extern "C" */
#endif
/*! @} */
#endif /* _COISYSINFO_COMMON_H */
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -73,8 +73,8 @@ typedef struct coimapinst * COIMAPINSTANCE;
typedef uint64_t COI_CPU_MASK[16];
/**
* On Windows, coi_wchar_t is a uint32_t. On Windows, wchar_t is 16 bits wide, and on Linux it is 32 bits wide, so uint32_t is used for portability.
/**
* On Windows, coi_wchar_t is a uint32_t. On Windows, wchar_t is 16 bits wide, and on Linux it is 32 bits wide, so uint32_t is used for portability.
*/
typedef wchar_t coi_wchar_t;
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -45,7 +45,7 @@
* @addtogroup COIBufferSink
@{
* @file sink\COIBuffer_sink.h
* @file sink\COIBuffer_sink.h
*/
#ifndef DOXYGEN_SHOULD_SKIP_THIS
#include "../common/COITypes_common.h"
......@@ -54,29 +54,29 @@
#ifdef __cplusplus
extern "C" {
#endif
#endif
//////////////////////////////////////////////////////////////////////////////
///
/// Adds a reference to the memory of a buffer. The memory of the buffer
/// will remain on the device until both a corresponding COIBufferReleaseRef()
/// Adds a reference to the memory of a buffer. The memory of the buffer
/// will remain on the device until both a corresponding COIBufferReleaseRef()
/// call is made and the run function that delivered the buffer returns.
///
/// Intel® Coprocessor Offload Infrastructure (Intel® COI) streaming buffers should not be AddRef'd. Doing so may result in
/// unpredictable results or may cause the sink process to crash.
/// Running this API in a thread spawned within the run function is not
/// supported and will cause unpredictable results and may cause data corruption.
///
/// @warning 1.It is possible for enqueued run functions to be unable to
/// execute due to all card memory being occupied by addref'ed
/// @warning 1.It is possible for enqueued run functions to be unable to
/// execute due to all card memory being occupied by AddRef'd
/// buffers. As such, it is important that whenever a buffer is
/// addref'd that there be no dependencies on future run functions
/// AddRef'd that there be no dependencies on future run functions
/// for progress to be made towards releasing the buffer.
/// 2.It is important that AddRef is called within the scope of
/// run function that carries the buffer to be addref'ed.
/// 2.It is important that AddRef is called within the scope of
/// run function that carries the buffer to be AddRef'd.
///
/// @param in_pBuffer
/// [in] Pointer to the start of a buffer being addref'ed, that was
/// [in] Pointer to the start of a buffer being AddRef'd, that was
/// passed in at the start of the run function.
///
///
/// @return COI_SUCCESS if the buffer ref count was successfully incremented.
///
/// @return COI_INVALID_POINTER if the buffer pointer is NULL.
......@@ -90,30 +90,33 @@ COIBufferAddRef(
//////////////////////////////////////////////////////////////////////////////
///
/// Removes a reference to the memory of a buffer. The memory of the buffer
/// Removes a reference to the memory of a buffer. The memory of the buffer
/// will be eligible for being freed on the device when the following
/// conditions are met: the run function that delivered the buffer
/// returns, and the number of calls to COIBufferReleaseRef() matches the
/// returns, and the number of calls to COIBufferReleaseRef() matches the
/// number of calls to COIBufferAddRef().
//
/// Running this API in a thread spawned within the run function is not
/// supported and will cause unpredictable results and may cause data corruption.
///
/// @warning When a buffer is addref'ed it is assumed that it is in use and all
/// @warning When a buffer is AddRef'd it is assumed that it is in use and all
/// other operations on that buffer waits for ReleaseRef() to happen.
/// So you cannot pass the addref'ed buffer's handle to RunFunction
/// that calls ReleaseRef(). This is a circular dependency and will
/// cause a deadlock. Buffer's pointer (buffer's sink side
/// So you cannot pass the AddRef'd buffer's handle to RunFunction
/// that calls ReleaseRef(). This is a circular dependency and will
/// cause a deadlock. Buffer's pointer (buffer's sink side
/// address/pointer which is different than source side BUFFER handle)
/// needs to be stored somewhere to retrieve it later to use in
/// needs to be stored somewhere to retrieve it later to use in
/// ReleaseRef.
///
/// @param in_pBuffer
/// [in] Pointer to the start of a buffer previously addref'ed, that
/// [in] Pointer to the start of a buffer previously AddRef'd, that
/// was passed in at the start of the run function.
///
///
/// @return COI_SUCCESS if the buffer refcount was successfully decremented.
///
/// @return COI_INVALID_POINTER if the buffer pointer was invalid.
///
/// @return COI_INVALID_HANDLE if the buffer did not have COIBufferAddRef()
/// @return COI_INVALID_HANDLE if the buffer did not have COIBufferAddRef()
/// previously called on it.
///
COIRESULT
......@@ -123,7 +126,7 @@ COIBufferReleaseRef(
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif
#endif /* _COIBUFFER_SINK_H */
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -63,10 +63,11 @@ extern "C" {
/// main() function from exiting until it is directed to by the source. When
/// the shutdown message is received this function will stop any future run
/// functions from executing but will wait for any current run functions to
/// complete. All Intel® Coprocessor Offload Infrastructure (Intel® COI) resources will be cleaned up and no additional Intel® Coprocessor Offload Infrastructure (Intel® COI) APIs
/// should be called after this function returns. This function does not
/// invoke exit() so the application can perform any of its own cleanup once
/// this call returns.
/// complete. All Intel® Coprocessor Offload Infrastructure (Intel® COI)
/// resources will be cleaned up and no additional Intel® Coprocessor Offload
/// Infrastructure (Intel® COI) APIs should be called after this function
/// returns. This function does not invoke exit() so the application
/// can perform any of its own cleanup once this call returns.
///
/// @return COI_SUCCESS once the process receives the shutdown message.
///
......@@ -86,8 +87,9 @@ COIProcessWaitForShutdown();
/// from this call.
///
/// @return COI_SUCCESS once the proxy output has been flushed to and written
/// written by the host. Note that Intel® Coprocessor Offload Infrastructure (Intel® COI) on the source writes to stdout
/// and stderr, but does not flush this output.
/// written by the host. Note that Intel® Coprocessor Offload
/// Infrastructure (Intel® COI) on the source writes to stdout and
/// stderr, but does not flush this output.
/// @return COI_SUCCESS if the process was created without enabling
/// proxy IO this function.
///
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -75,7 +75,7 @@ typedef enum
///////////////////////////////////////////////////////////////////////////////
/// This structure returns information about an Intel(r) Xeon Phi(tm)
/// This structure returns information about an Intel(R) Xeon Phi(TM)
/// coprocessor.
/// A pointer to this structure is passed into the COIGetEngineInfo() function,
/// which fills in the data before returning to the caller.
......@@ -101,6 +101,7 @@ typedef struct COI_ENGINE_INFO
uint32_t CoreMaxFrequency;
/// The load percentage for each of the hardware threads on the engine.
/// Currently this is limited to reporting out a maximum of 1024 HW threads
uint32_t Load[COI_MAX_HW_THREADS];
/// The amount of physical memory managed by the OS.
......@@ -133,9 +134,9 @@ typedef struct COI_ENGINE_INFO
///////////////////////////////////////////////////////////////////////////////
///
/// Returns information related to a specified engine. Note that if Intel® Coprocessor Offload Infrastructure (Intel® COI) is
/// unable to query a value it will be returned as zero but the call will
/// still succeed.
/// Returns information related to a specified engine. Note that if Intel(R)
/// Coprocessor Offload Infrastructure (Intel(R) COI) is unable to query
/// a value it will be returned as zero but the call will still succeed.
///
///
/// @param in_EngineHandle
......@@ -173,14 +174,15 @@ COIEngineGetInfo(
///
/// Returns the number of engines in the system that match the provided ISA.
///
/// Note that while it is possible to enumerate different types of Intel(r)
/// Xeon Phi(tm) coprocessors on a single host this is not currently
/// supported. Intel® Coprocessor Offload Infrastructure (Intel® COI) makes an assumption that all Intel(r) Xeon Phi(tm)
/// coprocessors found in the system are the same architecture as the first
/// coprocessor device.
/// Note that while it is possible to enumerate different types of Intel(R)
/// Xeon Phi(TM) coprocessors on a single host this is not currently
/// supported. Intel(R) Coprocessor Offload Infrastructure (Intel(R) COI)
/// makes an assumption that all Intel(R) Xeon Phi(TM) coprocessors found
/// in the system are the same architecture as the first coprocessor device.
///
/// Also, note that this function returns the number of engines that Intel® Coprocessor Offload Infrastructure (Intel® COI)
/// is able to detect. Not all of them may be online.
/// Also, note that this function returns the number of engines that Intel(R)
/// Coprocessor Offload Infrastructure (Intel(R) COI) is able to detect. Not
/// all of them may be online.
///
/// @param in_ISA
/// [in] Specifies the ISA type of the engine requested.
......@@ -211,7 +213,7 @@ COIEngineGetCount(
///
/// @param in_EngineIndex
/// [in] A unsigned integer which specifies the zero-based position of
/// the engine in a collection of engines. The makeup of this
/// the engine in a collection of engines. The makeup of this
/// collection is defined by the in_ISA parameter.
///
/// @param out_pEngineHandle
......@@ -226,7 +228,8 @@ COIEngineGetCount(
///
/// @return COI_INVALID_POINTER if the out_pEngineHandle parameter is NULL.
///
/// @return COI_VERSION_MISMATCH if the version of Intel® Coprocessor Offload Infrastructure (Intel® COI) on the host is not
/// @return COI_VERSION_MISMATCH if the version of Intel(R) Coprocessor Offload
/// Infrastructure (Intel(R) COI) on the host is not
/// compatible with the version on the device.
///
/// @return COI_NOT_INITIALIZED if the engine requested exists but is offline.
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -59,12 +59,10 @@ extern "C" {
///
/// Special case event values which can be passed in to APIs to specify
/// how the API should behave. In COIBuffer APIs passing in NULL for the
/// completion event is the equivalent of passing COI_EVENT_SYNC. For
/// COIPipelineRunFunction passing in NULL is the equivalent of
/// COI_EVENT_ASYNC.
/// completion event is the equivalent of passing COI_EVENT_SYNC.
/// Note that passing COI_EVENT_ASYNC can be used when the caller wishes the
/// operation to be performed asynchronously but does not care when the
/// operation completes. This can be useful for opertions that by definition
/// operation completes. This can be useful for operations that by definition
/// must complete in order (DMAs, run functions on a single pipeline). If
/// the caller does care when the operation completes then they should pass
/// in a valid completion event which they can later wait on.
......@@ -72,6 +70,16 @@ extern "C" {
#define COI_EVENT_ASYNC ((COIEVENT*)1)
#define COI_EVENT_SYNC ((COIEVENT*)2)
//////////////////////////////////////////////////////////////////////////////
///
/// This can be used to initialize a COIEVENT to a known invalid state.
/// This is not required to use, but can be useful in some cases
/// if a program is unsure if the event will be initialized by the runtime.
/// Simply set the event to this value: COIEVENT event = COI_EVENT_INITIALIZER;
///
#define COI_EVENT_INITIALIZER { { 0, -1 } }
///////////////////////////////////////////////////////////////////////////////
///
/// Wait for an arbitrary number of COIEVENTs to be signaled as completed,
......@@ -94,17 +102,17 @@ extern "C" {
/// and returns immediately, -1 blocks indefinitely.
///
/// @param in_WaitForAll
/// [in] Boolean value specifying behavior. If true, wait for all
/// [in] Boolean value specifying behavior. If true, wait for all
/// events to be signaled, or for timeout, whichever happens first.
/// If false, return when any event is signaled, or at timeout.
///
/// @param out_pNumSignaled
/// [out] The number of events that were signaled. If in_NumEvents
/// [out] The number of events that were signaled. If in_NumEvents
/// is 1 or in_WaitForAll = True, this parameter is optional.
///
/// @param out_pSignaledIndices
/// [out] Pointer to an array of indicies into the original event
/// array. Those denoted have been signaled. The user must provide an
/// [out] Pointer to an array of indices into the original event
/// array. Those denoted have been signaled. The user must provide an
/// array that is no smaller than the in_Events array. If in_NumEvents
/// is 1 or in_WaitForAll = True, this parameter is optional.
///
......@@ -132,6 +140,10 @@ extern "C" {
/// @return COI_PROCESS_DIED if the remote process died. See COIProcessDestroy
/// for more details.
///
/// @return COI_<REAL ERROR> if only a single event is passed in, and that event
/// failed, COI will attempt to return the real error code that caused
/// the original operation to fail, otherwise COI_PROCESS_DIED is reported.
///
COIACCESSAPI
COIRESULT
COIEventWait(
......@@ -183,6 +195,103 @@ COIRESULT
COIEventUnregisterUserEvent(
COIEVENT in_Event);
//////////////////////////////////////////////////////////////////////////////
///
/// A callback that will be invoked to notify the user of an internal
/// runtime event completion.
///
/// As with any callback mechanism it is up to the user to make sure that
/// there are no possible deadlocks due to reentrancy (ie the callback being
/// invoked in the same context that triggered the notification) and also
/// that the callback does not slow down overall processing. If the user
/// performs too much work within the callback it could delay further
/// processing. The callback will be invoked prior to the signaling of
/// the corresponding COIEvent. For example, if a user is waiting
/// for a COIEvent associated with a run function completing they will
/// receive the callback before the COIEvent is marked as signaled.
///
/// @param in_Event
/// [in] The completion event that is associated with the
/// operation that is being notified.
///
/// @param in_Result
/// [in] The COIRESULT of the operation.
///
/// @param in_UserData
/// [in] Opaque data that was provided when the callback was
/// registered. Intel(R) Coprocessor Offload Infrastructure
/// (Intel(R) COI) simply passes this back to the user so that
/// they can interpret it as they choose.
///
typedef void (*COI_EVENT_CALLBACK)(
COIEVENT in_Event,
const COIRESULT in_Result,
const void* in_UserData);
//////////////////////////////////////////////////////////////////////////////
///
/// Registers any COIEVENT to receive a one time callback, when the event
/// is marked complete in the offload runtime. If the event has completed
/// before the COIEventRegisterCallback() is called then the callback will
/// immediately be invoked by the calling thread. When the event is
/// registered before the event completes, the runtime gaurantees that
/// the callback will be invoked before COIEventWait() is notified of
/// the same event completing. In well written user code, this may provide
/// a slight performance advantage.
///
/// Users should treat the callback much like an interrupt routine, in regards
/// of performance. Specifically designing the callback to be as short and
/// non blocking as possible. Since the thread that runs the callback is
/// non deterministic blocking or stalling of the callback, may have severe
/// performance impacts on the offload runtime. Thus, it is important to not
/// create deadlocks between the callback and other signaling/waiting
/// mechanisms. It is recommended to never invoke COIEventWait() inside
/// a callback function, as this could lead to immediate deadlocks.
///
/// It is important to note that the runtime cannot distinguish between
/// already triggered events and invalid events. Thus the user needs to pass
/// in a valid event, or the callback will be invoked immediately.
/// Failed events will still receive a callback and the user can query
/// COIEventWait() after the callback for the failed return code.
///
/// If more than one callback is registered for the same event, only the
/// single most current callback will be used, i.e. the older one will
/// be replaced.
///
/// @param in_Event
/// [in] A valid single event handle to be registered to receive a callback.
///
/// @param in_Callback
/// [in] Pointer to a user function used to signal an
/// event completion.
///
/// @param in_UserData
/// [in] Opaque data to pass to the callback when it is invoked.
///
/// @param in_Flags
/// [in] Reserved parameter for future expansion, required to be zero for now.
///
/// @return COI_INVALID_HANDLE if in_Event is not a valid COIEVENT
///
/// @return COI_INVALID_HANDLE if in_Callback is not a valid pointer.
///
/// @return COI_ARGUMENT_MISMATCH if the in_Flags is not zero.
///
/// @return COI_SUCCESS an event is successfully registered
///
COIACCESSAPI
COIRESULT
COIEventRegisterCallback(
const COIEVENT in_Event,
COI_EVENT_CALLBACK in_Callback,
const void* in_UserData,
const uint64_t in_Flags);
#ifdef __cplusplus
} /* extern "C" */
#endif
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -59,12 +59,13 @@ extern "C" {
//////////////////////////////////////////////////////////////////////////////
/// These flags specify how a buffer will be used within a run function. They
/// allow Intel® Coprocessor Offload Infrastructure (Intel® COI) to make optimizations in how it moves data around the system.
/// These flags specify how a buffer will be used within a run function. They
/// allow the runtime to make optimizations in how it moves the data around.
/// These flags can affect the correctness of an application, so they must be
/// set properly. For example, if a buffer is used in a run function with the
/// COI_SINK_READ flag and then mapped on the source, Intel® Coprocessor Offload Infrastructure (Intel® COI) may use a previously
/// cached version of the buffer instead of retrieving data from the sink.
/// set properly. For example, if a buffer is used in a run function with the
/// COI_SINK_READ flag and then mapped on the source, the runtime may use a
/// previously cached version of the buffer instead of retrieving data from
/// the sink.
typedef enum COI_ACCESS_FLAGS
{
/// Specifies that the run function will only read the associated buffer.
......@@ -76,7 +77,23 @@ typedef enum COI_ACCESS_FLAGS
/// Specifies that the run function will overwrite the entire associated
/// buffer and therefore the buffer will not be synchronized with the
/// source before execution.
COI_SINK_WRITE_ENTIRE
COI_SINK_WRITE_ENTIRE,
/// Specifies that the run function will only read the associated buffer
/// and will maintain the reference count on the buffer after
/// run function exit.
COI_SINK_READ_ADDREF,
/// Specifies that the run function will write to the associated buffer
/// and will maintain the reference count on the buffer after
/// run function exit.
COI_SINK_WRITE_ADDREF,
/// Specifies that the run function will overwrite the entire associated
/// buffer and therefore the buffer will not be synchronized with the
/// source before execution and will maintain the reference count on the
/// buffer after run function exit.
COI_SINK_WRITE_ENTIRE_ADDREF
} COI_ACCESS_FLAGS;
#define COI_PIPELINE_MAX_PIPELINES 512
......@@ -86,7 +103,7 @@ typedef enum COI_ACCESS_FLAGS
///////////////////////////////////////////////////////////////////////////////
///
/// Create a pipeline assoiated with a remote process. This pipeline can
/// Create a pipeline associated with a remote process. This pipeline can
/// then be used to execute remote functions and to share data using
/// COIBuffers.
///
......@@ -133,8 +150,8 @@ typedef enum COI_ACCESS_FLAGS
/// @return COI_TIME_OUT_REACHED if establishing the communication channel with
/// the remote pipeline timed out.
///
/// @return COI_RETRY if the pipeline cannot be created due to the number of
/// source-to-sink connections in use. A subsequent call to
/// @return COI_RETRY if the pipeline cannot be created due to the number of
/// source-to-sink connections in use. A subsequent call to
/// COIPipelineCreate may succeed if resources are freed up.
///
/// @return COI_PROCESS_DIED if in_Process died.
......@@ -149,7 +166,7 @@ COIPipelineCreate(
///////////////////////////////////////////////////////////////////////////////
///
/// Destroys the inidicated pipeline, releasing its resources.
/// Destroys the indicated pipeline, releasing its resources.
///
/// @param in_Pipeline
/// [in] Pipeline to destroy.
......@@ -175,22 +192,21 @@ COIPipelineDestroy(
///
/// 1. Proper care has to be taken while setting the input dependencies for
/// RunFunctions. Setting it incorrectly can lead to cyclic dependencies
/// and can cause the respective pipeline (as a result Intel® Coprocessor Offload Infrastructure (Intel® COI) Runtime) to
/// stall.
/// and can cause the respective pipeline to stall.
/// 2. RunFunctions can also segfault if enough memory space is not available
/// on the sink for the buffers passed in. Pinned buffers and buffers that
/// are AddRef'd need to be accounted for available memory space. In other
/// words, this memory is not available for use until it is freed up.
/// 3. Unexpected segmentation faults or erroneous behaviour can occur if
/// handles or data passed in to Runfunction gets destroyed before the
/// 3. Unexpected segmentation faults or erroneous behavior can occur if
/// handles or data passed in to Runfunction gets destroyed before the
/// RunFunction finishes.
/// For example, if a variable passed in as Misc data or the buffer gets
/// destroyed before the Intel® Coprocessor Offload Infrastructure (Intel® COI) runtime receives the completion notification
/// of the Runfunction, it can cause unexpected behaviour. So it is always
/// destroyed before the runtime receives the completion notification
/// of the Runfunction, it can cause unexpected behavior. So it is always
/// recommended to wait for RunFunction completion event before any related
/// destroy event occurs.
///
/// Intel® Coprocessor Offload Infrastructure (Intel® COI) Runtime expects users to handle such scenarios. COIPipelineRunFunction
/// The runtime expects users to handle such scenarios. COIPipelineRunFunction
/// returns COI_SUCCESS for above cases because it was queued up successfully.
/// Also if you try to destroy a pipeline with a stalled function then the
/// destroy call will hang. COIPipelineDestroy waits until all the functions
......@@ -240,7 +256,7 @@ COIPipelineDestroy(
/// [in] Pointer to user defined data, typically used to pass
/// parameters to Sink side functions. Should only be used for small
/// amounts data since the data will be placed directly in the
/// Driver's command buffer. COIBuffers should be used to pass large
/// Driver's command buffer. COIBuffers should be used to pass large
/// amounts of data.
///
/// @param in_MiscDataLen
......@@ -250,8 +266,8 @@ COIPipelineDestroy(
///
/// @param out_pAsyncReturnValue
/// [out] Pointer to user-allocated memory where the return value from
/// the run function will be placed. This memory should not be read
/// until out_pCompletion has been signalled.
/// the run function will be placed. This memory should not be read
/// until out_pCompletion has been signaled.
///
/// @param in_AsyncReturnValueLen
/// [in] Size of the out_pAsyncReturnValue in bytes.
......@@ -259,11 +275,14 @@ COIPipelineDestroy(
/// @param out_pCompletion
/// [out] An optional pointer to a COIEVENT object
/// that will be signaled when this run function has completed
/// execution. The user may pass in NULL if they do not wish to signal
/// any COIEVENTs when this run function completes.
/// execution. The user may pass in NULL if they wish for this function
/// to be synchronous, otherwise if a COIEVENT object is passed in the
/// function is then asynchronous and closes after enqueuing the
/// RunFunction and passes back the COIEVENT that will be signaled
/// once the RunFunction has completed.
///
/// @return COI_SUCCESS if the function was successfully placed in a
/// pipeline for future execution. Note that the actual
/// pipeline for future execution. Note that the actual
/// execution of the function will occur in the future.
///
/// @return COI_OUT_OF_RANGE if in_NumBuffers is greater than
......@@ -303,18 +322,10 @@ COIPipelineDestroy(
/// @return COI_ARGUMENT_MISMATCH if in_pReturnValue is non-NULL but
/// in_ReturnValueLen is zero.
///
/// @return COI_ARGUMENT_MISMATCH if a COI_BUFFER_STREAMING_TO_SOURCE buffer
/// is not passed with COI_SINK_WRITE_ENTIRE access flag.
///
/// @return COI_RESOURCE_EXHAUSTED if could not create a version for TO_SOURCE
/// streaming buffer. It can fail if enough memory is not available to
/// register. This call will succeed eventually when the registered
/// memory becomes available.
///
/// @return COI_RETRY if any input buffers, which are not pinned buffers,
/// are still mapped when passed to the run function.
///
/// @return COI_MISSING_DEPENDENCY if buffer was not created on the process
/// @return COI_MISSING_DEPENDENCY if buffer was not created on the process
/// associated with the pipeline that was passed in.
///
/// @return COI_OUT_OF_RANGE if any of the access flags in
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -459,11 +459,37 @@ extern MyoError myoiTargetSharedMallocTableRegister(
* return -1;
* }
* @endcode
* This intialization is required only in the client/host side
* of the application. The server/card side executable should be
* executed only on the second card in this case.
* This intialization is required only in the client/host side
* of the application. The server/card side executable should be
* executed only on the second card in this case.
*
* Another capability for the MyoiUserParams structure in MYO is specifying
* a remote procedure call to be executed on the host or card, immediately after
* myoiLibInit() completes. This capability is useful because some calls in
* MYO return immediately, but do not actually complete until after the MYO
* library is completely initialized on all peers. An example follows,
* showing how to cause MYO to execute the registered function named
* "PostMyoLibInitFunction" on the first card only:
* @code
* MyoiUserParams UserParas[64];
* UserParas[0].type = MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC;
* UserParas[0].nodeid = 1;
* SetPostLibInitFuncName(UserParas[1], "PostMyoLibInitFunction");
* UserParas[2].type = MYOI_USERPARAMS_LAST_MSG;
* if(MYO_SUCCESS != myoiLibInit(&UserParas, (void*)&myoiUserInit)) {
* printf("Failed to initialize MYO runtime\n");
* return -1;
* }
* @endcode
*
* Note, to cause PostMyoLibInitFunction to be executed on ALL cards,
* specify: MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES for the nodeid.
* That is:
* @code
* UserParas[0].nodeid = MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES;
* @endcode
*
* @param userInitFunc Shared variables and remote funtions are
* @param userInitFunc Shared variables and remote functions are
* registered in this routine, which is called by the runtime during
* library initialization.
* @return
......@@ -473,6 +499,22 @@ extern MyoError myoiTargetSharedMallocTableRegister(
MYOACCESSAPI
MyoError myoiLibInit(void * in_args, void *userInitFunc /*userInitFunc must be: MyoError (*userInitFunc)(void) */);
/** @fn extern MyoError myoiSupportsFeature(MyoFeatureType myoFeature)
* @brief Supports runtime query to determine whether a feature is supported
* by the myo that is installed on the system. This function is intended to
* support client code to query the myo library to determine whether its set
* of capabilities are able to support the client's needs.
*
* @param myoFeature The feature that is to be inquired about.
* @return
* MYO_SUCCESS; if the feature is supported.
* MYO_FEATURE_NOT_IMPLEMENTED if the feature is not supported.
*
* (For more information, please also see the declaration of the MyoFeatureType enum declaration.)
**/
MYOACCESSAPI
MyoError myoiSupportsFeature(MyoFeatureType myoFeature);
/** @fn void myoiLibFini()
* @brief Finalize the MYO library, all resources held by the runtime are
* released by this routine.
......@@ -519,17 +561,56 @@ MyoError myoiSetMemConsistent(void *in_pAddr, size_t in_Size);
EXTERN_C MYOACCESSAPI unsigned int myoiMyId; /* MYO_MYID if on accelerators */
EXTERN_C MYOACCESSAPI volatile int myoiInitFlag;
//! Structure of the array element that is passed to myoiLibInit() to initialize a subset of the available cards.
typedef struct{
//!type = MYOI_USERPARAMS_DEVID for each element in the array except the last element ; type = MYOI_USERPARAMS_LAST_MSG for the last element in the array.
//! Structure of the array element that is passed to myoiLibInit() to initialize a subset of the available cards, or
//! to specify a remote call function to be called after successful myo library initialization:
typedef struct {
//!type = MYOI_USERPARAMS_DEVID or MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC for each element in the array except
//!the last element, type should be: MYOI_USERPARAMS_LAST_MSG.
int type;
//!nodeid refers to the card index.
//! nodeid refers to the 'one-based' card index. Specifying, 1 represents the first card, mic0, 2 represents the
// second card, mic1, 3 represents the third card, mic2, ....).
// NOTE: for type == MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC, specifying MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES
// for nodeid, will execute the named function, on each card in the system, mic0, mic1, mic2, .... micn.
int nodeid;
}MyoiUserParams;
#define MYOI_USERPARAMS_DEVID 1
#define MYOI_USERPARAMS_LAST_MSG -1
} MyoiUserParams;
//!The following two types are dealt with entirely with just one MyoiUserParams structure:
//!MYOI_USERPARAMS_DEVID maps node ids.
#define MYOI_USERPARAMS_DEVID 1
//!MYOI_USERPARAMS_LAST_MSG terminates the array of MyoiUserParams.
#define MYOI_USERPARAMS_LAST_MSG -1
//!The following type requires setting the node id in a MyoiUserParams structure, and then following the struct
//!with a MyoiUserParamsPostLibInit union:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC 2
//!nodeid can be one of the following macros, or a number >=1, corresponding to the card number (1 == mic0,
//!2 == mic1, 3 == mic2, ....)
//!Setting nodeid to MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES causes the function to be called on all
//!cards:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES 0
//!Setting nodeid to MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE causes the function to be called on the
//!host instead of the card:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE -1
//!The postLibInit union contains two members that serves two different purposes:
//!1. It can be used to stipulate the name of the function to be remotely called from host to card, on successful
//!myo library initialization, (member postLibInitRemoveFuncName) using the type:
//!MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC. OR
//!2. It can be an actual function pointer (member name: postLibInitHostFuncAddress) that will be called on the host,
//!on successful myo library initialization, using the type: MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC, with nodeid:
//!MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE
typedef union {
const char *postLibInitRemoveFuncName;
void (*postLibInitHostFuncAddress)(void);
} MyoiUserParamsPostLibInit;
/* These are two macros to help get the information in a MyoiUserParamsPostLibInit union from a MyoiUserParams struct; */
#define GetPostLibInitFuncName(USERPARAMS) ((MyoiUserParamsPostLibInit *) (& (USERPARAMS)))->postLibInitRemoveFuncName
#define GetPostLibInitFuncAddr(USERPARAMS) ((MyoiUserParamsPostLibInit *) (& (USERPARAMS)))->postLibInitHostFuncAddress
/* These are two macros to help set the information in a MyoiUserParamsPostLibInit union from a MyoiUserParams struct; */
#define SetPostLibInitFuncName(USERPARAMS,FUNC_NAME) GetPostLibInitFuncName(USERPARAMS) = FUNC_NAME
#define SetPostLibInitFuncAddr(USERPARAMS,FUNC_ADDR) GetPostLibInitFuncAddr(USERPARAMS) = FUNC_ADDR
#ifdef __cplusplus
}
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -74,7 +74,8 @@ typedef enum {
MYO_ALREADY_EXISTS, /*!< Already Exists */
MYO_EOF, /*!< EOF */
MYO_EOF, /*!< EOF */
MYO_FEATURE_NOT_IMPLEMENTED = -1, /*!< Feature not implemented (see myoiSupportsFeature(). */
} MyoError;
......@@ -84,6 +85,40 @@ typedef enum {
MYO_ARENA_OURS, /*!< Arena OURS Ownership */
} MyoOwnershipType;
/*! MYO Features */
typedef enum {
/*!< EVERY VALUE that is less than MYO_FEATURE_BEGIN is not implemented. */
MYO_FEATURE_BEGIN = 1, /*!< The first feature that is supported. */
MYO_FEATURE_POST_LIB_INIT = MYO_FEATURE_BEGIN, /*!< Allows specifying a function to be executed immediately */
/* after myoiLibInit() completes. This feature was implemented in version */
/* 3.3 of MPSS. */
/* MYO_FEATURE_FUTURE_CAPABILITY = 2, at some time in the future, as new features are added to MYO, new enumeration constants */
/* will be added to the MyoFeatureType, and the value of the new enumeration constant will be greater */
/* than the current value of MYO_FEATURE_LAST constant, and then the MYO_FEATURE_LAST constant too, */
/* will be changed to be the value of the new enumeration constant. For example, in April, 2014, */
/* the POST_LIB_INIT feature was implemented in version 3.3 of MPSS, and the MYO_FEATURE_BEGIN */
/* enumeration constant is the same as the MYO_FEATURE_LAST enumeration constant, and both are equal */
/* to 1. */
/* Suppose in December, 2014, a new feature is added to the MYO library, for version 3.4 of MPSS. */
/* Then, MYO_FEATURE_BEGIN enumeration constant will be still the value 1, but the MYO_FEATURE_LAST */
/* enumeration constant will be set to 2. */
/* At runtime, one client binary can determine if the MYO that is installed is capable of any */
/* capability. For example, suppose a future client binary queries version 3.3 of MYO if it is */
/* capable of some future feature. Version 3.3 of MYO will indicate that the feature is not */
/* implemented to the client. But, conversely, suppose the future client queries version 3.4 of MYO */
/* if it is capable of some future feature. Version 3.4 of MYO will indicate that the feature isd */
/* supported. */
/* */
/* Date: | MYO_FEATURE_BEGIN: | MYO_FEATURE_LAST: | MPSS VERSION: | myoiSupportsFeature(MYO_FEATURE_FUTURE_CAPABILITY) */
/* ---------------+---------------------+--------------------+---------------+--------------------------------------------------- */
/* April, 2014 | 1 | 1 | 3.3 | MYO_FEATURE_NOT_IMPLEMENTED */
/* December, 2014 | 1 | 2 | 3.4 | MYO_SUCCESS */
/* ---------------+---------------------+--------------------+---------------+--------------------------------------------------- */
MYO_FEATURE_LAST = MYO_FEATURE_POST_LIB_INIT, /*!< The last feature that is supported. */
/*!< EVERY VALUE that is greater than MYO_FEATURE_LAST is not implemented. */
/*!< EVERY VALUE that is greater than or equal to MYO_FEATURE_BEGIN AND less than or equal to MYO_FEATURE_LAST is implemented. */
} MyoFeatureType; /* (For more information, please also see myoiSupportsFeature() function declaration.) */
/*************************************************************
* define the property of MYO Arena
***********************************************************/
......
......@@ -35,7 +35,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config
build_dir = $(top_builddir)
source_dir = $(top_srcdir)
coi_inc_dir = $(top_srcdir)/../include/coi
myo_inc_dir = $(top_srcdir)/../include/myo
include_src_dir = $(top_srcdir)/../../include
libgomp_src_dir = $(top_srcdir)/../../libgomp
libgomp_dir = $(build_dir)/../../libgomp
......@@ -53,12 +52,12 @@ target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$
if PLUGIN_HOST
toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la
libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp
libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0
else # PLUGIN_TARGET
plugin_includedir = $(libsubincludedir)
plugin_include_HEADERS = main_target_image.h
AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
AM_CXXFLAGS = $(CXXFLAGS)
AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic
endif
......
......@@ -305,7 +305,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config
build_dir = $(top_builddir)
source_dir = $(top_srcdir)
coi_inc_dir = $(top_srcdir)/../include/coi
myo_inc_dir = $(top_srcdir)/../include/myo
include_src_dir = $(top_srcdir)/../../include
libgomp_src_dir = $(top_srcdir)/../../libgomp
libgomp_dir = $(build_dir)/../../libgomp
......@@ -321,11 +320,11 @@ target_build_dir = $(accel_search_dir)/$(accel_target)$(MULTISUBDIR)/liboffloadm
target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$(MULTISUBDIR)
@PLUGIN_HOST_TRUE@toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0
@PLUGIN_HOST_FALSE@plugin_includedir = $(libsubincludedir)
@PLUGIN_HOST_FALSE@plugin_include_HEADERS = main_target_image.h
@PLUGIN_HOST_FALSE@AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
@PLUGIN_HOST_FALSE@AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
@PLUGIN_HOST_FALSE@AM_CXXFLAGS = $(CXXFLAGS)
@PLUGIN_HOST_FALSE@AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -34,7 +34,7 @@
// 1. allocate element of CeanReadRanges type
// 2. initialized it for reading consequently contiguous ranges
// described by "ap" argument
CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap)
CeanReadRanges * init_read_ranges_arr_desc(const Arr_Desc *ap)
{
CeanReadRanges * res;
......@@ -57,6 +57,8 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap)
(ap->rank - rank) * sizeof(CeanReadDim));
if (res == NULL)
LIBOFFLOAD_ERROR(c_malloc);
res->arr_desc = const_cast<Arr_Desc*>(ap);
res->current_number = 0;
res->range_size = length;
res->last_noncont_ind = rank;
......@@ -82,7 +84,7 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap)
return res;
}
// check if ranges described by 1 argument could be transfered into ranges
// check if ranges described by 1 argument could be transferred into ranges
// described by 2-nd one
bool cean_ranges_match(
CeanReadRanges * read_rng1,
......@@ -118,7 +120,7 @@ bool get_next_range(
return true;
}
bool is_arr_desc_contiguous(const arr_desc *ap)
bool is_arr_desc_contiguous(const Arr_Desc *ap)
{
int64_t rank = ap->rank - 1;
int64_t length = ap->dim[rank].size;
......@@ -146,14 +148,22 @@ int64_t cean_get_transf_size(CeanReadRanges * read_rng)
}
static uint64_t last_left, last_right;
typedef void (*fpp)(const char *spaces, uint64_t low, uint64_t high, int esize);
typedef void (*fpp)(
const char *spaces,
uint64_t low,
uint64_t high,
int esize,
bool print_values
);
static void generate_one_range(
const char *spaces,
uint64_t lrange,
uint64_t rrange,
fpp fp,
int esize
int esize,
bool print_values
)
{
OFFLOAD_TRACE(3,
......@@ -168,20 +178,35 @@ static void generate_one_range(
// Extend previous range, don't print
}
else {
(*fp)(spaces, last_left, last_right, esize);
(*fp)(spaces, last_left, last_right, esize, print_values);
last_left = lrange;
}
}
last_right = rrange;
}
static bool element_is_contiguous(
uint64_t rank,
const struct Dim_Desc *ddp
)
{
if (rank == 1) {
return (ddp[0].lower == ddp[0].upper || ddp[0].stride == 1);
}
else {
return ((ddp[0].size == (ddp[1].upper-ddp[1].lower+1)*ddp[1].size) &&
element_is_contiguous(rank-1, ddp++));
}
}
static void generate_mem_ranges_one_rank(
const char *spaces,
uint64_t base,
uint64_t rank,
const struct dim_desc *ddp,
const struct Dim_Desc *ddp,
fpp fp,
int esize
int esize,
bool print_values
)
{
uint64_t lindex = ddp->lindex;
......@@ -194,35 +219,40 @@ static void generate_mem_ranges_one_rank(
"generate_mem_ranges_one_rank(base=%p, rank=%lld, lindex=%lld, "
"lower=%lld, upper=%lld, stride=%lld, size=%lld, esize=%d)\n",
spaces, (void*)base, rank, lindex, lower, upper, stride, size, esize);
if (rank == 1) {
if (element_is_contiguous(rank, ddp)) {
uint64_t lrange, rrange;
if (stride == 1) {
lrange = base + (lower-lindex)*size;
rrange = lrange + (upper-lower+1)*size - 1;
generate_one_range(spaces, lrange, rrange, fp, esize);
}
else {
lrange = base + (lower-lindex)*size;
rrange = lrange + (upper-lower+1)*size - 1;
generate_one_range(spaces, lrange, rrange, fp, esize, print_values);
}
else {
if (rank == 1) {
for (int i=lower-lindex; i<=upper-lindex; i+=stride) {
uint64_t lrange, rrange;
lrange = base + i*size;
rrange = lrange + size - 1;
generate_one_range(spaces, lrange, rrange, fp, esize);
generate_one_range(spaces, lrange, rrange,
fp, esize, print_values);
}
}
}
else {
for (int i=lower-lindex; i<=upper-lindex; i+=stride) {
generate_mem_ranges_one_rank(
spaces, base+i*size, rank-1, ddp+1, fp, esize);
else {
for (int i=lower-lindex; i<=upper-lindex; i+=stride) {
generate_mem_ranges_one_rank(
spaces, base+i*size, rank-1, ddp+1,
fp, esize, print_values);
}
}
}
}
static void generate_mem_ranges(
const char *spaces,
const arr_desc *adp,
const Arr_Desc *adp,
bool deref,
fpp fp
fpp fp,
bool print_values
)
{
uint64_t esize;
......@@ -241,13 +271,13 @@ static void generate_mem_ranges(
// For c_cean_var the base addr is the address of the data
// For c_cean_var_ptr the base addr is dereferenced to get to the data
spaces, deref ? *((uint64_t*)(adp->base)) : adp->base,
adp->rank, &adp->dim[0], fp, esize);
(*fp)(spaces, last_left, last_right, esize);
adp->rank, &adp->dim[0], fp, esize, print_values);
(*fp)(spaces, last_left, last_right, esize, print_values);
}
// returns offset and length of the data to be transferred
void __arr_data_offset_and_length(
const arr_desc *adp,
const Arr_Desc *adp,
int64_t &offset,
int64_t &length
)
......@@ -284,11 +314,12 @@ void __arr_data_offset_and_length(
#if OFFLOAD_DEBUG > 0
void print_range(
static void print_range(
const char *spaces,
uint64_t low,
uint64_t high,
int esize
int esize,
bool print_values
)
{
char buffer[1024];
......@@ -297,7 +328,7 @@ void print_range(
OFFLOAD_TRACE(3, "%s print_range(low=%p, high=%p, esize=%d)\n",
spaces, (void*)low, (void*)high, esize);
if (console_enabled < 4) {
if (console_enabled < 4 || !print_values) {
return;
}
OFFLOAD_TRACE(4, "%s values:\n", spaces);
......@@ -340,8 +371,9 @@ void print_range(
void __arr_desc_dump(
const char *spaces,
const char *name,
const arr_desc *adp,
bool deref
const Arr_Desc *adp,
bool deref,
bool print_values
)
{
OFFLOAD_TRACE(2, "%s%s CEAN expression %p\n", spaces, name, adp);
......@@ -360,7 +392,7 @@ void __arr_desc_dump(
}
// For c_cean_var the base addr is the address of the data
// For c_cean_var_ptr the base addr is dereferenced to get to the data
generate_mem_ranges(spaces, adp, deref, &print_range);
generate_mem_ranges(spaces, adp, deref, &print_range, print_values);
}
}
#endif // OFFLOAD_DEBUG
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -32,9 +32,10 @@
#define CEAN_UTIL_H_INCLUDED
#include <stdint.h>
#include "offload_util.h"
// CEAN expression representation
struct dim_desc {
struct Dim_Desc {
int64_t size; // Length of data type
int64_t lindex; // Lower index
int64_t lower; // Lower section bound
......@@ -42,10 +43,10 @@ struct dim_desc {
int64_t stride; // Stride
};
struct arr_desc {
struct Arr_Desc {
int64_t base; // Base address
int64_t rank; // Rank of array
dim_desc dim[1];
Dim_Desc dim[1];
};
struct CeanReadDim {
......@@ -55,6 +56,7 @@ struct CeanReadDim {
};
struct CeanReadRanges {
Arr_Desc* arr_desc;
void * ptr;
int64_t current_number; // the number of ranges read
int64_t range_max_number; // number of contiguous ranges
......@@ -66,23 +68,23 @@ struct CeanReadRanges {
// array descriptor length
#define __arr_desc_length(rank) \
(sizeof(int64_t) + sizeof(dim_desc) * (rank))
(sizeof(int64_t) + sizeof(Dim_Desc) * (rank))
// returns offset and length of the data to be transferred
void __arr_data_offset_and_length(const arr_desc *adp,
DLL_LOCAL void __arr_data_offset_and_length(const Arr_Desc *adp,
int64_t &offset,
int64_t &length);
// define if data array described by argument is contiguous one
bool is_arr_desc_contiguous(const arr_desc *ap);
DLL_LOCAL bool is_arr_desc_contiguous(const Arr_Desc *ap);
// allocate element of CeanReadRanges type initialized
// to read consequently contiguous ranges described by "ap" argument
CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap);
DLL_LOCAL CeanReadRanges * init_read_ranges_arr_desc(const Arr_Desc *ap);
// check if ranges described by 1 argument could be transfered into ranges
// check if ranges described by 1 argument could be transferred into ranges
// described by 2-nd one
bool cean_ranges_match(
DLL_LOCAL bool cean_ranges_match(
CeanReadRanges * read_rng1,
CeanReadRanges * read_rng2
);
......@@ -90,27 +92,27 @@ bool cean_ranges_match(
// first argument - returned value by call to init_read_ranges_arr_desc.
// returns true if offset and length of next range is set successfuly.
// returns false if the ranges is over.
bool get_next_range(
DLL_LOCAL bool get_next_range(
CeanReadRanges * read_rng,
int64_t *offset
);
// returns number of transfered bytes
int64_t cean_get_transf_size(CeanReadRanges * read_rng);
// returns number of transferred bytes
DLL_LOCAL int64_t cean_get_transf_size(CeanReadRanges * read_rng);
#if OFFLOAD_DEBUG > 0
// prints array descriptor contents to stderr
void __arr_desc_dump(
DLL_LOCAL void __arr_desc_dump(
const char *spaces,
const char *name,
const arr_desc *adp,
bool dereference);
const Arr_Desc *adp,
bool dereference,
bool print_values);
#define ARRAY_DESC_DUMP(spaces, name, adp, dereference, print_values) \
if (console_enabled >= 2) \
__arr_desc_dump(spaces, name, adp, dereference, print_values);
#else
#define __arr_desc_dump(
spaces,
name,
adp,
dereference)
#define ARRAY_DESC_DUMP(spaces, name, adp, dereference, print_values)
#endif // OFFLOAD_DEBUG
#endif // CEAN_UTIL_H_INCLUDED
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -50,6 +50,13 @@ COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*, const void*,
const char**, uint8_t, const char*,
uint64_t, const char*, const char*,
uint64_t, COIPROCESS*);
COIRESULT (*ProcessCreateFromFile)(COIENGINE, const char*,
int, const char**, uint8_t,
const char**, uint8_t, const char*,
uint64_t, const char*,COIPROCESS*);
COIRESULT (*ProcessSetCacheSize)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*);
COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t, int8_t*, uint32_t*);
COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t, const char**,
COIFUNCTION*);
......@@ -57,6 +64,8 @@ COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS, const void*, uint64_t,
const char*, const char*,
const char*, uint64_t, uint32_t,
COILIBRARY*);
COIRESULT (*ProcessUnloadLibrary)(COIPROCESS,
COILIBRARY);
COIRESULT (*ProcessRegisterLibraries)(uint32_t, const void**, const uint64_t*,
const char**, const uint64_t*);
......@@ -80,6 +89,13 @@ COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferReadMultiD)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferWriteMultiD)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*);
......@@ -92,6 +108,20 @@ COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t, uint8_t, uint32_t*,
uint64_t (*PerfGetCycleFrequency)(void);
COIRESULT (*PipelineClearCPUMask) (COI_CPU_MASK);
COIRESULT (*PipelineSetCPUMask) (COIPROCESS, uint32_t,
uint8_t, COI_CPU_MASK);
COIRESULT (*EngineGetInfo)(COIENGINE, uint32_t, COI_ENGINE_INFO*);
COIRESULT (*EventRegisterCallback)(
const COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t);
COIRESULT (*ProcessConfigureDMA)(const uint64_t, const int);
bool init(void)
{
#ifndef TARGET_WINNT
......@@ -140,6 +170,32 @@ bool init(void)
return false;
}
ProcessSetCacheSize =
(COIRESULT (*)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIProcessSetCacheSize", COI_VERSION1);
if (ProcessSetCacheSize == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessSetCacheSize");
#if 0 // for now disable as ProcessSetCacheSize is not available on < MPSS 3.4
fini();
return false;
#endif
}
ProcessCreateFromFile =
(COIRESULT (*)(COIENGINE, const char*, int, const char**, uint8_t,
const char**, uint8_t, const char*, uint64_t,
const char*, COIPROCESS*))
DL_sym(lib_handle, "COIProcessCreateFromFile", COI_VERSION1);
if (ProcessCreateFromFile == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessCreateFromFile");
fini();
return false;
}
ProcessDestroy =
(COIRESULT (*)(COIPROCESS, int32_t, uint8_t, int8_t*,
uint32_t*))
......@@ -173,6 +229,17 @@ bool init(void)
return false;
}
ProcessUnloadLibrary =
(COIRESULT (*)(COIPROCESS,
COILIBRARY))
DL_sym(lib_handle, "COIProcessUnloadLibrary", COI_VERSION1);
if (ProcessUnloadLibrary == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessUnloadLibrary");
fini();
return false;
}
ProcessRegisterLibraries =
(COIRESULT (*)(uint32_t, const void**, const uint64_t*, const char**,
const uint64_t*))
......@@ -295,6 +362,22 @@ bool init(void)
return false;
}
BufferReadMultiD =
(COIRESULT (*)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIBufferReadMultiD", COI_VERSION1);
// We accept that coi library has no COIBufferReadMultiD routine.
// So there is no check for zero value
BufferWriteMultiD =
(COIRESULT (*)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIBufferWriteMultiD", COI_VERSION1);
// We accept that coi library has no COIBufferWriteMultiD routine.
// So there is no check for zero value
BufferCopy =
(COIRESULT (*)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*,
......@@ -350,6 +433,47 @@ bool init(void)
return false;
}
PipelineClearCPUMask =
(COIRESULT (*)(COI_CPU_MASK))
DL_sym(lib_handle, "COIPipelineClearCPUMask", COI_VERSION1);
if (PipelineClearCPUMask == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIPipelineClearCPUMask");
fini();
return false;
}
PipelineSetCPUMask =
(COIRESULT (*)(COIPROCESS, uint32_t,uint8_t, COI_CPU_MASK))
DL_sym(lib_handle, "COIPipelineSetCPUMask", COI_VERSION1);
if (PipelineSetCPUMask == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIPipelineSetCPUMask");
fini();
return false;
}
EngineGetInfo =
(COIRESULT (*)(COIENGINE, uint32_t, COI_ENGINE_INFO*))
DL_sym(lib_handle, "COIEngineGetInfo", COI_VERSION1);
if (COIEngineGetInfo == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIEngineGetInfo");
fini();
return false;
}
EventRegisterCallback =
(COIRESULT (*)(COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t))
DL_sym(lib_handle, "COIEventRegisterCallback", COI_VERSION1);
ProcessConfigureDMA =
(COIRESULT (*)(const uint64_t, const int))
DL_sym(lib_handle, "COIProcessConfigureDMA", COI_VERSION1);
is_available = true;
return true;
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -28,7 +28,7 @@
*/
// The interface betwen offload library and the COI API on the host
// The interface between offload library and the COI API on the host
#ifndef COI_CLIENT_H_INCLUDED
#define COI_CLIENT_H_INCLUDED
......@@ -54,16 +54,16 @@
// COI library interface
namespace COI {
extern bool init(void);
extern void fini(void);
DLL_LOCAL extern bool init(void);
DLL_LOCAL extern void fini(void);
extern bool is_available;
DLL_LOCAL extern bool is_available;
// pointers to functions from COI library
extern COIRESULT (*EngineGetCount)(COI_ISA_TYPE, uint32_t*);
extern COIRESULT (*EngineGetHandle)(COI_ISA_TYPE, uint32_t, COIENGINE*);
DLL_LOCAL extern COIRESULT (*EngineGetCount)(COI_ISA_TYPE, uint32_t*);
DLL_LOCAL extern COIRESULT (*EngineGetHandle)(COI_ISA_TYPE, uint32_t, COIENGINE*);
extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*,
DLL_LOCAL extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*,
const void*, uint64_t, int,
const char**, uint8_t,
const char**, uint8_t,
......@@ -71,12 +71,23 @@ extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*,
const char*,
const char*, uint64_t,
COIPROCESS*);
extern COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t,
DLL_LOCAL extern COIRESULT (*ProcessCreateFromFile)(COIENGINE, const char*, int,
const char**, uint8_t,
const char**,
uint8_t,
const char*,
uint64_t,
const char*,
COIPROCESS*);
DLL_LOCAL extern COIRESULT (*ProcessSetCacheSize)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t,
int8_t*, uint32_t*);
extern COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t,
DLL_LOCAL extern COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t,
const char**,
COIFUNCTION*);
extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS,
DLL_LOCAL extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS,
const void*,
uint64_t,
const char*,
......@@ -85,54 +96,80 @@ extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS,
uint64_t,
uint32_t,
COILIBRARY*);
extern COIRESULT (*ProcessRegisterLibraries)(uint32_t,
DLL_LOCAL extern COIRESULT (*ProcessUnloadLibrary)(COIPROCESS,
COILIBRARY);
DLL_LOCAL extern COIRESULT (*ProcessRegisterLibraries)(uint32_t,
const void**,
const uint64_t*,
const char**,
const uint64_t*);
extern COIRESULT (*PipelineCreate)(COIPROCESS, COI_CPU_MASK, uint32_t,
DLL_LOCAL extern COIRESULT (*PipelineCreate)(COIPROCESS, COI_CPU_MASK, uint32_t,
COIPIPELINE*);
extern COIRESULT (*PipelineDestroy)(COIPIPELINE);
extern COIRESULT (*PipelineRunFunction)(COIPIPELINE, COIFUNCTION,
DLL_LOCAL extern COIRESULT (*PipelineDestroy)(COIPIPELINE);
DLL_LOCAL extern COIRESULT (*PipelineRunFunction)(COIPIPELINE, COIFUNCTION,
uint32_t, const COIBUFFER*,
const COI_ACCESS_FLAGS*,
uint32_t, const COIEVENT*,
const void*, uint16_t, void*,
uint16_t, COIEVENT*);
extern COIRESULT (*BufferCreate)(uint64_t, COI_BUFFER_TYPE, uint32_t,
DLL_LOCAL extern COIRESULT (*BufferCreate)(uint64_t, COI_BUFFER_TYPE, uint32_t,
const void*, uint32_t,
const COIPROCESS*, COIBUFFER*);
extern COIRESULT (*BufferCreateFromMemory)(uint64_t, COI_BUFFER_TYPE,
DLL_LOCAL extern COIRESULT (*BufferCreateFromMemory)(uint64_t, COI_BUFFER_TYPE,
uint32_t, void*,
uint32_t, const COIPROCESS*,
COIBUFFER*);
extern COIRESULT (*BufferDestroy)(COIBUFFER);
extern COIRESULT (*BufferMap)(COIBUFFER, uint64_t, uint64_t,
DLL_LOCAL extern COIRESULT (*BufferDestroy)(COIBUFFER);
DLL_LOCAL extern COIRESULT (*BufferMap)(COIBUFFER, uint64_t, uint64_t,
COI_MAP_TYPE, uint32_t, const COIEVENT*,
COIEVENT*, COIMAPINSTANCE*, void**);
extern COIRESULT (*BufferUnmap)(COIMAPINSTANCE, uint32_t,
DLL_LOCAL extern COIRESULT (*BufferUnmap)(COIMAPINSTANCE, uint32_t,
const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*,
DLL_LOCAL extern COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*,
uint64_t, COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t,
DLL_LOCAL extern COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t,
COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t,
DLL_LOCAL extern COIRESULT (*BufferReadMultiD)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*BufferWriteMultiD)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t,
uint64_t, COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*);
extern COIRESULT (*BufferSetState)(COIBUFFER, COIPROCESS, COI_BUFFER_STATE,
DLL_LOCAL extern COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*);
DLL_LOCAL extern COIRESULT (*BufferSetState)(COIBUFFER, COIPROCESS, COI_BUFFER_STATE,
COI_BUFFER_MOVE_FLAG, uint32_t,
const COIEVENT*, COIEVENT*);
extern COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t,
DLL_LOCAL extern COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t,
uint8_t, uint32_t*, uint32_t*);
extern uint64_t (*PerfGetCycleFrequency)(void);
DLL_LOCAL extern uint64_t (*PerfGetCycleFrequency)(void);
DLL_LOCAL extern COIRESULT (*ProcessConfigureDMA)(const uint64_t, const int);
extern COIRESULT (*PipelineClearCPUMask)(COI_CPU_MASK);
extern COIRESULT (*PipelineSetCPUMask)(COIPROCESS, uint32_t,
uint8_t, COI_CPU_MASK);
extern COIRESULT (*EngineGetInfo)(COIENGINE, uint32_t, COI_ENGINE_INFO*);
extern COIRESULT (*EventRegisterCallback)(
const COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t);
const int DMA_MODE_READ_WRITE = 1;
} // namespace COI
#endif // COI_CLIENT_H_INCLUDED
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -38,6 +38,22 @@
#include "../offload_myo_target.h" // for __offload_myoLibInit/Fini
#endif // MYO_SUPPORT
#if !defined(CPU_COUNT)
// if CPU_COUNT is not defined count number of CPUs manually
static
int my_cpu_count(cpu_set_t const *cpu_set)
{
int res = 0;
for (int i = 0; i < sizeof(cpu_set_t) / sizeof(__cpu_mask); ++i) {
res += __builtin_popcountl(cpu_set->__bits[i]);
}
return res;
}
// Map CPU_COUNT to our function
#define CPU_COUNT(x) my_cpu_count(x)
#endif
COINATIVELIBEXPORT
void server_compute(
uint32_t buffer_count,
......@@ -118,6 +134,20 @@ void server_var_table_copy(
__offload_vars.table_copy(buffers[0], *static_cast<int64_t*>(misc_data));
}
COINATIVELIBEXPORT
void server_set_stream_affinity(
uint32_t buffer_count,
void** buffers,
uint64_t* buffers_len,
void* misc_data,
uint16_t misc_data_len,
void* return_data,
uint16_t return_data_len
)
{
/* kmp affinity is not supported by GCC. */
}
#ifdef MYO_SUPPORT
// temporary workaround for blocking behavior of myoiLibInit/Fini calls
COINATIVELIBEXPORT
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -28,7 +28,7 @@
*/
//The interface betwen offload library and the COI API on the target.
// The interface between offload library and the COI API on the target
#ifndef COI_SERVER_H_INCLUDED
#define COI_SERVER_H_INCLUDED
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -35,7 +35,7 @@
#include <alloca.h>
#endif // TARGET_WINNT
// Global counter on host.
// Global counter on host.
// This variable is used if P2OPT_offload_do_data_persistence == 2.
// The variable used to identify offload constructs contained in one procedure.
// Increment of OFFLOAD_CALL_COUNT is inserted at entries of HOST routines with
......@@ -72,7 +72,7 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE(
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
// initalize all devices is init_type is on_offload_all
// initialize all devices is init_type is on_offload_all
if (retval && __offload_init_type == c_init_on_offload_all) {
for (int i = 0; i < mic_engines_total; i++) {
mic_engines[i].init();
......@@ -241,7 +241,128 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1(
return ofld;
}
int offload_offload_wrap(
extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE2(
TARGET_TYPE target_type,
int target_number,
int is_optional,
_Offload_status* status,
const char* file,
uint64_t line,
const void** stream
)
{
bool retval;
OFFLOAD ofld;
// initialize status
if (status != 0) {
status->result = OFFLOAD_UNAVAILABLE;
status->device_number = -1;
status->data_sent = 0;
status->data_received = 0;
}
// make sure libray is initialized
retval = __offload_init_library();
// OFFLOAD_TIMER_INIT must follow call to __offload_init_library
OffloadHostTimerData * timer_data = OFFLOAD_TIMER_INIT(file, line);
OFFLOAD_TIMER_START(timer_data, c_offload_host_total_offload);
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
// initalize all devices if init_type is on_offload_all
if (retval && __offload_init_type == c_init_on_offload_all) {
for (int i = 0; i < mic_engines_total; i++) {
mic_engines[i].init();
}
}
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_initialize);
OFFLOAD_TIMER_START(timer_data, c_offload_host_target_acquire);
if (target_type == TARGET_HOST) {
// Host always available
retval = true;
}
else if (target_type == TARGET_MIC) {
_Offload_stream handle = *(reinterpret_cast<_Offload_stream*>(stream));
Stream * stream = handle ? Stream::find_stream(handle, false) : NULL;
if (target_number >= -1) {
if (retval) {
// device number is defined by stream
if (stream) {
target_number = stream->get_device();
target_number = target_number % mic_engines_total;
}
// reserve device in ORSL
if (target_number != -1) {
if (is_optional) {
if (!ORSL::try_reserve(target_number)) {
target_number = -1;
}
}
else {
if (!ORSL::reserve(target_number)) {
target_number = -1;
}
}
}
// initialize device
if (target_number >= 0 &&
__offload_init_type == c_init_on_offload) {
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
mic_engines[target_number].init();
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_initialize);
}
}
else {
// fallback to CPU
target_number = -1;
}
if (!(target_number == -1 && handle == 0)) {
if (target_number < 0 || !retval) {
if (!is_optional && status == 0) {
LIBOFFLOAD_ERROR(c_device_is_not_available);
exit(1);
}
retval = false;
}
}
}
else {
LIBOFFLOAD_ERROR(c_invalid_device_number);
exit(1);
}
}
if (retval) {
ofld = new OffloadDescriptor(target_number, status,
!is_optional, false, timer_data);
OFFLOAD_TIMER_HOST_MIC_NUM(timer_data, target_number);
Offload_Report_Prolog(timer_data);
OFFLOAD_DEBUG_TRACE_1(2, timer_data->offload_number, c_offload_start,
"Starting offload: target_type = %d, "
"number = %d, is_optional = %d\n",
target_type, target_number, is_optional);
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_target_acquire);
}
else {
ofld = NULL;
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_target_acquire);
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_total_offload);
offload_report_free_data(timer_data);
}
return ofld;
}
static int offload_offload_wrap(
OFFLOAD ofld,
const char *name,
int is_empty,
......@@ -252,12 +373,15 @@ int offload_offload_wrap(
const void **waits,
const void **signal,
int entry_id,
const void *stack_addr
const void *stack_addr,
OffloadFlags offload_flags
)
{
bool ret = ofld->offload(name, is_empty, vars, vars2, num_vars,
waits, num_waits, signal, entry_id, stack_addr);
if (!ret || signal == 0) {
waits, num_waits, signal, entry_id,
stack_addr, offload_flags);
if (!ret || (signal == 0 && ofld->get_stream() == 0 &&
!offload_flags.bits.omp_async)) {
delete ofld;
}
return ret;
......@@ -278,7 +402,7 @@ extern "C" int OFFLOAD_OFFLOAD1(
return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2,
num_waits, waits,
signal, NULL, NULL);
signal, 0, NULL, {0});
}
extern "C" int OFFLOAD_OFFLOAD2(
......@@ -298,7 +422,35 @@ extern "C" int OFFLOAD_OFFLOAD2(
return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2,
num_waits, waits,
signal, entry_id, stack_addr);
signal, entry_id, stack_addr, {0});
}
extern "C" int OFFLOAD_OFFLOAD3(
OFFLOAD ofld,
const char *name,
int is_empty,
int num_vars,
VarDesc *vars,
VarDesc2 *vars2,
int num_waits,
const void** waits,
const void** signal,
int entry_id,
const void *stack_addr,
OffloadFlags offload_flags,
const void** stream
)
{
// 1. if the source is compiled with -traceback then stream is 0
// 2. if offload has a stream clause then stream is address of stream value
if (stream) {
ofld->set_stream(*(reinterpret_cast<_Offload_stream *>(stream)));
}
return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2,
num_waits, waits,
signal, entry_id, stack_addr, offload_flags);
}
extern "C" int OFFLOAD_OFFLOAD(
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -39,9 +39,11 @@
#define OFFLOAD_TARGET_ACQUIRE OFFLOAD_PREFIX(target_acquire)
#define OFFLOAD_TARGET_ACQUIRE1 OFFLOAD_PREFIX(target_acquire1)
#define OFFLOAD_TARGET_ACQUIRE2 OFFLOAD_PREFIX(target_acquire2)
#define OFFLOAD_OFFLOAD OFFLOAD_PREFIX(offload)
#define OFFLOAD_OFFLOAD1 OFFLOAD_PREFIX(offload1)
#define OFFLOAD_OFFLOAD2 OFFLOAD_PREFIX(offload2)
#define OFFLOAD_OFFLOAD3 OFFLOAD_PREFIX(offload3)
#define OFFLOAD_CALL_COUNT OFFLOAD_PREFIX(offload_call_count)
......@@ -75,6 +77,26 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1(
uint64_t line
);
/*! \fn OFFLOAD_TARGET_ACQUIRE2
\brief Attempt to acquire the target.
\param target_type The type of target.
\param target_number The device number.
\param is_optional Whether CPU fall-back is allowed.
\param status Address of variable to hold offload status.
\param file Filename in which this offload occurred.
\param line Line number in the file where this offload occurred.
\param stream Pointer to stream value.
*/
extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE2(
TARGET_TYPE target_type,
int target_number,
int is_optional,
_Offload_status* status,
const char* file,
uint64_t line,
const void** stream
);
/*! \fn OFFLOAD_OFFLOAD1
\brief Run function on target using interface for old data persistence.
\param o Offload descriptor created by OFFLOAD_TARGET_ACQUIRE.
......@@ -127,6 +149,40 @@ extern "C" int OFFLOAD_OFFLOAD2(
const void *stack_addr
);
/*! \fn OFFLOAD_OFFLOAD3
\brief Run function on target, API introduced in 15.0 Update 1
\brief when targetptr, preallocated feature was introduced.
\param o Offload descriptor created by OFFLOAD_TARGET_ACQUIRE.
\param name Name of offload entry point.
\param is_empty If no code to execute (e.g. offload_transfer)
\param num_vars Number of variable descriptors.
\param vars Pointer to VarDesc array.
\param vars2 Pointer to VarDesc2 array.
\param num_waits Number of "wait" values.
\param waits Pointer to array of wait values.
\param signal Pointer to signal value or NULL.
\param entry_id A signature for the function doing the offload.
\param stack_addr The stack frame address of the function doing offload.
\param offload_flags Flags to indicate Fortran traceback, OpenMP async.
\param stream Pointer to stream value or NULL.
*/
extern "C" int OFFLOAD_OFFLOAD3(
OFFLOAD ofld,
const char *name,
int is_empty,
int num_vars,
VarDesc *vars,
VarDesc2 *vars2,
int num_waits,
const void** waits,
const void** signal,
int entry_id,
const void *stack_addr,
OffloadFlags offload_flags,
const void** stream
);
// Run function on target (obsolete).
// @param o OFFLOAD object
// @param name function name
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -102,8 +102,8 @@ CeanReadRanges * init_read_ranges_dv(const ArrDesc *dvp)
}
res = (CeanReadRanges *)malloc(
sizeof(CeanReadRanges) + (rank - i) * sizeof(CeanReadDim));
if (res == NULL)
LIBOFFLOAD_ERROR(c_malloc);
if (res == NULL)
LIBOFFLOAD_ERROR(c_malloc);
res -> last_noncont_ind = rank - i - 1;
count = 1;
for (; i < rank; i++) {
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -32,6 +32,7 @@
#define DV_UTIL_H_INCLUDED
#include <stdint.h>
#include "offload_util.h"
// Dope vector declarations
#define ArrDescMaxArrayRank 31
......@@ -64,18 +65,18 @@ typedef struct ArrDesc {
typedef ArrDesc* pArrDesc;
bool __dv_is_contiguous(const ArrDesc *dvp);
DLL_LOCAL bool __dv_is_contiguous(const ArrDesc *dvp);
bool __dv_is_allocated(const ArrDesc *dvp);
DLL_LOCAL bool __dv_is_allocated(const ArrDesc *dvp);
uint64_t __dv_data_length(const ArrDesc *dvp);
DLL_LOCAL uint64_t __dv_data_length(const ArrDesc *dvp);
uint64_t __dv_data_length(const ArrDesc *dvp, int64_t nelems);
DLL_LOCAL uint64_t __dv_data_length(const ArrDesc *dvp, int64_t nelems);
CeanReadRanges * init_read_ranges_dv(const ArrDesc *dvp);
DLL_LOCAL CeanReadRanges * init_read_ranges_dv(const ArrDesc *dvp);
#if OFFLOAD_DEBUG > 0
void __dv_desc_dump(const char *name, const ArrDesc *dvp);
DLL_LOCAL void __dv_desc_dump(const char *name, const ArrDesc *dvp);
#else // OFFLOAD_DEBUG
#define __dv_desc_dump(name, dvp)
#endif // OFFLOAD_DEBUG
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -62,8 +62,8 @@
/* Environment variable for target executable run command. */
#define OFFLOAD_EMUL_RUN_ENV "OFFLOAD_EMUL_RUN"
/* Environment variable for number ok KNC devices. */
#define OFFLOAD_EMUL_KNC_NUM_ENV "OFFLOAD_EMUL_KNC_NUM"
/* Environment variable for number of emulated devices. */
#define OFFLOAD_EMUL_NUM_ENV "OFFLOAD_EMUL_NUM"
/* Path to engine directory. */
......@@ -133,6 +133,7 @@ typedef enum
CMD_BUFFER_UNMAP,
CMD_GET_FUNCTION_HANDLE,
CMD_OPEN_LIBRARY,
CMD_CLOSE_LIBRARY,
CMD_RUN_FUNCTION,
CMD_SHUTDOWN
} cmd_t;
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -109,8 +109,8 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
strlen (PIPE_HOST_PATH) + strlen (mic_dir) + 1);
MALLOC (char *, pipe_target_path,
strlen (PIPE_TARGET_PATH) + strlen (mic_dir) + 1);
sprintf (pipe_host_path, "%s"PIPE_HOST_PATH, mic_dir);
sprintf (pipe_target_path, "%s"PIPE_TARGET_PATH, mic_dir);
sprintf (pipe_host_path, "%s" PIPE_HOST_PATH, mic_dir);
sprintf (pipe_target_path, "%s" PIPE_TARGET_PATH, mic_dir);
pipe_host = open (pipe_host_path, O_CLOEXEC | O_WRONLY);
if (pipe_host < 0)
COIERROR ("Cannot open target-to-host pipe.");
......@@ -237,6 +237,7 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
{
char *lib_path;
size_t len;
void *handle;
/* Receive data from host. */
READ (pipe_target, &len, sizeof (size_t));
......@@ -244,14 +245,28 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
READ (pipe_target, lib_path, len);
/* Open library. */
if (dlopen (lib_path, RTLD_LAZY | RTLD_GLOBAL) == 0)
handle = dlopen (lib_path, RTLD_LAZY | RTLD_GLOBAL);
if (handle == NULL)
COIERROR ("Cannot load %s: %s", lib_path, dlerror ());
/* Send data to host. */
WRITE (pipe_host, &handle, sizeof (void *));
/* Clean up. */
free (lib_path);
break;
}
case CMD_CLOSE_LIBRARY:
{
/* Receive data from host. */
void *handle;
READ (pipe_target, &handle, sizeof (void *));
dlclose (handle);
break;
}
case CMD_RUN_FUNCTION:
{
uint16_t misc_data_len, return_data_len;
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -40,8 +40,8 @@ extern char **environ;
char **tmp_dirs;
unsigned tmp_dirs_num = 0;
/* Number of KNC engines. */
long knc_engines_num;
/* Number of emulated MIC engines. */
long num_engines;
/* Mutex to sync parallel execution. */
pthread_mutex_t mutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;
......@@ -116,8 +116,7 @@ __attribute__((constructor))
static void
init ()
{
if (read_long_env (OFFLOAD_EMUL_KNC_NUM_ENV, &knc_engines_num, 1)
== COI_ERROR)
if (read_long_env (OFFLOAD_EMUL_NUM_ENV, &num_engines, 1) == COI_ERROR)
exit (0);
}
......@@ -665,10 +664,10 @@ SYMBOL_VERSION (COIEngineGetCount, 1) (COI_ISA_TYPE isa,
COITRACE ("COIEngineGetCount");
/* Features of liboffload. */
assert (isa == COI_ISA_KNC);
assert (isa == COI_ISA_MIC);
/* Prepare output arguments. */
*count = knc_engines_num;
*count = num_engines;
return COI_SUCCESS;
}
......@@ -684,10 +683,10 @@ SYMBOL_VERSION (COIEngineGetHandle, 1) (COI_ISA_TYPE isa,
Engine *engine;
/* Features of liboffload. */
assert (isa == COI_ISA_KNC);
assert (isa == COI_ISA_MIC);
/* Check engine index. */
if (index >= knc_engines_num)
if (index >= num_engines)
COIERROR ("Wrong engine index.");
/* Create engine handle. */
......@@ -889,7 +888,7 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
/* Create directory for pipes to prevent names collision. */
MALLOC (char *, pipes_path, strlen (PIPES_PATH) + strlen (eng->dir) + 1);
sprintf (pipes_path, "%s"PIPES_PATH, eng->dir);
sprintf (pipes_path, "%s" PIPES_PATH, eng->dir);
if (mkdir (pipes_path, S_IRWXU) < 0)
COIERROR ("Cannot create folder %s.", pipes_path);
......@@ -900,8 +899,8 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
strlen (PIPE_TARGET_PATH) + strlen (eng->dir) + 1);
if (pipe_target_path == NULL)
COIERROR ("Cannot allocate memory.");
sprintf (pipe_host_path, "%s"PIPE_HOST_PATH, eng->dir);
sprintf (pipe_target_path, "%s"PIPE_TARGET_PATH, eng->dir);
sprintf (pipe_host_path, "%s" PIPE_HOST_PATH, eng->dir);
sprintf (pipe_target_path, "%s" PIPE_TARGET_PATH, eng->dir);
if (mkfifo (pipe_host_path, S_IRUSR | S_IWUSR) < 0)
COIERROR ("Cannot create pipe %s.", pipe_host_path);
if (mkfifo (pipe_target_path, S_IRUSR | S_IWUSR) < 0)
......@@ -1019,6 +1018,27 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
COIRESULT
SYMBOL_VERSION (COIProcessCreateFromFile, 1) (COIENGINE in_Engine,
const char *in_pBinaryName,
int in_Argc,
const char **in_ppArgv,
uint8_t in_DupEnv,
const char **in_ppAdditionalEnv,
uint8_t in_ProxyActive,
const char *in_Reserved,
uint64_t in_BufferSpace,
const char *in_LibrarySearchPath,
COIPROCESS *out_pProcess)
{
COITRACE ("COIProcessCreateFromFile");
/* liboffloadmic with GCC compiled binaries should never go here. */
assert (false);
return COI_ERROR;
}
COIRESULT
SYMBOL_VERSION (COIProcessDestroy, 1) (COIPROCESS process,
int32_t wait_timeout, // Ignored
uint8_t force,
......@@ -1129,38 +1149,39 @@ SYMBOL_VERSION (COIProcessGetFunctionHandles, 1) (COIPROCESS process,
COIRESULT
SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process,
const void *lib_buffer,
uint64_t lib_buffer_len,
const char *lib_name,
const char *lib_search_path,
const char *file_of_origin, // Ignored
uint64_t file_from_origin_offset, // Ignored
uint32_t flags, // Ignored
COILIBRARY *library) // Ignored
SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS in_Process,
const void *in_pLibraryBuffer,
uint64_t in_LibraryBufferLength,
const char *in_pLibraryName,
const char *in_LibrarySearchPath, // Ignored
const char *in_FileOfOrigin, // Ignored
uint64_t in_FileOfOriginOffset, // Ignored
uint32_t in_Flags, // Ignored
COILIBRARY *out_pLibrary)
{
COITRACE ("COIProcessLoadLibraryFromMemory");
const cmd_t cmd = CMD_OPEN_LIBRARY;
char *lib_path;
cmd_t cmd = CMD_OPEN_LIBRARY;
int fd;
FILE *file;
size_t len;
/* Convert input arguments. */
Process *proc = (Process *) process;
Process *proc = (Process *) in_Process;
/* Create target library file. */
MALLOC (char *, lib_path,
strlen (proc->engine->dir) + strlen (lib_name) + 2);
sprintf (lib_path, "%s/%s", proc->engine->dir, lib_name);
strlen (proc->engine->dir) + strlen (in_pLibraryName) + 2);
sprintf (lib_path, "%s/%s", proc->engine->dir, in_pLibraryName);
fd = open (lib_path, O_CLOEXEC | O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
if (fd < 0)
COIERROR ("Cannot create file %s.", lib_path);
file = fdopen (fd, "wb");
if (file == NULL)
COIERROR ("Cannot associate stream with file descriptor.");
if (fwrite (lib_buffer, 1, lib_buffer_len, file) != lib_buffer_len)
if (fwrite (in_pLibraryBuffer, 1, in_LibraryBufferLength, file)
!= in_LibraryBufferLength)
COIERROR ("Cannot write in file %s.", lib_path);
if (fclose (file) != 0)
COIERROR ("Cannot close file %s.", lib_path);
......@@ -1176,6 +1197,10 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process,
WRITE (proc->pipeline->pipe_target, &len, sizeof (size_t));
WRITE (proc->pipeline->pipe_target, lib_path, len);
/* Receive data from target. */
void *handle;
READ (proc->pipeline->pipe_host, &handle, sizeof (void *));
/* Finish critical section. */
if (pthread_mutex_unlock (&mutex) != 0)
COIERROR ("Cannot unlock mutex.");
......@@ -1183,6 +1208,7 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process,
/* Clean up. */
free (lib_path);
*out_pLibrary = (COILIBRARY) handle;
return COI_SUCCESS;
}
......@@ -1202,6 +1228,33 @@ SYMBOL_VERSION (COIProcessRegisterLibraries, 1) (uint32_t libraries_num,
}
COIRESULT
SYMBOL_VERSION (COIProcessUnloadLibrary, 1) (COIPROCESS in_Process,
COILIBRARY in_Library)
{
COITRACE ("COIProcessUnloadLibrary");
const cmd_t cmd = CMD_CLOSE_LIBRARY;
/* Convert input arguments. */
Process *proc = (Process *) in_Process;
/* Start critical section. */
if (pthread_mutex_lock (&mutex) != 0)
COIERROR ("Cannot lock mutex.");
/* Make target close library. */
WRITE (proc->pipeline->pipe_target, &cmd, sizeof (cmd_t));
WRITE (proc->pipeline->pipe_target, &in_Library, sizeof (void *));
/* Finish critical section. */
if (pthread_mutex_unlock (&mutex) != 0)
COIERROR ("Cannot unlock mutex.");
return COI_SUCCESS;
}
uint64_t
SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) ()
{
......@@ -1210,5 +1263,51 @@ SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) ()
return (uint64_t) CYCLE_FREQUENCY;
}
COIRESULT
SYMBOL_VERSION (COIPipelineClearCPUMask, 1) (COI_CPU_MASK *in_Mask)
{
COITRACE ("COIPipelineClearCPUMask");
/* Looks like we have nothing to do here. */
return COI_SUCCESS;
}
COIRESULT
SYMBOL_VERSION (COIPipelineSetCPUMask, 1) (COIPROCESS in_Process,
uint32_t in_CoreID,
uint8_t in_ThreadID,
COI_CPU_MASK *out_pMask)
{
COITRACE ("COIPipelineSetCPUMask");
/* Looks like we have nothing to do here. */
return COI_SUCCESS;
}
COIRESULT
SYMBOL_VERSION (COIEngineGetInfo, 1) (COIENGINE in_EngineHandle,
uint32_t in_EngineInfoSize,
COI_ENGINE_INFO *out_pEngineInfo)
{
COITRACE ("COIEngineGetInfo");
out_pEngineInfo->ISA = COI_ISA_x86_64;
out_pEngineInfo->NumCores = 1;
out_pEngineInfo->NumThreads = 8;
out_pEngineInfo->CoreMaxFrequency = SYMBOL_VERSION(COIPerfGetCycleFrequency,1)() / 1000000;
out_pEngineInfo->PhysicalMemory = 1024;
out_pEngineInfo->PhysicalMemoryFree = 1024;
out_pEngineInfo->SwapMemory = 1024;
out_pEngineInfo->SwapMemoryFree = 1024;
out_pEngineInfo->MiscFlags = COI_ENG_ECC_DISABLED;
return COI_SUCCESS;
}
} // extern "C"
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -38,31 +38,54 @@
* intellectual property rights is granted herein.
*/
__asm__ (".symver COIBufferAddRef1,COIBufferAddRef@@COI_1.0");
__asm__ (".symver COIBufferCopy1,COIBufferCopy@@COI_1.0");
__asm__ (".symver COIBufferCreate1,COIBufferCreate@@COI_1.0");
__asm__ (".symver COIBufferCreateFromMemory1,COIBufferCreateFromMemory@@COI_1.0");
__asm__ (".symver COIBufferDestroy1,COIBufferDestroy@@COI_1.0");
__asm__ (".symver COIBufferGetSinkAddress1,COIBufferGetSinkAddress@@COI_1.0");
__asm__ (".symver COIBufferMap1,COIBufferMap@@COI_1.0");
__asm__ (".symver COIBufferRead1,COIBufferRead@@COI_1.0");
__asm__ (".symver COIBufferReleaseRef1,COIBufferReleaseRef@@COI_1.0");
__asm__ (".symver COIBufferSetState1,COIBufferSetState@@COI_1.0");
__asm__ (".symver COIBufferUnmap1,COIBufferUnmap@@COI_1.0");
__asm__ (".symver COIBufferWrite1,COIBufferWrite@@COI_1.0");
__asm__ (".symver COIEngineGetCount1,COIEngineGetCount@@COI_1.0");
__asm__ (".symver COIEngineGetHandle1,COIEngineGetHandle@@COI_1.0");
__asm__ (".symver COIEngineGetIndex1,COIEngineGetIndex@@COI_1.0");
__asm__ (".symver COIEventWait1,COIEventWait@@COI_1.0");
__asm__ (".symver COIPerfGetCycleFrequency1,COIPerfGetCycleFrequency@@COI_1.0");
__asm__ (".symver COIPipelineCreate1,COIPipelineCreate@@COI_1.0");
__asm__ (".symver COIPipelineDestroy1,COIPipelineDestroy@@COI_1.0");
__asm__ (".symver COIPipelineRunFunction1,COIPipelineRunFunction@@COI_1.0");
__asm__ (".symver COIPipelineStartExecutingRunFunctions1,COIPipelineStartExecutingRunFunctions@@COI_1.0");
__asm__ (".symver COIProcessCreateFromMemory1,COIProcessCreateFromMemory@@COI_1.0");
__asm__ (".symver COIProcessDestroy1,COIProcessDestroy@@COI_1.0");
__asm__ (".symver COIProcessGetFunctionHandles1,COIProcessGetFunctionHandles@@COI_1.0");
__asm__ (".symver COIProcessLoadLibraryFromMemory2,COIProcessLoadLibraryFromMemory@COI_2.0");
__asm__ (".symver COIProcessRegisterLibraries1,COIProcessRegisterLibraries@@COI_1.0");
__asm__ (".symver COIProcessWaitForShutdown1,COIProcessWaitForShutdown@@COI_1.0");
// Originally generated via:
// cd include;
// ctags -x --c-kinds=fp -R sink/ source/ common/ | grep -v COIX | awk '{print "__asm__(\".symver "$1"1,"$1"@@COI_1.0\");"}'
//
// These directives must have an associated linker script with VERSION stuff.
// See coi_version_linker_script.map
// Passed in as
// -Wl,--version-script coi_version_linker_script.map
// when building Intel(R) Coprocessor Offload Infrastructure (Intel(R) COI)
//
// See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info
//
// This is not strictly a .h file, so no need to #pragma once or anything.
// You must include these asm directives in the same translation unit as the
// one where the function body is.
// Otherwise we'd have add this file to the list of files needed to build
// libcoi*, instead of including it in each of the api/*/*cpp files.
//
__asm__(".symver COIBufferAddRef1,COIBufferAddRef@@COI_1.0");
__asm__(".symver COIBufferCopy1,COIBufferCopy@@COI_1.0");
__asm__(".symver COIBufferCreate1,COIBufferCreate@@COI_1.0");
__asm__(".symver COIBufferCreateFromMemory1,COIBufferCreateFromMemory@@COI_1.0");
__asm__(".symver COIBufferDestroy1,COIBufferDestroy@@COI_1.0");
__asm__(".symver COIBufferGetSinkAddress1,COIBufferGetSinkAddress@@COI_1.0");
__asm__(".symver COIBufferMap1,COIBufferMap@@COI_1.0");
__asm__(".symver COIBufferRead1,COIBufferRead@@COI_1.0");
__asm__(".symver COIBufferReleaseRef1,COIBufferReleaseRef@@COI_1.0");
__asm__(".symver COIBufferSetState1,COIBufferSetState@@COI_1.0");
__asm__(".symver COIBufferUnmap1,COIBufferUnmap@@COI_1.0");
__asm__(".symver COIBufferWrite1,COIBufferWrite@@COI_1.0");
__asm__(".symver COIEngineGetCount1,COIEngineGetCount@@COI_1.0");
__asm__(".symver COIEngineGetHandle1,COIEngineGetHandle@@COI_1.0");
__asm__(".symver COIEngineGetIndex1,COIEngineGetIndex@@COI_1.0");
__asm__(".symver COIEngineGetInfo1,COIEngineGetInfo@@COI_1.0");
__asm__(".symver COIEventRegisterCallback1,COIEventRegisterCallback@@COI_1.0");
__asm__(".symver COIEventWait1,COIEventWait@@COI_1.0");
__asm__(".symver COIPerfGetCycleFrequency1,COIPerfGetCycleFrequency@@COI_1.0");
__asm__(".symver COIPipelineClearCPUMask1,COIPipelineClearCPUMask@@COI_1.0");
__asm__(".symver COIPipelineCreate1,COIPipelineCreate@@COI_1.0");
__asm__(".symver COIPipelineDestroy1,COIPipelineDestroy@@COI_1.0");
__asm__(".symver COIPipelineRunFunction1,COIPipelineRunFunction@@COI_1.0");
__asm__(".symver COIPipelineSetCPUMask1,COIPipelineSetCPUMask@@COI_1.0");
__asm__(".symver COIPipelineStartExecutingRunFunctions1,COIPipelineStartExecutingRunFunctions@@COI_1.0");
__asm__(".symver COIProcessCreateFromFile1,COIProcessCreateFromFile@@COI_1.0");
__asm__(".symver COIProcessCreateFromMemory1,COIProcessCreateFromMemory@@COI_1.0");
__asm__(".symver COIProcessDestroy1,COIProcessDestroy@@COI_1.0");
__asm__(".symver COIProcessGetFunctionHandles1,COIProcessGetFunctionHandles@@COI_1.0");
__asm__(".symver COIProcessLoadLibraryFromMemory2,COIProcessLoadLibraryFromMemory@COI_2.0");
__asm__(".symver COIProcessRegisterLibraries1,COIProcessRegisterLibraries@@COI_1.0");
__asm__(".symver COIProcessUnloadLibrary1,COIProcessUnloadLibrary@@COI_1.0");
__asm__(".symver COIProcessWaitForShutdown1,COIProcessWaitForShutdown@@COI_1.0");
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -38,6 +38,12 @@
* intellectual property rights is granted herein.
*/
/***
* See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info.
* Use this in conjunction with coi_version_asm.h.
* // Comments don't work in this file.
***/
COI_1.0
{
global:
......@@ -56,17 +62,23 @@ COI_1.0
COIEngineGetCount;
COIEngineGetHandle;
COIEngineGetIndex;
COIEngineGetInfo;
COIEventWait;
COIEventRegisterCallback;
COIPerfGetCycleFrequency;
COIPipelineClearCPUMask;
COIPipelineCreate;
COIPipelineDestroy;
COIPipelineRunFunction;
COIPipelineSetCPUMask;
COIPipelineStartExecutingRunFunctions;
COIProcessCreateFromFile;
COIProcessCreateFromMemory;
COIProcessDestroy;
COIProcessGetFunctionHandles;
COIProcessLoadLibraryFromMemory;
COIProcessRegisterLibraries;
COIProcessUnloadLibrary;
COIProcessWaitForShutdown;
local:
*;
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -155,5 +155,49 @@ SYMBOL_VERSION (myoiTargetFptrTableRegister, 1) (void *table,
return MYO_ERROR;
}
MYOACCESSAPI MyoError
SYMBOL_VERSION (myoArenaRelease, 1) (MyoArena in_Arena)
{
MYOTRACE ("myoArenaRelease");
assert (false);
return MYO_ERROR;
}
MYOACCESSAPI MyoError
SYMBOL_VERSION (myoArenaAcquire, 1) (MyoArena in_Arena)
{
MYOTRACE ("myoArenaAcquire");
assert (false);
return MYO_ERROR;
}
MYOACCESSAPI void
SYMBOL_VERSION (myoArenaAlignedFree, 1) (MyoArena in_Arena, void *in_pPtr)
{
MYOTRACE ("myoArenaAlignedFree");
assert (false);
}
MYOACCESSAPI void *
SYMBOL_VERSION (myoArenaAlignedMalloc, 1) (MyoArena in_Arena, size_t in_Size,
size_t in_Alignment)
{
MYOTRACE ("myoArenaAlignedMalloc");
assert (false);
return 0;
}
} // extern "C"
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -38,16 +38,24 @@
* intellectual property rights is granted herein.
*/
__asm__ (".symver myoAcquire1,myoAcquire@@MYO_1.0");
__asm__ (".symver myoRelease1,myoRelease@@MYO_1.0");
__asm__ (".symver myoSharedAlignedFree1,myoSharedAlignedFree@@MYO_1.0");
__asm__ (".symver myoSharedAlignedMalloc1,myoSharedAlignedMalloc@@MYO_1.0");
__asm__ (".symver myoSharedFree1,myoSharedFree@@MYO_1.0");
__asm__ (".symver myoSharedMalloc1,myoSharedMalloc@@MYO_1.0");
/*Version for Symbols( only Functions currently versioned)
Only that Linux Host Side code is versioned currently*/
#if (! defined MYO_MIC_CARD) && (! defined _WIN32)
__asm__ (".symver myoiLibInit1,myoiLibInit@@MYO_1.0");
__asm__ (".symver myoiLibFini1,myoiLibFini@@MYO_1.0");
__asm__ (".symver myoiMicVarTableRegister1,myoiMicVarTableRegister@@MYO_1.0");
__asm__ (".symver myoiRemoteFuncRegister1,myoiRemoteFuncRegister@@MYO_1.0");
__asm__ (".symver myoiTargetFptrTableRegister1,myoiTargetFptrTableRegister@@MYO_1.0");
__asm__(".symver myoArenaAlignedMalloc1,myoArenaAlignedMalloc@@MYO_1.0");
__asm__(".symver myoArenaAlignedFree1,myoArenaAlignedFree@@MYO_1.0");
__asm__(".symver myoArenaAcquire1,myoArenaAcquire@@MYO_1.0");
__asm__(".symver myoArenaRelease1,myoArenaRelease@@MYO_1.0");
__asm__(".symver myoAcquire1,myoAcquire@@MYO_1.0");
__asm__(".symver myoRelease1,myoRelease@@MYO_1.0");
__asm__(".symver myoSharedAlignedFree1,myoSharedAlignedFree@@MYO_1.0");
__asm__(".symver myoSharedAlignedMalloc1,myoSharedAlignedMalloc@@MYO_1.0");
__asm__(".symver myoSharedFree1,myoSharedFree@@MYO_1.0");
__asm__(".symver myoSharedMalloc1,myoSharedMalloc@@MYO_1.0");
__asm__(".symver myoiLibInit1,myoiLibInit@@MYO_1.0");
__asm__(".symver myoiLibFini1,myoiLibFini@@MYO_1.0");
__asm__(".symver myoiMicVarTableRegister1,myoiMicVarTableRegister@@MYO_1.0");
__asm__(".symver myoiRemoteFuncRegister1,myoiRemoteFuncRegister@@MYO_1.0");
__asm__(".symver myoiTargetFptrTableRegister1,myoiTargetFptrTableRegister@@MYO_1.0");
#endif
/*
* Copyright 2010-2013 Intel Corporation.
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
......@@ -38,9 +38,17 @@
* intellectual property rights is granted herein.
*/
/***
* See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info.
***/
MYO_1.0
{
global:
myoArenaAlignedMalloc;
myoArenaAlignedFree;
myoArenaAcquire;
myoArenaRelease;
myoAcquire;
myoRelease;
myoSharedAlignedFree;
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -144,6 +144,9 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_process_create:
write_message(stderr, msg_c_process_create, args);
break;
case c_process_set_cache_size:
write_message(stderr, msg_c_process_set_cache_size, args);
break;
case c_process_wait_shutdown:
write_message(stderr, msg_c_process_wait_shutdown, args);
break;
......@@ -216,6 +219,9 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_zero_or_neg_transfer_size:
write_message(stderr, msg_c_zero_or_neg_transfer_size, args);
break;
case c_bad_ptr_mem_alloc:
write_message(stderr, msg_c_bad_ptr_mem_alloc, args);
break;
case c_bad_ptr_mem_range:
write_message(stderr, msg_c_bad_ptr_mem_range, args);
break;
......@@ -258,6 +264,39 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_report_unknown_trace_node:
write_message(stderr, msg_c_report_unknown_trace_node, args);
break;
case c_incorrect_affinity:
write_message(stderr, msg_c_incorrect_affinity, args);
break;
case c_cannot_set_affinity:
write_message(stderr, msg_c_cannot_set_affinity, args);
break;
case c_in_with_preallocated:
write_message(stderr, msg_c_in_with_preallocated, args);
break;
case c_report_no_host_exe:
write_message(stderr, msg_c_report_no_host_exe, args);
break;
case c_report_path_buff_overflow:
write_message(stderr, msg_c_report_path_buff_overflow, args);
break;
case c_create_pipeline_for_stream:
write_message(stderr, msg_c_create_pipeline_for_stream, args);
break;
case c_offload_no_stream:
write_message(stderr, msg_c_offload_no_stream, args);
break;
case c_get_engine_info:
write_message(stderr, msg_c_get_engine_info, args);
break;
case c_clear_cpu_mask:
write_message(stderr, msg_c_clear_cpu_mask, args);
break;
case c_set_cpu_mask:
write_message(stderr, msg_c_set_cpu_mask, args);
break;
case c_unload_library:
write_message(stderr, msg_c_unload_library, args);
break;
}
va_end(args);
}
......@@ -374,6 +413,10 @@ char const * report_get_message_str(error_types input_tag)
return (offload_get_message_str(msg_c_report_unregister));
case c_report_var:
return (offload_get_message_str(msg_c_report_var));
case c_report_stream:
return (offload_get_message_str(msg_c_report_stream));
case c_report_state_stream:
return (offload_get_message_str(msg_c_report_state_stream));
default:
LIBOFFLOAD_ERROR(c_report_unknown_trace_node);
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -68,6 +68,7 @@ typedef enum
c_get_engine_handle,
c_get_engine_index,
c_process_create,
c_process_set_cache_size,
c_process_get_func_handles,
c_process_wait_shutdown,
c_process_proxy_flush,
......@@ -91,6 +92,7 @@ typedef enum
c_event_wait,
c_zero_or_neg_ptr_len,
c_zero_or_neg_transfer_size,
c_bad_ptr_mem_alloc,
c_bad_ptr_mem_range,
c_different_src_and_dstn_sizes,
c_ranges_dont_match,
......@@ -103,6 +105,8 @@ typedef enum
c_unknown_binary_type,
c_multiple_target_exes,
c_no_target_exe,
c_incorrect_affinity,
c_cannot_set_affinity,
c_report_host,
c_report_target,
c_report_title,
......@@ -159,7 +163,24 @@ typedef enum
c_report_myosharedalignedfree,
c_report_myoacquire,
c_report_myorelease,
c_coipipe_max_number
c_report_myosupportsfeature,
c_report_myosharedarenacreate,
c_report_myosharedalignedarenamalloc,
c_report_myosharedalignedarenafree,
c_report_myoarenaacquire,
c_report_myoarenarelease,
c_coipipe_max_number,
c_in_with_preallocated,
c_report_no_host_exe,
c_report_path_buff_overflow,
c_create_pipeline_for_stream,
c_offload_no_stream,
c_get_engine_info,
c_clear_cpu_mask,
c_set_cpu_mask,
c_report_state_stream,
c_report_stream,
c_unload_library
} error_types;
enum OffloadHostPhase {
......@@ -260,15 +281,21 @@ enum OffloadTargetPhase {
c_offload_target_max_phase
};
#ifdef TARGET_WINNT
#define DLL_LOCAL
#else
#define DLL_LOCAL __attribute__((visibility("hidden")))
#endif
#ifdef __cplusplus
extern "C" {
#endif
void __liboffload_error_support(error_types input_tag, ...);
void __liboffload_report_support(error_types input_tag, ...);
char const *offload_get_message_str(int msgCode);
char const * report_get_message_str(error_types input_tag);
char const * report_get_host_stage_str(int i);
char const * report_get_target_stage_str(int i);
DLL_LOCAL void __liboffload_error_support(error_types input_tag, ...);
DLL_LOCAL void __liboffload_report_support(error_types input_tag, ...);
DLL_LOCAL char const *offload_get_message_str(int msgCode);
DLL_LOCAL char const * report_get_message_str(error_types input_tag);
DLL_LOCAL char const * report_get_host_stage_str(int i);
DLL_LOCAL char const * report_get_target_stage_str(int i);
#ifdef __cplusplus
}
#endif
......@@ -281,7 +308,7 @@ char const * report_get_target_stage_str(int i);
fprintf(stderr, "\t TEST for %s \n \t", nm); \
__liboffload_error_support(msg, __VA_ARGS__);
void write_message(FILE * file, int msgCode, va_list args_p);
DLL_LOCAL void write_message(FILE * file, int msgCode, va_list args_p);
#define LIBOFFLOAD_ERROR __liboffload_error_support
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -28,7 +28,6 @@
*/
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
......@@ -55,7 +54,7 @@
va_copy(args, args_p);
buf[0] = '\n';
vsnprintf(buf + 1, sizeof(buf) - 2,
MESSAGE_TABLE_NAME[ msgCode ], args);
MESSAGE_TABLE_NAME[ msgCode ], args);
strcat(buf, "\n");
va_end(args);
fputs(buf, file);
......
!
! Copyright (c) 2014 Intel Corporation. All Rights Reserved.
! Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
!
! Redistribution and use in source and binary forms, with or without
! modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -42,6 +42,13 @@
#include <stddef.h>
#include <omp.h>
#ifdef TARGET_WINNT
// <stdint.h> is not compatible with Windows
typedef unsigned long long int uint64_t;
#else
#include <stdint.h>
#endif // TARGET_WINNT
#ifdef __cplusplus
extern "C" {
#endif
......@@ -86,6 +93,8 @@ typedef struct {
size_t data_received; /* number of bytes received by host */
} _Offload_status;
typedef uint64_t _Offload_stream;
#define OFFLOAD_STATUS_INIT(x) \
((x).result = OFFLOAD_DISABLED)
......@@ -98,14 +107,57 @@ extern int _Offload_number_of_devices(void);
extern int _Offload_get_device_number(void);
extern int _Offload_get_physical_device_number(void);
/* Offload stream runtime interfaces */
extern _Offload_stream _Offload_stream_create(
int device, // MIC device number
int number_of_cpus // Cores allocated to the stream
);
extern int _Offload_stream_destroy(
int device, // MIC device number
_Offload_stream stream // stream handle
);
extern int _Offload_stream_completed(
int device, // MIC device number
_Offload_stream handle // stream handle
);
/*
* _Offload_shared_malloc/free are only supported when offload is enabled
* else they are defined to malloc and free
*/
#ifdef __INTEL_OFFLOAD
extern void* _Offload_shared_malloc(size_t size);
extern void _Offload_shared_free(void *ptr);
extern void* _Offload_shared_aligned_malloc(size_t size, size_t align);
extern void _Offload_shared_aligned_free(void *ptr);
#else
#include <malloc.h>
#define _Offload_shared_malloc(size) malloc(size)
#define _Offload_shared_free(ptr) free(ptr);
#if defined(_WIN32)
#define _Offload_shared_aligned_malloc(size, align) _aligned_malloc(size, align)
#define _Offload_shared_aligned_free(ptr) _aligned_free(ptr);
#else
#define _Offload_shared_aligned_malloc(size, align) memalign(align, size)
#define _Offload_shared_aligned_free(ptr) free(ptr);
#endif
#endif
extern int _Offload_signaled(int index, void *signal);
extern void _Offload_report(int val);
extern int _Offload_find_associated_mic_memory(
int target,
const void* cpu_addr,
void** cpu_base_addr,
uint64_t* buf_length,
void** mic_addr,
uint64_t* mic_buf_start_offset,
int* is_static
);
/* OpenMP API */
......@@ -343,7 +395,11 @@ namespace __offload {
shared_allocator<void>::const_pointer) {
/* Allocate from shared memory. */
void *ptr = _Offload_shared_malloc(s*sizeof(T));
if (ptr == 0) std::__throw_bad_alloc();
#if (defined(_WIN32) || defined(_WIN64)) /* Windows */
if (ptr == 0) throw std::bad_alloc();
#else
if (ptr == 0) std::__throw_bad_alloc();
#endif
return static_cast<pointer>(ptr);
} /* allocate */
......@@ -355,13 +411,13 @@ namespace __offload {
} /* deallocate */
template <typename _T1, typename _T2>
inline bool operator==(const shared_allocator<_T1> &,
inline bool operator==(const shared_allocator<_T1> &,
const shared_allocator<_T2> &) throw() {
return true;
} /* operator== */
template <typename _T1, typename _T2>
inline bool operator!=(const shared_allocator<_T1> &,
inline bool operator!=(const shared_allocator<_T1> &,
const shared_allocator<_T2> &) throw() {
return false;
} /* operator!= */
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -40,10 +40,6 @@
#include <string.h>
#include <memory.h>
#if (defined(LINUX) || defined(FREEBSD)) && !defined(__INTEL_COMPILER)
#include <mm_malloc.h>
#endif
#include "offload.h"
#include "offload_table.h"
#include "offload_trace.h"
......@@ -65,22 +61,24 @@
// The debug routines
// Host console and file logging
extern int console_enabled;
extern int offload_report_level;
DLL_LOCAL extern int console_enabled;
DLL_LOCAL extern int offload_report_level;
#define OFFLOAD_DO_TRACE (offload_report_level == 3)
extern const char *prefix;
extern int offload_number;
DLL_LOCAL extern const char *prefix;
DLL_LOCAL extern int offload_number;
#if !HOST_LIBRARY
extern int mic_index;
DLL_LOCAL extern int mic_index;
#define OFFLOAD_DO_TRACE (offload_report_level == 3)
#else
#define OFFLOAD_DO_TRACE (offload_report_enabled && (offload_report_level == 3))
#endif
#if HOST_LIBRARY
void Offload_Report_Prolog(OffloadHostTimerData* timer_data);
void Offload_Report_Epilog(OffloadHostTimerData* timer_data);
void offload_report_free_data(OffloadHostTimerData * timer_data);
void Offload_Timer_Print(void);
DLL_LOCAL void Offload_Report_Prolog(OffloadHostTimerData* timer_data);
DLL_LOCAL void Offload_Report_Epilog(OffloadHostTimerData* timer_data);
DLL_LOCAL void offload_report_free_data(OffloadHostTimerData * timer_data);
DLL_LOCAL void Offload_Timer_Print(void);
#ifndef TARGET_WINNT
#define OFFLOAD_DEBUG_INCR_OFLD_NUM() \
......@@ -130,7 +128,7 @@ void Offload_Timer_Print(void);
#define OFFLOAD_DEBUG_DUMP_BYTES(level, a, b) \
__dump_bytes(level, a, b)
extern void __dump_bytes(
DLL_LOCAL extern void __dump_bytes(
int level,
const void *data,
int len
......@@ -156,6 +154,17 @@ extern void *OFFLOAD_MALLOC(size_t size, size_t align);
// The Marshaller
// Flags describing an offload
//! Flags describing an offload
union OffloadFlags{
uint32_t flags;
struct {
uint32_t fortran_traceback : 1; //!< Fortran traceback requested
uint32_t omp_async : 1; //!< OpenMP asynchronous offload
} bits;
};
//! \enum Indicator for the type of entry on an offload item list.
enum OffloadItemType {
c_data = 1, //!< Plain data
......@@ -203,6 +212,44 @@ enum OffloadParameterType {
c_parameter_inout //!< Variable listed in "inout" clause
};
//! Flags describing an offloaded variable
union varDescFlags {
struct {
//! source variable has persistent storage
uint32_t is_static : 1;
//! destination variable has persistent storage
uint32_t is_static_dstn : 1;
//! has length for c_dv && c_dv_ptr
uint32_t has_length : 1;
//! persisted local scalar is in stack buffer
uint32_t is_stack_buf : 1;
//! "targetptr" modifier used
uint32_t targetptr : 1;
//! "preallocated" modifier used
uint32_t preallocated : 1;
//! Needs documentation
uint32_t is_pointer : 1;
//! buffer address is sent in data
uint32_t sink_addr : 1;
//! alloc displacement is sent in data
uint32_t alloc_disp : 1;
//! source data is noncontiguous
uint32_t is_noncont_src : 1;
//! destination data is noncontiguous
uint32_t is_noncont_dst : 1;
//! "OpenMP always" modifier used
uint32_t always_copy : 1;
//! "OpenMP delete" modifier used
uint32_t always_delete : 1;
//! CPU memory pinning/unpinning operation
uint32_t pin : 1;
};
uint32_t bits;
};
//! An Offload Variable descriptor
struct VarDesc {
//! OffloadItemTypes of source and destination
......@@ -230,27 +277,7 @@ struct VarDesc {
/*! Used by runtime as offset to data from start of MIC buffer */
uint32_t mic_offset;
//! Flags describing this variable
union {
struct {
//! source variable has persistent storage
uint32_t is_static : 1;
//! destination variable has persistent storage
uint32_t is_static_dstn : 1;
//! has length for c_dv && c_dv_ptr
uint32_t has_length : 1;
//! persisted local scalar is in stack buffer
uint32_t is_stack_buf : 1;
//! buffer address is sent in data
uint32_t sink_addr : 1;
//! alloc displacement is sent in data
uint32_t alloc_disp : 1;
//! source data is noncontiguous
uint32_t is_noncont_src : 1;
//! destination data is noncontiguous
uint32_t is_noncont_dst : 1;
};
uint32_t bits;
} flags;
varDescFlags flags;
//! Not used by compiler; set to 0
/*! Used by runtime as offset to base from data stored in a buffer */
int64_t offset;
......@@ -472,4 +499,16 @@ struct FunctionDescriptor
// Pointer to OffloadDescriptor.
typedef struct OffloadDescriptor *OFFLOAD;
// Use for setting affinity of a stream
enum affinity_type {
affinity_compact,
affinity_scatter
};
struct affinity_spec {
uint64_t sink_mask[16];
int affinity_type;
int num_cores;
int num_threads;
};
#endif // OFFLOAD_COMMON_H_INCLUDED
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -32,13 +32,16 @@
#define OFFLOAD_ENGINE_H_INCLUDED
#include <limits.h>
#include <bitset>
#include <list>
#include <set>
#include <map>
#include "offload_common.h"
#include "coi/coi_client.h"
#define SIGNAL_IS_REMOVED ((OffloadDescriptor *)-1)
const int64_t no_stream = -1;
// Address range
class MemRange {
public:
......@@ -157,6 +160,50 @@ private:
typedef std::list<PtrData*> PtrDataList;
class PtrDataTable {
public:
typedef std::set<PtrData> PtrSet;
PtrData* find_ptr_data(const void *ptr) {
m_ptr_lock.lock();
PtrSet::iterator res = list.find(PtrData(ptr, 0));
m_ptr_lock.unlock();
if (res == list.end()) {
return 0;
}
return const_cast<PtrData*>(res.operator->());
}
PtrData* insert_ptr_data(const void *ptr, uint64_t len, bool &is_new) {
m_ptr_lock.lock();
std::pair<PtrSet::iterator, bool> res =
list.insert(PtrData(ptr, len));
PtrData* ptr_data = const_cast<PtrData*>(res.first.operator->());
m_ptr_lock.unlock();
is_new = res.second;
if (is_new) {
// It's necessary to lock as soon as possible.
// unlock must be done at call site of insert_ptr_data at
// branch for is_new
ptr_data->alloc_ptr_data_lock.lock();
}
return ptr_data;
}
void remove_ptr_data(const void *ptr) {
m_ptr_lock.lock();
list.erase(PtrData(ptr, 0));
m_ptr_lock.unlock();
}
private:
PtrSet list;
mutex_t m_ptr_lock;
};
// Data associated with automatic variable
class AutoData {
public:
......@@ -186,7 +233,15 @@ public:
return _InterlockedDecrement(&ref_count);
#endif // TARGET_WINNT
}
long nullify_reference() {
#ifndef TARGET_WINNT
return __sync_lock_test_and_set(&ref_count, 0);
#else // TARGET_WINNT
return _InterlockedExchange(&ref_count,0);
#endif // TARGET_WINNT
}
long get_reference() const {
return ref_count;
}
......@@ -226,18 +281,39 @@ struct TargetImage
typedef std::list<TargetImage> TargetImageList;
// dynamic library and Image associated with lib
struct DynLib
{
DynLib(const char *_name, const void *_data,
COILIBRARY _lib) :
name(_name), data(_data), lib(_lib)
{}
// library name
const char* name;
// contents
const void* data;
COILIBRARY lib;
};
typedef std::list<DynLib> DynLibList;
// Data associated with persistent auto objects
struct PersistData
{
PersistData(const void *addr, uint64_t routine_num, uint64_t size) :
stack_cpu_addr(addr), routine_id(routine_num)
PersistData(const void *addr, uint64_t routine_num,
uint64_t size, uint64_t thread) :
stack_cpu_addr(addr), routine_id(routine_num), thread_id(thread)
{
stack_ptr_data = new PtrData(0, size);
}
// 1-st key value - begining of the stack at CPU
// 1-st key value - beginning of the stack at CPU
const void * stack_cpu_addr;
// 2-nd key value - identifier of routine invocation at CPU
uint64_t routine_id;
// 3-rd key value - thread identifier
uint64_t thread_id;
// corresponded PtrData; only stack_ptr_data->mic_buf is used
PtrData * stack_ptr_data;
// used to get offset of the variable in stack buffer
......@@ -246,6 +322,75 @@ struct PersistData
typedef std::list<PersistData> PersistDataList;
// Data associated with stream
struct Stream
{
Stream(int device, int num_of_cpus) :
m_number_of_cpus(num_of_cpus), m_pipeline(0), m_last_offload(0),
m_device(device)
{}
~Stream() {
if (m_pipeline) {
COI::PipelineDestroy(m_pipeline);
}
}
COIPIPELINE get_pipeline(void) {
return(m_pipeline);
}
int get_device(void) {
return(m_device);
}
int get_cpu_number(void) {
return(m_number_of_cpus);
}
void set_pipeline(COIPIPELINE pipeline) {
m_pipeline = pipeline;
}
OffloadDescriptor* get_last_offload(void) {
return(m_last_offload);
}
void set_last_offload(OffloadDescriptor* last_offload) {
m_last_offload = last_offload;
}
static Stream* find_stream(uint64_t handle, bool remove);
static _Offload_stream add_stream(int device, int number_of_cpus) {
m_stream_lock.lock();
all_streams[++m_streams_count] = new Stream(device, number_of_cpus);
m_stream_lock.unlock();
return(m_streams_count);
}
typedef std::map<uint64_t, Stream*> StreamMap;
static uint64_t m_streams_count;
static StreamMap all_streams;
static mutex_t m_stream_lock;
int m_device;
// number of cpus
int m_number_of_cpus;
// The pipeline associated with the stream
COIPIPELINE m_pipeline;
// The last offload occured via the stream
OffloadDescriptor* m_last_offload;
// Cpus used by the stream
std::bitset<COI_MAX_HW_THREADS> m_stream_cpus;
};
typedef std::map<uint64_t, Stream*> StreamMap;
// class representing a single engine
struct Engine {
friend void __offload_init_library_once(void);
......@@ -275,9 +420,14 @@ struct Engine {
return m_process;
}
uint64_t get_thread_id(void);
// initialize device
void init(void);
// unload library
void unload_library(const void *data, const char *name);
// add new library
void add_lib(const TargetImage &lib)
{
......@@ -288,6 +438,7 @@ struct Engine {
}
COIRESULT compute(
_Offload_stream stream,
const std::list<COIBUFFER> &buffers,
const void* data,
uint16_t data_size,
......@@ -323,36 +474,28 @@ struct Engine {
// Memory association table
//
PtrData* find_ptr_data(const void *ptr) {
m_ptr_lock.lock();
PtrSet::iterator res = m_ptr_set.find(PtrData(ptr, 0));
m_ptr_lock.unlock();
if (res == m_ptr_set.end()) {
return 0;
}
return const_cast<PtrData*>(res.operator->());
return m_ptr_set.find_ptr_data(ptr);
}
PtrData* find_targetptr_data(const void *ptr) {
return m_targetptr_set.find_ptr_data(ptr);
}
PtrData* insert_ptr_data(const void *ptr, uint64_t len, bool &is_new) {
m_ptr_lock.lock();
std::pair<PtrSet::iterator, bool> res =
m_ptr_set.insert(PtrData(ptr, len));
PtrData* ptr_data = const_cast<PtrData*>(res.first.operator->());
m_ptr_lock.unlock();
return m_ptr_set.insert_ptr_data(ptr, len, is_new);
}
is_new = res.second;
if (is_new) {
// It's necessary to lock as soon as possible.
// unlock must be done at call site of insert_ptr_data at
// branch for is_new
ptr_data->alloc_ptr_data_lock.lock();
}
return ptr_data;
PtrData* insert_targetptr_data(const void *ptr, uint64_t len,
bool &is_new) {
return m_targetptr_set.insert_ptr_data(ptr, len, is_new);
}
void remove_ptr_data(const void *ptr) {
m_ptr_lock.lock();
m_ptr_set.erase(PtrData(ptr, 0));
m_ptr_lock.unlock();
m_ptr_set.remove_ptr_data(ptr);
}
void remove_targetptr_data(const void *ptr) {
m_targetptr_set.remove_ptr_data(ptr);
}
//
......@@ -396,7 +539,7 @@ struct Engine {
if (it != m_signal_map.end()) {
desc = it->second;
if (remove) {
m_signal_map.erase(it);
it->second = SIGNAL_IS_REMOVED;
}
}
}
......@@ -405,6 +548,14 @@ struct Engine {
return desc;
}
void stream_destroy(_Offload_stream handle);
COIPIPELINE get_pipeline(_Offload_stream stream);
StreamMap get_stream_map() {
return m_stream_map;
}
// stop device process
void fini_process(bool verbose);
......@@ -417,6 +568,11 @@ private:
{}
~Engine() {
for (StreamMap::iterator it = m_stream_map.begin();
it != m_stream_map.end(); it++) {
Stream * stream = it->second;
delete stream;
}
if (m_process != 0) {
fini_process(false);
}
......@@ -469,14 +625,24 @@ private:
// List of libraries to be loaded
TargetImageList m_images;
// var table
PtrSet m_ptr_set;
mutex_t m_ptr_lock;
// var tables
PtrDataTable m_ptr_set;
PtrDataTable m_targetptr_set;
// signals
SignalMap m_signal_map;
mutex_t m_signal_lock;
// streams
StreamMap m_stream_map;
mutex_t m_stream_lock;
int m_num_cores;
int m_num_threads;
std::bitset<COI_MAX_HW_THREADS> m_cpus;
// List of dynamic libraries to be registred
DynLibList m_dyn_libs;
// constants for accessing device function handles
enum {
c_func_compute = 0,
......@@ -487,6 +653,7 @@ private:
c_func_init,
c_func_var_table_size,
c_func_var_table_copy,
c_func_set_stream_affinity,
c_funcs_total
};
static const char* m_func_names[c_funcs_total];
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -146,7 +146,7 @@ void MicEnvVar::add_env_var(
else {
card = get_card(card_number);
if (!card) {
// definition for new card occured
// definition for new card occurred
card = new CardEnvVars(card_number);
card_spec_list.push_back(card);
}
......@@ -321,7 +321,7 @@ void MicEnvVar::mic_parse_env_var_list(
// Collect all definitions for the card with number "card_num".
// The returned result is vector of string pointers defining one
// environment variable. The vector is terminated by NULL pointer.
// In the begining of the vector there are env vars defined as
// In the beginning of the vector there are env vars defined as
// <mic-prefix>_<card-number>_<var>=<value>
// or
// <mic-prefix>_<card-number>_ENV=<env-vars>
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -32,6 +32,7 @@
#define OFFLOAD_ENV_H_INCLUDED
#include <list>
#include "offload_util.h"
// data structure and routines to parse MIC user environment and pass to MIC
......@@ -43,7 +44,7 @@ enum MicEnvVarKind
c_mic_card_env // for <mic-prefix>_<card-number>_ENV
};
struct MicEnvVar {
struct DLL_LOCAL MicEnvVar {
public:
MicEnvVar() : prefix(0) {}
~MicEnvVar();
......
This source diff could not be displayed because it is too large. You can view the blob instead.
/*
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/*! \file
\brief Iterator of Variable tables list used by the runtime library
*/
#ifndef OFFLOAD_ITERATOR_H_INCLUDED
#define OFFLOAD_ITERATOR_H_INCLUDED
#include <iterator>
#include "offload_table.h"
// The following class is for iteration over var table.
// It was extracted and moved to this offload_iterator.h file from offload_table.h
// to solve the problem with compiling with VS 2010. The problem was in incompatibility
// of STL objects in VS 2010 with ones in later VS versions.
// var table list iterator
class Iterator : public std::iterator<std::input_iterator_tag,
VarTable::Entry> {
public:
Iterator() : m_node(0), m_entry(0) {}
explicit Iterator(TableList<VarTable>::Node *node) {
new_node(node);
}
Iterator& operator++() {
if (m_entry != 0) {
m_entry++;
while (m_entry->name == 0) {
m_entry++;
}
if (m_entry->name == reinterpret_cast<const char*>(-1)) {
new_node(m_node->next);
}
}
return *this;
}
bool operator==(const Iterator &other) const {
return m_entry == other.m_entry;
}
bool operator!=(const Iterator &other) const {
return m_entry != other.m_entry;
}
const VarTable::Entry* operator*() const {
return m_entry;
}
private:
void new_node(TableList<VarTable>::Node *node) {
m_node = node;
m_entry = 0;
while (m_node != 0) {
m_entry = m_node->table.entries;
while (m_entry->name == 0) {
m_entry++;
}
if (m_entry->name != reinterpret_cast<const char*>(-1)) {
break;
}
m_node = m_node->next;
m_entry = 0;
}
}
private:
TableList<VarTable>::Node *m_node;
const VarTable::Entry *m_entry;
};
#endif // OFFLOAD_ITERATOR_H_INCLUDED
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -34,67 +34,35 @@
#include <myotypes.h>
#include <myoimpl.h>
#include <myo.h>
#include "offload.h"
typedef MyoiSharedVarEntry SharedTableEntry;
//typedef MyoiHostSharedFptrEntry FptrTableEntry;
typedef struct {
//! Function Name
const char *funcName;
//! Function Address
void *funcAddr;
//! Local Thunk Address
void *localThunkAddr;
#ifdef TARGET_WINNT
// Dummy to pad up to 32 bytes
void *dummy;
#endif // TARGET_WINNT
} FptrTableEntry;
struct InitTableEntry {
#ifdef TARGET_WINNT
// Dummy to pad up to 16 bytes
// Function Name
const char *funcName;
#endif // TARGET_WINNT
void (*func)(void);
};
#ifdef TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable$a"
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable$z"
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START ".MyoSharedInitTable$a"
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END ".MyoSharedInitTable$z"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable$a"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable$z"
#else // TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START ".MyoSharedInitTable."
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END ".MyoSharedInitTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable."
#endif // TARGET_WINNT
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_END, read, write)
#include "offload.h"
// undefine the following since offload.h defines them to malloc and free if __INTEL_OFFLOAD
// is not defined which is the case when building the offload library
#undef _Offload_shared_malloc
#undef _Offload_shared_free
#undef _Offload_shared_aligned_malloc
#undef _Offload_shared_aligned_free
#include "offload_table.h"
// This function retained for compatibility with 15.0
extern "C" void __offload_myoRegisterTables(
InitTableEntry *init_table,
SharedTableEntry *shared_table,
FptrTableEntry *fptr_table
);
// Process shared variable, shared vtable and function and init routine tables.
// In .dlls/.sos these will be collected together.
// In the main program, all collected tables will be processed.
extern "C" bool __offload_myoProcessTables(
const void* image,
MYOInitTableList::Node *init_table,
MYOVarTableList::Node *shared_table,
MYOVarTableList::Node *shared_vtable,
MYOFuncTableList::Node *fptr_table
);
extern void __offload_myoFini(void);
extern bool __offload_myo_init_is_deferred(const void *image);
#endif // OFFLOAD_MYO_HOST_H_INCLUDED
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -44,7 +44,7 @@ static void CheckResult(const char *func, MyoError error) {
}
}
static void __offload_myo_shared_table_register(SharedTableEntry *entry)
static void __offload_myo_shared_table_process(SharedTableEntry *entry)
{
int entries = 0;
SharedTableEntry *t_start;
......@@ -68,7 +68,32 @@ static void __offload_myo_shared_table_register(SharedTableEntry *entry)
}
}
static void __offload_myo_fptr_table_register(
static void __offload_myo_shared_vtable_process(SharedTableEntry *entry)
{
int entries = 0;
SharedTableEntry *t_start;
OFFLOAD_DEBUG_TRACE(3, "%s(%p)\n", __func__, entry);
t_start = entry;
while (t_start->varName != 0) {
OFFLOAD_DEBUG_TRACE_1(4, 0, c_offload_mic_myo_shared,
"myo shared vtable entry name"
" = \"%s\" addr = %p\n",
t_start->varName, t_start->sharedAddr);
t_start++;
entries++;
}
if (entries > 0) {
OFFLOAD_DEBUG_TRACE(3, "myoiMicVarTableRegister(%p, %d)\n", entry,
entries);
CheckResult("myoiMicVarTableRegister",
myoiMicVarTableRegister(entry, entries));
}
}
static void __offload_myo_fptr_table_process(
FptrTableEntry *entry
)
{
......@@ -94,9 +119,22 @@ static void __offload_myo_fptr_table_register(
}
}
void __offload_myo_shared_init_table_process(InitTableEntry* entry)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%p)\n", __func__, entry);
for (; entry->func != 0; entry++) {
// Invoke the function to init the shared memory
OFFLOAD_DEBUG_TRACE(3, "Invoked a shared init function @%p\n",
(void *)(entry->func));
entry->func();
}
}
extern "C" void __offload_myoAcquire(void)
{
OFFLOAD_DEBUG_TRACE(3, "%s\n", __func__);
CheckResult("myoAcquire", myoAcquire());
}
......@@ -162,8 +200,35 @@ extern "C" void __offload_myoRegisterTables(
return;
}
__offload_myo_shared_table_register(shared_table);
__offload_myo_fptr_table_register(fptr_table);
__offload_myo_shared_table_process(shared_table);
__offload_myo_fptr_table_process(fptr_table);
}
extern "C" void __offload_myoProcessTables(
InitTableEntry* init_table,
SharedTableEntry *shared_table,
SharedTableEntry *shared_vtable,
FptrTableEntry *fptr_table
)
{
OFFLOAD_DEBUG_TRACE(3, "%s\n", __func__);
// one time registration of Intel(R) Cilk(TM) language entries
static pthread_once_t once_control = PTHREAD_ONCE_INIT;
pthread_once(&once_control, __offload_myo_once_init);
// register module's tables
// check slot-1 of the function table because
// slot-0 is predefined with --vtable_initializer--
if (shared_table->varName == 0 &&
shared_vtable->varName == 0 &&
fptr_table[1].funcName == 0) {
return;
}
__offload_myo_shared_table_process(shared_table);
__offload_myo_shared_vtable_process(shared_vtable);
__offload_myo_fptr_table_process(fptr_table);
}
extern "C" void* _Offload_shared_malloc(size_t size)
......@@ -190,6 +255,46 @@ extern "C" void _Offload_shared_aligned_free(void *ptr)
myoSharedAlignedFree(ptr);
}
extern "C" void* _Offload_shared_aligned_arena_malloc(
MyoArena arena,
size_t size,
size_t align
)
{
OFFLOAD_DEBUG_TRACE(
3, "%s(%u, %lld, %lld)\n", __func__, arena, size, align);
return myoArenaAlignedMalloc(arena, size, align);
}
extern "C" void _Offload_shared_aligned_arena_free(
MyoArena arena,
void *ptr
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u, %p)\n", __func__, arena, ptr);
myoArenaAlignedFree(arena, ptr);
}
extern "C" void _Offload_shared_arena_acquire(
MyoArena arena
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u)\n", __func__, arena);
myoArenaAcquire(arena);
}
extern "C" void _Offload_shared_arena_release(
MyoArena arena
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u)\n", __func__, arena);
myoArenaRelease(arena);
}
// temporary workaround for blocking behavior of myoiLibInit/Fini calls
extern "C" void __offload_myoLibInit()
{
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -31,42 +31,38 @@
#ifndef OFFLOAD_MYO_TARGET_H_INCLUDED
#define OFFLOAD_MYO_TARGET_H_INCLUDED
#include <myotypes.h>
#include <myoimpl.h>
#include <myo.h>
#include "offload.h"
typedef MyoiSharedVarEntry SharedTableEntry;
typedef MyoiTargetSharedFptrEntry FptrTableEntry;
#ifdef TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable$a"
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable$z"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable$a"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable$z"
#else // TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable."
#endif // TARGET_WINNT
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_END, read, write)
#include "offload.h"
// undefine the following since offload.h defines them to malloc and free if __INTEL_OFFLOAD
// is not defined which is the case when building the offload library
#undef _Offload_shared_malloc
#undef _Offload_shared_free
#undef _Offload_shared_aligned_malloc
#undef _Offload_shared_aligned_free
#include "offload_table.h"
// This function retained for compatibility with 15.0
extern "C" void __offload_myoRegisterTables(
SharedTableEntry *shared_table,
FptrTableEntry *fptr_table
);
// Process shared variable, shared vtable and function and init routine tables.
// On the target side the contents of the tables are registered with MYO.
extern "C" void __offload_myoProcessTables(
InitTableEntry* init_table,
SharedTableEntry *shared_table,
SharedTableEntry *shared_vtable,
FptrTableEntry *fptr_table
);
extern "C" void __offload_myoAcquire(void);
extern "C" void __offload_myoRelease(void);
// Call the compiler-generated routines for initializing shared variables.
// This can only be done after shared memory allocation has been done.
extern void __offload_myo_shared_init_table_process(InitTableEntry* entry);
// temporary workaround for blocking behavior for myoiLibInit/Fini calls
extern "C" void __offload_myoLibInit();
extern "C" void __offload_myoLibFini();
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......
/*
Copyright (c) 2014 Intel Corporation. All Rights Reserved.
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
......@@ -86,7 +86,7 @@ static int omp_get_int_from_host(
return setting;
}
void omp_set_num_threads_lrb(
DLL_LOCAL void omp_set_num_threads_lrb(
void *ofld
)
{
......@@ -96,7 +96,7 @@ void omp_set_num_threads_lrb(
omp_set_num_threads(num_threads);
}
void omp_get_max_threads_lrb(
DLL_LOCAL void omp_get_max_threads_lrb(
void *ofld
)
{
......@@ -106,7 +106,7 @@ void omp_get_max_threads_lrb(
omp_send_int_to_host(ofld, num_threads);
}
void omp_get_num_procs_lrb(
DLL_LOCAL void omp_get_num_procs_lrb(
void *ofld
)
{
......@@ -116,7 +116,7 @@ void omp_get_num_procs_lrb(
omp_send_int_to_host(ofld, num_procs);
}
void omp_set_dynamic_lrb(
DLL_LOCAL void omp_set_dynamic_lrb(
void *ofld
)
{
......@@ -126,7 +126,7 @@ void omp_set_dynamic_lrb(
omp_set_dynamic(dynamic);
}
void omp_get_dynamic_lrb(
DLL_LOCAL void omp_get_dynamic_lrb(
void *ofld
)
{
......@@ -136,7 +136,7 @@ void omp_get_dynamic_lrb(
omp_send_int_to_host(ofld, dynamic);
}
void omp_set_nested_lrb(
DLL_LOCAL void omp_set_nested_lrb(
void *ofld
)
{
......@@ -146,7 +146,7 @@ void omp_set_nested_lrb(
omp_set_nested(nested);
}
void omp_get_nested_lrb(
DLL_LOCAL void omp_get_nested_lrb(
void *ofld
)
{
......@@ -156,7 +156,7 @@ void omp_get_nested_lrb(
omp_send_int_to_host(ofld, nested);
}
void omp_set_schedule_lrb(
DLL_LOCAL void omp_set_schedule_lrb(
void *ofld_
)
{
......@@ -180,7 +180,7 @@ void omp_set_schedule_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_get_schedule_lrb(
DLL_LOCAL void omp_get_schedule_lrb(
void *ofld_
)
{
......@@ -206,7 +206,7 @@ void omp_get_schedule_lrb(
// lock API functions
void omp_init_lock_lrb(
DLL_LOCAL void omp_init_lock_lrb(
void *ofld_
)
{
......@@ -224,7 +224,7 @@ void omp_init_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_destroy_lock_lrb(
DLL_LOCAL void omp_destroy_lock_lrb(
void *ofld_
)
{
......@@ -242,7 +242,7 @@ void omp_destroy_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_set_lock_lrb(
DLL_LOCAL void omp_set_lock_lrb(
void *ofld_
)
{
......@@ -260,7 +260,7 @@ void omp_set_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_unset_lock_lrb(
DLL_LOCAL void omp_unset_lock_lrb(
void *ofld_
)
{
......@@ -278,7 +278,7 @@ void omp_unset_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_test_lock_lrb(
DLL_LOCAL void omp_test_lock_lrb(
void *ofld_
)
{
......@@ -304,7 +304,7 @@ void omp_test_lock_lrb(
// nested lock API functions
void omp_init_nest_lock_lrb(
DLL_LOCAL void omp_init_nest_lock_lrb(
void *ofld_
)
{
......@@ -322,7 +322,7 @@ void omp_init_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_destroy_nest_lock_lrb(
DLL_LOCAL void omp_destroy_nest_lock_lrb(
void *ofld_
)
{
......@@ -340,7 +340,7 @@ void omp_destroy_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_set_nest_lock_lrb(
DLL_LOCAL void omp_set_nest_lock_lrb(
void *ofld_
)
{
......@@ -358,7 +358,7 @@ void omp_set_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_unset_nest_lock_lrb(
DLL_LOCAL void omp_unset_nest_lock_lrb(
void *ofld_
)
{
......@@ -376,7 +376,7 @@ void omp_unset_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld);
}
void omp_test_nest_lock_lrb(
DLL_LOCAL void omp_test_nest_lock_lrb(
void *ofld_
)
{
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment