Commit 2eab9666 by Ilya Verbin

backport: Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host and libmyo-client.

Merge liboffloadmic from upstream, version 20150803.

liboffloadmic/
	* Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host
	and libmyo-client.  liboffloadmic_host loads them dynamically.
	* Makefile.in: Regenerate.
	* doc/doxygen/header.tex: Merge from upstream, version 20150803
	<https://openmprtl.org/sites/default/files/liboffload_oss_20150803.tgz>.
	* runtime/cean_util.cpp: Likewise.
	* runtime/cean_util.h: Likewise.
	* runtime/coi/coi_client.cpp: Likewise.
	* runtime/coi/coi_client.h: Likewise.
	* runtime/coi/coi_server.cpp: Likewise.
	* runtime/coi/coi_server.h: Likewise.
	* runtime/compiler_if_host.cpp: Likewise.
	* runtime/compiler_if_host.h: Likewise.
	* runtime/compiler_if_target.cpp: Likewise.
	* runtime/compiler_if_target.h: Likewise.
	* runtime/dv_util.cpp: Likewise.
	* runtime/dv_util.h: Likewise.
	* runtime/liboffload_error.c: Likewise.
	* runtime/liboffload_error_codes.h: Likewise.
	* runtime/liboffload_msg.c: Likewise.
	* runtime/liboffload_msg.h: Likewise.
	* runtime/mic_lib.f90: Likewise.
	* runtime/offload.h: Likewise.
	* runtime/offload_common.cpp: Likewise.
	* runtime/offload_common.h: Likewise.
	* runtime/offload_engine.cpp: Likewise.
	* runtime/offload_engine.h: Likewise.
	* runtime/offload_env.cpp: Likewise.
	* runtime/offload_env.h: Likewise.
	* runtime/offload_host.cpp: Likewise.
	* runtime/offload_host.h: Likewise.
	* runtime/offload_iterator.h: Likewise.
	* runtime/offload_myo_host.cpp: Likewise.
	* runtime/offload_myo_host.h: Likewise.
	* runtime/offload_myo_target.cpp: Likewise.
	* runtime/offload_myo_target.h: Likewise.
	* runtime/offload_omp_host.cpp: Likewise.
	* runtime/offload_omp_target.cpp: Likewise.
	* runtime/offload_orsl.cpp: Likewise.
	* runtime/offload_orsl.h: Likewise.
	* runtime/offload_table.cpp: Likewise.
	* runtime/offload_table.h: Likewise.
	* runtime/offload_target.cpp: Likewise.
	* runtime/offload_target.h: Likewise.
	* runtime/offload_target_main.cpp: Likewise.
	* runtime/offload_timer.h: Likewise.
	* runtime/offload_timer_host.cpp: Likewise.
	* runtime/offload_timer_target.cpp: Likewise.
	* runtime/offload_trace.cpp: Likewise.
	* runtime/offload_trace.h: Likewise.
	* runtime/offload_util.cpp: Likewise.
	* runtime/offload_util.h: Likewise.
	* runtime/ofldbegin.cpp: Likewise.
	* runtime/ofldend.cpp: Likewise.
	* runtime/orsl-lite/include/orsl-lite.h: Likewise.
	* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
	* runtime/use_mpss2.txt: Likewise.
	* include/coi/common/COIEngine_common.h: Merge from upstream, MPSS
	version 3.5
	<http://registrationcenter.intel.com/irc_nas/7445/mpss-src-3.5.tar>.
	* include/coi/common/COIEvent_common.h: Likewise.
	* include/coi/common/COIMacros_common.h: Likewise.
	* include/coi/common/COIPerf_common.h: Likewise.
	* include/coi/common/COIResult_common.h: Likewise.
	* include/coi/common/COISysInfo_common.h: Likewise.
	* include/coi/common/COITypes_common.h: Likewise.
	* include/coi/sink/COIBuffer_sink.h: Likewise.
	* include/coi/sink/COIPipeline_sink.h: Likewise.
	* include/coi/sink/COIProcess_sink.h: Likewise.
	* include/coi/source/COIBuffer_source.h: Likewise.
	* include/coi/source/COIEngine_source.h: Likewise.
	* include/coi/source/COIEvent_source.h: Likewise.
	* include/coi/source/COIPipeline_source.h: Likewise.
	* include/coi/source/COIProcess_source.h: Likewise.
	* include/myo/myo.h: Likewise.
	* include/myo/myoimpl.h: Likewise.
	* include/myo/myotypes.h: Likewise.
	* plugin/Makefile.am (myo_inc_dir): Remove.
	(libgomp_plugin_intelmic_la_CPPFLAGS): Do not define MYO_SUPPORT.
	(AM_CPPFLAGS): Likewise for offload_target_main.
	* plugin/Makefile.in: Regenerate.
	* runtime/emulator/coi_common.h: Update copyright years.
	(OFFLOAD_EMUL_KNC_NUM_ENV): Replace with ...
	(OFFLOAD_EMUL_NUM_ENV): ... this.
	(enum cmd_t): Add CMD_CLOSE_LIBRARY.
	* runtime/emulator/coi_device.cpp: Update copyright years.
	(COIProcessWaitForShutdown): Add space between string constants.
	Return handle to host in CMD_OPEN_LIBRARY.
	Support CMD_CLOSE_LIBRARY.
	* runtime/emulator/coi_device.h: Update copyright years.
	* runtime/emulator/coi_host.cpp: Update copyright years.
	(knc_engines_num): Replace with ...
	(num_engines): ... this.
	(init): Replace OFFLOAD_EMUL_KNC_NUM_ENV with OFFLOAD_EMUL_NUM_ENV.
	(COIEngineGetCount): Replace COI_ISA_KNC with COI_ISA_MIC, and
	knc_engines_num with num_engines.
	(COIEngineGetHandle): Likewise.
	(COIProcessCreateFromMemory): Add space between string constants.
	(COIProcessCreateFromFile): New function.
	(COIProcessLoadLibraryFromMemory): Rename arguments according to
	COIProcess_source.h.  Return handle, received from target.
	(COIProcessUnloadLibrary): New function.
	(COIPipelineClearCPUMask): New function.
	(COIPipelineSetCPUMask): New function.
	(COIEngineGetInfo): New function.
	* runtime/emulator/coi_host.h: Update copyright years.
	* runtime/emulator/coi_version_asm.h: Regenerate.
	* runtime/emulator/coi_version_linker_script.map: Regenerate.
	* runtime/emulator/myo_client.cpp: Update copyright years.
	* runtime/emulator/myo_service.cpp: Update copyright years.
	(myoArenaRelease): New function.
	(myoArenaAcquire): New function.
	(myoArenaAlignedFree): New function.
	(myoArenaAlignedMalloc): New function.
	* runtime/emulator/myo_service.h: Update copyright years.
	* runtime/emulator/myo_version_asm.h: Regenerate.
	* runtime/emulator/myo_version_linker_script.map: Regenerate.

From-SVN: r227532
parent 761f8e2f
2015-09-08 Ilya Verbin <ilya.verbin@intel.com>
* Makefile.am (liboffloadmic_host_la_DEPENDENCIES): Remove libcoi_host
and libmyo-client. liboffloadmic_host loads them dynamically.
* Makefile.in: Regenerate.
* doc/doxygen/header.tex: Merge from upstream, version 20150803
<https://openmprtl.org/sites/default/files/liboffload_oss_20150803.tgz>.
* runtime/cean_util.cpp: Likewise.
* runtime/cean_util.h: Likewise.
* runtime/coi/coi_client.cpp: Likewise.
* runtime/coi/coi_client.h: Likewise.
* runtime/coi/coi_server.cpp: Likewise.
* runtime/coi/coi_server.h: Likewise.
* runtime/compiler_if_host.cpp: Likewise.
* runtime/compiler_if_host.h: Likewise.
* runtime/compiler_if_target.cpp: Likewise.
* runtime/compiler_if_target.h: Likewise.
* runtime/dv_util.cpp: Likewise.
* runtime/dv_util.h: Likewise.
* runtime/liboffload_error.c: Likewise.
* runtime/liboffload_error_codes.h: Likewise.
* runtime/liboffload_msg.c: Likewise.
* runtime/liboffload_msg.h: Likewise.
* runtime/mic_lib.f90: Likewise.
* runtime/offload.h: Likewise.
* runtime/offload_common.cpp: Likewise.
* runtime/offload_common.h: Likewise.
* runtime/offload_engine.cpp: Likewise.
* runtime/offload_engine.h: Likewise.
* runtime/offload_env.cpp: Likewise.
* runtime/offload_env.h: Likewise.
* runtime/offload_host.cpp: Likewise.
* runtime/offload_host.h: Likewise.
* runtime/offload_iterator.h: Likewise.
* runtime/offload_myo_host.cpp: Likewise.
* runtime/offload_myo_host.h: Likewise.
* runtime/offload_myo_target.cpp: Likewise.
* runtime/offload_myo_target.h: Likewise.
* runtime/offload_omp_host.cpp: Likewise.
* runtime/offload_omp_target.cpp: Likewise.
* runtime/offload_orsl.cpp: Likewise.
* runtime/offload_orsl.h: Likewise.
* runtime/offload_table.cpp: Likewise.
* runtime/offload_table.h: Likewise.
* runtime/offload_target.cpp: Likewise.
* runtime/offload_target.h: Likewise.
* runtime/offload_target_main.cpp: Likewise.
* runtime/offload_timer.h: Likewise.
* runtime/offload_timer_host.cpp: Likewise.
* runtime/offload_timer_target.cpp: Likewise.
* runtime/offload_trace.cpp: Likewise.
* runtime/offload_trace.h: Likewise.
* runtime/offload_util.cpp: Likewise.
* runtime/offload_util.h: Likewise.
* runtime/ofldbegin.cpp: Likewise.
* runtime/ofldend.cpp: Likewise.
* runtime/orsl-lite/include/orsl-lite.h: Likewise.
* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
* runtime/use_mpss2.txt: Likewise.
* include/coi/common/COIEngine_common.h: Merge from upstream, MPSS
version 3.5
<http://registrationcenter.intel.com/irc_nas/7445/mpss-src-3.5.tar>.
* include/coi/common/COIEvent_common.h: Likewise.
* include/coi/common/COIMacros_common.h: Likewise.
* include/coi/common/COIPerf_common.h: Likewise.
* include/coi/common/COIResult_common.h: Likewise.
* include/coi/common/COISysInfo_common.h: Likewise.
* include/coi/common/COITypes_common.h: Likewise.
* include/coi/sink/COIBuffer_sink.h: Likewise.
* include/coi/sink/COIPipeline_sink.h: Likewise.
* include/coi/sink/COIProcess_sink.h: Likewise.
* include/coi/source/COIBuffer_source.h: Likewise.
* include/coi/source/COIEngine_source.h: Likewise.
* include/coi/source/COIEvent_source.h: Likewise.
* include/coi/source/COIPipeline_source.h: Likewise.
* include/coi/source/COIProcess_source.h: Likewise.
* include/myo/myo.h: Likewise.
* include/myo/myoimpl.h: Likewise.
* include/myo/myotypes.h: Likewise.
* plugin/Makefile.am (myo_inc_dir): Remove.
(libgomp_plugin_intelmic_la_CPPFLAGS): Do not define MYO_SUPPORT.
(AM_CPPFLAGS): Likewise for offload_target_main.
* plugin/Makefile.in: Regenerate.
* runtime/emulator/coi_common.h: Update copyright years.
(OFFLOAD_EMUL_KNC_NUM_ENV): Replace with ...
(OFFLOAD_EMUL_NUM_ENV): ... this.
(enum cmd_t): Add CMD_CLOSE_LIBRARY.
* runtime/emulator/coi_device.cpp: Update copyright years.
(COIProcessWaitForShutdown): Add space between string constants.
Return handle to host in CMD_OPEN_LIBRARY.
Support CMD_CLOSE_LIBRARY.
* runtime/emulator/coi_device.h: Update copyright years.
* runtime/emulator/coi_host.cpp: Update copyright years.
(knc_engines_num): Replace with ...
(num_engines): ... this.
(init): Replace OFFLOAD_EMUL_KNC_NUM_ENV with OFFLOAD_EMUL_NUM_ENV.
(COIEngineGetCount): Replace COI_ISA_KNC with COI_ISA_MIC, and
knc_engines_num with num_engines.
(COIEngineGetHandle): Likewise.
(COIProcessCreateFromMemory): Add space between string constants.
(COIProcessCreateFromFile): New function.
(COIProcessLoadLibraryFromMemory): Rename arguments according to
COIProcess_source.h. Return handle, received from target.
(COIProcessUnloadLibrary): New function.
(COIPipelineClearCPUMask): New function.
(COIPipelineSetCPUMask): New function.
(COIEngineGetInfo): New function.
* runtime/emulator/coi_host.h: Update copyright years.
* runtime/emulator/coi_version_asm.h: Regenerate.
* runtime/emulator/coi_version_linker_script.map: Regenerate.
* runtime/emulator/myo_client.cpp: Update copyright years.
* runtime/emulator/myo_service.cpp: Update copyright years.
(myoArenaRelease): New function.
(myoArenaAcquire): New function.
(myoArenaAlignedFree): New function.
(myoArenaAlignedMalloc): New function.
* runtime/emulator/myo_service.h: Update copyright years.
* runtime/emulator/myo_version_asm.h: Regenerate.
* runtime/emulator/myo_version_linker_script.map: Regenerate.
2015-08-24 Nathan Sidwell <nathan@codesourcery.com> 2015-08-24 Nathan Sidwell <nathan@codesourcery.com>
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_version): New. * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_version): New.
...@@ -17,11 +137,11 @@ ...@@ -17,11 +137,11 @@
* configure: Reflects renaming of configure.in to configure.ac * configure: Reflects renaming of configure.in to configure.ac
2015-07-17 Nathan Sidwell <nathan@acm.org> 2015-07-17 Nathan Sidwell <nathan@acm.org>
Ilya Verbin <iverbin@gmail.com> Ilya Verbin <ilya.verbin@intel.com>
* plugin/libgomp-plugin-intelmic.cpp (ImgDevAddrMap): Constify. * plugin/libgomp-plugin-intelmic.cpp (ImgDevAddrMap): Constify.
(offload_image, GOMP_OFFLOAD_load_image, (offload_image, GOMP_OFFLOAD_load_image,
OMP_OFFLOAD_unload_image): Constify target data. GOMP_OFFLOAD_unload_image): Constify target data.
2015-07-08 Thomas Schwinge <thomas@codesourcery.com> 2015-07-08 Thomas Schwinge <thomas@codesourcery.com>
......
...@@ -84,8 +84,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \ ...@@ -84,8 +84,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \
liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1 liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1
liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0 liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0
liboffloadmic_host_la_LIBADD = libcoi_host.la libmyo-client.la
liboffloadmic_host_la_DEPENDENCIES = $(liboffloadmic_host_la_LIBADD)
liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \ liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \
runtime/coi/coi_server.cpp \ runtime/coi/coi_server.cpp \
......
...@@ -165,6 +165,7 @@ libmyo_service_la_LINK = $(LIBTOOL) --tag=CXX $(AM_LIBTOOLFLAGS) \ ...@@ -165,6 +165,7 @@ libmyo_service_la_LINK = $(LIBTOOL) --tag=CXX $(AM_LIBTOOLFLAGS) \
$(CXXFLAGS) $(libmyo_service_la_LDFLAGS) $(LDFLAGS) -o $@ $(CXXFLAGS) $(libmyo_service_la_LDFLAGS) $(LDFLAGS) -o $@
@LIBOFFLOADMIC_HOST_FALSE@am_libmyo_service_la_rpath = -rpath \ @LIBOFFLOADMIC_HOST_FALSE@am_libmyo_service_la_rpath = -rpath \
@LIBOFFLOADMIC_HOST_FALSE@ $(toolexeclibdir) @LIBOFFLOADMIC_HOST_FALSE@ $(toolexeclibdir)
liboffloadmic_host_la_LIBADD =
am__objects_1 = liboffloadmic_host_la-dv_util.lo \ am__objects_1 = liboffloadmic_host_la-dv_util.lo \
liboffloadmic_host_la-liboffload_error.lo \ liboffloadmic_host_la-liboffload_error.lo \
liboffloadmic_host_la-liboffload_msg.lo \ liboffloadmic_host_la-liboffload_msg.lo \
...@@ -445,8 +446,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \ ...@@ -445,8 +446,6 @@ liboffloadmic_host_la_SOURCES = $(liboffloadmic_sources) \
liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1 liboffloadmic_host_la_CPPFLAGS = $(liboffloadmic_cppflags) -DHOST_LIBRARY=1
liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0 liboffloadmic_host_la_LDFLAGS = @lt_cv_dlopen_libs@ -version-info 5:0:0
liboffloadmic_host_la_LIBADD = libcoi_host.la libmyo-client.la
liboffloadmic_host_la_DEPENDENCIES = $(liboffloadmic_host_la_LIBADD)
liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \ liboffloadmic_target_la_SOURCES = $(liboffloadmic_sources) \
runtime/coi/coi_server.cpp \ runtime/coi/coi_server.cpp \
runtime/compiler_if_target.cpp \ runtime/compiler_if_target.cpp \
......
...@@ -82,7 +82,7 @@ Notice revision \#20110804 ...@@ -82,7 +82,7 @@ Notice revision \#20110804
Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.
This document is Copyright \textcopyright 2014, Intel Corporation. All rights reserved. This document is Copyright \textcopyright 2014-2015, Intel Corporation. All rights reserved.
\pagenumbering{roman} \pagenumbering{roman}
\tableofcontents \tableofcontents
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -89,7 +89,7 @@ typedef enum ...@@ -89,7 +89,7 @@ typedef enum
/// [out] The zero-based index of this engine in the collection of /// [out] The zero-based index of this engine in the collection of
/// engines of the ISA returned in out_pType. /// engines of the ISA returned in out_pType.
/// ///
/// @return COI_INVALID_POINTER if the any of the parameters are NULL. /// @return COI_INVALID_POINTER if any of the parameters are NULL.
/// ///
/// @return COI_SUCCESS /// @return COI_SUCCESS
/// ///
......
/*
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, version 2.1.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* Disclaimer: The codes contained in these modules may be specific
* to the Intel Software Development Platform codenamed Knights Ferry,
* and the Intel product codenamed Knights Corner, and are not backward
* compatible with other Intel products. Additionally, Intel will NOT
* support the codes or instruction set in future products.
*
* Intel offers no warranty of any kind regarding the code. This code is
* licensed on an "AS IS" basis and Intel is not obligated to provide
* any support, assistance, installation, training, or other services
* of any kind. Intel is also not obligated to provide any updates,
* enhancements or extensions. Intel specifically disclaims any warranty
* of merchantability, non-infringement, fitness for any particular
* purpose, and any other warranty.
*
* Further, Intel disclaims all liability of any kind, including but
* not limited to liability for infringement of any proprietary rights,
* relating to the use of the code, even if Intel is notified of the
* possibility of such liability. Except as expressly stated in an Intel
* license agreement provided with this code and agreed upon with Intel,
* no license, express or implied, by estoppel or otherwise, to any
* intellectual property rights is granted herein.
*/
#ifndef _COIEVENT_COMMON_H
#define _COIEVENT_COMMON_H
/** @ingroup COIEvent
* @addtogroup COIEventcommon
@{
* @file common/COIEvent_common.h
*/
#ifndef DOXYGEN_SHOULD_SKIP_THIS
#include "../common/COITypes_common.h"
#include "../common/COIResult_common.h"
#ifdef __cplusplus
extern "C" {
#endif
#endif // DOXYGEN_SHOULD_SKIP_THIS
///////////////////////////////////////////////////////////////////////////////
///
/// Signal one shot user event. User events created on source can be
/// signaled from both sink and source. This fires the event and wakes up
/// threads waiting on COIEventWait.
///
/// Note: For events that are not registered or already signaled this call
/// will behave as a NOP. Users need to make sure that they pass valid
/// events on the sink side.
///
/// @param in_Event
/// Event Handle to be signaled.
///
/// @return COI_INVAILD_HANDLE if in_Event was not a User event.
///
/// @return COI_ERROR if the signal fails to be sent from the sink.
///
/// @return COI_SUCCESS if the event was successfully signaled or ignored.
///
COIACCESSAPI
COIRESULT COIEventSignalUserEvent(COIEVENT in_Event);
///
///
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* _COIEVENT_COMMON_H */
/*! @} */
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -41,12 +41,17 @@ ...@@ -41,12 +41,17 @@
#ifndef _COIMACROS_COMMON_H #ifndef _COIMACROS_COMMON_H
#define _COIMACROS_COMMON_H #define _COIMACROS_COMMON_H
#include <string.h>
#include "../source/COIPipeline_source.h"
#include "../common/COITypes_common.h"
/// @file common/COIMacros_common.h /// @file common/COIMacros_common.h
/// Commonly used macros /// Commonly used macros
// Note that UNUSUED_ATTR means that it is "possibly" unused, not "definitely". // Note that UNUSUED_ATTR means that it is "possibly" unused, not "definitely".
// This should compile out in release mode if indeed it is unused. // This should compile out in release mode if indeed it is unused.
#define UNUSED_ATTR __attribute__((unused)) #define UNUSED_ATTR __attribute__((unused))
#include <sched.h>
#ifndef UNREFERENCED_CONST_PARAM #ifndef UNREFERENCED_CONST_PARAM
#define UNREFERENCED_CONST_PARAM(P) { void* x UNUSED_ATTR = \ #define UNREFERENCED_CONST_PARAM(P) { void* x UNUSED_ATTR = \
(void*)(uint64_t)P; \ (void*)(uint64_t)P; \
...@@ -66,4 +71,150 @@ ...@@ -66,4 +71,150 @@
#endif #endif
/* The following are static inline definitions of functions used for manipulating
COI_CPU_MASK info (The COI_CPU_MASK type is declared as an array of 16 uint64_t's
in COITypes_common.h "typedef uint64_t COI_CPU_MASK[16]").
These static inlined functions are intended on being roughly the same as the Linux
CPU_* macros defined in sched.h - with the important difference being a different
fundamental type difference: cpu_set_t versus COI_CPU_MASK.
The motivation for writing this code was to ease portability on the host side of COI
applications to both Windows and Linux.
*/
/* Roughly equivalent to CPU_ISSET(). */
static inline uint64_t COI_CPU_MASK_ISSET(int bitNumber, const COI_CPU_MASK cpu_mask)
{
if ((size_t)bitNumber < sizeof(COI_CPU_MASK)*8)
return ((cpu_mask)[bitNumber/64] & (((uint64_t)1) << (bitNumber%64)));
return 0;
}
/* Roughly equivalent to CPU_SET(). */
static inline void COI_CPU_MASK_SET(int bitNumber, COI_CPU_MASK cpu_mask)
{
if ((size_t)bitNumber < sizeof(COI_CPU_MASK)*8)
((cpu_mask)[bitNumber/64] |= (((uint64_t)1) << (bitNumber%64)));
}
/* Roughly equivalent to CPU_ZERO(). */
static inline void COI_CPU_MASK_ZERO(COI_CPU_MASK cpu_mask)
{
memset(cpu_mask,0,sizeof(COI_CPU_MASK));
}
/* Roughly equivalent to CPU_AND(). */
static inline void COI_CPU_MASK_AND(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] & src2[i];
}
/* Roughly equivalent to CPU_XOR(). */
static inline void COI_CPU_MASK_XOR(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] ^ src2[i];
}
/* Roughly equivalent to CPU_OR(). */
static inline void COI_CPU_MASK_OR(COI_CPU_MASK dst, const COI_CPU_MASK src1, const COI_CPU_MASK src2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(dst[0]);
for(unsigned int i=0;i<loopIterations;++i)
dst[i] = src1[i] | src2[i];
}
/* Utility function for COI_CPU_MASK_COUNT() below. */
static inline int __COI_CountBits(uint64_t n)
{
int cnt=0;
for (;n;cnt++)
n &= (n-1);
return cnt;
}
/* Roughly equivalent to CPU_COUNT(). */
static inline int COI_CPU_MASK_COUNT(const COI_CPU_MASK cpu_mask)
{
int cnt=0;
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(cpu_mask[0]);
for(unsigned int i=0;i < loopIterations;++i)
{
cnt += __COI_CountBits(cpu_mask[i]);
}
return cnt;
}
/* Roughly equivalent to CPU_EQUAL(). */
static inline int COI_CPU_MASK_EQUAL(const COI_CPU_MASK cpu_mask1,const COI_CPU_MASK cpu_mask2)
{
const unsigned int loopIterations = sizeof(COI_CPU_MASK) / sizeof(cpu_mask1[0]);
for(unsigned int i=0;i < loopIterations;++i)
{
if (cpu_mask1[i] != cpu_mask2[i])
return 0;
}
return 1;
}
/* Utility function to translate from cpu_set * to COI_CPU_MASK. */
static inline void COI_CPU_MASK_XLATE(COI_CPU_MASK dest,const cpu_set_t *src)
{
COI_CPU_MASK_ZERO(dest);
#if 0
/* Slightly slower version than the following #else/#endif block. Left here only to
document the intent of the code. */
for(unsigned int i=0;i < sizeof(cpu_set_t)*8;++i)
if (CPU_ISSET(i,src))
COI_CPU_MASK_SET(i,dest);
#else
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)/sizeof(dest[0]);++i)
{
for(unsigned int j=0;j < 64;++j)
{
if (CPU_ISSET(i*64+j,src))
dest[i] |= ((uint64_t)1) << j;
}
}
#endif
}
/* Utility function to translate from COI_CPU_MASK to cpu_set *. */
static inline void COI_CPU_MASK_XLATE_EX(cpu_set_t *dest,const COI_CPU_MASK src)
{
CPU_ZERO(dest);
#if 0
/* Slightly slower version than the following #else/#endif block. Left here only to
document the intent of the code. */
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)*8;++i)
if (COI_CPU_MASK_ISSET(i,src))
CPU_SET(i,dest);
#else
for(unsigned int i=0;i < sizeof(COI_CPU_MASK)/sizeof(src[0]);++i)
{
const uint64_t cpu_mask = src[i];
for(unsigned int j=0;j < 64;++j)
{
const uint64_t bit = ((uint64_t)1) << j;
if (bit & cpu_mask)
CPU_SET(i*64+j,dest);
}
}
#endif
}
#endif /* _COIMACROS_COMMON_H */ #endif /* _COIMACROS_COMMON_H */
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -115,7 +115,8 @@ typedef enum COIRESULT ...@@ -115,7 +115,8 @@ typedef enum COIRESULT
COI_AUTHENTICATION_FAILURE, ///< The daemon was unable to authenticate COI_AUTHENTICATION_FAILURE, ///< The daemon was unable to authenticate
///< the user that requested an engine. ///< the user that requested an engine.
///< Only reported if daemon is set up for ///< Only reported if daemon is set up for
///< authorization. ///< authorization. Is also reported in
///< Windows if host can not find user.
COI_NUM_RESULTS ///< Reserved, do not use. COI_NUM_RESULTS ///< Reserved, do not use.
} }
COIRESULT; COIRESULT;
......
/*
* Copyright 2010-2015 Intel Corporation.
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, version 2.1.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* Disclaimer: The codes contained in these modules may be specific
* to the Intel Software Development Platform codenamed Knights Ferry,
* and the Intel product codenamed Knights Corner, and are not backward
* compatible with other Intel products. Additionally, Intel will NOT
* support the codes or instruction set in future products.
*
* Intel offers no warranty of any kind regarding the code. This code is
* licensed on an "AS IS" basis and Intel is not obligated to provide
* any support, assistance, installation, training, or other services
* of any kind. Intel is also not obligated to provide any updates,
* enhancements or extensions. Intel specifically disclaims any warranty
* of merchantability, non-infringement, fitness for any particular
* purpose, and any other warranty.
*
* Further, Intel disclaims all liability of any kind, including but
* not limited to liability for infringement of any proprietary rights,
* relating to the use of the code, even if Intel is notified of the
* possibility of such liability. Except as expressly stated in an Intel
* license agreement provided with this code and agreed upon with Intel,
* no license, express or implied, by estoppel or otherwise, to any
* intellectual property rights is granted herein.
*/
#ifndef _COISYSINFO_COMMON_H
#define _COISYSINFO_COMMON_H
/** @ingroup COISysInfo
* @addtogroup COISysInfoCommon
@{
* @file common/COISysInfo_common.h
* This interface allows developers to query the platform for system level
* information. */
#ifndef DOXYGEN_SHOULD_SKIP_THIS
#include "../common/COITypes_common.h"
#include <assert.h>
#include <string.h>
#ifdef __cplusplus
extern "C" {
#endif
#endif // DOXYGEN_SHOULD_SKIP_THIS
#define INITIAL_APIC_ID_BITS 0xFF000000 // EBX[31:24] unique APIC ID
///////////////////////////////////////////////////////////////////////////////
/// \fn uint32_t COISysGetAPICID(void)
/// @return The Advanced Programmable Interrupt Controller (APIC) ID of
/// the hardware thread on which the caller is running.
///
/// @warning APIC IDs are unique to each hardware thread within a processor,
/// but may not be sequential.
COIACCESSAPI
uint32_t COISysGetAPICID(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of cores exposed by the processor on which the caller is
/// running. Returns 0 if there is an error loading the processor info.
COIACCESSAPI
uint32_t COISysGetCoreCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of hardware threads exposed by the processor on which
/// the caller is running. Returns 0 if there is an error loading processor
/// info.
COIACCESSAPI
uint32_t COISysGetHardwareThreadCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the hardware thread on which the caller is running.
///
/// The indexes of neighboring hardware threads will differ by a value of one
/// and are within the range zero through COISysGetHardwareThreadCount()-1.
/// Returns ((uint32_t)-1) if there was an error loading processor info.
COIACCESSAPI
uint32_t COISysGetHardwareThreadIndex(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the core on which the caller is running.
///
/// The indexes of neighboring cores will differ by a value of one and are
/// within the range zero through COISysGetCoreCount()-1. Returns ((uint32_t)-1)
/// if there was an error loading processor info.
COIACCESSAPI
uint32_t COISysGetCoreIndex(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The number of level 2 caches within the processor on which the
/// caller is running. Returns ((uint32_t)-1) if there was an error loading
/// processor info.
COIACCESSAPI
uint32_t COISysGetL2CacheCount(void);
///////////////////////////////////////////////////////////////////////////////
///
/// @return The index of the level 2 cache on which the caller is running.
/// Returns ((uint32_t)-1) if there was an error loading processor info.
///
/// The indexes of neighboring cores will differ by a value of one and are
/// within the range zero through COISysGetL2CacheCount()-1.
COIACCESSAPI
uint32_t COISysGetL2CacheIndex(void);
#ifdef __cplusplus
} /* extern "C" */
#endif
/*! @} */
#endif /* _COISYSINFO_COMMON_H */
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -62,19 +62,19 @@ extern "C" { ...@@ -62,19 +62,19 @@ extern "C" {
/// will remain on the device until both a corresponding COIBufferReleaseRef() /// will remain on the device until both a corresponding COIBufferReleaseRef()
/// call is made and the run function that delivered the buffer returns. /// call is made and the run function that delivered the buffer returns.
/// ///
/// Intel® Coprocessor Offload Infrastructure (Intel® COI) streaming buffers should not be AddRef'd. Doing so may result in /// Running this API in a thread spawned within the run function is not
/// unpredictable results or may cause the sink process to crash. /// supported and will cause unpredictable results and may cause data corruption.
/// ///
/// @warning 1.It is possible for enqueued run functions to be unable to /// @warning 1.It is possible for enqueued run functions to be unable to
/// execute due to all card memory being occupied by addref'ed /// execute due to all card memory being occupied by AddRef'd
/// buffers. As such, it is important that whenever a buffer is /// buffers. As such, it is important that whenever a buffer is
/// addref'd that there be no dependencies on future run functions /// AddRef'd that there be no dependencies on future run functions
/// for progress to be made towards releasing the buffer. /// for progress to be made towards releasing the buffer.
/// 2.It is important that AddRef is called within the scope of /// 2.It is important that AddRef is called within the scope of
/// run function that carries the buffer to be addref'ed. /// run function that carries the buffer to be AddRef'd.
/// ///
/// @param in_pBuffer /// @param in_pBuffer
/// [in] Pointer to the start of a buffer being addref'ed, that was /// [in] Pointer to the start of a buffer being AddRef'd, that was
/// passed in at the start of the run function. /// passed in at the start of the run function.
/// ///
/// @return COI_SUCCESS if the buffer ref count was successfully incremented. /// @return COI_SUCCESS if the buffer ref count was successfully incremented.
...@@ -95,10 +95,13 @@ COIBufferAddRef( ...@@ -95,10 +95,13 @@ COIBufferAddRef(
/// conditions are met: the run function that delivered the buffer /// conditions are met: the run function that delivered the buffer
/// returns, and the number of calls to COIBufferReleaseRef() matches the /// returns, and the number of calls to COIBufferReleaseRef() matches the
/// number of calls to COIBufferAddRef(). /// number of calls to COIBufferAddRef().
//
/// Running this API in a thread spawned within the run function is not
/// supported and will cause unpredictable results and may cause data corruption.
/// ///
/// @warning When a buffer is addref'ed it is assumed that it is in use and all /// @warning When a buffer is AddRef'd it is assumed that it is in use and all
/// other operations on that buffer waits for ReleaseRef() to happen. /// other operations on that buffer waits for ReleaseRef() to happen.
/// So you cannot pass the addref'ed buffer's handle to RunFunction /// So you cannot pass the AddRef'd buffer's handle to RunFunction
/// that calls ReleaseRef(). This is a circular dependency and will /// that calls ReleaseRef(). This is a circular dependency and will
/// cause a deadlock. Buffer's pointer (buffer's sink side /// cause a deadlock. Buffer's pointer (buffer's sink side
/// address/pointer which is different than source side BUFFER handle) /// address/pointer which is different than source side BUFFER handle)
...@@ -106,7 +109,7 @@ COIBufferAddRef( ...@@ -106,7 +109,7 @@ COIBufferAddRef(
/// ReleaseRef. /// ReleaseRef.
/// ///
/// @param in_pBuffer /// @param in_pBuffer
/// [in] Pointer to the start of a buffer previously addref'ed, that /// [in] Pointer to the start of a buffer previously AddRef'd, that
/// was passed in at the start of the run function. /// was passed in at the start of the run function.
/// ///
/// @return COI_SUCCESS if the buffer refcount was successfully decremented. /// @return COI_SUCCESS if the buffer refcount was successfully decremented.
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -63,10 +63,11 @@ extern "C" { ...@@ -63,10 +63,11 @@ extern "C" {
/// main() function from exiting until it is directed to by the source. When /// main() function from exiting until it is directed to by the source. When
/// the shutdown message is received this function will stop any future run /// the shutdown message is received this function will stop any future run
/// functions from executing but will wait for any current run functions to /// functions from executing but will wait for any current run functions to
/// complete. All Intel® Coprocessor Offload Infrastructure (Intel® COI) resources will be cleaned up and no additional Intel® Coprocessor Offload Infrastructure (Intel® COI) APIs /// complete. All Intel® Coprocessor Offload Infrastructure (Intel® COI)
/// should be called after this function returns. This function does not /// resources will be cleaned up and no additional Intel® Coprocessor Offload
/// invoke exit() so the application can perform any of its own cleanup once /// Infrastructure (Intel® COI) APIs should be called after this function
/// this call returns. /// returns. This function does not invoke exit() so the application
/// can perform any of its own cleanup once this call returns.
/// ///
/// @return COI_SUCCESS once the process receives the shutdown message. /// @return COI_SUCCESS once the process receives the shutdown message.
/// ///
...@@ -86,8 +87,9 @@ COIProcessWaitForShutdown(); ...@@ -86,8 +87,9 @@ COIProcessWaitForShutdown();
/// from this call. /// from this call.
/// ///
/// @return COI_SUCCESS once the proxy output has been flushed to and written /// @return COI_SUCCESS once the proxy output has been flushed to and written
/// written by the host. Note that Intel® Coprocessor Offload Infrastructure (Intel® COI) on the source writes to stdout /// written by the host. Note that Intel® Coprocessor Offload
/// and stderr, but does not flush this output. /// Infrastructure (Intel® COI) on the source writes to stdout and
/// stderr, but does not flush this output.
/// @return COI_SUCCESS if the process was created without enabling /// @return COI_SUCCESS if the process was created without enabling
/// proxy IO this function. /// proxy IO this function.
/// ///
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -75,7 +75,7 @@ typedef enum ...@@ -75,7 +75,7 @@ typedef enum
/////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////
/// This structure returns information about an Intel(r) Xeon Phi(tm) /// This structure returns information about an Intel(R) Xeon Phi(TM)
/// coprocessor. /// coprocessor.
/// A pointer to this structure is passed into the COIGetEngineInfo() function, /// A pointer to this structure is passed into the COIGetEngineInfo() function,
/// which fills in the data before returning to the caller. /// which fills in the data before returning to the caller.
...@@ -101,6 +101,7 @@ typedef struct COI_ENGINE_INFO ...@@ -101,6 +101,7 @@ typedef struct COI_ENGINE_INFO
uint32_t CoreMaxFrequency; uint32_t CoreMaxFrequency;
/// The load percentage for each of the hardware threads on the engine. /// The load percentage for each of the hardware threads on the engine.
/// Currently this is limited to reporting out a maximum of 1024 HW threads
uint32_t Load[COI_MAX_HW_THREADS]; uint32_t Load[COI_MAX_HW_THREADS];
/// The amount of physical memory managed by the OS. /// The amount of physical memory managed by the OS.
...@@ -133,9 +134,9 @@ typedef struct COI_ENGINE_INFO ...@@ -133,9 +134,9 @@ typedef struct COI_ENGINE_INFO
/////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////
/// ///
/// Returns information related to a specified engine. Note that if Intel® Coprocessor Offload Infrastructure (Intel® COI) is /// Returns information related to a specified engine. Note that if Intel(R)
/// unable to query a value it will be returned as zero but the call will /// Coprocessor Offload Infrastructure (Intel(R) COI) is unable to query
/// still succeed. /// a value it will be returned as zero but the call will still succeed.
/// ///
/// ///
/// @param in_EngineHandle /// @param in_EngineHandle
...@@ -173,14 +174,15 @@ COIEngineGetInfo( ...@@ -173,14 +174,15 @@ COIEngineGetInfo(
/// ///
/// Returns the number of engines in the system that match the provided ISA. /// Returns the number of engines in the system that match the provided ISA.
/// ///
/// Note that while it is possible to enumerate different types of Intel(r) /// Note that while it is possible to enumerate different types of Intel(R)
/// Xeon Phi(tm) coprocessors on a single host this is not currently /// Xeon Phi(TM) coprocessors on a single host this is not currently
/// supported. Intel® Coprocessor Offload Infrastructure (Intel® COI) makes an assumption that all Intel(r) Xeon Phi(tm) /// supported. Intel(R) Coprocessor Offload Infrastructure (Intel(R) COI)
/// coprocessors found in the system are the same architecture as the first /// makes an assumption that all Intel(R) Xeon Phi(TM) coprocessors found
/// coprocessor device. /// in the system are the same architecture as the first coprocessor device.
/// ///
/// Also, note that this function returns the number of engines that Intel® Coprocessor Offload Infrastructure (Intel® COI) /// Also, note that this function returns the number of engines that Intel(R)
/// is able to detect. Not all of them may be online. /// Coprocessor Offload Infrastructure (Intel(R) COI) is able to detect. Not
/// all of them may be online.
/// ///
/// @param in_ISA /// @param in_ISA
/// [in] Specifies the ISA type of the engine requested. /// [in] Specifies the ISA type of the engine requested.
...@@ -226,7 +228,8 @@ COIEngineGetCount( ...@@ -226,7 +228,8 @@ COIEngineGetCount(
/// ///
/// @return COI_INVALID_POINTER if the out_pEngineHandle parameter is NULL. /// @return COI_INVALID_POINTER if the out_pEngineHandle parameter is NULL.
/// ///
/// @return COI_VERSION_MISMATCH if the version of Intel® Coprocessor Offload Infrastructure (Intel® COI) on the host is not /// @return COI_VERSION_MISMATCH if the version of Intel(R) Coprocessor Offload
/// Infrastructure (Intel(R) COI) on the host is not
/// compatible with the version on the device. /// compatible with the version on the device.
/// ///
/// @return COI_NOT_INITIALIZED if the engine requested exists but is offline. /// @return COI_NOT_INITIALIZED if the engine requested exists but is offline.
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -59,12 +59,10 @@ extern "C" { ...@@ -59,12 +59,10 @@ extern "C" {
/// ///
/// Special case event values which can be passed in to APIs to specify /// Special case event values which can be passed in to APIs to specify
/// how the API should behave. In COIBuffer APIs passing in NULL for the /// how the API should behave. In COIBuffer APIs passing in NULL for the
/// completion event is the equivalent of passing COI_EVENT_SYNC. For /// completion event is the equivalent of passing COI_EVENT_SYNC.
/// COIPipelineRunFunction passing in NULL is the equivalent of
/// COI_EVENT_ASYNC.
/// Note that passing COI_EVENT_ASYNC can be used when the caller wishes the /// Note that passing COI_EVENT_ASYNC can be used when the caller wishes the
/// operation to be performed asynchronously but does not care when the /// operation to be performed asynchronously but does not care when the
/// operation completes. This can be useful for opertions that by definition /// operation completes. This can be useful for operations that by definition
/// must complete in order (DMAs, run functions on a single pipeline). If /// must complete in order (DMAs, run functions on a single pipeline). If
/// the caller does care when the operation completes then they should pass /// the caller does care when the operation completes then they should pass
/// in a valid completion event which they can later wait on. /// in a valid completion event which they can later wait on.
...@@ -72,6 +70,16 @@ extern "C" { ...@@ -72,6 +70,16 @@ extern "C" {
#define COI_EVENT_ASYNC ((COIEVENT*)1) #define COI_EVENT_ASYNC ((COIEVENT*)1)
#define COI_EVENT_SYNC ((COIEVENT*)2) #define COI_EVENT_SYNC ((COIEVENT*)2)
//////////////////////////////////////////////////////////////////////////////
///
/// This can be used to initialize a COIEVENT to a known invalid state.
/// This is not required to use, but can be useful in some cases
/// if a program is unsure if the event will be initialized by the runtime.
/// Simply set the event to this value: COIEVENT event = COI_EVENT_INITIALIZER;
///
#define COI_EVENT_INITIALIZER { { 0, -1 } }
/////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////
/// ///
/// Wait for an arbitrary number of COIEVENTs to be signaled as completed, /// Wait for an arbitrary number of COIEVENTs to be signaled as completed,
...@@ -103,7 +111,7 @@ extern "C" { ...@@ -103,7 +111,7 @@ extern "C" {
/// is 1 or in_WaitForAll = True, this parameter is optional. /// is 1 or in_WaitForAll = True, this parameter is optional.
/// ///
/// @param out_pSignaledIndices /// @param out_pSignaledIndices
/// [out] Pointer to an array of indicies into the original event /// [out] Pointer to an array of indices into the original event
/// array. Those denoted have been signaled. The user must provide an /// array. Those denoted have been signaled. The user must provide an
/// array that is no smaller than the in_Events array. If in_NumEvents /// array that is no smaller than the in_Events array. If in_NumEvents
/// is 1 or in_WaitForAll = True, this parameter is optional. /// is 1 or in_WaitForAll = True, this parameter is optional.
...@@ -132,6 +140,10 @@ extern "C" { ...@@ -132,6 +140,10 @@ extern "C" {
/// @return COI_PROCESS_DIED if the remote process died. See COIProcessDestroy /// @return COI_PROCESS_DIED if the remote process died. See COIProcessDestroy
/// for more details. /// for more details.
/// ///
/// @return COI_<REAL ERROR> if only a single event is passed in, and that event
/// failed, COI will attempt to return the real error code that caused
/// the original operation to fail, otherwise COI_PROCESS_DIED is reported.
///
COIACCESSAPI COIACCESSAPI
COIRESULT COIRESULT
COIEventWait( COIEventWait(
...@@ -183,6 +195,103 @@ COIRESULT ...@@ -183,6 +195,103 @@ COIRESULT
COIEventUnregisterUserEvent( COIEventUnregisterUserEvent(
COIEVENT in_Event); COIEVENT in_Event);
//////////////////////////////////////////////////////////////////////////////
///
/// A callback that will be invoked to notify the user of an internal
/// runtime event completion.
///
/// As with any callback mechanism it is up to the user to make sure that
/// there are no possible deadlocks due to reentrancy (ie the callback being
/// invoked in the same context that triggered the notification) and also
/// that the callback does not slow down overall processing. If the user
/// performs too much work within the callback it could delay further
/// processing. The callback will be invoked prior to the signaling of
/// the corresponding COIEvent. For example, if a user is waiting
/// for a COIEvent associated with a run function completing they will
/// receive the callback before the COIEvent is marked as signaled.
///
/// @param in_Event
/// [in] The completion event that is associated with the
/// operation that is being notified.
///
/// @param in_Result
/// [in] The COIRESULT of the operation.
///
/// @param in_UserData
/// [in] Opaque data that was provided when the callback was
/// registered. Intel(R) Coprocessor Offload Infrastructure
/// (Intel(R) COI) simply passes this back to the user so that
/// they can interpret it as they choose.
///
typedef void (*COI_EVENT_CALLBACK)(
COIEVENT in_Event,
const COIRESULT in_Result,
const void* in_UserData);
//////////////////////////////////////////////////////////////////////////////
///
/// Registers any COIEVENT to receive a one time callback, when the event
/// is marked complete in the offload runtime. If the event has completed
/// before the COIEventRegisterCallback() is called then the callback will
/// immediately be invoked by the calling thread. When the event is
/// registered before the event completes, the runtime gaurantees that
/// the callback will be invoked before COIEventWait() is notified of
/// the same event completing. In well written user code, this may provide
/// a slight performance advantage.
///
/// Users should treat the callback much like an interrupt routine, in regards
/// of performance. Specifically designing the callback to be as short and
/// non blocking as possible. Since the thread that runs the callback is
/// non deterministic blocking or stalling of the callback, may have severe
/// performance impacts on the offload runtime. Thus, it is important to not
/// create deadlocks between the callback and other signaling/waiting
/// mechanisms. It is recommended to never invoke COIEventWait() inside
/// a callback function, as this could lead to immediate deadlocks.
///
/// It is important to note that the runtime cannot distinguish between
/// already triggered events and invalid events. Thus the user needs to pass
/// in a valid event, or the callback will be invoked immediately.
/// Failed events will still receive a callback and the user can query
/// COIEventWait() after the callback for the failed return code.
///
/// If more than one callback is registered for the same event, only the
/// single most current callback will be used, i.e. the older one will
/// be replaced.
///
/// @param in_Event
/// [in] A valid single event handle to be registered to receive a callback.
///
/// @param in_Callback
/// [in] Pointer to a user function used to signal an
/// event completion.
///
/// @param in_UserData
/// [in] Opaque data to pass to the callback when it is invoked.
///
/// @param in_Flags
/// [in] Reserved parameter for future expansion, required to be zero for now.
///
/// @return COI_INVALID_HANDLE if in_Event is not a valid COIEVENT
///
/// @return COI_INVALID_HANDLE if in_Callback is not a valid pointer.
///
/// @return COI_ARGUMENT_MISMATCH if the in_Flags is not zero.
///
/// @return COI_SUCCESS an event is successfully registered
///
COIACCESSAPI
COIRESULT
COIEventRegisterCallback(
const COIEVENT in_Event,
COI_EVENT_CALLBACK in_Callback,
const void* in_UserData,
const uint64_t in_Flags);
#ifdef __cplusplus #ifdef __cplusplus
} /* extern "C" */ } /* extern "C" */
#endif #endif
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -60,11 +60,12 @@ extern "C" { ...@@ -60,11 +60,12 @@ extern "C" {
////////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////////
/// These flags specify how a buffer will be used within a run function. They /// These flags specify how a buffer will be used within a run function. They
/// allow Intel® Coprocessor Offload Infrastructure (Intel® COI) to make optimizations in how it moves data around the system. /// allow the runtime to make optimizations in how it moves the data around.
/// These flags can affect the correctness of an application, so they must be /// These flags can affect the correctness of an application, so they must be
/// set properly. For example, if a buffer is used in a run function with the /// set properly. For example, if a buffer is used in a run function with the
/// COI_SINK_READ flag and then mapped on the source, Intel® Coprocessor Offload Infrastructure (Intel® COI) may use a previously /// COI_SINK_READ flag and then mapped on the source, the runtime may use a
/// cached version of the buffer instead of retrieving data from the sink. /// previously cached version of the buffer instead of retrieving data from
/// the sink.
typedef enum COI_ACCESS_FLAGS typedef enum COI_ACCESS_FLAGS
{ {
/// Specifies that the run function will only read the associated buffer. /// Specifies that the run function will only read the associated buffer.
...@@ -76,7 +77,23 @@ typedef enum COI_ACCESS_FLAGS ...@@ -76,7 +77,23 @@ typedef enum COI_ACCESS_FLAGS
/// Specifies that the run function will overwrite the entire associated /// Specifies that the run function will overwrite the entire associated
/// buffer and therefore the buffer will not be synchronized with the /// buffer and therefore the buffer will not be synchronized with the
/// source before execution. /// source before execution.
COI_SINK_WRITE_ENTIRE COI_SINK_WRITE_ENTIRE,
/// Specifies that the run function will only read the associated buffer
/// and will maintain the reference count on the buffer after
/// run function exit.
COI_SINK_READ_ADDREF,
/// Specifies that the run function will write to the associated buffer
/// and will maintain the reference count on the buffer after
/// run function exit.
COI_SINK_WRITE_ADDREF,
/// Specifies that the run function will overwrite the entire associated
/// buffer and therefore the buffer will not be synchronized with the
/// source before execution and will maintain the reference count on the
/// buffer after run function exit.
COI_SINK_WRITE_ENTIRE_ADDREF
} COI_ACCESS_FLAGS; } COI_ACCESS_FLAGS;
#define COI_PIPELINE_MAX_PIPELINES 512 #define COI_PIPELINE_MAX_PIPELINES 512
...@@ -86,7 +103,7 @@ typedef enum COI_ACCESS_FLAGS ...@@ -86,7 +103,7 @@ typedef enum COI_ACCESS_FLAGS
/////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////
/// ///
/// Create a pipeline assoiated with a remote process. This pipeline can /// Create a pipeline associated with a remote process. This pipeline can
/// then be used to execute remote functions and to share data using /// then be used to execute remote functions and to share data using
/// COIBuffers. /// COIBuffers.
/// ///
...@@ -149,7 +166,7 @@ COIPipelineCreate( ...@@ -149,7 +166,7 @@ COIPipelineCreate(
/////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////
/// ///
/// Destroys the inidicated pipeline, releasing its resources. /// Destroys the indicated pipeline, releasing its resources.
/// ///
/// @param in_Pipeline /// @param in_Pipeline
/// [in] Pipeline to destroy. /// [in] Pipeline to destroy.
...@@ -175,22 +192,21 @@ COIPipelineDestroy( ...@@ -175,22 +192,21 @@ COIPipelineDestroy(
/// ///
/// 1. Proper care has to be taken while setting the input dependencies for /// 1. Proper care has to be taken while setting the input dependencies for
/// RunFunctions. Setting it incorrectly can lead to cyclic dependencies /// RunFunctions. Setting it incorrectly can lead to cyclic dependencies
/// and can cause the respective pipeline (as a result Intel® Coprocessor Offload Infrastructure (Intel® COI) Runtime) to /// and can cause the respective pipeline to stall.
/// stall.
/// 2. RunFunctions can also segfault if enough memory space is not available /// 2. RunFunctions can also segfault if enough memory space is not available
/// on the sink for the buffers passed in. Pinned buffers and buffers that /// on the sink for the buffers passed in. Pinned buffers and buffers that
/// are AddRef'd need to be accounted for available memory space. In other /// are AddRef'd need to be accounted for available memory space. In other
/// words, this memory is not available for use until it is freed up. /// words, this memory is not available for use until it is freed up.
/// 3. Unexpected segmentation faults or erroneous behaviour can occur if /// 3. Unexpected segmentation faults or erroneous behavior can occur if
/// handles or data passed in to Runfunction gets destroyed before the /// handles or data passed in to Runfunction gets destroyed before the
/// RunFunction finishes. /// RunFunction finishes.
/// For example, if a variable passed in as Misc data or the buffer gets /// For example, if a variable passed in as Misc data or the buffer gets
/// destroyed before the Intel® Coprocessor Offload Infrastructure (Intel® COI) runtime receives the completion notification /// destroyed before the runtime receives the completion notification
/// of the Runfunction, it can cause unexpected behaviour. So it is always /// of the Runfunction, it can cause unexpected behavior. So it is always
/// recommended to wait for RunFunction completion event before any related /// recommended to wait for RunFunction completion event before any related
/// destroy event occurs. /// destroy event occurs.
/// ///
/// Intel® Coprocessor Offload Infrastructure (Intel® COI) Runtime expects users to handle such scenarios. COIPipelineRunFunction /// The runtime expects users to handle such scenarios. COIPipelineRunFunction
/// returns COI_SUCCESS for above cases because it was queued up successfully. /// returns COI_SUCCESS for above cases because it was queued up successfully.
/// Also if you try to destroy a pipeline with a stalled function then the /// Also if you try to destroy a pipeline with a stalled function then the
/// destroy call will hang. COIPipelineDestroy waits until all the functions /// destroy call will hang. COIPipelineDestroy waits until all the functions
...@@ -251,7 +267,7 @@ COIPipelineDestroy( ...@@ -251,7 +267,7 @@ COIPipelineDestroy(
/// @param out_pAsyncReturnValue /// @param out_pAsyncReturnValue
/// [out] Pointer to user-allocated memory where the return value from /// [out] Pointer to user-allocated memory where the return value from
/// the run function will be placed. This memory should not be read /// the run function will be placed. This memory should not be read
/// until out_pCompletion has been signalled. /// until out_pCompletion has been signaled.
/// ///
/// @param in_AsyncReturnValueLen /// @param in_AsyncReturnValueLen
/// [in] Size of the out_pAsyncReturnValue in bytes. /// [in] Size of the out_pAsyncReturnValue in bytes.
...@@ -259,8 +275,11 @@ COIPipelineDestroy( ...@@ -259,8 +275,11 @@ COIPipelineDestroy(
/// @param out_pCompletion /// @param out_pCompletion
/// [out] An optional pointer to a COIEVENT object /// [out] An optional pointer to a COIEVENT object
/// that will be signaled when this run function has completed /// that will be signaled when this run function has completed
/// execution. The user may pass in NULL if they do not wish to signal /// execution. The user may pass in NULL if they wish for this function
/// any COIEVENTs when this run function completes. /// to be synchronous, otherwise if a COIEVENT object is passed in the
/// function is then asynchronous and closes after enqueuing the
/// RunFunction and passes back the COIEVENT that will be signaled
/// once the RunFunction has completed.
/// ///
/// @return COI_SUCCESS if the function was successfully placed in a /// @return COI_SUCCESS if the function was successfully placed in a
/// pipeline for future execution. Note that the actual /// pipeline for future execution. Note that the actual
...@@ -303,14 +322,6 @@ COIPipelineDestroy( ...@@ -303,14 +322,6 @@ COIPipelineDestroy(
/// @return COI_ARGUMENT_MISMATCH if in_pReturnValue is non-NULL but /// @return COI_ARGUMENT_MISMATCH if in_pReturnValue is non-NULL but
/// in_ReturnValueLen is zero. /// in_ReturnValueLen is zero.
/// ///
/// @return COI_ARGUMENT_MISMATCH if a COI_BUFFER_STREAMING_TO_SOURCE buffer
/// is not passed with COI_SINK_WRITE_ENTIRE access flag.
///
/// @return COI_RESOURCE_EXHAUSTED if could not create a version for TO_SOURCE
/// streaming buffer. It can fail if enough memory is not available to
/// register. This call will succeed eventually when the registered
/// memory becomes available.
///
/// @return COI_RETRY if any input buffers, which are not pinned buffers, /// @return COI_RETRY if any input buffers, which are not pinned buffers,
/// are still mapped when passed to the run function. /// are still mapped when passed to the run function.
/// ///
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -463,7 +463,33 @@ extern MyoError myoiTargetSharedMallocTableRegister( ...@@ -463,7 +463,33 @@ extern MyoError myoiTargetSharedMallocTableRegister(
* of the application. The server/card side executable should be * of the application. The server/card side executable should be
* executed only on the second card in this case. * executed only on the second card in this case.
* *
* @param userInitFunc Shared variables and remote funtions are * Another capability for the MyoiUserParams structure in MYO is specifying
* a remote procedure call to be executed on the host or card, immediately after
* myoiLibInit() completes. This capability is useful because some calls in
* MYO return immediately, but do not actually complete until after the MYO
* library is completely initialized on all peers. An example follows,
* showing how to cause MYO to execute the registered function named
* "PostMyoLibInitFunction" on the first card only:
* @code
* MyoiUserParams UserParas[64];
* UserParas[0].type = MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC;
* UserParas[0].nodeid = 1;
* SetPostLibInitFuncName(UserParas[1], "PostMyoLibInitFunction");
* UserParas[2].type = MYOI_USERPARAMS_LAST_MSG;
* if(MYO_SUCCESS != myoiLibInit(&UserParas, (void*)&myoiUserInit)) {
* printf("Failed to initialize MYO runtime\n");
* return -1;
* }
* @endcode
*
* Note, to cause PostMyoLibInitFunction to be executed on ALL cards,
* specify: MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES for the nodeid.
* That is:
* @code
* UserParas[0].nodeid = MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES;
* @endcode
*
* @param userInitFunc Shared variables and remote functions are
* registered in this routine, which is called by the runtime during * registered in this routine, which is called by the runtime during
* library initialization. * library initialization.
* @return * @return
...@@ -473,6 +499,22 @@ extern MyoError myoiTargetSharedMallocTableRegister( ...@@ -473,6 +499,22 @@ extern MyoError myoiTargetSharedMallocTableRegister(
MYOACCESSAPI MYOACCESSAPI
MyoError myoiLibInit(void * in_args, void *userInitFunc /*userInitFunc must be: MyoError (*userInitFunc)(void) */); MyoError myoiLibInit(void * in_args, void *userInitFunc /*userInitFunc must be: MyoError (*userInitFunc)(void) */);
/** @fn extern MyoError myoiSupportsFeature(MyoFeatureType myoFeature)
* @brief Supports runtime query to determine whether a feature is supported
* by the myo that is installed on the system. This function is intended to
* support client code to query the myo library to determine whether its set
* of capabilities are able to support the client's needs.
*
* @param myoFeature The feature that is to be inquired about.
* @return
* MYO_SUCCESS; if the feature is supported.
* MYO_FEATURE_NOT_IMPLEMENTED if the feature is not supported.
*
* (For more information, please also see the declaration of the MyoFeatureType enum declaration.)
**/
MYOACCESSAPI
MyoError myoiSupportsFeature(MyoFeatureType myoFeature);
/** @fn void myoiLibFini() /** @fn void myoiLibFini()
* @brief Finalize the MYO library, all resources held by the runtime are * @brief Finalize the MYO library, all resources held by the runtime are
* released by this routine. * released by this routine.
...@@ -519,18 +561,57 @@ MyoError myoiSetMemConsistent(void *in_pAddr, size_t in_Size); ...@@ -519,18 +561,57 @@ MyoError myoiSetMemConsistent(void *in_pAddr, size_t in_Size);
EXTERN_C MYOACCESSAPI unsigned int myoiMyId; /* MYO_MYID if on accelerators */ EXTERN_C MYOACCESSAPI unsigned int myoiMyId; /* MYO_MYID if on accelerators */
EXTERN_C MYOACCESSAPI volatile int myoiInitFlag; EXTERN_C MYOACCESSAPI volatile int myoiInitFlag;
//! Structure of the array element that is passed to myoiLibInit() to initialize a subset of the available cards, or
//! Structure of the array element that is passed to myoiLibInit() to initialize a subset of the available cards. //! to specify a remote call function to be called after successful myo library initialization:
typedef struct{ typedef struct {
//!type = MYOI_USERPARAMS_DEVID for each element in the array except the last element ; type = MYOI_USERPARAMS_LAST_MSG for the last element in the array. //!type = MYOI_USERPARAMS_DEVID or MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC for each element in the array except
//!the last element, type should be: MYOI_USERPARAMS_LAST_MSG.
int type; int type;
//!nodeid refers to the card index. //! nodeid refers to the 'one-based' card index. Specifying, 1 represents the first card, mic0, 2 represents the
// second card, mic1, 3 represents the third card, mic2, ....).
// NOTE: for type == MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC, specifying MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES
// for nodeid, will execute the named function, on each card in the system, mic0, mic1, mic2, .... micn.
int nodeid; int nodeid;
}MyoiUserParams; } MyoiUserParams;
//!The following two types are dealt with entirely with just one MyoiUserParams structure:
//!MYOI_USERPARAMS_DEVID maps node ids.
#define MYOI_USERPARAMS_DEVID 1 #define MYOI_USERPARAMS_DEVID 1
//!MYOI_USERPARAMS_LAST_MSG terminates the array of MyoiUserParams.
#define MYOI_USERPARAMS_LAST_MSG -1 #define MYOI_USERPARAMS_LAST_MSG -1
//!The following type requires setting the node id in a MyoiUserParams structure, and then following the struct
//!with a MyoiUserParamsPostLibInit union:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC 2
//!nodeid can be one of the following macros, or a number >=1, corresponding to the card number (1 == mic0,
//!2 == mic1, 3 == mic2, ....)
//!Setting nodeid to MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES causes the function to be called on all
//!cards:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_ALL_NODES 0
//!Setting nodeid to MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE causes the function to be called on the
//!host instead of the card:
#define MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE -1
//!The postLibInit union contains two members that serves two different purposes:
//!1. It can be used to stipulate the name of the function to be remotely called from host to card, on successful
//!myo library initialization, (member postLibInitRemoveFuncName) using the type:
//!MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC. OR
//!2. It can be an actual function pointer (member name: postLibInitHostFuncAddress) that will be called on the host,
//!on successful myo library initialization, using the type: MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC, with nodeid:
//!MYOI_USERPARAMS_POST_MYO_LIB_INIT_FUNC_HOST_NODE
typedef union {
const char *postLibInitRemoveFuncName;
void (*postLibInitHostFuncAddress)(void);
} MyoiUserParamsPostLibInit;
/* These are two macros to help get the information in a MyoiUserParamsPostLibInit union from a MyoiUserParams struct; */
#define GetPostLibInitFuncName(USERPARAMS) ((MyoiUserParamsPostLibInit *) (& (USERPARAMS)))->postLibInitRemoveFuncName
#define GetPostLibInitFuncAddr(USERPARAMS) ((MyoiUserParamsPostLibInit *) (& (USERPARAMS)))->postLibInitHostFuncAddress
/* These are two macros to help set the information in a MyoiUserParamsPostLibInit union from a MyoiUserParams struct; */
#define SetPostLibInitFuncName(USERPARAMS,FUNC_NAME) GetPostLibInitFuncName(USERPARAMS) = FUNC_NAME
#define SetPostLibInitFuncAddr(USERPARAMS,FUNC_ADDR) GetPostLibInitFuncAddr(USERPARAMS) = FUNC_ADDR
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -75,6 +75,7 @@ typedef enum { ...@@ -75,6 +75,7 @@ typedef enum {
MYO_ALREADY_EXISTS, /*!< Already Exists */ MYO_ALREADY_EXISTS, /*!< Already Exists */
MYO_EOF, /*!< EOF */ MYO_EOF, /*!< EOF */
MYO_FEATURE_NOT_IMPLEMENTED = -1, /*!< Feature not implemented (see myoiSupportsFeature(). */
} MyoError; } MyoError;
...@@ -84,6 +85,40 @@ typedef enum { ...@@ -84,6 +85,40 @@ typedef enum {
MYO_ARENA_OURS, /*!< Arena OURS Ownership */ MYO_ARENA_OURS, /*!< Arena OURS Ownership */
} MyoOwnershipType; } MyoOwnershipType;
/*! MYO Features */
typedef enum {
/*!< EVERY VALUE that is less than MYO_FEATURE_BEGIN is not implemented. */
MYO_FEATURE_BEGIN = 1, /*!< The first feature that is supported. */
MYO_FEATURE_POST_LIB_INIT = MYO_FEATURE_BEGIN, /*!< Allows specifying a function to be executed immediately */
/* after myoiLibInit() completes. This feature was implemented in version */
/* 3.3 of MPSS. */
/* MYO_FEATURE_FUTURE_CAPABILITY = 2, at some time in the future, as new features are added to MYO, new enumeration constants */
/* will be added to the MyoFeatureType, and the value of the new enumeration constant will be greater */
/* than the current value of MYO_FEATURE_LAST constant, and then the MYO_FEATURE_LAST constant too, */
/* will be changed to be the value of the new enumeration constant. For example, in April, 2014, */
/* the POST_LIB_INIT feature was implemented in version 3.3 of MPSS, and the MYO_FEATURE_BEGIN */
/* enumeration constant is the same as the MYO_FEATURE_LAST enumeration constant, and both are equal */
/* to 1. */
/* Suppose in December, 2014, a new feature is added to the MYO library, for version 3.4 of MPSS. */
/* Then, MYO_FEATURE_BEGIN enumeration constant will be still the value 1, but the MYO_FEATURE_LAST */
/* enumeration constant will be set to 2. */
/* At runtime, one client binary can determine if the MYO that is installed is capable of any */
/* capability. For example, suppose a future client binary queries version 3.3 of MYO if it is */
/* capable of some future feature. Version 3.3 of MYO will indicate that the feature is not */
/* implemented to the client. But, conversely, suppose the future client queries version 3.4 of MYO */
/* if it is capable of some future feature. Version 3.4 of MYO will indicate that the feature isd */
/* supported. */
/* */
/* Date: | MYO_FEATURE_BEGIN: | MYO_FEATURE_LAST: | MPSS VERSION: | myoiSupportsFeature(MYO_FEATURE_FUTURE_CAPABILITY) */
/* ---------------+---------------------+--------------------+---------------+--------------------------------------------------- */
/* April, 2014 | 1 | 1 | 3.3 | MYO_FEATURE_NOT_IMPLEMENTED */
/* December, 2014 | 1 | 2 | 3.4 | MYO_SUCCESS */
/* ---------------+---------------------+--------------------+---------------+--------------------------------------------------- */
MYO_FEATURE_LAST = MYO_FEATURE_POST_LIB_INIT, /*!< The last feature that is supported. */
/*!< EVERY VALUE that is greater than MYO_FEATURE_LAST is not implemented. */
/*!< EVERY VALUE that is greater than or equal to MYO_FEATURE_BEGIN AND less than or equal to MYO_FEATURE_LAST is implemented. */
} MyoFeatureType; /* (For more information, please also see myoiSupportsFeature() function declaration.) */
/************************************************************* /*************************************************************
* define the property of MYO Arena * define the property of MYO Arena
***********************************************************/ ***********************************************************/
......
...@@ -35,7 +35,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config ...@@ -35,7 +35,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config
build_dir = $(top_builddir) build_dir = $(top_builddir)
source_dir = $(top_srcdir) source_dir = $(top_srcdir)
coi_inc_dir = $(top_srcdir)/../include/coi coi_inc_dir = $(top_srcdir)/../include/coi
myo_inc_dir = $(top_srcdir)/../include/myo
include_src_dir = $(top_srcdir)/../../include include_src_dir = $(top_srcdir)/../../include
libgomp_src_dir = $(top_srcdir)/../../libgomp libgomp_src_dir = $(top_srcdir)/../../libgomp
libgomp_dir = $(build_dir)/../../libgomp libgomp_dir = $(build_dir)/../../libgomp
...@@ -53,12 +52,12 @@ target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$ ...@@ -53,12 +52,12 @@ target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$
if PLUGIN_HOST if PLUGIN_HOST
toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la
libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp
libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0 libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0
else # PLUGIN_TARGET else # PLUGIN_TARGET
plugin_includedir = $(libsubincludedir) plugin_includedir = $(libsubincludedir)
plugin_include_HEADERS = main_target_image.h plugin_include_HEADERS = main_target_image.h
AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir) AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
AM_CXXFLAGS = $(CXXFLAGS) AM_CXXFLAGS = $(CXXFLAGS)
AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic
endif endif
......
...@@ -305,7 +305,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config ...@@ -305,7 +305,6 @@ ACLOCAL_AMFLAGS = -I ../.. -I ../../config
build_dir = $(top_builddir) build_dir = $(top_builddir)
source_dir = $(top_srcdir) source_dir = $(top_srcdir)
coi_inc_dir = $(top_srcdir)/../include/coi coi_inc_dir = $(top_srcdir)/../include/coi
myo_inc_dir = $(top_srcdir)/../include/myo
include_src_dir = $(top_srcdir)/../../include include_src_dir = $(top_srcdir)/../../include
libgomp_src_dir = $(top_srcdir)/../../libgomp libgomp_src_dir = $(top_srcdir)/../../libgomp
libgomp_dir = $(build_dir)/../../libgomp libgomp_dir = $(build_dir)/../../libgomp
...@@ -321,11 +320,11 @@ target_build_dir = $(accel_search_dir)/$(accel_target)$(MULTISUBDIR)/liboffloadm ...@@ -321,11 +320,11 @@ target_build_dir = $(accel_search_dir)/$(accel_target)$(MULTISUBDIR)/liboffloadm
target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$(MULTISUBDIR) target_install_dir = $(accel_search_dir)/lib/gcc/$(accel_target)/$(gcc_version)$(MULTISUBDIR)
@PLUGIN_HOST_TRUE@toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la @PLUGIN_HOST_TRUE@toolexeclib_LTLIBRARIES = libgomp-plugin-intelmic.la
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp @PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_SOURCES = libgomp-plugin-intelmic.cpp
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include @PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=1 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_src_dir) -I$(libgomp_dir) -I$(include_src_dir) -I$(target_prefix_dir)/include -I$(target_build_dir) -I$(target_install_dir)/include
@PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0 @PLUGIN_HOST_TRUE@libgomp_plugin_intelmic_la_LDFLAGS = -L$(liboffload_dir)/.libs -loffloadmic_host -version-info 1:0:0
@PLUGIN_HOST_FALSE@plugin_includedir = $(libsubincludedir) @PLUGIN_HOST_FALSE@plugin_includedir = $(libsubincludedir)
@PLUGIN_HOST_FALSE@plugin_include_HEADERS = main_target_image.h @PLUGIN_HOST_FALSE@plugin_include_HEADERS = main_target_image.h
@PLUGIN_HOST_FALSE@AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DMYO_SUPPORT -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(myo_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir) @PLUGIN_HOST_FALSE@AM_CPPFLAGS = $(CPPFLAGS) -DLINUX -DCOI_LIBRARY_VERSION=2 -DOFFLOAD_DEBUG=1 -DSEP_SUPPORT -DTIMING_SUPPORT -DHOST_LIBRARY=0 -I$(coi_inc_dir) -I$(liboffload_src_dir) -I$(libgomp_dir)
@PLUGIN_HOST_FALSE@AM_CXXFLAGS = $(CXXFLAGS) @PLUGIN_HOST_FALSE@AM_CXXFLAGS = $(CXXFLAGS)
@PLUGIN_HOST_FALSE@AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic @PLUGIN_HOST_FALSE@AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -34,7 +34,7 @@ ...@@ -34,7 +34,7 @@
// 1. allocate element of CeanReadRanges type // 1. allocate element of CeanReadRanges type
// 2. initialized it for reading consequently contiguous ranges // 2. initialized it for reading consequently contiguous ranges
// described by "ap" argument // described by "ap" argument
CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap) CeanReadRanges * init_read_ranges_arr_desc(const Arr_Desc *ap)
{ {
CeanReadRanges * res; CeanReadRanges * res;
...@@ -57,6 +57,8 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap) ...@@ -57,6 +57,8 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap)
(ap->rank - rank) * sizeof(CeanReadDim)); (ap->rank - rank) * sizeof(CeanReadDim));
if (res == NULL) if (res == NULL)
LIBOFFLOAD_ERROR(c_malloc); LIBOFFLOAD_ERROR(c_malloc);
res->arr_desc = const_cast<Arr_Desc*>(ap);
res->current_number = 0; res->current_number = 0;
res->range_size = length; res->range_size = length;
res->last_noncont_ind = rank; res->last_noncont_ind = rank;
...@@ -82,7 +84,7 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap) ...@@ -82,7 +84,7 @@ CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap)
return res; return res;
} }
// check if ranges described by 1 argument could be transfered into ranges // check if ranges described by 1 argument could be transferred into ranges
// described by 2-nd one // described by 2-nd one
bool cean_ranges_match( bool cean_ranges_match(
CeanReadRanges * read_rng1, CeanReadRanges * read_rng1,
...@@ -118,7 +120,7 @@ bool get_next_range( ...@@ -118,7 +120,7 @@ bool get_next_range(
return true; return true;
} }
bool is_arr_desc_contiguous(const arr_desc *ap) bool is_arr_desc_contiguous(const Arr_Desc *ap)
{ {
int64_t rank = ap->rank - 1; int64_t rank = ap->rank - 1;
int64_t length = ap->dim[rank].size; int64_t length = ap->dim[rank].size;
...@@ -146,14 +148,22 @@ int64_t cean_get_transf_size(CeanReadRanges * read_rng) ...@@ -146,14 +148,22 @@ int64_t cean_get_transf_size(CeanReadRanges * read_rng)
} }
static uint64_t last_left, last_right; static uint64_t last_left, last_right;
typedef void (*fpp)(const char *spaces, uint64_t low, uint64_t high, int esize);
typedef void (*fpp)(
const char *spaces,
uint64_t low,
uint64_t high,
int esize,
bool print_values
);
static void generate_one_range( static void generate_one_range(
const char *spaces, const char *spaces,
uint64_t lrange, uint64_t lrange,
uint64_t rrange, uint64_t rrange,
fpp fp, fpp fp,
int esize int esize,
bool print_values
) )
{ {
OFFLOAD_TRACE(3, OFFLOAD_TRACE(3,
...@@ -168,20 +178,35 @@ static void generate_one_range( ...@@ -168,20 +178,35 @@ static void generate_one_range(
// Extend previous range, don't print // Extend previous range, don't print
} }
else { else {
(*fp)(spaces, last_left, last_right, esize); (*fp)(spaces, last_left, last_right, esize, print_values);
last_left = lrange; last_left = lrange;
} }
} }
last_right = rrange; last_right = rrange;
} }
static bool element_is_contiguous(
uint64_t rank,
const struct Dim_Desc *ddp
)
{
if (rank == 1) {
return (ddp[0].lower == ddp[0].upper || ddp[0].stride == 1);
}
else {
return ((ddp[0].size == (ddp[1].upper-ddp[1].lower+1)*ddp[1].size) &&
element_is_contiguous(rank-1, ddp++));
}
}
static void generate_mem_ranges_one_rank( static void generate_mem_ranges_one_rank(
const char *spaces, const char *spaces,
uint64_t base, uint64_t base,
uint64_t rank, uint64_t rank,
const struct dim_desc *ddp, const struct Dim_Desc *ddp,
fpp fp, fpp fp,
int esize int esize,
bool print_values
) )
{ {
uint64_t lindex = ddp->lindex; uint64_t lindex = ddp->lindex;
...@@ -194,35 +219,40 @@ static void generate_mem_ranges_one_rank( ...@@ -194,35 +219,40 @@ static void generate_mem_ranges_one_rank(
"generate_mem_ranges_one_rank(base=%p, rank=%lld, lindex=%lld, " "generate_mem_ranges_one_rank(base=%p, rank=%lld, lindex=%lld, "
"lower=%lld, upper=%lld, stride=%lld, size=%lld, esize=%d)\n", "lower=%lld, upper=%lld, stride=%lld, size=%lld, esize=%d)\n",
spaces, (void*)base, rank, lindex, lower, upper, stride, size, esize); spaces, (void*)base, rank, lindex, lower, upper, stride, size, esize);
if (rank == 1) {
if (element_is_contiguous(rank, ddp)) {
uint64_t lrange, rrange; uint64_t lrange, rrange;
if (stride == 1) {
lrange = base + (lower-lindex)*size; lrange = base + (lower-lindex)*size;
rrange = lrange + (upper-lower+1)*size - 1; rrange = lrange + (upper-lower+1)*size - 1;
generate_one_range(spaces, lrange, rrange, fp, esize); generate_one_range(spaces, lrange, rrange, fp, esize, print_values);
} }
else { else {
if (rank == 1) {
for (int i=lower-lindex; i<=upper-lindex; i+=stride) { for (int i=lower-lindex; i<=upper-lindex; i+=stride) {
uint64_t lrange, rrange;
lrange = base + i*size; lrange = base + i*size;
rrange = lrange + size - 1; rrange = lrange + size - 1;
generate_one_range(spaces, lrange, rrange, fp, esize); generate_one_range(spaces, lrange, rrange,
} fp, esize, print_values);
} }
} }
else { else {
for (int i=lower-lindex; i<=upper-lindex; i+=stride) { for (int i=lower-lindex; i<=upper-lindex; i+=stride) {
generate_mem_ranges_one_rank( generate_mem_ranges_one_rank(
spaces, base+i*size, rank-1, ddp+1, fp, esize); spaces, base+i*size, rank-1, ddp+1,
fp, esize, print_values);
} }
} }
}
} }
static void generate_mem_ranges( static void generate_mem_ranges(
const char *spaces, const char *spaces,
const arr_desc *adp, const Arr_Desc *adp,
bool deref, bool deref,
fpp fp fpp fp,
bool print_values
) )
{ {
uint64_t esize; uint64_t esize;
...@@ -241,13 +271,13 @@ static void generate_mem_ranges( ...@@ -241,13 +271,13 @@ static void generate_mem_ranges(
// For c_cean_var the base addr is the address of the data // For c_cean_var the base addr is the address of the data
// For c_cean_var_ptr the base addr is dereferenced to get to the data // For c_cean_var_ptr the base addr is dereferenced to get to the data
spaces, deref ? *((uint64_t*)(adp->base)) : adp->base, spaces, deref ? *((uint64_t*)(adp->base)) : adp->base,
adp->rank, &adp->dim[0], fp, esize); adp->rank, &adp->dim[0], fp, esize, print_values);
(*fp)(spaces, last_left, last_right, esize); (*fp)(spaces, last_left, last_right, esize, print_values);
} }
// returns offset and length of the data to be transferred // returns offset and length of the data to be transferred
void __arr_data_offset_and_length( void __arr_data_offset_and_length(
const arr_desc *adp, const Arr_Desc *adp,
int64_t &offset, int64_t &offset,
int64_t &length int64_t &length
) )
...@@ -284,11 +314,12 @@ void __arr_data_offset_and_length( ...@@ -284,11 +314,12 @@ void __arr_data_offset_and_length(
#if OFFLOAD_DEBUG > 0 #if OFFLOAD_DEBUG > 0
void print_range( static void print_range(
const char *spaces, const char *spaces,
uint64_t low, uint64_t low,
uint64_t high, uint64_t high,
int esize int esize,
bool print_values
) )
{ {
char buffer[1024]; char buffer[1024];
...@@ -297,7 +328,7 @@ void print_range( ...@@ -297,7 +328,7 @@ void print_range(
OFFLOAD_TRACE(3, "%s print_range(low=%p, high=%p, esize=%d)\n", OFFLOAD_TRACE(3, "%s print_range(low=%p, high=%p, esize=%d)\n",
spaces, (void*)low, (void*)high, esize); spaces, (void*)low, (void*)high, esize);
if (console_enabled < 4) { if (console_enabled < 4 || !print_values) {
return; return;
} }
OFFLOAD_TRACE(4, "%s values:\n", spaces); OFFLOAD_TRACE(4, "%s values:\n", spaces);
...@@ -340,8 +371,9 @@ void print_range( ...@@ -340,8 +371,9 @@ void print_range(
void __arr_desc_dump( void __arr_desc_dump(
const char *spaces, const char *spaces,
const char *name, const char *name,
const arr_desc *adp, const Arr_Desc *adp,
bool deref bool deref,
bool print_values
) )
{ {
OFFLOAD_TRACE(2, "%s%s CEAN expression %p\n", spaces, name, adp); OFFLOAD_TRACE(2, "%s%s CEAN expression %p\n", spaces, name, adp);
...@@ -360,7 +392,7 @@ void __arr_desc_dump( ...@@ -360,7 +392,7 @@ void __arr_desc_dump(
} }
// For c_cean_var the base addr is the address of the data // For c_cean_var the base addr is the address of the data
// For c_cean_var_ptr the base addr is dereferenced to get to the data // For c_cean_var_ptr the base addr is dereferenced to get to the data
generate_mem_ranges(spaces, adp, deref, &print_range); generate_mem_ranges(spaces, adp, deref, &print_range, print_values);
} }
} }
#endif // OFFLOAD_DEBUG #endif // OFFLOAD_DEBUG
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -32,9 +32,10 @@ ...@@ -32,9 +32,10 @@
#define CEAN_UTIL_H_INCLUDED #define CEAN_UTIL_H_INCLUDED
#include <stdint.h> #include <stdint.h>
#include "offload_util.h"
// CEAN expression representation // CEAN expression representation
struct dim_desc { struct Dim_Desc {
int64_t size; // Length of data type int64_t size; // Length of data type
int64_t lindex; // Lower index int64_t lindex; // Lower index
int64_t lower; // Lower section bound int64_t lower; // Lower section bound
...@@ -42,10 +43,10 @@ struct dim_desc { ...@@ -42,10 +43,10 @@ struct dim_desc {
int64_t stride; // Stride int64_t stride; // Stride
}; };
struct arr_desc { struct Arr_Desc {
int64_t base; // Base address int64_t base; // Base address
int64_t rank; // Rank of array int64_t rank; // Rank of array
dim_desc dim[1]; Dim_Desc dim[1];
}; };
struct CeanReadDim { struct CeanReadDim {
...@@ -55,6 +56,7 @@ struct CeanReadDim { ...@@ -55,6 +56,7 @@ struct CeanReadDim {
}; };
struct CeanReadRanges { struct CeanReadRanges {
Arr_Desc* arr_desc;
void * ptr; void * ptr;
int64_t current_number; // the number of ranges read int64_t current_number; // the number of ranges read
int64_t range_max_number; // number of contiguous ranges int64_t range_max_number; // number of contiguous ranges
...@@ -66,23 +68,23 @@ struct CeanReadRanges { ...@@ -66,23 +68,23 @@ struct CeanReadRanges {
// array descriptor length // array descriptor length
#define __arr_desc_length(rank) \ #define __arr_desc_length(rank) \
(sizeof(int64_t) + sizeof(dim_desc) * (rank)) (sizeof(int64_t) + sizeof(Dim_Desc) * (rank))
// returns offset and length of the data to be transferred // returns offset and length of the data to be transferred
void __arr_data_offset_and_length(const arr_desc *adp, DLL_LOCAL void __arr_data_offset_and_length(const Arr_Desc *adp,
int64_t &offset, int64_t &offset,
int64_t &length); int64_t &length);
// define if data array described by argument is contiguous one // define if data array described by argument is contiguous one
bool is_arr_desc_contiguous(const arr_desc *ap); DLL_LOCAL bool is_arr_desc_contiguous(const Arr_Desc *ap);
// allocate element of CeanReadRanges type initialized // allocate element of CeanReadRanges type initialized
// to read consequently contiguous ranges described by "ap" argument // to read consequently contiguous ranges described by "ap" argument
CeanReadRanges * init_read_ranges_arr_desc(const arr_desc *ap); DLL_LOCAL CeanReadRanges * init_read_ranges_arr_desc(const Arr_Desc *ap);
// check if ranges described by 1 argument could be transfered into ranges // check if ranges described by 1 argument could be transferred into ranges
// described by 2-nd one // described by 2-nd one
bool cean_ranges_match( DLL_LOCAL bool cean_ranges_match(
CeanReadRanges * read_rng1, CeanReadRanges * read_rng1,
CeanReadRanges * read_rng2 CeanReadRanges * read_rng2
); );
...@@ -90,27 +92,27 @@ bool cean_ranges_match( ...@@ -90,27 +92,27 @@ bool cean_ranges_match(
// first argument - returned value by call to init_read_ranges_arr_desc. // first argument - returned value by call to init_read_ranges_arr_desc.
// returns true if offset and length of next range is set successfuly. // returns true if offset and length of next range is set successfuly.
// returns false if the ranges is over. // returns false if the ranges is over.
bool get_next_range( DLL_LOCAL bool get_next_range(
CeanReadRanges * read_rng, CeanReadRanges * read_rng,
int64_t *offset int64_t *offset
); );
// returns number of transfered bytes // returns number of transferred bytes
int64_t cean_get_transf_size(CeanReadRanges * read_rng); DLL_LOCAL int64_t cean_get_transf_size(CeanReadRanges * read_rng);
#if OFFLOAD_DEBUG > 0 #if OFFLOAD_DEBUG > 0
// prints array descriptor contents to stderr // prints array descriptor contents to stderr
void __arr_desc_dump( DLL_LOCAL void __arr_desc_dump(
const char *spaces, const char *spaces,
const char *name, const char *name,
const arr_desc *adp, const Arr_Desc *adp,
bool dereference); bool dereference,
bool print_values);
#define ARRAY_DESC_DUMP(spaces, name, adp, dereference, print_values) \
if (console_enabled >= 2) \
__arr_desc_dump(spaces, name, adp, dereference, print_values);
#else #else
#define __arr_desc_dump( #define ARRAY_DESC_DUMP(spaces, name, adp, dereference, print_values)
spaces,
name,
adp,
dereference)
#endif // OFFLOAD_DEBUG #endif // OFFLOAD_DEBUG
#endif // CEAN_UTIL_H_INCLUDED #endif // CEAN_UTIL_H_INCLUDED
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -50,6 +50,13 @@ COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*, const void*, ...@@ -50,6 +50,13 @@ COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*, const void*,
const char**, uint8_t, const char*, const char**, uint8_t, const char*,
uint64_t, const char*, const char*, uint64_t, const char*, const char*,
uint64_t, COIPROCESS*); uint64_t, COIPROCESS*);
COIRESULT (*ProcessCreateFromFile)(COIENGINE, const char*,
int, const char**, uint8_t,
const char**, uint8_t, const char*,
uint64_t, const char*,COIPROCESS*);
COIRESULT (*ProcessSetCacheSize)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*);
COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t, int8_t*, uint32_t*); COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t, int8_t*, uint32_t*);
COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t, const char**, COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t, const char**,
COIFUNCTION*); COIFUNCTION*);
...@@ -57,6 +64,8 @@ COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS, const void*, uint64_t, ...@@ -57,6 +64,8 @@ COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS, const void*, uint64_t,
const char*, const char*, const char*, const char*,
const char*, uint64_t, uint32_t, const char*, uint64_t, uint32_t,
COILIBRARY*); COILIBRARY*);
COIRESULT (*ProcessUnloadLibrary)(COIPROCESS,
COILIBRARY);
COIRESULT (*ProcessRegisterLibraries)(uint32_t, const void**, const uint64_t*, COIRESULT (*ProcessRegisterLibraries)(uint32_t, const void**, const uint64_t*,
const char**, const uint64_t*); const char**, const uint64_t*);
...@@ -80,6 +89,13 @@ COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*, uint64_t, ...@@ -80,6 +89,13 @@ COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*); COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t, COI_COPY_TYPE, COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*); uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferReadMultiD)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferWriteMultiD)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t, COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*); COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*); COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*);
...@@ -92,6 +108,20 @@ COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t, uint8_t, uint32_t*, ...@@ -92,6 +108,20 @@ COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t, uint8_t, uint32_t*,
uint64_t (*PerfGetCycleFrequency)(void); uint64_t (*PerfGetCycleFrequency)(void);
COIRESULT (*PipelineClearCPUMask) (COI_CPU_MASK);
COIRESULT (*PipelineSetCPUMask) (COIPROCESS, uint32_t,
uint8_t, COI_CPU_MASK);
COIRESULT (*EngineGetInfo)(COIENGINE, uint32_t, COI_ENGINE_INFO*);
COIRESULT (*EventRegisterCallback)(
const COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t);
COIRESULT (*ProcessConfigureDMA)(const uint64_t, const int);
bool init(void) bool init(void)
{ {
#ifndef TARGET_WINNT #ifndef TARGET_WINNT
...@@ -140,6 +170,32 @@ bool init(void) ...@@ -140,6 +170,32 @@ bool init(void)
return false; return false;
} }
ProcessSetCacheSize =
(COIRESULT (*)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIProcessSetCacheSize", COI_VERSION1);
if (ProcessSetCacheSize == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessSetCacheSize");
#if 0 // for now disable as ProcessSetCacheSize is not available on < MPSS 3.4
fini();
return false;
#endif
}
ProcessCreateFromFile =
(COIRESULT (*)(COIENGINE, const char*, int, const char**, uint8_t,
const char**, uint8_t, const char*, uint64_t,
const char*, COIPROCESS*))
DL_sym(lib_handle, "COIProcessCreateFromFile", COI_VERSION1);
if (ProcessCreateFromFile == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessCreateFromFile");
fini();
return false;
}
ProcessDestroy = ProcessDestroy =
(COIRESULT (*)(COIPROCESS, int32_t, uint8_t, int8_t*, (COIRESULT (*)(COIPROCESS, int32_t, uint8_t, int8_t*,
uint32_t*)) uint32_t*))
...@@ -173,6 +229,17 @@ bool init(void) ...@@ -173,6 +229,17 @@ bool init(void)
return false; return false;
} }
ProcessUnloadLibrary =
(COIRESULT (*)(COIPROCESS,
COILIBRARY))
DL_sym(lib_handle, "COIProcessUnloadLibrary", COI_VERSION1);
if (ProcessUnloadLibrary == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIProcessUnloadLibrary");
fini();
return false;
}
ProcessRegisterLibraries = ProcessRegisterLibraries =
(COIRESULT (*)(uint32_t, const void**, const uint64_t*, const char**, (COIRESULT (*)(uint32_t, const void**, const uint64_t*, const char**,
const uint64_t*)) const uint64_t*))
...@@ -295,6 +362,22 @@ bool init(void) ...@@ -295,6 +362,22 @@ bool init(void)
return false; return false;
} }
BufferReadMultiD =
(COIRESULT (*)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIBufferReadMultiD", COI_VERSION1);
// We accept that coi library has no COIBufferReadMultiD routine.
// So there is no check for zero value
BufferWriteMultiD =
(COIRESULT (*)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*))
DL_sym(lib_handle, "COIBufferWriteMultiD", COI_VERSION1);
// We accept that coi library has no COIBufferWriteMultiD routine.
// So there is no check for zero value
BufferCopy = BufferCopy =
(COIRESULT (*)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t, (COIRESULT (*)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, uint64_t,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COI_COPY_TYPE, uint32_t, const COIEVENT*,
...@@ -350,6 +433,47 @@ bool init(void) ...@@ -350,6 +433,47 @@ bool init(void)
return false; return false;
} }
PipelineClearCPUMask =
(COIRESULT (*)(COI_CPU_MASK))
DL_sym(lib_handle, "COIPipelineClearCPUMask", COI_VERSION1);
if (PipelineClearCPUMask == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIPipelineClearCPUMask");
fini();
return false;
}
PipelineSetCPUMask =
(COIRESULT (*)(COIPROCESS, uint32_t,uint8_t, COI_CPU_MASK))
DL_sym(lib_handle, "COIPipelineSetCPUMask", COI_VERSION1);
if (PipelineSetCPUMask == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIPipelineSetCPUMask");
fini();
return false;
}
EngineGetInfo =
(COIRESULT (*)(COIENGINE, uint32_t, COI_ENGINE_INFO*))
DL_sym(lib_handle, "COIEngineGetInfo", COI_VERSION1);
if (COIEngineGetInfo == 0) {
OFFLOAD_DEBUG_TRACE(2, "Failed to find %s in COI library\n",
"COIEngineGetInfo");
fini();
return false;
}
EventRegisterCallback =
(COIRESULT (*)(COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t))
DL_sym(lib_handle, "COIEventRegisterCallback", COI_VERSION1);
ProcessConfigureDMA =
(COIRESULT (*)(const uint64_t, const int))
DL_sym(lib_handle, "COIProcessConfigureDMA", COI_VERSION1);
is_available = true; is_available = true;
return true; return true;
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -28,7 +28,7 @@ ...@@ -28,7 +28,7 @@
*/ */
// The interface betwen offload library and the COI API on the host // The interface between offload library and the COI API on the host
#ifndef COI_CLIENT_H_INCLUDED #ifndef COI_CLIENT_H_INCLUDED
#define COI_CLIENT_H_INCLUDED #define COI_CLIENT_H_INCLUDED
...@@ -54,16 +54,16 @@ ...@@ -54,16 +54,16 @@
// COI library interface // COI library interface
namespace COI { namespace COI {
extern bool init(void); DLL_LOCAL extern bool init(void);
extern void fini(void); DLL_LOCAL extern void fini(void);
extern bool is_available; DLL_LOCAL extern bool is_available;
// pointers to functions from COI library // pointers to functions from COI library
extern COIRESULT (*EngineGetCount)(COI_ISA_TYPE, uint32_t*); DLL_LOCAL extern COIRESULT (*EngineGetCount)(COI_ISA_TYPE, uint32_t*);
extern COIRESULT (*EngineGetHandle)(COI_ISA_TYPE, uint32_t, COIENGINE*); DLL_LOCAL extern COIRESULT (*EngineGetHandle)(COI_ISA_TYPE, uint32_t, COIENGINE*);
extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*, DLL_LOCAL extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*,
const void*, uint64_t, int, const void*, uint64_t, int,
const char**, uint8_t, const char**, uint8_t,
const char**, uint8_t, const char**, uint8_t,
...@@ -71,12 +71,23 @@ extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*, ...@@ -71,12 +71,23 @@ extern COIRESULT (*ProcessCreateFromMemory)(COIENGINE, const char*,
const char*, const char*,
const char*, uint64_t, const char*, uint64_t,
COIPROCESS*); COIPROCESS*);
extern COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t, DLL_LOCAL extern COIRESULT (*ProcessCreateFromFile)(COIENGINE, const char*, int,
const char**, uint8_t,
const char**,
uint8_t,
const char*,
uint64_t,
const char*,
COIPROCESS*);
DLL_LOCAL extern COIRESULT (*ProcessSetCacheSize)(COIPROCESS, uint64_t, uint32_t,
uint64_t, uint32_t, uint32_t,
const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*ProcessDestroy)(COIPROCESS, int32_t, uint8_t,
int8_t*, uint32_t*); int8_t*, uint32_t*);
extern COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t, DLL_LOCAL extern COIRESULT (*ProcessGetFunctionHandles)(COIPROCESS, uint32_t,
const char**, const char**,
COIFUNCTION*); COIFUNCTION*);
extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS, DLL_LOCAL extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS,
const void*, const void*,
uint64_t, uint64_t,
const char*, const char*,
...@@ -85,54 +96,80 @@ extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS, ...@@ -85,54 +96,80 @@ extern COIRESULT (*ProcessLoadLibraryFromMemory)(COIPROCESS,
uint64_t, uint64_t,
uint32_t, uint32_t,
COILIBRARY*); COILIBRARY*);
extern COIRESULT (*ProcessRegisterLibraries)(uint32_t,
DLL_LOCAL extern COIRESULT (*ProcessUnloadLibrary)(COIPROCESS,
COILIBRARY);
DLL_LOCAL extern COIRESULT (*ProcessRegisterLibraries)(uint32_t,
const void**, const void**,
const uint64_t*, const uint64_t*,
const char**, const char**,
const uint64_t*); const uint64_t*);
extern COIRESULT (*PipelineCreate)(COIPROCESS, COI_CPU_MASK, uint32_t, DLL_LOCAL extern COIRESULT (*PipelineCreate)(COIPROCESS, COI_CPU_MASK, uint32_t,
COIPIPELINE*); COIPIPELINE*);
extern COIRESULT (*PipelineDestroy)(COIPIPELINE); DLL_LOCAL extern COIRESULT (*PipelineDestroy)(COIPIPELINE);
extern COIRESULT (*PipelineRunFunction)(COIPIPELINE, COIFUNCTION, DLL_LOCAL extern COIRESULT (*PipelineRunFunction)(COIPIPELINE, COIFUNCTION,
uint32_t, const COIBUFFER*, uint32_t, const COIBUFFER*,
const COI_ACCESS_FLAGS*, const COI_ACCESS_FLAGS*,
uint32_t, const COIEVENT*, uint32_t, const COIEVENT*,
const void*, uint16_t, void*, const void*, uint16_t, void*,
uint16_t, COIEVENT*); uint16_t, COIEVENT*);
extern COIRESULT (*BufferCreate)(uint64_t, COI_BUFFER_TYPE, uint32_t, DLL_LOCAL extern COIRESULT (*BufferCreate)(uint64_t, COI_BUFFER_TYPE, uint32_t,
const void*, uint32_t, const void*, uint32_t,
const COIPROCESS*, COIBUFFER*); const COIPROCESS*, COIBUFFER*);
extern COIRESULT (*BufferCreateFromMemory)(uint64_t, COI_BUFFER_TYPE, DLL_LOCAL extern COIRESULT (*BufferCreateFromMemory)(uint64_t, COI_BUFFER_TYPE,
uint32_t, void*, uint32_t, void*,
uint32_t, const COIPROCESS*, uint32_t, const COIPROCESS*,
COIBUFFER*); COIBUFFER*);
extern COIRESULT (*BufferDestroy)(COIBUFFER); DLL_LOCAL extern COIRESULT (*BufferDestroy)(COIBUFFER);
extern COIRESULT (*BufferMap)(COIBUFFER, uint64_t, uint64_t, DLL_LOCAL extern COIRESULT (*BufferMap)(COIBUFFER, uint64_t, uint64_t,
COI_MAP_TYPE, uint32_t, const COIEVENT*, COI_MAP_TYPE, uint32_t, const COIEVENT*,
COIEVENT*, COIMAPINSTANCE*, void**); COIEVENT*, COIMAPINSTANCE*, void**);
extern COIRESULT (*BufferUnmap)(COIMAPINSTANCE, uint32_t, DLL_LOCAL extern COIRESULT (*BufferUnmap)(COIMAPINSTANCE, uint32_t,
const COIEVENT*, COIEVENT*); const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*, DLL_LOCAL extern COIRESULT (*BufferWrite)(COIBUFFER, uint64_t, const void*,
uint64_t, COI_COPY_TYPE, uint32_t, uint64_t, COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*); const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t, DLL_LOCAL extern COIRESULT (*BufferRead)(COIBUFFER, uint64_t, void*, uint64_t,
COI_COPY_TYPE, uint32_t, COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*); const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t, DLL_LOCAL extern COIRESULT (*BufferReadMultiD)(COIBUFFER, uint64_t,
void *, void *, COI_COPY_TYPE,
uint32_t, const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*BufferWriteMultiD)(COIBUFFER, const COIPROCESS,
uint64_t, void *, void *,
COI_COPY_TYPE, uint32_t, const COIEVENT*, COIEVENT*);
DLL_LOCAL extern COIRESULT (*BufferCopy)(COIBUFFER, COIBUFFER, uint64_t, uint64_t,
uint64_t, COI_COPY_TYPE, uint32_t, uint64_t, COI_COPY_TYPE, uint32_t,
const COIEVENT*, COIEVENT*); const COIEVENT*, COIEVENT*);
extern COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*); DLL_LOCAL extern COIRESULT (*BufferGetSinkAddress)(COIBUFFER, uint64_t*);
extern COIRESULT (*BufferSetState)(COIBUFFER, COIPROCESS, COI_BUFFER_STATE, DLL_LOCAL extern COIRESULT (*BufferSetState)(COIBUFFER, COIPROCESS, COI_BUFFER_STATE,
COI_BUFFER_MOVE_FLAG, uint32_t, COI_BUFFER_MOVE_FLAG, uint32_t,
const COIEVENT*, COIEVENT*); const COIEVENT*, COIEVENT*);
extern COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t, DLL_LOCAL extern COIRESULT (*EventWait)(uint16_t, const COIEVENT*, int32_t,
uint8_t, uint32_t*, uint32_t*); uint8_t, uint32_t*, uint32_t*);
extern uint64_t (*PerfGetCycleFrequency)(void); DLL_LOCAL extern uint64_t (*PerfGetCycleFrequency)(void);
DLL_LOCAL extern COIRESULT (*ProcessConfigureDMA)(const uint64_t, const int);
extern COIRESULT (*PipelineClearCPUMask)(COI_CPU_MASK);
extern COIRESULT (*PipelineSetCPUMask)(COIPROCESS, uint32_t,
uint8_t, COI_CPU_MASK);
extern COIRESULT (*EngineGetInfo)(COIENGINE, uint32_t, COI_ENGINE_INFO*);
extern COIRESULT (*EventRegisterCallback)(
const COIEVENT,
void (*)(COIEVENT, const COIRESULT, const void*),
const void*,
const uint64_t);
const int DMA_MODE_READ_WRITE = 1;
} // namespace COI } // namespace COI
#endif // COI_CLIENT_H_INCLUDED #endif // COI_CLIENT_H_INCLUDED
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -38,6 +38,22 @@ ...@@ -38,6 +38,22 @@
#include "../offload_myo_target.h" // for __offload_myoLibInit/Fini #include "../offload_myo_target.h" // for __offload_myoLibInit/Fini
#endif // MYO_SUPPORT #endif // MYO_SUPPORT
#if !defined(CPU_COUNT)
// if CPU_COUNT is not defined count number of CPUs manually
static
int my_cpu_count(cpu_set_t const *cpu_set)
{
int res = 0;
for (int i = 0; i < sizeof(cpu_set_t) / sizeof(__cpu_mask); ++i) {
res += __builtin_popcountl(cpu_set->__bits[i]);
}
return res;
}
// Map CPU_COUNT to our function
#define CPU_COUNT(x) my_cpu_count(x)
#endif
COINATIVELIBEXPORT COINATIVELIBEXPORT
void server_compute( void server_compute(
uint32_t buffer_count, uint32_t buffer_count,
...@@ -118,6 +134,20 @@ void server_var_table_copy( ...@@ -118,6 +134,20 @@ void server_var_table_copy(
__offload_vars.table_copy(buffers[0], *static_cast<int64_t*>(misc_data)); __offload_vars.table_copy(buffers[0], *static_cast<int64_t*>(misc_data));
} }
COINATIVELIBEXPORT
void server_set_stream_affinity(
uint32_t buffer_count,
void** buffers,
uint64_t* buffers_len,
void* misc_data,
uint16_t misc_data_len,
void* return_data,
uint16_t return_data_len
)
{
/* kmp affinity is not supported by GCC. */
}
#ifdef MYO_SUPPORT #ifdef MYO_SUPPORT
// temporary workaround for blocking behavior of myoiLibInit/Fini calls // temporary workaround for blocking behavior of myoiLibInit/Fini calls
COINATIVELIBEXPORT COINATIVELIBEXPORT
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -28,7 +28,7 @@ ...@@ -28,7 +28,7 @@
*/ */
//The interface betwen offload library and the COI API on the target. // The interface between offload library and the COI API on the target
#ifndef COI_SERVER_H_INCLUDED #ifndef COI_SERVER_H_INCLUDED
#define COI_SERVER_H_INCLUDED #define COI_SERVER_H_INCLUDED
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -72,7 +72,7 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE( ...@@ -72,7 +72,7 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE(
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize); OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
// initalize all devices is init_type is on_offload_all // initialize all devices is init_type is on_offload_all
if (retval && __offload_init_type == c_init_on_offload_all) { if (retval && __offload_init_type == c_init_on_offload_all) {
for (int i = 0; i < mic_engines_total; i++) { for (int i = 0; i < mic_engines_total; i++) {
mic_engines[i].init(); mic_engines[i].init();
...@@ -241,7 +241,128 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1( ...@@ -241,7 +241,128 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1(
return ofld; return ofld;
} }
int offload_offload_wrap( extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE2(
TARGET_TYPE target_type,
int target_number,
int is_optional,
_Offload_status* status,
const char* file,
uint64_t line,
const void** stream
)
{
bool retval;
OFFLOAD ofld;
// initialize status
if (status != 0) {
status->result = OFFLOAD_UNAVAILABLE;
status->device_number = -1;
status->data_sent = 0;
status->data_received = 0;
}
// make sure libray is initialized
retval = __offload_init_library();
// OFFLOAD_TIMER_INIT must follow call to __offload_init_library
OffloadHostTimerData * timer_data = OFFLOAD_TIMER_INIT(file, line);
OFFLOAD_TIMER_START(timer_data, c_offload_host_total_offload);
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
// initalize all devices if init_type is on_offload_all
if (retval && __offload_init_type == c_init_on_offload_all) {
for (int i = 0; i < mic_engines_total; i++) {
mic_engines[i].init();
}
}
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_initialize);
OFFLOAD_TIMER_START(timer_data, c_offload_host_target_acquire);
if (target_type == TARGET_HOST) {
// Host always available
retval = true;
}
else if (target_type == TARGET_MIC) {
_Offload_stream handle = *(reinterpret_cast<_Offload_stream*>(stream));
Stream * stream = handle ? Stream::find_stream(handle, false) : NULL;
if (target_number >= -1) {
if (retval) {
// device number is defined by stream
if (stream) {
target_number = stream->get_device();
target_number = target_number % mic_engines_total;
}
// reserve device in ORSL
if (target_number != -1) {
if (is_optional) {
if (!ORSL::try_reserve(target_number)) {
target_number = -1;
}
}
else {
if (!ORSL::reserve(target_number)) {
target_number = -1;
}
}
}
// initialize device
if (target_number >= 0 &&
__offload_init_type == c_init_on_offload) {
OFFLOAD_TIMER_START(timer_data, c_offload_host_initialize);
mic_engines[target_number].init();
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_initialize);
}
}
else {
// fallback to CPU
target_number = -1;
}
if (!(target_number == -1 && handle == 0)) {
if (target_number < 0 || !retval) {
if (!is_optional && status == 0) {
LIBOFFLOAD_ERROR(c_device_is_not_available);
exit(1);
}
retval = false;
}
}
}
else {
LIBOFFLOAD_ERROR(c_invalid_device_number);
exit(1);
}
}
if (retval) {
ofld = new OffloadDescriptor(target_number, status,
!is_optional, false, timer_data);
OFFLOAD_TIMER_HOST_MIC_NUM(timer_data, target_number);
Offload_Report_Prolog(timer_data);
OFFLOAD_DEBUG_TRACE_1(2, timer_data->offload_number, c_offload_start,
"Starting offload: target_type = %d, "
"number = %d, is_optional = %d\n",
target_type, target_number, is_optional);
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_target_acquire);
}
else {
ofld = NULL;
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_target_acquire);
OFFLOAD_TIMER_STOP(timer_data, c_offload_host_total_offload);
offload_report_free_data(timer_data);
}
return ofld;
}
static int offload_offload_wrap(
OFFLOAD ofld, OFFLOAD ofld,
const char *name, const char *name,
int is_empty, int is_empty,
...@@ -252,12 +373,15 @@ int offload_offload_wrap( ...@@ -252,12 +373,15 @@ int offload_offload_wrap(
const void **waits, const void **waits,
const void **signal, const void **signal,
int entry_id, int entry_id,
const void *stack_addr const void *stack_addr,
OffloadFlags offload_flags
) )
{ {
bool ret = ofld->offload(name, is_empty, vars, vars2, num_vars, bool ret = ofld->offload(name, is_empty, vars, vars2, num_vars,
waits, num_waits, signal, entry_id, stack_addr); waits, num_waits, signal, entry_id,
if (!ret || signal == 0) { stack_addr, offload_flags);
if (!ret || (signal == 0 && ofld->get_stream() == 0 &&
!offload_flags.bits.omp_async)) {
delete ofld; delete ofld;
} }
return ret; return ret;
...@@ -278,7 +402,7 @@ extern "C" int OFFLOAD_OFFLOAD1( ...@@ -278,7 +402,7 @@ extern "C" int OFFLOAD_OFFLOAD1(
return offload_offload_wrap(ofld, name, is_empty, return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2, num_vars, vars, vars2,
num_waits, waits, num_waits, waits,
signal, NULL, NULL); signal, 0, NULL, {0});
} }
extern "C" int OFFLOAD_OFFLOAD2( extern "C" int OFFLOAD_OFFLOAD2(
...@@ -298,7 +422,35 @@ extern "C" int OFFLOAD_OFFLOAD2( ...@@ -298,7 +422,35 @@ extern "C" int OFFLOAD_OFFLOAD2(
return offload_offload_wrap(ofld, name, is_empty, return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2, num_vars, vars, vars2,
num_waits, waits, num_waits, waits,
signal, entry_id, stack_addr); signal, entry_id, stack_addr, {0});
}
extern "C" int OFFLOAD_OFFLOAD3(
OFFLOAD ofld,
const char *name,
int is_empty,
int num_vars,
VarDesc *vars,
VarDesc2 *vars2,
int num_waits,
const void** waits,
const void** signal,
int entry_id,
const void *stack_addr,
OffloadFlags offload_flags,
const void** stream
)
{
// 1. if the source is compiled with -traceback then stream is 0
// 2. if offload has a stream clause then stream is address of stream value
if (stream) {
ofld->set_stream(*(reinterpret_cast<_Offload_stream *>(stream)));
}
return offload_offload_wrap(ofld, name, is_empty,
num_vars, vars, vars2,
num_waits, waits,
signal, entry_id, stack_addr, offload_flags);
} }
extern "C" int OFFLOAD_OFFLOAD( extern "C" int OFFLOAD_OFFLOAD(
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -39,9 +39,11 @@ ...@@ -39,9 +39,11 @@
#define OFFLOAD_TARGET_ACQUIRE OFFLOAD_PREFIX(target_acquire) #define OFFLOAD_TARGET_ACQUIRE OFFLOAD_PREFIX(target_acquire)
#define OFFLOAD_TARGET_ACQUIRE1 OFFLOAD_PREFIX(target_acquire1) #define OFFLOAD_TARGET_ACQUIRE1 OFFLOAD_PREFIX(target_acquire1)
#define OFFLOAD_TARGET_ACQUIRE2 OFFLOAD_PREFIX(target_acquire2)
#define OFFLOAD_OFFLOAD OFFLOAD_PREFIX(offload) #define OFFLOAD_OFFLOAD OFFLOAD_PREFIX(offload)
#define OFFLOAD_OFFLOAD1 OFFLOAD_PREFIX(offload1) #define OFFLOAD_OFFLOAD1 OFFLOAD_PREFIX(offload1)
#define OFFLOAD_OFFLOAD2 OFFLOAD_PREFIX(offload2) #define OFFLOAD_OFFLOAD2 OFFLOAD_PREFIX(offload2)
#define OFFLOAD_OFFLOAD3 OFFLOAD_PREFIX(offload3)
#define OFFLOAD_CALL_COUNT OFFLOAD_PREFIX(offload_call_count) #define OFFLOAD_CALL_COUNT OFFLOAD_PREFIX(offload_call_count)
...@@ -75,6 +77,26 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1( ...@@ -75,6 +77,26 @@ extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE1(
uint64_t line uint64_t line
); );
/*! \fn OFFLOAD_TARGET_ACQUIRE2
\brief Attempt to acquire the target.
\param target_type The type of target.
\param target_number The device number.
\param is_optional Whether CPU fall-back is allowed.
\param status Address of variable to hold offload status.
\param file Filename in which this offload occurred.
\param line Line number in the file where this offload occurred.
\param stream Pointer to stream value.
*/
extern "C" OFFLOAD OFFLOAD_TARGET_ACQUIRE2(
TARGET_TYPE target_type,
int target_number,
int is_optional,
_Offload_status* status,
const char* file,
uint64_t line,
const void** stream
);
/*! \fn OFFLOAD_OFFLOAD1 /*! \fn OFFLOAD_OFFLOAD1
\brief Run function on target using interface for old data persistence. \brief Run function on target using interface for old data persistence.
\param o Offload descriptor created by OFFLOAD_TARGET_ACQUIRE. \param o Offload descriptor created by OFFLOAD_TARGET_ACQUIRE.
...@@ -127,6 +149,40 @@ extern "C" int OFFLOAD_OFFLOAD2( ...@@ -127,6 +149,40 @@ extern "C" int OFFLOAD_OFFLOAD2(
const void *stack_addr const void *stack_addr
); );
/*! \fn OFFLOAD_OFFLOAD3
\brief Run function on target, API introduced in 15.0 Update 1
\brief when targetptr, preallocated feature was introduced.
\param o Offload descriptor created by OFFLOAD_TARGET_ACQUIRE.
\param name Name of offload entry point.
\param is_empty If no code to execute (e.g. offload_transfer)
\param num_vars Number of variable descriptors.
\param vars Pointer to VarDesc array.
\param vars2 Pointer to VarDesc2 array.
\param num_waits Number of "wait" values.
\param waits Pointer to array of wait values.
\param signal Pointer to signal value or NULL.
\param entry_id A signature for the function doing the offload.
\param stack_addr The stack frame address of the function doing offload.
\param offload_flags Flags to indicate Fortran traceback, OpenMP async.
\param stream Pointer to stream value or NULL.
*/
extern "C" int OFFLOAD_OFFLOAD3(
OFFLOAD ofld,
const char *name,
int is_empty,
int num_vars,
VarDesc *vars,
VarDesc2 *vars2,
int num_waits,
const void** waits,
const void** signal,
int entry_id,
const void *stack_addr,
OffloadFlags offload_flags,
const void** stream
);
// Run function on target (obsolete). // Run function on target (obsolete).
// @param o OFFLOAD object // @param o OFFLOAD object
// @param name function name // @param name function name
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -32,6 +32,7 @@ ...@@ -32,6 +32,7 @@
#define DV_UTIL_H_INCLUDED #define DV_UTIL_H_INCLUDED
#include <stdint.h> #include <stdint.h>
#include "offload_util.h"
// Dope vector declarations // Dope vector declarations
#define ArrDescMaxArrayRank 31 #define ArrDescMaxArrayRank 31
...@@ -64,18 +65,18 @@ typedef struct ArrDesc { ...@@ -64,18 +65,18 @@ typedef struct ArrDesc {
typedef ArrDesc* pArrDesc; typedef ArrDesc* pArrDesc;
bool __dv_is_contiguous(const ArrDesc *dvp); DLL_LOCAL bool __dv_is_contiguous(const ArrDesc *dvp);
bool __dv_is_allocated(const ArrDesc *dvp); DLL_LOCAL bool __dv_is_allocated(const ArrDesc *dvp);
uint64_t __dv_data_length(const ArrDesc *dvp); DLL_LOCAL uint64_t __dv_data_length(const ArrDesc *dvp);
uint64_t __dv_data_length(const ArrDesc *dvp, int64_t nelems); DLL_LOCAL uint64_t __dv_data_length(const ArrDesc *dvp, int64_t nelems);
CeanReadRanges * init_read_ranges_dv(const ArrDesc *dvp); DLL_LOCAL CeanReadRanges * init_read_ranges_dv(const ArrDesc *dvp);
#if OFFLOAD_DEBUG > 0 #if OFFLOAD_DEBUG > 0
void __dv_desc_dump(const char *name, const ArrDesc *dvp); DLL_LOCAL void __dv_desc_dump(const char *name, const ArrDesc *dvp);
#else // OFFLOAD_DEBUG #else // OFFLOAD_DEBUG
#define __dv_desc_dump(name, dvp) #define __dv_desc_dump(name, dvp)
#endif // OFFLOAD_DEBUG #endif // OFFLOAD_DEBUG
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -62,8 +62,8 @@ ...@@ -62,8 +62,8 @@
/* Environment variable for target executable run command. */ /* Environment variable for target executable run command. */
#define OFFLOAD_EMUL_RUN_ENV "OFFLOAD_EMUL_RUN" #define OFFLOAD_EMUL_RUN_ENV "OFFLOAD_EMUL_RUN"
/* Environment variable for number ok KNC devices. */ /* Environment variable for number of emulated devices. */
#define OFFLOAD_EMUL_KNC_NUM_ENV "OFFLOAD_EMUL_KNC_NUM" #define OFFLOAD_EMUL_NUM_ENV "OFFLOAD_EMUL_NUM"
/* Path to engine directory. */ /* Path to engine directory. */
...@@ -133,6 +133,7 @@ typedef enum ...@@ -133,6 +133,7 @@ typedef enum
CMD_BUFFER_UNMAP, CMD_BUFFER_UNMAP,
CMD_GET_FUNCTION_HANDLE, CMD_GET_FUNCTION_HANDLE,
CMD_OPEN_LIBRARY, CMD_OPEN_LIBRARY,
CMD_CLOSE_LIBRARY,
CMD_RUN_FUNCTION, CMD_RUN_FUNCTION,
CMD_SHUTDOWN CMD_SHUTDOWN
} cmd_t; } cmd_t;
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -109,8 +109,8 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) () ...@@ -109,8 +109,8 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
strlen (PIPE_HOST_PATH) + strlen (mic_dir) + 1); strlen (PIPE_HOST_PATH) + strlen (mic_dir) + 1);
MALLOC (char *, pipe_target_path, MALLOC (char *, pipe_target_path,
strlen (PIPE_TARGET_PATH) + strlen (mic_dir) + 1); strlen (PIPE_TARGET_PATH) + strlen (mic_dir) + 1);
sprintf (pipe_host_path, "%s"PIPE_HOST_PATH, mic_dir); sprintf (pipe_host_path, "%s" PIPE_HOST_PATH, mic_dir);
sprintf (pipe_target_path, "%s"PIPE_TARGET_PATH, mic_dir); sprintf (pipe_target_path, "%s" PIPE_TARGET_PATH, mic_dir);
pipe_host = open (pipe_host_path, O_CLOEXEC | O_WRONLY); pipe_host = open (pipe_host_path, O_CLOEXEC | O_WRONLY);
if (pipe_host < 0) if (pipe_host < 0)
COIERROR ("Cannot open target-to-host pipe."); COIERROR ("Cannot open target-to-host pipe.");
...@@ -237,6 +237,7 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) () ...@@ -237,6 +237,7 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
{ {
char *lib_path; char *lib_path;
size_t len; size_t len;
void *handle;
/* Receive data from host. */ /* Receive data from host. */
READ (pipe_target, &len, sizeof (size_t)); READ (pipe_target, &len, sizeof (size_t));
...@@ -244,14 +245,28 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) () ...@@ -244,14 +245,28 @@ SYMBOL_VERSION (COIProcessWaitForShutdown, 1) ()
READ (pipe_target, lib_path, len); READ (pipe_target, lib_path, len);
/* Open library. */ /* Open library. */
if (dlopen (lib_path, RTLD_LAZY | RTLD_GLOBAL) == 0) handle = dlopen (lib_path, RTLD_LAZY | RTLD_GLOBAL);
if (handle == NULL)
COIERROR ("Cannot load %s: %s", lib_path, dlerror ()); COIERROR ("Cannot load %s: %s", lib_path, dlerror ());
/* Send data to host. */
WRITE (pipe_host, &handle, sizeof (void *));
/* Clean up. */ /* Clean up. */
free (lib_path); free (lib_path);
break; break;
} }
case CMD_CLOSE_LIBRARY:
{
/* Receive data from host. */
void *handle;
READ (pipe_target, &handle, sizeof (void *));
dlclose (handle);
break;
}
case CMD_RUN_FUNCTION: case CMD_RUN_FUNCTION:
{ {
uint16_t misc_data_len, return_data_len; uint16_t misc_data_len, return_data_len;
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -40,8 +40,8 @@ extern char **environ; ...@@ -40,8 +40,8 @@ extern char **environ;
char **tmp_dirs; char **tmp_dirs;
unsigned tmp_dirs_num = 0; unsigned tmp_dirs_num = 0;
/* Number of KNC engines. */ /* Number of emulated MIC engines. */
long knc_engines_num; long num_engines;
/* Mutex to sync parallel execution. */ /* Mutex to sync parallel execution. */
pthread_mutex_t mutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP; pthread_mutex_t mutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;
...@@ -116,8 +116,7 @@ __attribute__((constructor)) ...@@ -116,8 +116,7 @@ __attribute__((constructor))
static void static void
init () init ()
{ {
if (read_long_env (OFFLOAD_EMUL_KNC_NUM_ENV, &knc_engines_num, 1) if (read_long_env (OFFLOAD_EMUL_NUM_ENV, &num_engines, 1) == COI_ERROR)
== COI_ERROR)
exit (0); exit (0);
} }
...@@ -665,10 +664,10 @@ SYMBOL_VERSION (COIEngineGetCount, 1) (COI_ISA_TYPE isa, ...@@ -665,10 +664,10 @@ SYMBOL_VERSION (COIEngineGetCount, 1) (COI_ISA_TYPE isa,
COITRACE ("COIEngineGetCount"); COITRACE ("COIEngineGetCount");
/* Features of liboffload. */ /* Features of liboffload. */
assert (isa == COI_ISA_KNC); assert (isa == COI_ISA_MIC);
/* Prepare output arguments. */ /* Prepare output arguments. */
*count = knc_engines_num; *count = num_engines;
return COI_SUCCESS; return COI_SUCCESS;
} }
...@@ -684,10 +683,10 @@ SYMBOL_VERSION (COIEngineGetHandle, 1) (COI_ISA_TYPE isa, ...@@ -684,10 +683,10 @@ SYMBOL_VERSION (COIEngineGetHandle, 1) (COI_ISA_TYPE isa,
Engine *engine; Engine *engine;
/* Features of liboffload. */ /* Features of liboffload. */
assert (isa == COI_ISA_KNC); assert (isa == COI_ISA_MIC);
/* Check engine index. */ /* Check engine index. */
if (index >= knc_engines_num) if (index >= num_engines)
COIERROR ("Wrong engine index."); COIERROR ("Wrong engine index.");
/* Create engine handle. */ /* Create engine handle. */
...@@ -889,7 +888,7 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine, ...@@ -889,7 +888,7 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
/* Create directory for pipes to prevent names collision. */ /* Create directory for pipes to prevent names collision. */
MALLOC (char *, pipes_path, strlen (PIPES_PATH) + strlen (eng->dir) + 1); MALLOC (char *, pipes_path, strlen (PIPES_PATH) + strlen (eng->dir) + 1);
sprintf (pipes_path, "%s"PIPES_PATH, eng->dir); sprintf (pipes_path, "%s" PIPES_PATH, eng->dir);
if (mkdir (pipes_path, S_IRWXU) < 0) if (mkdir (pipes_path, S_IRWXU) < 0)
COIERROR ("Cannot create folder %s.", pipes_path); COIERROR ("Cannot create folder %s.", pipes_path);
...@@ -900,8 +899,8 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine, ...@@ -900,8 +899,8 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
strlen (PIPE_TARGET_PATH) + strlen (eng->dir) + 1); strlen (PIPE_TARGET_PATH) + strlen (eng->dir) + 1);
if (pipe_target_path == NULL) if (pipe_target_path == NULL)
COIERROR ("Cannot allocate memory."); COIERROR ("Cannot allocate memory.");
sprintf (pipe_host_path, "%s"PIPE_HOST_PATH, eng->dir); sprintf (pipe_host_path, "%s" PIPE_HOST_PATH, eng->dir);
sprintf (pipe_target_path, "%s"PIPE_TARGET_PATH, eng->dir); sprintf (pipe_target_path, "%s" PIPE_TARGET_PATH, eng->dir);
if (mkfifo (pipe_host_path, S_IRUSR | S_IWUSR) < 0) if (mkfifo (pipe_host_path, S_IRUSR | S_IWUSR) < 0)
COIERROR ("Cannot create pipe %s.", pipe_host_path); COIERROR ("Cannot create pipe %s.", pipe_host_path);
if (mkfifo (pipe_target_path, S_IRUSR | S_IWUSR) < 0) if (mkfifo (pipe_target_path, S_IRUSR | S_IWUSR) < 0)
...@@ -1019,6 +1018,27 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine, ...@@ -1019,6 +1018,27 @@ SYMBOL_VERSION (COIProcessCreateFromMemory, 1) (COIENGINE engine,
COIRESULT COIRESULT
SYMBOL_VERSION (COIProcessCreateFromFile, 1) (COIENGINE in_Engine,
const char *in_pBinaryName,
int in_Argc,
const char **in_ppArgv,
uint8_t in_DupEnv,
const char **in_ppAdditionalEnv,
uint8_t in_ProxyActive,
const char *in_Reserved,
uint64_t in_BufferSpace,
const char *in_LibrarySearchPath,
COIPROCESS *out_pProcess)
{
COITRACE ("COIProcessCreateFromFile");
/* liboffloadmic with GCC compiled binaries should never go here. */
assert (false);
return COI_ERROR;
}
COIRESULT
SYMBOL_VERSION (COIProcessDestroy, 1) (COIPROCESS process, SYMBOL_VERSION (COIProcessDestroy, 1) (COIPROCESS process,
int32_t wait_timeout, // Ignored int32_t wait_timeout, // Ignored
uint8_t force, uint8_t force,
...@@ -1129,38 +1149,39 @@ SYMBOL_VERSION (COIProcessGetFunctionHandles, 1) (COIPROCESS process, ...@@ -1129,38 +1149,39 @@ SYMBOL_VERSION (COIProcessGetFunctionHandles, 1) (COIPROCESS process,
COIRESULT COIRESULT
SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process, SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS in_Process,
const void *lib_buffer, const void *in_pLibraryBuffer,
uint64_t lib_buffer_len, uint64_t in_LibraryBufferLength,
const char *lib_name, const char *in_pLibraryName,
const char *lib_search_path, const char *in_LibrarySearchPath, // Ignored
const char *file_of_origin, // Ignored const char *in_FileOfOrigin, // Ignored
uint64_t file_from_origin_offset, // Ignored uint64_t in_FileOfOriginOffset, // Ignored
uint32_t flags, // Ignored uint32_t in_Flags, // Ignored
COILIBRARY *library) // Ignored COILIBRARY *out_pLibrary)
{ {
COITRACE ("COIProcessLoadLibraryFromMemory"); COITRACE ("COIProcessLoadLibraryFromMemory");
const cmd_t cmd = CMD_OPEN_LIBRARY;
char *lib_path; char *lib_path;
cmd_t cmd = CMD_OPEN_LIBRARY;
int fd; int fd;
FILE *file; FILE *file;
size_t len; size_t len;
/* Convert input arguments. */ /* Convert input arguments. */
Process *proc = (Process *) process; Process *proc = (Process *) in_Process;
/* Create target library file. */ /* Create target library file. */
MALLOC (char *, lib_path, MALLOC (char *, lib_path,
strlen (proc->engine->dir) + strlen (lib_name) + 2); strlen (proc->engine->dir) + strlen (in_pLibraryName) + 2);
sprintf (lib_path, "%s/%s", proc->engine->dir, lib_name); sprintf (lib_path, "%s/%s", proc->engine->dir, in_pLibraryName);
fd = open (lib_path, O_CLOEXEC | O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR); fd = open (lib_path, O_CLOEXEC | O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
if (fd < 0) if (fd < 0)
COIERROR ("Cannot create file %s.", lib_path); COIERROR ("Cannot create file %s.", lib_path);
file = fdopen (fd, "wb"); file = fdopen (fd, "wb");
if (file == NULL) if (file == NULL)
COIERROR ("Cannot associate stream with file descriptor."); COIERROR ("Cannot associate stream with file descriptor.");
if (fwrite (lib_buffer, 1, lib_buffer_len, file) != lib_buffer_len) if (fwrite (in_pLibraryBuffer, 1, in_LibraryBufferLength, file)
!= in_LibraryBufferLength)
COIERROR ("Cannot write in file %s.", lib_path); COIERROR ("Cannot write in file %s.", lib_path);
if (fclose (file) != 0) if (fclose (file) != 0)
COIERROR ("Cannot close file %s.", lib_path); COIERROR ("Cannot close file %s.", lib_path);
...@@ -1176,6 +1197,10 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process, ...@@ -1176,6 +1197,10 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process,
WRITE (proc->pipeline->pipe_target, &len, sizeof (size_t)); WRITE (proc->pipeline->pipe_target, &len, sizeof (size_t));
WRITE (proc->pipeline->pipe_target, lib_path, len); WRITE (proc->pipeline->pipe_target, lib_path, len);
/* Receive data from target. */
void *handle;
READ (proc->pipeline->pipe_host, &handle, sizeof (void *));
/* Finish critical section. */ /* Finish critical section. */
if (pthread_mutex_unlock (&mutex) != 0) if (pthread_mutex_unlock (&mutex) != 0)
COIERROR ("Cannot unlock mutex."); COIERROR ("Cannot unlock mutex.");
...@@ -1183,6 +1208,7 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process, ...@@ -1183,6 +1208,7 @@ SYMBOL_VERSION (COIProcessLoadLibraryFromMemory, 2) (COIPROCESS process,
/* Clean up. */ /* Clean up. */
free (lib_path); free (lib_path);
*out_pLibrary = (COILIBRARY) handle;
return COI_SUCCESS; return COI_SUCCESS;
} }
...@@ -1202,6 +1228,33 @@ SYMBOL_VERSION (COIProcessRegisterLibraries, 1) (uint32_t libraries_num, ...@@ -1202,6 +1228,33 @@ SYMBOL_VERSION (COIProcessRegisterLibraries, 1) (uint32_t libraries_num,
} }
COIRESULT
SYMBOL_VERSION (COIProcessUnloadLibrary, 1) (COIPROCESS in_Process,
COILIBRARY in_Library)
{
COITRACE ("COIProcessUnloadLibrary");
const cmd_t cmd = CMD_CLOSE_LIBRARY;
/* Convert input arguments. */
Process *proc = (Process *) in_Process;
/* Start critical section. */
if (pthread_mutex_lock (&mutex) != 0)
COIERROR ("Cannot lock mutex.");
/* Make target close library. */
WRITE (proc->pipeline->pipe_target, &cmd, sizeof (cmd_t));
WRITE (proc->pipeline->pipe_target, &in_Library, sizeof (void *));
/* Finish critical section. */
if (pthread_mutex_unlock (&mutex) != 0)
COIERROR ("Cannot unlock mutex.");
return COI_SUCCESS;
}
uint64_t uint64_t
SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) () SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) ()
{ {
...@@ -1210,5 +1263,51 @@ SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) () ...@@ -1210,5 +1263,51 @@ SYMBOL_VERSION (COIPerfGetCycleFrequency, 1) ()
return (uint64_t) CYCLE_FREQUENCY; return (uint64_t) CYCLE_FREQUENCY;
} }
COIRESULT
SYMBOL_VERSION (COIPipelineClearCPUMask, 1) (COI_CPU_MASK *in_Mask)
{
COITRACE ("COIPipelineClearCPUMask");
/* Looks like we have nothing to do here. */
return COI_SUCCESS;
}
COIRESULT
SYMBOL_VERSION (COIPipelineSetCPUMask, 1) (COIPROCESS in_Process,
uint32_t in_CoreID,
uint8_t in_ThreadID,
COI_CPU_MASK *out_pMask)
{
COITRACE ("COIPipelineSetCPUMask");
/* Looks like we have nothing to do here. */
return COI_SUCCESS;
}
COIRESULT
SYMBOL_VERSION (COIEngineGetInfo, 1) (COIENGINE in_EngineHandle,
uint32_t in_EngineInfoSize,
COI_ENGINE_INFO *out_pEngineInfo)
{
COITRACE ("COIEngineGetInfo");
out_pEngineInfo->ISA = COI_ISA_x86_64;
out_pEngineInfo->NumCores = 1;
out_pEngineInfo->NumThreads = 8;
out_pEngineInfo->CoreMaxFrequency = SYMBOL_VERSION(COIPerfGetCycleFrequency,1)() / 1000000;
out_pEngineInfo->PhysicalMemory = 1024;
out_pEngineInfo->PhysicalMemoryFree = 1024;
out_pEngineInfo->SwapMemory = 1024;
out_pEngineInfo->SwapMemoryFree = 1024;
out_pEngineInfo->MiscFlags = COI_ENG_ECC_DISABLED;
return COI_SUCCESS;
}
} // extern "C" } // extern "C"
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -38,31 +38,54 @@ ...@@ -38,31 +38,54 @@
* intellectual property rights is granted herein. * intellectual property rights is granted herein.
*/ */
__asm__ (".symver COIBufferAddRef1,COIBufferAddRef@@COI_1.0"); // Originally generated via:
__asm__ (".symver COIBufferCopy1,COIBufferCopy@@COI_1.0"); // cd include;
__asm__ (".symver COIBufferCreate1,COIBufferCreate@@COI_1.0"); // ctags -x --c-kinds=fp -R sink/ source/ common/ | grep -v COIX | awk '{print "__asm__(\".symver "$1"1,"$1"@@COI_1.0\");"}'
__asm__ (".symver COIBufferCreateFromMemory1,COIBufferCreateFromMemory@@COI_1.0"); //
__asm__ (".symver COIBufferDestroy1,COIBufferDestroy@@COI_1.0"); // These directives must have an associated linker script with VERSION stuff.
__asm__ (".symver COIBufferGetSinkAddress1,COIBufferGetSinkAddress@@COI_1.0"); // See coi_version_linker_script.map
__asm__ (".symver COIBufferMap1,COIBufferMap@@COI_1.0"); // Passed in as
__asm__ (".symver COIBufferRead1,COIBufferRead@@COI_1.0"); // -Wl,--version-script coi_version_linker_script.map
__asm__ (".symver COIBufferReleaseRef1,COIBufferReleaseRef@@COI_1.0"); // when building Intel(R) Coprocessor Offload Infrastructure (Intel(R) COI)
__asm__ (".symver COIBufferSetState1,COIBufferSetState@@COI_1.0"); //
__asm__ (".symver COIBufferUnmap1,COIBufferUnmap@@COI_1.0"); // See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info
__asm__ (".symver COIBufferWrite1,COIBufferWrite@@COI_1.0"); //
__asm__ (".symver COIEngineGetCount1,COIEngineGetCount@@COI_1.0"); // This is not strictly a .h file, so no need to #pragma once or anything.
__asm__ (".symver COIEngineGetHandle1,COIEngineGetHandle@@COI_1.0"); // You must include these asm directives in the same translation unit as the
__asm__ (".symver COIEngineGetIndex1,COIEngineGetIndex@@COI_1.0"); // one where the function body is.
__asm__ (".symver COIEventWait1,COIEventWait@@COI_1.0"); // Otherwise we'd have add this file to the list of files needed to build
__asm__ (".symver COIPerfGetCycleFrequency1,COIPerfGetCycleFrequency@@COI_1.0"); // libcoi*, instead of including it in each of the api/*/*cpp files.
__asm__ (".symver COIPipelineCreate1,COIPipelineCreate@@COI_1.0"); //
__asm__ (".symver COIPipelineDestroy1,COIPipelineDestroy@@COI_1.0"); __asm__(".symver COIBufferAddRef1,COIBufferAddRef@@COI_1.0");
__asm__ (".symver COIPipelineRunFunction1,COIPipelineRunFunction@@COI_1.0"); __asm__(".symver COIBufferCopy1,COIBufferCopy@@COI_1.0");
__asm__ (".symver COIPipelineStartExecutingRunFunctions1,COIPipelineStartExecutingRunFunctions@@COI_1.0"); __asm__(".symver COIBufferCreate1,COIBufferCreate@@COI_1.0");
__asm__ (".symver COIProcessCreateFromMemory1,COIProcessCreateFromMemory@@COI_1.0"); __asm__(".symver COIBufferCreateFromMemory1,COIBufferCreateFromMemory@@COI_1.0");
__asm__ (".symver COIProcessDestroy1,COIProcessDestroy@@COI_1.0"); __asm__(".symver COIBufferDestroy1,COIBufferDestroy@@COI_1.0");
__asm__ (".symver COIProcessGetFunctionHandles1,COIProcessGetFunctionHandles@@COI_1.0"); __asm__(".symver COIBufferGetSinkAddress1,COIBufferGetSinkAddress@@COI_1.0");
__asm__ (".symver COIProcessLoadLibraryFromMemory2,COIProcessLoadLibraryFromMemory@COI_2.0"); __asm__(".symver COIBufferMap1,COIBufferMap@@COI_1.0");
__asm__ (".symver COIProcessRegisterLibraries1,COIProcessRegisterLibraries@@COI_1.0"); __asm__(".symver COIBufferRead1,COIBufferRead@@COI_1.0");
__asm__ (".symver COIProcessWaitForShutdown1,COIProcessWaitForShutdown@@COI_1.0"); __asm__(".symver COIBufferReleaseRef1,COIBufferReleaseRef@@COI_1.0");
__asm__(".symver COIBufferSetState1,COIBufferSetState@@COI_1.0");
__asm__(".symver COIBufferUnmap1,COIBufferUnmap@@COI_1.0");
__asm__(".symver COIBufferWrite1,COIBufferWrite@@COI_1.0");
__asm__(".symver COIEngineGetCount1,COIEngineGetCount@@COI_1.0");
__asm__(".symver COIEngineGetHandle1,COIEngineGetHandle@@COI_1.0");
__asm__(".symver COIEngineGetIndex1,COIEngineGetIndex@@COI_1.0");
__asm__(".symver COIEngineGetInfo1,COIEngineGetInfo@@COI_1.0");
__asm__(".symver COIEventRegisterCallback1,COIEventRegisterCallback@@COI_1.0");
__asm__(".symver COIEventWait1,COIEventWait@@COI_1.0");
__asm__(".symver COIPerfGetCycleFrequency1,COIPerfGetCycleFrequency@@COI_1.0");
__asm__(".symver COIPipelineClearCPUMask1,COIPipelineClearCPUMask@@COI_1.0");
__asm__(".symver COIPipelineCreate1,COIPipelineCreate@@COI_1.0");
__asm__(".symver COIPipelineDestroy1,COIPipelineDestroy@@COI_1.0");
__asm__(".symver COIPipelineRunFunction1,COIPipelineRunFunction@@COI_1.0");
__asm__(".symver COIPipelineSetCPUMask1,COIPipelineSetCPUMask@@COI_1.0");
__asm__(".symver COIPipelineStartExecutingRunFunctions1,COIPipelineStartExecutingRunFunctions@@COI_1.0");
__asm__(".symver COIProcessCreateFromFile1,COIProcessCreateFromFile@@COI_1.0");
__asm__(".symver COIProcessCreateFromMemory1,COIProcessCreateFromMemory@@COI_1.0");
__asm__(".symver COIProcessDestroy1,COIProcessDestroy@@COI_1.0");
__asm__(".symver COIProcessGetFunctionHandles1,COIProcessGetFunctionHandles@@COI_1.0");
__asm__(".symver COIProcessLoadLibraryFromMemory2,COIProcessLoadLibraryFromMemory@COI_2.0");
__asm__(".symver COIProcessRegisterLibraries1,COIProcessRegisterLibraries@@COI_1.0");
__asm__(".symver COIProcessUnloadLibrary1,COIProcessUnloadLibrary@@COI_1.0");
__asm__(".symver COIProcessWaitForShutdown1,COIProcessWaitForShutdown@@COI_1.0");
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -38,6 +38,12 @@ ...@@ -38,6 +38,12 @@
* intellectual property rights is granted herein. * intellectual property rights is granted herein.
*/ */
/***
* See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info.
* Use this in conjunction with coi_version_asm.h.
* // Comments don't work in this file.
***/
COI_1.0 COI_1.0
{ {
global: global:
...@@ -56,17 +62,23 @@ COI_1.0 ...@@ -56,17 +62,23 @@ COI_1.0
COIEngineGetCount; COIEngineGetCount;
COIEngineGetHandle; COIEngineGetHandle;
COIEngineGetIndex; COIEngineGetIndex;
COIEngineGetInfo;
COIEventWait; COIEventWait;
COIEventRegisterCallback;
COIPerfGetCycleFrequency; COIPerfGetCycleFrequency;
COIPipelineClearCPUMask;
COIPipelineCreate; COIPipelineCreate;
COIPipelineDestroy; COIPipelineDestroy;
COIPipelineRunFunction; COIPipelineRunFunction;
COIPipelineSetCPUMask;
COIPipelineStartExecutingRunFunctions; COIPipelineStartExecutingRunFunctions;
COIProcessCreateFromFile;
COIProcessCreateFromMemory; COIProcessCreateFromMemory;
COIProcessDestroy; COIProcessDestroy;
COIProcessGetFunctionHandles; COIProcessGetFunctionHandles;
COIProcessLoadLibraryFromMemory; COIProcessLoadLibraryFromMemory;
COIProcessRegisterLibraries; COIProcessRegisterLibraries;
COIProcessUnloadLibrary;
COIProcessWaitForShutdown; COIProcessWaitForShutdown;
local: local:
*; *;
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -155,5 +155,49 @@ SYMBOL_VERSION (myoiTargetFptrTableRegister, 1) (void *table, ...@@ -155,5 +155,49 @@ SYMBOL_VERSION (myoiTargetFptrTableRegister, 1) (void *table,
return MYO_ERROR; return MYO_ERROR;
} }
MYOACCESSAPI MyoError
SYMBOL_VERSION (myoArenaRelease, 1) (MyoArena in_Arena)
{
MYOTRACE ("myoArenaRelease");
assert (false);
return MYO_ERROR;
}
MYOACCESSAPI MyoError
SYMBOL_VERSION (myoArenaAcquire, 1) (MyoArena in_Arena)
{
MYOTRACE ("myoArenaAcquire");
assert (false);
return MYO_ERROR;
}
MYOACCESSAPI void
SYMBOL_VERSION (myoArenaAlignedFree, 1) (MyoArena in_Arena, void *in_pPtr)
{
MYOTRACE ("myoArenaAlignedFree");
assert (false);
}
MYOACCESSAPI void *
SYMBOL_VERSION (myoArenaAlignedMalloc, 1) (MyoArena in_Arena, size_t in_Size,
size_t in_Alignment)
{
MYOTRACE ("myoArenaAlignedMalloc");
assert (false);
return 0;
}
} // extern "C" } // extern "C"
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -38,16 +38,24 @@ ...@@ -38,16 +38,24 @@
* intellectual property rights is granted herein. * intellectual property rights is granted herein.
*/ */
__asm__ (".symver myoAcquire1,myoAcquire@@MYO_1.0"); /*Version for Symbols( only Functions currently versioned)
__asm__ (".symver myoRelease1,myoRelease@@MYO_1.0"); Only that Linux Host Side code is versioned currently*/
__asm__ (".symver myoSharedAlignedFree1,myoSharedAlignedFree@@MYO_1.0"); #if (! defined MYO_MIC_CARD) && (! defined _WIN32)
__asm__ (".symver myoSharedAlignedMalloc1,myoSharedAlignedMalloc@@MYO_1.0");
__asm__ (".symver myoSharedFree1,myoSharedFree@@MYO_1.0");
__asm__ (".symver myoSharedMalloc1,myoSharedMalloc@@MYO_1.0");
__asm__ (".symver myoiLibInit1,myoiLibInit@@MYO_1.0"); __asm__(".symver myoArenaAlignedMalloc1,myoArenaAlignedMalloc@@MYO_1.0");
__asm__ (".symver myoiLibFini1,myoiLibFini@@MYO_1.0"); __asm__(".symver myoArenaAlignedFree1,myoArenaAlignedFree@@MYO_1.0");
__asm__ (".symver myoiMicVarTableRegister1,myoiMicVarTableRegister@@MYO_1.0"); __asm__(".symver myoArenaAcquire1,myoArenaAcquire@@MYO_1.0");
__asm__ (".symver myoiRemoteFuncRegister1,myoiRemoteFuncRegister@@MYO_1.0"); __asm__(".symver myoArenaRelease1,myoArenaRelease@@MYO_1.0");
__asm__ (".symver myoiTargetFptrTableRegister1,myoiTargetFptrTableRegister@@MYO_1.0"); __asm__(".symver myoAcquire1,myoAcquire@@MYO_1.0");
__asm__(".symver myoRelease1,myoRelease@@MYO_1.0");
__asm__(".symver myoSharedAlignedFree1,myoSharedAlignedFree@@MYO_1.0");
__asm__(".symver myoSharedAlignedMalloc1,myoSharedAlignedMalloc@@MYO_1.0");
__asm__(".symver myoSharedFree1,myoSharedFree@@MYO_1.0");
__asm__(".symver myoSharedMalloc1,myoSharedMalloc@@MYO_1.0");
__asm__(".symver myoiLibInit1,myoiLibInit@@MYO_1.0");
__asm__(".symver myoiLibFini1,myoiLibFini@@MYO_1.0");
__asm__(".symver myoiMicVarTableRegister1,myoiMicVarTableRegister@@MYO_1.0");
__asm__(".symver myoiRemoteFuncRegister1,myoiRemoteFuncRegister@@MYO_1.0");
__asm__(".symver myoiTargetFptrTableRegister1,myoiTargetFptrTableRegister@@MYO_1.0");
#endif
/* /*
* Copyright 2010-2013 Intel Corporation. * Copyright 2010-2015 Intel Corporation.
* *
* This library is free software; you can redistribute it and/or modify it * This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published * under the terms of the GNU Lesser General Public License as published
...@@ -38,9 +38,17 @@ ...@@ -38,9 +38,17 @@
* intellectual property rights is granted herein. * intellectual property rights is granted herein.
*/ */
/***
* See http://sourceware.org/binutils/docs/ld/VERSION.html#VERSION for more info.
***/
MYO_1.0 MYO_1.0
{ {
global: global:
myoArenaAlignedMalloc;
myoArenaAlignedFree;
myoArenaAcquire;
myoArenaRelease;
myoAcquire; myoAcquire;
myoRelease; myoRelease;
myoSharedAlignedFree; myoSharedAlignedFree;
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -144,6 +144,9 @@ void __liboffload_error_support(error_types input_tag, ...) ...@@ -144,6 +144,9 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_process_create: case c_process_create:
write_message(stderr, msg_c_process_create, args); write_message(stderr, msg_c_process_create, args);
break; break;
case c_process_set_cache_size:
write_message(stderr, msg_c_process_set_cache_size, args);
break;
case c_process_wait_shutdown: case c_process_wait_shutdown:
write_message(stderr, msg_c_process_wait_shutdown, args); write_message(stderr, msg_c_process_wait_shutdown, args);
break; break;
...@@ -216,6 +219,9 @@ void __liboffload_error_support(error_types input_tag, ...) ...@@ -216,6 +219,9 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_zero_or_neg_transfer_size: case c_zero_or_neg_transfer_size:
write_message(stderr, msg_c_zero_or_neg_transfer_size, args); write_message(stderr, msg_c_zero_or_neg_transfer_size, args);
break; break;
case c_bad_ptr_mem_alloc:
write_message(stderr, msg_c_bad_ptr_mem_alloc, args);
break;
case c_bad_ptr_mem_range: case c_bad_ptr_mem_range:
write_message(stderr, msg_c_bad_ptr_mem_range, args); write_message(stderr, msg_c_bad_ptr_mem_range, args);
break; break;
...@@ -258,6 +264,39 @@ void __liboffload_error_support(error_types input_tag, ...) ...@@ -258,6 +264,39 @@ void __liboffload_error_support(error_types input_tag, ...)
case c_report_unknown_trace_node: case c_report_unknown_trace_node:
write_message(stderr, msg_c_report_unknown_trace_node, args); write_message(stderr, msg_c_report_unknown_trace_node, args);
break; break;
case c_incorrect_affinity:
write_message(stderr, msg_c_incorrect_affinity, args);
break;
case c_cannot_set_affinity:
write_message(stderr, msg_c_cannot_set_affinity, args);
break;
case c_in_with_preallocated:
write_message(stderr, msg_c_in_with_preallocated, args);
break;
case c_report_no_host_exe:
write_message(stderr, msg_c_report_no_host_exe, args);
break;
case c_report_path_buff_overflow:
write_message(stderr, msg_c_report_path_buff_overflow, args);
break;
case c_create_pipeline_for_stream:
write_message(stderr, msg_c_create_pipeline_for_stream, args);
break;
case c_offload_no_stream:
write_message(stderr, msg_c_offload_no_stream, args);
break;
case c_get_engine_info:
write_message(stderr, msg_c_get_engine_info, args);
break;
case c_clear_cpu_mask:
write_message(stderr, msg_c_clear_cpu_mask, args);
break;
case c_set_cpu_mask:
write_message(stderr, msg_c_set_cpu_mask, args);
break;
case c_unload_library:
write_message(stderr, msg_c_unload_library, args);
break;
} }
va_end(args); va_end(args);
} }
...@@ -374,6 +413,10 @@ char const * report_get_message_str(error_types input_tag) ...@@ -374,6 +413,10 @@ char const * report_get_message_str(error_types input_tag)
return (offload_get_message_str(msg_c_report_unregister)); return (offload_get_message_str(msg_c_report_unregister));
case c_report_var: case c_report_var:
return (offload_get_message_str(msg_c_report_var)); return (offload_get_message_str(msg_c_report_var));
case c_report_stream:
return (offload_get_message_str(msg_c_report_stream));
case c_report_state_stream:
return (offload_get_message_str(msg_c_report_state_stream));
default: default:
LIBOFFLOAD_ERROR(c_report_unknown_trace_node); LIBOFFLOAD_ERROR(c_report_unknown_trace_node);
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -68,6 +68,7 @@ typedef enum ...@@ -68,6 +68,7 @@ typedef enum
c_get_engine_handle, c_get_engine_handle,
c_get_engine_index, c_get_engine_index,
c_process_create, c_process_create,
c_process_set_cache_size,
c_process_get_func_handles, c_process_get_func_handles,
c_process_wait_shutdown, c_process_wait_shutdown,
c_process_proxy_flush, c_process_proxy_flush,
...@@ -91,6 +92,7 @@ typedef enum ...@@ -91,6 +92,7 @@ typedef enum
c_event_wait, c_event_wait,
c_zero_or_neg_ptr_len, c_zero_or_neg_ptr_len,
c_zero_or_neg_transfer_size, c_zero_or_neg_transfer_size,
c_bad_ptr_mem_alloc,
c_bad_ptr_mem_range, c_bad_ptr_mem_range,
c_different_src_and_dstn_sizes, c_different_src_and_dstn_sizes,
c_ranges_dont_match, c_ranges_dont_match,
...@@ -103,6 +105,8 @@ typedef enum ...@@ -103,6 +105,8 @@ typedef enum
c_unknown_binary_type, c_unknown_binary_type,
c_multiple_target_exes, c_multiple_target_exes,
c_no_target_exe, c_no_target_exe,
c_incorrect_affinity,
c_cannot_set_affinity,
c_report_host, c_report_host,
c_report_target, c_report_target,
c_report_title, c_report_title,
...@@ -159,7 +163,24 @@ typedef enum ...@@ -159,7 +163,24 @@ typedef enum
c_report_myosharedalignedfree, c_report_myosharedalignedfree,
c_report_myoacquire, c_report_myoacquire,
c_report_myorelease, c_report_myorelease,
c_coipipe_max_number c_report_myosupportsfeature,
c_report_myosharedarenacreate,
c_report_myosharedalignedarenamalloc,
c_report_myosharedalignedarenafree,
c_report_myoarenaacquire,
c_report_myoarenarelease,
c_coipipe_max_number,
c_in_with_preallocated,
c_report_no_host_exe,
c_report_path_buff_overflow,
c_create_pipeline_for_stream,
c_offload_no_stream,
c_get_engine_info,
c_clear_cpu_mask,
c_set_cpu_mask,
c_report_state_stream,
c_report_stream,
c_unload_library
} error_types; } error_types;
enum OffloadHostPhase { enum OffloadHostPhase {
...@@ -260,15 +281,21 @@ enum OffloadTargetPhase { ...@@ -260,15 +281,21 @@ enum OffloadTargetPhase {
c_offload_target_max_phase c_offload_target_max_phase
}; };
#ifdef TARGET_WINNT
#define DLL_LOCAL
#else
#define DLL_LOCAL __attribute__((visibility("hidden")))
#endif
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
void __liboffload_error_support(error_types input_tag, ...); DLL_LOCAL void __liboffload_error_support(error_types input_tag, ...);
void __liboffload_report_support(error_types input_tag, ...); DLL_LOCAL void __liboffload_report_support(error_types input_tag, ...);
char const *offload_get_message_str(int msgCode); DLL_LOCAL char const *offload_get_message_str(int msgCode);
char const * report_get_message_str(error_types input_tag); DLL_LOCAL char const * report_get_message_str(error_types input_tag);
char const * report_get_host_stage_str(int i); DLL_LOCAL char const * report_get_host_stage_str(int i);
char const * report_get_target_stage_str(int i); DLL_LOCAL char const * report_get_target_stage_str(int i);
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
...@@ -281,7 +308,7 @@ char const * report_get_target_stage_str(int i); ...@@ -281,7 +308,7 @@ char const * report_get_target_stage_str(int i);
fprintf(stderr, "\t TEST for %s \n \t", nm); \ fprintf(stderr, "\t TEST for %s \n \t", nm); \
__liboffload_error_support(msg, __VA_ARGS__); __liboffload_error_support(msg, __VA_ARGS__);
void write_message(FILE * file, int msgCode, va_list args_p); DLL_LOCAL void write_message(FILE * file, int msgCode, va_list args_p);
#define LIBOFFLOAD_ERROR __liboffload_error_support #define LIBOFFLOAD_ERROR __liboffload_error_support
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -28,7 +28,6 @@ ...@@ -28,7 +28,6 @@
*/ */
#include <stdarg.h> #include <stdarg.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
......
! !
! Copyright (c) 2014 Intel Corporation. All Rights Reserved. ! Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
! !
! Redistribution and use in source and binary forms, with or without ! Redistribution and use in source and binary forms, with or without
! modification, are permitted provided that the following conditions ! modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -42,6 +42,13 @@ ...@@ -42,6 +42,13 @@
#include <stddef.h> #include <stddef.h>
#include <omp.h> #include <omp.h>
#ifdef TARGET_WINNT
// <stdint.h> is not compatible with Windows
typedef unsigned long long int uint64_t;
#else
#include <stdint.h>
#endif // TARGET_WINNT
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
...@@ -86,6 +93,8 @@ typedef struct { ...@@ -86,6 +93,8 @@ typedef struct {
size_t data_received; /* number of bytes received by host */ size_t data_received; /* number of bytes received by host */
} _Offload_status; } _Offload_status;
typedef uint64_t _Offload_stream;
#define OFFLOAD_STATUS_INIT(x) \ #define OFFLOAD_STATUS_INIT(x) \
((x).result = OFFLOAD_DISABLED) ((x).result = OFFLOAD_DISABLED)
...@@ -98,14 +107,57 @@ extern int _Offload_number_of_devices(void); ...@@ -98,14 +107,57 @@ extern int _Offload_number_of_devices(void);
extern int _Offload_get_device_number(void); extern int _Offload_get_device_number(void);
extern int _Offload_get_physical_device_number(void); extern int _Offload_get_physical_device_number(void);
/* Offload stream runtime interfaces */
extern _Offload_stream _Offload_stream_create(
int device, // MIC device number
int number_of_cpus // Cores allocated to the stream
);
extern int _Offload_stream_destroy(
int device, // MIC device number
_Offload_stream stream // stream handle
);
extern int _Offload_stream_completed(
int device, // MIC device number
_Offload_stream handle // stream handle
);
/*
* _Offload_shared_malloc/free are only supported when offload is enabled
* else they are defined to malloc and free
*/
#ifdef __INTEL_OFFLOAD
extern void* _Offload_shared_malloc(size_t size); extern void* _Offload_shared_malloc(size_t size);
extern void _Offload_shared_free(void *ptr); extern void _Offload_shared_free(void *ptr);
extern void* _Offload_shared_aligned_malloc(size_t size, size_t align); extern void* _Offload_shared_aligned_malloc(size_t size, size_t align);
extern void _Offload_shared_aligned_free(void *ptr); extern void _Offload_shared_aligned_free(void *ptr);
#else
#include <malloc.h>
#define _Offload_shared_malloc(size) malloc(size)
#define _Offload_shared_free(ptr) free(ptr);
#if defined(_WIN32)
#define _Offload_shared_aligned_malloc(size, align) _aligned_malloc(size, align)
#define _Offload_shared_aligned_free(ptr) _aligned_free(ptr);
#else
#define _Offload_shared_aligned_malloc(size, align) memalign(align, size)
#define _Offload_shared_aligned_free(ptr) free(ptr);
#endif
#endif
extern int _Offload_signaled(int index, void *signal); extern int _Offload_signaled(int index, void *signal);
extern void _Offload_report(int val); extern void _Offload_report(int val);
extern int _Offload_find_associated_mic_memory(
int target,
const void* cpu_addr,
void** cpu_base_addr,
uint64_t* buf_length,
void** mic_addr,
uint64_t* mic_buf_start_offset,
int* is_static
);
/* OpenMP API */ /* OpenMP API */
...@@ -343,7 +395,11 @@ namespace __offload { ...@@ -343,7 +395,11 @@ namespace __offload {
shared_allocator<void>::const_pointer) { shared_allocator<void>::const_pointer) {
/* Allocate from shared memory. */ /* Allocate from shared memory. */
void *ptr = _Offload_shared_malloc(s*sizeof(T)); void *ptr = _Offload_shared_malloc(s*sizeof(T));
#if (defined(_WIN32) || defined(_WIN64)) /* Windows */
if (ptr == 0) throw std::bad_alloc();
#else
if (ptr == 0) std::__throw_bad_alloc(); if (ptr == 0) std::__throw_bad_alloc();
#endif
return static_cast<pointer>(ptr); return static_cast<pointer>(ptr);
} /* allocate */ } /* allocate */
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -40,10 +40,6 @@ ...@@ -40,10 +40,6 @@
#include <string.h> #include <string.h>
#include <memory.h> #include <memory.h>
#if (defined(LINUX) || defined(FREEBSD)) && !defined(__INTEL_COMPILER)
#include <mm_malloc.h>
#endif
#include "offload.h" #include "offload.h"
#include "offload_table.h" #include "offload_table.h"
#include "offload_trace.h" #include "offload_trace.h"
...@@ -65,22 +61,24 @@ ...@@ -65,22 +61,24 @@
// The debug routines // The debug routines
// Host console and file logging // Host console and file logging
extern int console_enabled; DLL_LOCAL extern int console_enabled;
extern int offload_report_level; DLL_LOCAL extern int offload_report_level;
#define OFFLOAD_DO_TRACE (offload_report_level == 3)
extern const char *prefix; DLL_LOCAL extern const char *prefix;
extern int offload_number; DLL_LOCAL extern int offload_number;
#if !HOST_LIBRARY #if !HOST_LIBRARY
extern int mic_index; DLL_LOCAL extern int mic_index;
#define OFFLOAD_DO_TRACE (offload_report_level == 3)
#else
#define OFFLOAD_DO_TRACE (offload_report_enabled && (offload_report_level == 3))
#endif #endif
#if HOST_LIBRARY #if HOST_LIBRARY
void Offload_Report_Prolog(OffloadHostTimerData* timer_data); DLL_LOCAL void Offload_Report_Prolog(OffloadHostTimerData* timer_data);
void Offload_Report_Epilog(OffloadHostTimerData* timer_data); DLL_LOCAL void Offload_Report_Epilog(OffloadHostTimerData* timer_data);
void offload_report_free_data(OffloadHostTimerData * timer_data); DLL_LOCAL void offload_report_free_data(OffloadHostTimerData * timer_data);
void Offload_Timer_Print(void); DLL_LOCAL void Offload_Timer_Print(void);
#ifndef TARGET_WINNT #ifndef TARGET_WINNT
#define OFFLOAD_DEBUG_INCR_OFLD_NUM() \ #define OFFLOAD_DEBUG_INCR_OFLD_NUM() \
...@@ -130,7 +128,7 @@ void Offload_Timer_Print(void); ...@@ -130,7 +128,7 @@ void Offload_Timer_Print(void);
#define OFFLOAD_DEBUG_DUMP_BYTES(level, a, b) \ #define OFFLOAD_DEBUG_DUMP_BYTES(level, a, b) \
__dump_bytes(level, a, b) __dump_bytes(level, a, b)
extern void __dump_bytes( DLL_LOCAL extern void __dump_bytes(
int level, int level,
const void *data, const void *data,
int len int len
...@@ -156,6 +154,17 @@ extern void *OFFLOAD_MALLOC(size_t size, size_t align); ...@@ -156,6 +154,17 @@ extern void *OFFLOAD_MALLOC(size_t size, size_t align);
// The Marshaller // The Marshaller
// Flags describing an offload
//! Flags describing an offload
union OffloadFlags{
uint32_t flags;
struct {
uint32_t fortran_traceback : 1; //!< Fortran traceback requested
uint32_t omp_async : 1; //!< OpenMP asynchronous offload
} bits;
};
//! \enum Indicator for the type of entry on an offload item list. //! \enum Indicator for the type of entry on an offload item list.
enum OffloadItemType { enum OffloadItemType {
c_data = 1, //!< Plain data c_data = 1, //!< Plain data
...@@ -203,6 +212,44 @@ enum OffloadParameterType { ...@@ -203,6 +212,44 @@ enum OffloadParameterType {
c_parameter_inout //!< Variable listed in "inout" clause c_parameter_inout //!< Variable listed in "inout" clause
}; };
//! Flags describing an offloaded variable
union varDescFlags {
struct {
//! source variable has persistent storage
uint32_t is_static : 1;
//! destination variable has persistent storage
uint32_t is_static_dstn : 1;
//! has length for c_dv && c_dv_ptr
uint32_t has_length : 1;
//! persisted local scalar is in stack buffer
uint32_t is_stack_buf : 1;
//! "targetptr" modifier used
uint32_t targetptr : 1;
//! "preallocated" modifier used
uint32_t preallocated : 1;
//! Needs documentation
uint32_t is_pointer : 1;
//! buffer address is sent in data
uint32_t sink_addr : 1;
//! alloc displacement is sent in data
uint32_t alloc_disp : 1;
//! source data is noncontiguous
uint32_t is_noncont_src : 1;
//! destination data is noncontiguous
uint32_t is_noncont_dst : 1;
//! "OpenMP always" modifier used
uint32_t always_copy : 1;
//! "OpenMP delete" modifier used
uint32_t always_delete : 1;
//! CPU memory pinning/unpinning operation
uint32_t pin : 1;
};
uint32_t bits;
};
//! An Offload Variable descriptor //! An Offload Variable descriptor
struct VarDesc { struct VarDesc {
//! OffloadItemTypes of source and destination //! OffloadItemTypes of source and destination
...@@ -230,27 +277,7 @@ struct VarDesc { ...@@ -230,27 +277,7 @@ struct VarDesc {
/*! Used by runtime as offset to data from start of MIC buffer */ /*! Used by runtime as offset to data from start of MIC buffer */
uint32_t mic_offset; uint32_t mic_offset;
//! Flags describing this variable //! Flags describing this variable
union { varDescFlags flags;
struct {
//! source variable has persistent storage
uint32_t is_static : 1;
//! destination variable has persistent storage
uint32_t is_static_dstn : 1;
//! has length for c_dv && c_dv_ptr
uint32_t has_length : 1;
//! persisted local scalar is in stack buffer
uint32_t is_stack_buf : 1;
//! buffer address is sent in data
uint32_t sink_addr : 1;
//! alloc displacement is sent in data
uint32_t alloc_disp : 1;
//! source data is noncontiguous
uint32_t is_noncont_src : 1;
//! destination data is noncontiguous
uint32_t is_noncont_dst : 1;
};
uint32_t bits;
} flags;
//! Not used by compiler; set to 0 //! Not used by compiler; set to 0
/*! Used by runtime as offset to base from data stored in a buffer */ /*! Used by runtime as offset to base from data stored in a buffer */
int64_t offset; int64_t offset;
...@@ -472,4 +499,16 @@ struct FunctionDescriptor ...@@ -472,4 +499,16 @@ struct FunctionDescriptor
// Pointer to OffloadDescriptor. // Pointer to OffloadDescriptor.
typedef struct OffloadDescriptor *OFFLOAD; typedef struct OffloadDescriptor *OFFLOAD;
// Use for setting affinity of a stream
enum affinity_type {
affinity_compact,
affinity_scatter
};
struct affinity_spec {
uint64_t sink_mask[16];
int affinity_type;
int num_cores;
int num_threads;
};
#endif // OFFLOAD_COMMON_H_INCLUDED #endif // OFFLOAD_COMMON_H_INCLUDED
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -32,13 +32,16 @@ ...@@ -32,13 +32,16 @@
#define OFFLOAD_ENGINE_H_INCLUDED #define OFFLOAD_ENGINE_H_INCLUDED
#include <limits.h> #include <limits.h>
#include <bitset>
#include <list> #include <list>
#include <set> #include <set>
#include <map> #include <map>
#include "offload_common.h" #include "offload_common.h"
#include "coi/coi_client.h" #include "coi/coi_client.h"
#define SIGNAL_IS_REMOVED ((OffloadDescriptor *)-1)
const int64_t no_stream = -1;
// Address range // Address range
class MemRange { class MemRange {
public: public:
...@@ -157,6 +160,50 @@ private: ...@@ -157,6 +160,50 @@ private:
typedef std::list<PtrData*> PtrDataList; typedef std::list<PtrData*> PtrDataList;
class PtrDataTable {
public:
typedef std::set<PtrData> PtrSet;
PtrData* find_ptr_data(const void *ptr) {
m_ptr_lock.lock();
PtrSet::iterator res = list.find(PtrData(ptr, 0));
m_ptr_lock.unlock();
if (res == list.end()) {
return 0;
}
return const_cast<PtrData*>(res.operator->());
}
PtrData* insert_ptr_data(const void *ptr, uint64_t len, bool &is_new) {
m_ptr_lock.lock();
std::pair<PtrSet::iterator, bool> res =
list.insert(PtrData(ptr, len));
PtrData* ptr_data = const_cast<PtrData*>(res.first.operator->());
m_ptr_lock.unlock();
is_new = res.second;
if (is_new) {
// It's necessary to lock as soon as possible.
// unlock must be done at call site of insert_ptr_data at
// branch for is_new
ptr_data->alloc_ptr_data_lock.lock();
}
return ptr_data;
}
void remove_ptr_data(const void *ptr) {
m_ptr_lock.lock();
list.erase(PtrData(ptr, 0));
m_ptr_lock.unlock();
}
private:
PtrSet list;
mutex_t m_ptr_lock;
};
// Data associated with automatic variable // Data associated with automatic variable
class AutoData { class AutoData {
public: public:
...@@ -187,6 +234,14 @@ public: ...@@ -187,6 +234,14 @@ public:
#endif // TARGET_WINNT #endif // TARGET_WINNT
} }
long nullify_reference() {
#ifndef TARGET_WINNT
return __sync_lock_test_and_set(&ref_count, 0);
#else // TARGET_WINNT
return _InterlockedExchange(&ref_count,0);
#endif // TARGET_WINNT
}
long get_reference() const { long get_reference() const {
return ref_count; return ref_count;
} }
...@@ -226,18 +281,39 @@ struct TargetImage ...@@ -226,18 +281,39 @@ struct TargetImage
typedef std::list<TargetImage> TargetImageList; typedef std::list<TargetImage> TargetImageList;
// dynamic library and Image associated with lib
struct DynLib
{
DynLib(const char *_name, const void *_data,
COILIBRARY _lib) :
name(_name), data(_data), lib(_lib)
{}
// library name
const char* name;
// contents
const void* data;
COILIBRARY lib;
};
typedef std::list<DynLib> DynLibList;
// Data associated with persistent auto objects // Data associated with persistent auto objects
struct PersistData struct PersistData
{ {
PersistData(const void *addr, uint64_t routine_num, uint64_t size) : PersistData(const void *addr, uint64_t routine_num,
stack_cpu_addr(addr), routine_id(routine_num) uint64_t size, uint64_t thread) :
stack_cpu_addr(addr), routine_id(routine_num), thread_id(thread)
{ {
stack_ptr_data = new PtrData(0, size); stack_ptr_data = new PtrData(0, size);
} }
// 1-st key value - begining of the stack at CPU // 1-st key value - beginning of the stack at CPU
const void * stack_cpu_addr; const void * stack_cpu_addr;
// 2-nd key value - identifier of routine invocation at CPU // 2-nd key value - identifier of routine invocation at CPU
uint64_t routine_id; uint64_t routine_id;
// 3-rd key value - thread identifier
uint64_t thread_id;
// corresponded PtrData; only stack_ptr_data->mic_buf is used // corresponded PtrData; only stack_ptr_data->mic_buf is used
PtrData * stack_ptr_data; PtrData * stack_ptr_data;
// used to get offset of the variable in stack buffer // used to get offset of the variable in stack buffer
...@@ -246,6 +322,75 @@ struct PersistData ...@@ -246,6 +322,75 @@ struct PersistData
typedef std::list<PersistData> PersistDataList; typedef std::list<PersistData> PersistDataList;
// Data associated with stream
struct Stream
{
Stream(int device, int num_of_cpus) :
m_number_of_cpus(num_of_cpus), m_pipeline(0), m_last_offload(0),
m_device(device)
{}
~Stream() {
if (m_pipeline) {
COI::PipelineDestroy(m_pipeline);
}
}
COIPIPELINE get_pipeline(void) {
return(m_pipeline);
}
int get_device(void) {
return(m_device);
}
int get_cpu_number(void) {
return(m_number_of_cpus);
}
void set_pipeline(COIPIPELINE pipeline) {
m_pipeline = pipeline;
}
OffloadDescriptor* get_last_offload(void) {
return(m_last_offload);
}
void set_last_offload(OffloadDescriptor* last_offload) {
m_last_offload = last_offload;
}
static Stream* find_stream(uint64_t handle, bool remove);
static _Offload_stream add_stream(int device, int number_of_cpus) {
m_stream_lock.lock();
all_streams[++m_streams_count] = new Stream(device, number_of_cpus);
m_stream_lock.unlock();
return(m_streams_count);
}
typedef std::map<uint64_t, Stream*> StreamMap;
static uint64_t m_streams_count;
static StreamMap all_streams;
static mutex_t m_stream_lock;
int m_device;
// number of cpus
int m_number_of_cpus;
// The pipeline associated with the stream
COIPIPELINE m_pipeline;
// The last offload occured via the stream
OffloadDescriptor* m_last_offload;
// Cpus used by the stream
std::bitset<COI_MAX_HW_THREADS> m_stream_cpus;
};
typedef std::map<uint64_t, Stream*> StreamMap;
// class representing a single engine // class representing a single engine
struct Engine { struct Engine {
friend void __offload_init_library_once(void); friend void __offload_init_library_once(void);
...@@ -275,9 +420,14 @@ struct Engine { ...@@ -275,9 +420,14 @@ struct Engine {
return m_process; return m_process;
} }
uint64_t get_thread_id(void);
// initialize device // initialize device
void init(void); void init(void);
// unload library
void unload_library(const void *data, const char *name);
// add new library // add new library
void add_lib(const TargetImage &lib) void add_lib(const TargetImage &lib)
{ {
...@@ -288,6 +438,7 @@ struct Engine { ...@@ -288,6 +438,7 @@ struct Engine {
} }
COIRESULT compute( COIRESULT compute(
_Offload_stream stream,
const std::list<COIBUFFER> &buffers, const std::list<COIBUFFER> &buffers,
const void* data, const void* data,
uint16_t data_size, uint16_t data_size,
...@@ -323,36 +474,28 @@ struct Engine { ...@@ -323,36 +474,28 @@ struct Engine {
// Memory association table // Memory association table
// //
PtrData* find_ptr_data(const void *ptr) { PtrData* find_ptr_data(const void *ptr) {
m_ptr_lock.lock(); return m_ptr_set.find_ptr_data(ptr);
PtrSet::iterator res = m_ptr_set.find(PtrData(ptr, 0));
m_ptr_lock.unlock();
if (res == m_ptr_set.end()) {
return 0;
} }
return const_cast<PtrData*>(res.operator->());
PtrData* find_targetptr_data(const void *ptr) {
return m_targetptr_set.find_ptr_data(ptr);
} }
PtrData* insert_ptr_data(const void *ptr, uint64_t len, bool &is_new) { PtrData* insert_ptr_data(const void *ptr, uint64_t len, bool &is_new) {
m_ptr_lock.lock(); return m_ptr_set.insert_ptr_data(ptr, len, is_new);
std::pair<PtrSet::iterator, bool> res =
m_ptr_set.insert(PtrData(ptr, len));
PtrData* ptr_data = const_cast<PtrData*>(res.first.operator->());
m_ptr_lock.unlock();
is_new = res.second;
if (is_new) {
// It's necessary to lock as soon as possible.
// unlock must be done at call site of insert_ptr_data at
// branch for is_new
ptr_data->alloc_ptr_data_lock.lock();
} }
return ptr_data;
PtrData* insert_targetptr_data(const void *ptr, uint64_t len,
bool &is_new) {
return m_targetptr_set.insert_ptr_data(ptr, len, is_new);
} }
void remove_ptr_data(const void *ptr) { void remove_ptr_data(const void *ptr) {
m_ptr_lock.lock(); m_ptr_set.remove_ptr_data(ptr);
m_ptr_set.erase(PtrData(ptr, 0)); }
m_ptr_lock.unlock();
void remove_targetptr_data(const void *ptr) {
m_targetptr_set.remove_ptr_data(ptr);
} }
// //
...@@ -396,7 +539,7 @@ struct Engine { ...@@ -396,7 +539,7 @@ struct Engine {
if (it != m_signal_map.end()) { if (it != m_signal_map.end()) {
desc = it->second; desc = it->second;
if (remove) { if (remove) {
m_signal_map.erase(it); it->second = SIGNAL_IS_REMOVED;
} }
} }
} }
...@@ -405,6 +548,14 @@ struct Engine { ...@@ -405,6 +548,14 @@ struct Engine {
return desc; return desc;
} }
void stream_destroy(_Offload_stream handle);
COIPIPELINE get_pipeline(_Offload_stream stream);
StreamMap get_stream_map() {
return m_stream_map;
}
// stop device process // stop device process
void fini_process(bool verbose); void fini_process(bool verbose);
...@@ -417,6 +568,11 @@ private: ...@@ -417,6 +568,11 @@ private:
{} {}
~Engine() { ~Engine() {
for (StreamMap::iterator it = m_stream_map.begin();
it != m_stream_map.end(); it++) {
Stream * stream = it->second;
delete stream;
}
if (m_process != 0) { if (m_process != 0) {
fini_process(false); fini_process(false);
} }
...@@ -469,14 +625,24 @@ private: ...@@ -469,14 +625,24 @@ private:
// List of libraries to be loaded // List of libraries to be loaded
TargetImageList m_images; TargetImageList m_images;
// var table // var tables
PtrSet m_ptr_set; PtrDataTable m_ptr_set;
mutex_t m_ptr_lock; PtrDataTable m_targetptr_set;
// signals // signals
SignalMap m_signal_map; SignalMap m_signal_map;
mutex_t m_signal_lock; mutex_t m_signal_lock;
// streams
StreamMap m_stream_map;
mutex_t m_stream_lock;
int m_num_cores;
int m_num_threads;
std::bitset<COI_MAX_HW_THREADS> m_cpus;
// List of dynamic libraries to be registred
DynLibList m_dyn_libs;
// constants for accessing device function handles // constants for accessing device function handles
enum { enum {
c_func_compute = 0, c_func_compute = 0,
...@@ -487,6 +653,7 @@ private: ...@@ -487,6 +653,7 @@ private:
c_func_init, c_func_init,
c_func_var_table_size, c_func_var_table_size,
c_func_var_table_copy, c_func_var_table_copy,
c_func_set_stream_affinity,
c_funcs_total c_funcs_total
}; };
static const char* m_func_names[c_funcs_total]; static const char* m_func_names[c_funcs_total];
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -146,7 +146,7 @@ void MicEnvVar::add_env_var( ...@@ -146,7 +146,7 @@ void MicEnvVar::add_env_var(
else { else {
card = get_card(card_number); card = get_card(card_number);
if (!card) { if (!card) {
// definition for new card occured // definition for new card occurred
card = new CardEnvVars(card_number); card = new CardEnvVars(card_number);
card_spec_list.push_back(card); card_spec_list.push_back(card);
} }
...@@ -321,7 +321,7 @@ void MicEnvVar::mic_parse_env_var_list( ...@@ -321,7 +321,7 @@ void MicEnvVar::mic_parse_env_var_list(
// Collect all definitions for the card with number "card_num". // Collect all definitions for the card with number "card_num".
// The returned result is vector of string pointers defining one // The returned result is vector of string pointers defining one
// environment variable. The vector is terminated by NULL pointer. // environment variable. The vector is terminated by NULL pointer.
// In the begining of the vector there are env vars defined as // In the beginning of the vector there are env vars defined as
// <mic-prefix>_<card-number>_<var>=<value> // <mic-prefix>_<card-number>_<var>=<value>
// or // or
// <mic-prefix>_<card-number>_ENV=<env-vars> // <mic-prefix>_<card-number>_ENV=<env-vars>
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -32,6 +32,7 @@ ...@@ -32,6 +32,7 @@
#define OFFLOAD_ENV_H_INCLUDED #define OFFLOAD_ENV_H_INCLUDED
#include <list> #include <list>
#include "offload_util.h"
// data structure and routines to parse MIC user environment and pass to MIC // data structure and routines to parse MIC user environment and pass to MIC
...@@ -43,7 +44,7 @@ enum MicEnvVarKind ...@@ -43,7 +44,7 @@ enum MicEnvVarKind
c_mic_card_env // for <mic-prefix>_<card-number>_ENV c_mic_card_env // for <mic-prefix>_<card-number>_ENV
}; };
struct MicEnvVar { struct DLL_LOCAL MicEnvVar {
public: public:
MicEnvVar() : prefix(0) {} MicEnvVar() : prefix(0) {}
~MicEnvVar(); ~MicEnvVar();
......
This source diff could not be displayed because it is too large. You can view the blob instead.
/*
Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/*! \file
\brief Iterator of Variable tables list used by the runtime library
*/
#ifndef OFFLOAD_ITERATOR_H_INCLUDED
#define OFFLOAD_ITERATOR_H_INCLUDED
#include <iterator>
#include "offload_table.h"
// The following class is for iteration over var table.
// It was extracted and moved to this offload_iterator.h file from offload_table.h
// to solve the problem with compiling with VS 2010. The problem was in incompatibility
// of STL objects in VS 2010 with ones in later VS versions.
// var table list iterator
class Iterator : public std::iterator<std::input_iterator_tag,
VarTable::Entry> {
public:
Iterator() : m_node(0), m_entry(0) {}
explicit Iterator(TableList<VarTable>::Node *node) {
new_node(node);
}
Iterator& operator++() {
if (m_entry != 0) {
m_entry++;
while (m_entry->name == 0) {
m_entry++;
}
if (m_entry->name == reinterpret_cast<const char*>(-1)) {
new_node(m_node->next);
}
}
return *this;
}
bool operator==(const Iterator &other) const {
return m_entry == other.m_entry;
}
bool operator!=(const Iterator &other) const {
return m_entry != other.m_entry;
}
const VarTable::Entry* operator*() const {
return m_entry;
}
private:
void new_node(TableList<VarTable>::Node *node) {
m_node = node;
m_entry = 0;
while (m_node != 0) {
m_entry = m_node->table.entries;
while (m_entry->name == 0) {
m_entry++;
}
if (m_entry->name != reinterpret_cast<const char*>(-1)) {
break;
}
m_node = m_node->next;
m_entry = 0;
}
}
private:
TableList<VarTable>::Node *m_node;
const VarTable::Entry *m_entry;
};
#endif // OFFLOAD_ITERATOR_H_INCLUDED
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -34,67 +34,35 @@ ...@@ -34,67 +34,35 @@
#include <myotypes.h> #include <myotypes.h>
#include <myoimpl.h> #include <myoimpl.h>
#include <myo.h> #include <myo.h>
#include "offload.h"
typedef MyoiSharedVarEntry SharedTableEntry;
//typedef MyoiHostSharedFptrEntry FptrTableEntry;
typedef struct {
//! Function Name
const char *funcName;
//! Function Address
void *funcAddr;
//! Local Thunk Address
void *localThunkAddr;
#ifdef TARGET_WINNT
// Dummy to pad up to 32 bytes
void *dummy;
#endif // TARGET_WINNT
} FptrTableEntry;
struct InitTableEntry {
#ifdef TARGET_WINNT
// Dummy to pad up to 16 bytes
// Function Name
const char *funcName;
#endif // TARGET_WINNT
void (*func)(void);
};
#ifdef TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable$a"
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable$z"
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START ".MyoSharedInitTable$a"
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END ".MyoSharedInitTable$z"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable$a"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable$z"
#else // TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START ".MyoSharedInitTable."
#define OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END ".MyoSharedInitTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable."
#endif // TARGET_WINNT
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_INIT_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_END, read, write)
#include "offload.h"
// undefine the following since offload.h defines them to malloc and free if __INTEL_OFFLOAD
// is not defined which is the case when building the offload library
#undef _Offload_shared_malloc
#undef _Offload_shared_free
#undef _Offload_shared_aligned_malloc
#undef _Offload_shared_aligned_free
#include "offload_table.h"
// This function retained for compatibility with 15.0
extern "C" void __offload_myoRegisterTables( extern "C" void __offload_myoRegisterTables(
InitTableEntry *init_table, InitTableEntry *init_table,
SharedTableEntry *shared_table, SharedTableEntry *shared_table,
FptrTableEntry *fptr_table FptrTableEntry *fptr_table
); );
// Process shared variable, shared vtable and function and init routine tables.
// In .dlls/.sos these will be collected together.
// In the main program, all collected tables will be processed.
extern "C" bool __offload_myoProcessTables(
const void* image,
MYOInitTableList::Node *init_table,
MYOVarTableList::Node *shared_table,
MYOVarTableList::Node *shared_vtable,
MYOFuncTableList::Node *fptr_table
);
extern void __offload_myoFini(void); extern void __offload_myoFini(void);
extern bool __offload_myo_init_is_deferred(const void *image);
#endif // OFFLOAD_MYO_HOST_H_INCLUDED #endif // OFFLOAD_MYO_HOST_H_INCLUDED
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -44,7 +44,7 @@ static void CheckResult(const char *func, MyoError error) { ...@@ -44,7 +44,7 @@ static void CheckResult(const char *func, MyoError error) {
} }
} }
static void __offload_myo_shared_table_register(SharedTableEntry *entry) static void __offload_myo_shared_table_process(SharedTableEntry *entry)
{ {
int entries = 0; int entries = 0;
SharedTableEntry *t_start; SharedTableEntry *t_start;
...@@ -68,7 +68,32 @@ static void __offload_myo_shared_table_register(SharedTableEntry *entry) ...@@ -68,7 +68,32 @@ static void __offload_myo_shared_table_register(SharedTableEntry *entry)
} }
} }
static void __offload_myo_fptr_table_register( static void __offload_myo_shared_vtable_process(SharedTableEntry *entry)
{
int entries = 0;
SharedTableEntry *t_start;
OFFLOAD_DEBUG_TRACE(3, "%s(%p)\n", __func__, entry);
t_start = entry;
while (t_start->varName != 0) {
OFFLOAD_DEBUG_TRACE_1(4, 0, c_offload_mic_myo_shared,
"myo shared vtable entry name"
" = \"%s\" addr = %p\n",
t_start->varName, t_start->sharedAddr);
t_start++;
entries++;
}
if (entries > 0) {
OFFLOAD_DEBUG_TRACE(3, "myoiMicVarTableRegister(%p, %d)\n", entry,
entries);
CheckResult("myoiMicVarTableRegister",
myoiMicVarTableRegister(entry, entries));
}
}
static void __offload_myo_fptr_table_process(
FptrTableEntry *entry FptrTableEntry *entry
) )
{ {
...@@ -94,9 +119,22 @@ static void __offload_myo_fptr_table_register( ...@@ -94,9 +119,22 @@ static void __offload_myo_fptr_table_register(
} }
} }
void __offload_myo_shared_init_table_process(InitTableEntry* entry)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%p)\n", __func__, entry);
for (; entry->func != 0; entry++) {
// Invoke the function to init the shared memory
OFFLOAD_DEBUG_TRACE(3, "Invoked a shared init function @%p\n",
(void *)(entry->func));
entry->func();
}
}
extern "C" void __offload_myoAcquire(void) extern "C" void __offload_myoAcquire(void)
{ {
OFFLOAD_DEBUG_TRACE(3, "%s\n", __func__); OFFLOAD_DEBUG_TRACE(3, "%s\n", __func__);
CheckResult("myoAcquire", myoAcquire()); CheckResult("myoAcquire", myoAcquire());
} }
...@@ -162,8 +200,35 @@ extern "C" void __offload_myoRegisterTables( ...@@ -162,8 +200,35 @@ extern "C" void __offload_myoRegisterTables(
return; return;
} }
__offload_myo_shared_table_register(shared_table); __offload_myo_shared_table_process(shared_table);
__offload_myo_fptr_table_register(fptr_table); __offload_myo_fptr_table_process(fptr_table);
}
extern "C" void __offload_myoProcessTables(
InitTableEntry* init_table,
SharedTableEntry *shared_table,
SharedTableEntry *shared_vtable,
FptrTableEntry *fptr_table
)
{
OFFLOAD_DEBUG_TRACE(3, "%s\n", __func__);
// one time registration of Intel(R) Cilk(TM) language entries
static pthread_once_t once_control = PTHREAD_ONCE_INIT;
pthread_once(&once_control, __offload_myo_once_init);
// register module's tables
// check slot-1 of the function table because
// slot-0 is predefined with --vtable_initializer--
if (shared_table->varName == 0 &&
shared_vtable->varName == 0 &&
fptr_table[1].funcName == 0) {
return;
}
__offload_myo_shared_table_process(shared_table);
__offload_myo_shared_vtable_process(shared_vtable);
__offload_myo_fptr_table_process(fptr_table);
} }
extern "C" void* _Offload_shared_malloc(size_t size) extern "C" void* _Offload_shared_malloc(size_t size)
...@@ -190,6 +255,46 @@ extern "C" void _Offload_shared_aligned_free(void *ptr) ...@@ -190,6 +255,46 @@ extern "C" void _Offload_shared_aligned_free(void *ptr)
myoSharedAlignedFree(ptr); myoSharedAlignedFree(ptr);
} }
extern "C" void* _Offload_shared_aligned_arena_malloc(
MyoArena arena,
size_t size,
size_t align
)
{
OFFLOAD_DEBUG_TRACE(
3, "%s(%u, %lld, %lld)\n", __func__, arena, size, align);
return myoArenaAlignedMalloc(arena, size, align);
}
extern "C" void _Offload_shared_aligned_arena_free(
MyoArena arena,
void *ptr
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u, %p)\n", __func__, arena, ptr);
myoArenaAlignedFree(arena, ptr);
}
extern "C" void _Offload_shared_arena_acquire(
MyoArena arena
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u)\n", __func__, arena);
myoArenaAcquire(arena);
}
extern "C" void _Offload_shared_arena_release(
MyoArena arena
)
{
OFFLOAD_DEBUG_TRACE(3, "%s(%u)\n", __func__, arena);
myoArenaRelease(arena);
}
// temporary workaround for blocking behavior of myoiLibInit/Fini calls // temporary workaround for blocking behavior of myoiLibInit/Fini calls
extern "C" void __offload_myoLibInit() extern "C" void __offload_myoLibInit()
{ {
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -31,42 +31,38 @@ ...@@ -31,42 +31,38 @@
#ifndef OFFLOAD_MYO_TARGET_H_INCLUDED #ifndef OFFLOAD_MYO_TARGET_H_INCLUDED
#define OFFLOAD_MYO_TARGET_H_INCLUDED #define OFFLOAD_MYO_TARGET_H_INCLUDED
#include <myotypes.h>
#include <myoimpl.h>
#include <myo.h>
#include "offload.h"
typedef MyoiSharedVarEntry SharedTableEntry;
typedef MyoiTargetSharedFptrEntry FptrTableEntry;
#ifdef TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable$a"
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable$z"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable$a"
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable$z"
#else // TARGET_WINNT
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_START ".MyoSharedTable."
#define OFFLOAD_MYO_SHARED_TABLE_SECTION_END ".MyoSharedTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_START ".MyoFptrTable."
#define OFFLOAD_MYO_FPTR_TABLE_SECTION_END ".MyoFptrTable."
#endif // TARGET_WINNT
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_SHARED_TABLE_SECTION_END, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_START, read, write)
#pragma section(OFFLOAD_MYO_FPTR_TABLE_SECTION_END, read, write)
#include "offload.h"
// undefine the following since offload.h defines them to malloc and free if __INTEL_OFFLOAD
// is not defined which is the case when building the offload library
#undef _Offload_shared_malloc
#undef _Offload_shared_free
#undef _Offload_shared_aligned_malloc
#undef _Offload_shared_aligned_free
#include "offload_table.h"
// This function retained for compatibility with 15.0
extern "C" void __offload_myoRegisterTables( extern "C" void __offload_myoRegisterTables(
SharedTableEntry *shared_table, SharedTableEntry *shared_table,
FptrTableEntry *fptr_table FptrTableEntry *fptr_table
); );
// Process shared variable, shared vtable and function and init routine tables.
// On the target side the contents of the tables are registered with MYO.
extern "C" void __offload_myoProcessTables(
InitTableEntry* init_table,
SharedTableEntry *shared_table,
SharedTableEntry *shared_vtable,
FptrTableEntry *fptr_table
);
extern "C" void __offload_myoAcquire(void); extern "C" void __offload_myoAcquire(void);
extern "C" void __offload_myoRelease(void); extern "C" void __offload_myoRelease(void);
// Call the compiler-generated routines for initializing shared variables.
// This can only be done after shared memory allocation has been done.
extern void __offload_myo_shared_init_table_process(InitTableEntry* entry);
// temporary workaround for blocking behavior for myoiLibInit/Fini calls // temporary workaround for blocking behavior for myoiLibInit/Fini calls
extern "C" void __offload_myoLibInit(); extern "C" void __offload_myoLibInit();
extern "C" void __offload_myoLibFini(); extern "C" void __offload_myoLibFini();
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -86,7 +86,7 @@ static int omp_get_int_from_host( ...@@ -86,7 +86,7 @@ static int omp_get_int_from_host(
return setting; return setting;
} }
void omp_set_num_threads_lrb( DLL_LOCAL void omp_set_num_threads_lrb(
void *ofld void *ofld
) )
{ {
...@@ -96,7 +96,7 @@ void omp_set_num_threads_lrb( ...@@ -96,7 +96,7 @@ void omp_set_num_threads_lrb(
omp_set_num_threads(num_threads); omp_set_num_threads(num_threads);
} }
void omp_get_max_threads_lrb( DLL_LOCAL void omp_get_max_threads_lrb(
void *ofld void *ofld
) )
{ {
...@@ -106,7 +106,7 @@ void omp_get_max_threads_lrb( ...@@ -106,7 +106,7 @@ void omp_get_max_threads_lrb(
omp_send_int_to_host(ofld, num_threads); omp_send_int_to_host(ofld, num_threads);
} }
void omp_get_num_procs_lrb( DLL_LOCAL void omp_get_num_procs_lrb(
void *ofld void *ofld
) )
{ {
...@@ -116,7 +116,7 @@ void omp_get_num_procs_lrb( ...@@ -116,7 +116,7 @@ void omp_get_num_procs_lrb(
omp_send_int_to_host(ofld, num_procs); omp_send_int_to_host(ofld, num_procs);
} }
void omp_set_dynamic_lrb( DLL_LOCAL void omp_set_dynamic_lrb(
void *ofld void *ofld
) )
{ {
...@@ -126,7 +126,7 @@ void omp_set_dynamic_lrb( ...@@ -126,7 +126,7 @@ void omp_set_dynamic_lrb(
omp_set_dynamic(dynamic); omp_set_dynamic(dynamic);
} }
void omp_get_dynamic_lrb( DLL_LOCAL void omp_get_dynamic_lrb(
void *ofld void *ofld
) )
{ {
...@@ -136,7 +136,7 @@ void omp_get_dynamic_lrb( ...@@ -136,7 +136,7 @@ void omp_get_dynamic_lrb(
omp_send_int_to_host(ofld, dynamic); omp_send_int_to_host(ofld, dynamic);
} }
void omp_set_nested_lrb( DLL_LOCAL void omp_set_nested_lrb(
void *ofld void *ofld
) )
{ {
...@@ -146,7 +146,7 @@ void omp_set_nested_lrb( ...@@ -146,7 +146,7 @@ void omp_set_nested_lrb(
omp_set_nested(nested); omp_set_nested(nested);
} }
void omp_get_nested_lrb( DLL_LOCAL void omp_get_nested_lrb(
void *ofld void *ofld
) )
{ {
...@@ -156,7 +156,7 @@ void omp_get_nested_lrb( ...@@ -156,7 +156,7 @@ void omp_get_nested_lrb(
omp_send_int_to_host(ofld, nested); omp_send_int_to_host(ofld, nested);
} }
void omp_set_schedule_lrb( DLL_LOCAL void omp_set_schedule_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -180,7 +180,7 @@ void omp_set_schedule_lrb( ...@@ -180,7 +180,7 @@ void omp_set_schedule_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_get_schedule_lrb( DLL_LOCAL void omp_get_schedule_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -206,7 +206,7 @@ void omp_get_schedule_lrb( ...@@ -206,7 +206,7 @@ void omp_get_schedule_lrb(
// lock API functions // lock API functions
void omp_init_lock_lrb( DLL_LOCAL void omp_init_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -224,7 +224,7 @@ void omp_init_lock_lrb( ...@@ -224,7 +224,7 @@ void omp_init_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_destroy_lock_lrb( DLL_LOCAL void omp_destroy_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -242,7 +242,7 @@ void omp_destroy_lock_lrb( ...@@ -242,7 +242,7 @@ void omp_destroy_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_set_lock_lrb( DLL_LOCAL void omp_set_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -260,7 +260,7 @@ void omp_set_lock_lrb( ...@@ -260,7 +260,7 @@ void omp_set_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_unset_lock_lrb( DLL_LOCAL void omp_unset_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -278,7 +278,7 @@ void omp_unset_lock_lrb( ...@@ -278,7 +278,7 @@ void omp_unset_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_test_lock_lrb( DLL_LOCAL void omp_test_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -304,7 +304,7 @@ void omp_test_lock_lrb( ...@@ -304,7 +304,7 @@ void omp_test_lock_lrb(
// nested lock API functions // nested lock API functions
void omp_init_nest_lock_lrb( DLL_LOCAL void omp_init_nest_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -322,7 +322,7 @@ void omp_init_nest_lock_lrb( ...@@ -322,7 +322,7 @@ void omp_init_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_destroy_nest_lock_lrb( DLL_LOCAL void omp_destroy_nest_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -340,7 +340,7 @@ void omp_destroy_nest_lock_lrb( ...@@ -340,7 +340,7 @@ void omp_destroy_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_set_nest_lock_lrb( DLL_LOCAL void omp_set_nest_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -358,7 +358,7 @@ void omp_set_nest_lock_lrb( ...@@ -358,7 +358,7 @@ void omp_set_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_unset_nest_lock_lrb( DLL_LOCAL void omp_unset_nest_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
...@@ -376,7 +376,7 @@ void omp_unset_nest_lock_lrb( ...@@ -376,7 +376,7 @@ void omp_unset_nest_lock_lrb(
OFFLOAD_TARGET_LEAVE(ofld); OFFLOAD_TARGET_LEAVE(ofld);
} }
void omp_test_nest_lock_lrb( DLL_LOCAL void omp_test_nest_lock_lrb(
void *ofld_ void *ofld_
) )
{ {
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -36,7 +36,7 @@ ...@@ -36,7 +36,7 @@
namespace ORSL { namespace ORSL {
static bool is_enabled = false; static bool is_enabled = false;
static const ORSLTag my_tag = "Offload"; static const ORSLTag my_tag = (const ORSLTag) "Offload";
void init() void init()
{ {
......
/* /*
Copyright (c) 2014 Intel Corporation. All Rights Reserved. Copyright (c) 2014-2015 Intel Corporation. All Rights Reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
...@@ -28,17 +28,19 @@ ...@@ -28,17 +28,19 @@
*/ */
#include "offload_util.h"
#ifndef OFFLOAD_ORSL_H_INCLUDED #ifndef OFFLOAD_ORSL_H_INCLUDED
#define OFFLOAD_ORSL_H_INCLUDED #define OFFLOAD_ORSL_H_INCLUDED
// ORSL interface // ORSL interface
namespace ORSL { namespace ORSL {
extern void init(); DLL_LOCAL extern void init();
extern bool reserve(int device); DLL_LOCAL extern bool reserve(int device);
extern bool try_reserve(int device); DLL_LOCAL extern bool try_reserve(int device);
extern void release(int device); DLL_LOCAL extern void release(int device);
} // namespace ORSL } // namespace ORSL
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment