Unverified Commit 708fd9a9 by MORITA Kazutaka Committed by GitHub

[DOCS] Migrate HLS documents from md to rst (#5419)

parent 1acad98e
<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements. See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership. The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License. You may obtain a copy of the License at -->
<!--- http://www.apache.org/licenses/LICENSE-2.0 -->
<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied. See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->
AOCL Backend Example
====================
TVM supports Intel FPGA SDK for OpenCL also known as AOCL. Here is a tutorial for how to use TVM with AOCL.
***Note***: This feature is still experimental. We cannot use AOCL to deploy an end to end neural networks for now. In addition, we only tested compilation for emulation mode of AOCL.
We use two python scripts for this tutorial.
- build.py - a script to synthesize FPGA bitstream.
```
import tvm
from tvm import te
tgt_host="llvm"
tgt="aocl_sw_emu"
n = te.var("n")
A = te.placeholder((n,), name='A')
B = te.placeholder((n,), name='B')
C = te.compute(A.shape, lambda i: A[i] + B[i], name="C")
s = te.create_schedule(C.op)
px, x = s[C].split(C.op.axis[0], nparts=1)
s[C].bind(px, tvm.thread_axis("pipeline"))
fadd = tvm.build(s, [A, B, C], tgt, target_host=tgt_host, name="myadd")
fadd.save("myadd.o")
fadd.imported_modules[0].save("myadd.aocx")
tvm.contrib.cc.create_shared("myadd.so", ["myadd.o"])
```
- run.py - a script to use FPGA as an accelerator.
```
import tvm
import numpy as np
import os
tgt="aocl_sw_emu"
fadd = tvm.runtime.load("myadd.so")
fadd_dev = tvm.runtime.load("myadd.aocx")
fadd.import_module(fadd_dev)
ctx = tvm.context(tgt, 0)
n = 1024
a = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
c = tvm.nd.array(np.zeros(n, dtype="float32"), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())
```
Setup
-----
- Install AOCL 17.1 on Ubuntu 16.04.4 LTS.
- Install BSP for your FPGA device.
- Install FPGA device driver.
- Create an ICD file at /etc/OpenCL/vendors/Altera.icd so that the OpenCL platform can be found.
```
/opt/intelFPGA/17.1/hld/linux64/lib/libalteracl.so
```
- Create an FCD file for example at /opt/Intel/OpenCL/Boards/s5_ref.fcd so that your FPGA device can be found.
```
/opt/intelFPGA/17.1/hld/board/s5_ref/linux64/lib/libaltera_s5_ref_mmd.so
```
- Setup TVM with AOCL and OpenCL enabled.
Emulation
---------
- Run software emulation
```
export CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA=1
python build.py
python run.py
```
- Run on FPGA devices (not tested)
- Change tgt value to "aocl -device=s5_ref" on build.py and run.py
```
unset CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA
python build.py
python run.py
```
<!--- Licensed to the Apache Software Foundation (ASF) under one --> .. Licensed to the Apache Software Foundation (ASF) under one
<!--- or more contributor license agreements. See the NOTICE file --> or more contributor license agreements. See the NOTICE file
<!--- distributed with this work for additional information --> distributed with this work for additional information
<!--- regarding copyright ownership. The ASF licenses this file --> regarding copyright ownership. The ASF licenses this file
<!--- to you under the Apache License, Version 2.0 (the --> to you under the Apache License, Version 2.0 (the
<!--- "License"); you may not use this file except in compliance --> "License"); you may not use this file except in compliance
<!--- with the License. You may obtain a copy of the License at --> with the License. You may obtain a copy of the License at
<!--- http://www.apache.org/licenses/LICENSE-2.0 --> .. http://www.apache.org/licenses/LICENSE-2.0
<!--- Unless required by applicable law or agreed to in writing, --> .. Unless required by applicable law or agreed to in writing,
<!--- software distributed under the License is distributed on an --> software distributed under the License is distributed on an
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY --> "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
<!--- KIND, either express or implied. See the License for the --> KIND, either express or implied. See the License for the
<!--- specific language governing permissions and limitations --> specific language governing permissions and limitations
<!--- under the License. --> under the License.
HLS Backend Example HLS Backend Example
=================== ===================
TVM supports Xilinx FPGA board with SDAccel. Here is a tutorial for how to deploy TVM to AWS F1 FPGA instance. TVM supports Xilinx FPGA board with SDAccel. Here is a tutorial for how to deploy TVM to AWS F1 FPGA instance.
***Note***: This feature is still experimental. We cannot use SDAccel to deploy an end to end neural networks for now. .. note::
This feature is still experimental. We cannot use SDAccel to deploy an end to end neural networks for now.
We use two python scripts for this tutorial. We use two python scripts for this tutorial.
- build.py - a script to synthesize FPGA bitstream. - build.py - a script to synthesize FPGA bitstream.
```python
import tvm
from tvm import te
tgt_host="llvm" .. code:: python
tgt="sdaccel"
n = te.var("n") import tvm
A = te.placeholder((n,), name='A') from tvm import te
B = te.placeholder((n,), name='B')
C = te.compute(A.shape, lambda i: A[i] + B[i], name="C")
s = te.create_schedule(C.op) tgt_host="llvm"
px, x = s[C].split(C.op.axis[0], nparts=1) tgt="sdaccel"
s[C].bind(px, tvm.thread_axis("pipeline")) n = te.var("n")
A = te.placeholder((n,), name='A')
B = te.placeholder((n,), name='B')
C = te.compute(A.shape, lambda i: A[i] + B[i], name="C")
fadd = tvm.build(s, [A, B, C], tgt, target_host=tgt_host, name="myadd") s = te.create_schedule(C.op)
px, x = s[C].split(C.op.axis[0], nparts=1)
fadd.save("myadd.o") s[C].bind(px, tvm.thread_axis("pipeline"))
fadd.imported_modules[0].save("myadd.xclbin")
tvm.contrib.cc.create_shared("myadd.so", ["myadd.o"]) fadd = tvm.build(s, [A, B, C], tgt, target_host=tgt_host, name="myadd")
```
fadd.save("myadd.o")
fadd.imported_modules[0].save("myadd.xclbin")
tvm.contrib.cc.create_shared("myadd.so", ["myadd.o"])
- run.py - a script to use FPGA as an accelerator. - run.py - a script to use FPGA as an accelerator.
```python
import tvm
import numpy as np
import os
tgt="sdaccel" .. code:: python
import tvm
import numpy as np
import os
tgt="sdaccel"
fadd = tvm.runtime.load("myadd.so") fadd = tvm.runtime.load("myadd.so")
if os.environ.get("XCL_EMULATION_MODE"): if os.environ.get("XCL_EMULATION_MODE"):
fadd_dev = tvm.runtime.load("myadd.xclbin") fadd_dev = tvm.runtime.load("myadd.xclbin")
else: else:
fadd_dev = tvm.runtime.load("myadd.awsxclbin") fadd_dev = tvm.runtime.load("myadd.awsxclbin")
fadd.import_module(fadd_dev) fadd.import_module(fadd_dev)
ctx = tvm.context(tgt, 0) ctx = tvm.context(tgt, 0)
n = 1024 n = 1024
a = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx) a = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx) b = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
c = tvm.nd.array(np.zeros(n, dtype="float32"), ctx) c = tvm.nd.array(np.zeros(n, dtype="float32"), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())
fadd(a, b, c)
tvm.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())
```
Setup Setup
----- -----
- Launch an instance using the FPGA Developer AMI. We don't need an F1 instance for emulation and synthesis, so it is recommended to use a lower cost instance for them. - Launch an instance using the FPGA Developer AMI. We don't need an F1 instance for emulation and synthesis, so it is recommended to use a lower cost instance for them.
- Setup AWS FPGA development kit. - Setup AWS FPGA development kit.
```bash
git clone https://github.com/aws/aws-fpga.git .. code:: bash
cd aws-fpga
source sdaccel_setup.sh git clone https://github.com/aws/aws-fpga.git
source ${XILINX_SDX}/settings64.sh cd aws-fpga
``` source sdaccel_setup.sh
source ${XILINX_SDX}/settings64.sh
- Setup TVM with OpenCL enabled. - Setup TVM with OpenCL enabled.
...@@ -95,76 +101,83 @@ Emulation ...@@ -95,76 +101,83 @@ Emulation
--------- ---------
- Create emconfig.json for emulation. - Create emconfig.json for emulation.
```bash
emconfigutil --platform ${AWS_PLATFORM} --nd 1 .. code:: bash
```
emconfigutil --platform ${AWS_PLATFORM} --nd 1
- Copy emconfig.json to the python binary directory. It is because the current Xilinx toolkit assumes that both host binary and the emconfig.json file are in the same path. - Copy emconfig.json to the python binary directory. It is because the current Xilinx toolkit assumes that both host binary and the emconfig.json file are in the same path.
```bash
cp emconfig.json $(dirname $(which python)) .. code:: bash
```
cp emconfig.json $(dirname $(which python))
- Run software emulation - Run software emulation
```bash
export XCL_EMULATION_MODE=1
export XCL_TARGET=sw_emu
python build.py .. code:: bash
python run.py
``` export XCL_EMULATION_MODE=1
export XCL_TARGET=sw_emu
python build.py
python run.py
- Run hardware emulation - Run hardware emulation
```bash
export XCL_EMULATION_MODE=1
export XCL_TARGET=hw_emu
python build.py .. code:: bash
python run.py
``` export XCL_EMULATION_MODE=1
export XCL_TARGET=hw_emu
python build.py
python run.py
Synthesis Synthesis
--------- ---------
- Run synthesis with the following script. - Run synthesis with the following script.
```bash .. code:: bash
unset XCL_EMULATION_MODE
export XCL_TARGET=hw
python build.py unset XCL_EMULATION_MODE
``` export XCL_TARGET=hw
python build.py
- Create AWS FPGA image and upload it to AWS S3. - Create AWS FPGA image and upload it to AWS S3.
```
${SDACCEL_DIR}/tools/create_sdaccel_afi.sh -xclbin=myadd.xclbin -o=myadd \ .. code:: bash
-s3_bucket=<bucket-name> -s3_dcp_key=<dcp-folder-name> -s3_logs_key=<logs-folder-name>
``` ${SDACCEL_DIR}/tools/create_sdaccel_afi.sh \
This also generates an awsxclbin file, which is necessary to use the AWS FPGA image on F1 instances. -xclbin=myadd.xclbin -o=myadd \
-s3_bucket=<bucket-name> -s3_dcp_key=<dcp-folder-name> \
-s3_logs_key=<logs-folder-name>
This also generates an awsxclbin file, which is necessary to use the AWS FPGA image on F1 instances.
Run Run
--- ---
- Launch Amazon EC2 F1 instance. - Launch Amazon EC2 F1 instance.
- Copy ``myadd.so``, ``myadd.awsxclbin``, and ``run.py`` to the F1 instance.
- Setup AWS FPGA development kit.
- Copy `myadd.so`, `myadd.awsxclbin`, and `run.py` to the F1 instance. .. code:: bash
- Setup AWS FPGA development kit. git clone https://github.com/aws/aws-fpga.git
```bash cd aws-fpga
git clone https://github.com/aws/aws-fpga.git source sdaccel_setup.sh
cd aws-fpga
source sdaccel_setup.sh
```
- Setup TVM with OpenCL enabled. - Setup TVM with OpenCL enabled.
- Become root and setup environment variables. - Become root and setup environment variables.
```bash
sudo sh .. code:: bash
source ${INSTALL_ROOT}/setup.sh
``` sudo sh
source ${INSTALL_ROOT}/setup.sh
- Run - Run
```bash
python run.py .. code:: bash
```
python run.py
...@@ -67,5 +67,4 @@ target device without relying on RPC. see the following resources on how to do s ...@@ -67,5 +67,4 @@ target device without relying on RPC. see the following resources on how to do s
cpp_deploy cpp_deploy
android android
integrate integrate
aocl_fpga hls
aws_fpga
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment