aws_fpga.md 3.3 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
HLS Backend Example
===================

TVM supports Xilinx FPGA board with SDAccel.  Here is a tutorial for how to deploy TVM to AWS F1 FPGA instance.

***Note***: This feature is still experimental.  We cannot use SDAccel to deploy an end to end neural networks for now.

We use two python scripts for this tutorial.

- build.py - a script to synthesize FPGA bitstream.
```python
import tvm

tgt_host="llvm"
tgt="sdaccel"

n = tvm.var("n")
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.compute(A.shape, lambda i: A[i] + B[i], name="C")

s = tvm.create_schedule(C.op)
px, x = s[C].split(C.op.axis[0], nparts=1)

s[C].bind(px, tvm.thread_axis("pipeline"))

fadd = tvm.build(s, [A, B, C], tgt, target_host=tgt_host, name="myadd")

fadd.save("myadd.o")
fadd.imported_modules[0].save("myadd.xclbin")

tvm.contrib.cc.create_shared("myadd.so", ["myadd.o"])
```

- run.py - a script to use FPGA as an accelerator.
```python
import tvm
import numpy as np
import os

tgt="sdaccel"

fadd = tvm.module.load("myadd.so")
if os.environ.get("XCL_EMULATION_MODE"):
    fadd_dev = tvm.module.load("myadd.xclbin")
else:
    fadd_dev = tvm.module.load("myadd.awsxclbin")
fadd.import_module(fadd_dev)

ctx = tvm.context(tgt, 0)

n = 1024
a = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
c = tvm.nd.array(np.zeros(n, dtype="float32"), ctx)

fadd(a, b, c)
58
tvm.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
```

Setup
-----

- Launch an instance using the FPGA Developer AMI.  We don't need an F1 instance for emulation and synthesis, so it is recommended to use a lower cost instance for them.

- Setup AWS FPGA development kit.
```bash
git clone https://github.com/aws/aws-fpga.git
cd aws-fpga
source sdaccel_setup.sh
source ${XILINX_SDX}/settings64.sh
```

- Setup TVM with OpenCL enabled.

Emulation
---------

- Create emconfig.json for emulation.
```bash
emconfigutil --platform ${AWS_PLATFORM} --nd 1
```

- Copy emconfig.json to the python binary directory.  It is because the current Xilinx toolkit assumes that both host binary and the emconfig.json file are in the same path.
```bash
cp emconfig.json $(dirname $(which python))
```

- Run software emulation
```bash
export XCL_EMULATION_MODE=1
export XCL_TARGET=sw_emu

python build.py
python run.py
```

- Run hardware emulation
```bash
export XCL_EMULATION_MODE=1
export XCL_TARGET=hw_emu

python build.py
python run.py
```


Synthesis
---------

111
- Run synthesis with the following script.
112 113

```bash
114
unset XCL_EMULATION_MODE
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152
export XCL_TARGET=hw

python build.py
```

- Create AWS FPGA image and upload it to AWS S3.
```
${SDACCEL_DIR}/tools/create_sdaccel_afi.sh -xclbin=myadd.xclbin -o=myadd \
    -s3_bucket=<bucket-name> -s3_dcp_key=<dcp-folder-name> -s3_logs_key=<logs-folder-name>
```
This also generates an awsxclbin file, which is necessary to use the AWS FPGA image on F1 instances.

Run
---

- Launch Amazon EC2 F1 instance.

- Copy `myadd.so`, `myadd.awsxclbin`, and `run.py` to the F1 instance.

- Setup AWS FPGA development kit.
```bash
git clone https://github.com/aws/aws-fpga.git
cd aws-fpga
source sdaccel_setup.sh
```

- Setup TVM with OpenCL enabled.

- Become root and setup environment variables.
```bash
sudo sh
source ${INSTALL_ROOT}/setup.sh
```

- Run
```bash
python run.py
```