Commits · 6805d54370ea657a304c58d610e5371c4add4bdf · wenyuanbo / tic

12 Apr, 2020 2 commits
- [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign (#5312) · cd0d52da
```
* [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign

* Lint
```
  Junru Shao committed 4 years ago
  cd0d52da Browse Directory
- Remove PrimExpr from String (#5311) · 8c31d0dd
  Zhi committed 4 years ago
  
  8c31d0dd Browse Directory
11 Apr, 2020 1 commit

[LLVM] Fix generation of LLVM intrinsics (#5282) · 403929f9

* [LLVM] Fix generation of LLVM intrinsics

The type list in the call to llvm::Intrinsic::getDeclaration is not
the intrinsic's signature, it's the list of overloaded types. Without
this fix, the updated unit test would cause the following error:

TVMError: LLVM module verification failed with the following errors:
Intrinsic name not mangled correctly for type arguments! Should be:
llvm.ctlz.i32
i32 (i32, i1)* @llvm.ctlz.i32.i1

Special handling for llvm.prefetch, sig matching for overloaded ints only

The prefetch intrinsic returns void in LLVM, while it returns i32 in TVM.
This case needs to be handled specially, because rule-based intrinsic
translation would cause invalid LLVM type to be created.

Do the signature matching only for overloaded intrinsics. It's not needed
for non-overloaded ones, so this can save a bit of compile-time.

* Include intrinsic name in the error message

* Fix number of arguments for llvm.fmuladd and llvm.pow

committed 4 years ago

403929f9 Browse Directory

10 Apr, 2020 3 commits

[CI] Fix the hexagon string (#5304) · 5795539c
Tianqi Chen committed 4 years ago

5795539c Browse Directory

[REFACTOR][IR] Move to runtime::String (#5276) · 5da361d3

* Use runtime::String

* move string to tvm namespace

* add const char* constructor

* implicit cast from std::string

committed 4 years ago

5da361d3 Browse Directory

[RUNTIME] Initial implementation of Hexagon runtime support (#5252) · 02d3a59b

* [RUNTIME] Initial implementation of Hexagon runtime support

This is only the TVM runtime. The FastRPC libraries, simulator driver,
etc. will be provided in subsequent commits.

* Fix pylint complaints

* Fix some more pylint complaints

* Add link to the Hexagon SDK website

* Extract VTCM marker into a common variable

* Implement device->device memory copy

* Disable unsigned PDs by default

* Ensure that --hvx_length is present in sim_args if HVX is enabled

* Remove the line about clang from README.md

Apparently things work with libstdc++.

* Mention to set USE_RPC=OFF when building libtvm_runtime.so for Hexagon

* Remember to use codegen_hvx in validate_hvx_length

* Add a line about minimum version of LLVM

committed 4 years ago

02d3a59b Browse Directory

08 Apr, 2020 1 commit
- [LLVM] Include Support/Host.h for declaration of getDefaultTargetTriple (#5268) · e9c90b72
```
In newer versions of LLVM, this header is no longer included by one of
the already included headers in llvm_common.h, so include it explicitly.
```
  Krzysztof Parzyszek committed 4 years ago
  e9c90b72 Browse Directory
07 Apr, 2020 3 commits
- [LLVM] Do not use x86_vcvtph2ps_256 intrinsic with LLVM 11+ (#5267) · 4e007632
```
This intrinsic was removed in LLVM 11.
```
  Krzysztof Parzyszek committed 4 years ago
  4e007632 Browse Directory
- [LLVM] Use llvm::ElementCount with LLVM 11+ when creating vectors (#5265) · df8a6f3b
```
LLVM 11 added support for scalable vectors, and now the number of
elements in a vector is represented by a llvm::ElementCount class,
not just a number.
```
  Krzysztof Parzyszek committed 4 years ago
  df8a6f3b Browse Directory
- [LLVM] Use llvm::Align with LLVM 11+ to avoid warnings (#5264) · 36ce2e24
```
LLVM 11 is introducing a separate class to represent alignment.
The functions in IRBuilder that create aligned loads and stores,
and which accept the alignment as an unsigned value have been
deprecated (and now cause warnings to be emitted).
```
  Krzysztof Parzyszek committed 4 years ago
  36ce2e24 Browse Directory
05 Apr, 2020 1 commit

[REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. (#5233) · e63e08fe

* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager.

This PR migrates the tvm.lower to return IRModule of PrimFuncs
instead of the LoweredFuncs.

* Remove LoweredFunc.

committed 4 years ago

e63e08fe Browse Directory

03 Apr, 2020 2 commits

[REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. (#5225) · 75e936e1

* [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager.

- SplitHostDevice
- ThreadSync
- BindDevice
- LowerThreadAllreduce
- Provide a temp fix for printing IRModule with PrimFunc before the formal text printer.

* Address comments, fix tests.

* Fix relay tests

* Explicit move

committed 4 years ago

75e936e1 Browse Directory

[CodeGen][CUDA] Fix bugs (#5209) · 316ce055

- Support vectorized casts

- It is incorrect to extract elements from int8x4 with

   0x000000ff & (x >> i * 8)

  as this value is of type int in C/C++. If this expression
  is used for sign extensions, the sign bit will be wrong.
  Simply use C style casts instead and sign bits will just work.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed 4 years ago

316ce055 Browse Directory

02 Apr, 2020 2 commits
- [REFACTOR][TIR] Migrate low-level pass functions to Pass Manager, (#5213) · 44bffdb3
```
- Migrate LowerTVMBultin
- Migrate inferFragment, LowerThreadAllreduce
- Migrate ThreadSync
- Refactor target::Build to directly take IRModule.
- Remove un-used legacy functions.
```
  Tianqi Chen committed 4 years ago
  44bffdb3 Browse Directory
- [TIR][PASS] dtype rewrite for indexing variables (#5092) · 4e5c5843
  Haozheng Fan committed 4 years ago
  
  4e5c5843 Browse Directory
01 Apr, 2020 1 commit
- [REFACTOR][TIR] Migrate Low-level Passes to Pass Manager (#5198) · 2b6d69c6
```
* [TIR][TRANSFORM] Migrate LowerIntrin

* LowerDeviceStorageAccessInfo

* Migrate LowerWarpMemory
```
  Tianqi Chen committed 4 years ago
  2b6d69c6 Browse Directory
26 Mar, 2020 1 commit
- Handle empty LLVMModule in GetFunction (#5146) · 08b38be9
  Ruizhe Zhao committed 4 years ago
  
  08b38be9 Browse Directory
24 Mar, 2020 1 commit

[REFACTOR][TIR] Introduce PrimFuncPass. (#5139) · 0a0e58bf

* [REFACTOR][TIR] Introduce PrimFuncPass.

- Introduce PrimFuncPass
- Convert one pass to the unified Pass API.

* Address comments

* Fix comments

committed 4 years ago

0a0e58bf Browse Directory

23 Mar, 2020 1 commit

[Relay, Topi, TF Frontend] Isfinite operator (#4981) · 9037f4ec

* isfinite doc update

* isfinit expr

* isfinit expr

* isfinite schedule reg

* isfinite python binding

* isfinite python binding

* relay register isfinite

* isfinite type relation

* intrin isfinite

* topi isfinite

* testcase topi isfinite

* tf frontend isfinite

* tf frontend isfinite testcase

* test case relay isfinite

* small fixes

* test forward tf isfinite

* test cases injective for cuda

* remove float16 test case

* add support for isinf

* remove unwanted import

* fix conflict

committed 4 years ago

9037f4ec Browse Directory

22 Mar, 2020 1 commit

[CodeGen][CUDA] Vectorization for intrinsics (#5101) · 05b0f7e0

- This allows to emit vectorized loads/stores
  for CUDA math intrinsics.

- A few intrinsics should be lowered as CUDAMath not CUDAFastMath ones.

- Fixed the code block identation.

committed 4 years ago

05b0f7e0 Browse Directory

20 Mar, 2020 1 commit

[TIR][TARGET] Refactor Target codegen to use IRModule and PrimFunc. (#5107) · 841725cc

As part of the unified IR refactor.
This PR refactors the target codegen to use IRModule containing tir::PrimFuncs.

In order to break the refactor into several steps without breaking the codebase,
we built an conversion pass to convert Array<LoweredFunc> into IRModule.

The follow-up refactors will gradually move the passes covered by IRModule up
until we cover all the passes. Then we can remove the additional redundant
concepts such as LoweredFunc.

committed 4 years ago

841725cc Browse Directory

18 Mar, 2020 1 commit
- [CODEGEN][OPENCL] Explicitly cast min/max operands (#5090) · c3b89b76
```
* [CODEGEN][OPENCL] Explicitly cast min/max operands

* retrigger CI
```
  MORITA Kazutaka committed 4 years ago
  c3b89b76 Browse Directory
16 Mar, 2020 1 commit

Return empty CSourceModule when no lowered_funcs exists in Relay mod (#4847) · 11ee1a0e

* Use dummy func when no lowered_funcs exists in Relay mod

* Dummy func -> CSourceModule with empty code str

* Added comments describing the empty CSouceModule

* Always import external modules w/o assertions

* Use CSourceModule as a fallback for LLVMModule

* Changed cond for target == llvm

* Create an empty LLVM module w/o using dummy func

* Avoid using IR str concat to create LLVM module

* Improved comments for codegen.LLVMModuleCreate

* Satisfy the linter for LLVMModuleCreate

committed 4 years ago

11ee1a0e Browse Directory

11 Mar, 2020 3 commits

[Intrin] Adding a few missing math intrin (#5011) · a6cb4b8d
```
* [intrin] exp2

* [intrin] exp10

* [intrin] log2/10

* [intrins] exp10

* [test] math intrin
```
Bing Xu committed 4 years ago
a6cb4b8d Browse Directory

[CodeGen][CUDA] Enhance CUDA codegen for SelectNode (#4983) · afa84171

- This patch allows CUDA backend to emit correct code for
  selects with vector conditions, which may be produced
  by floordiv op lowering etc..

- This already works for llvm BE, as llvm select instruction
  supports vector conditions.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed 4 years ago

afa84171 Browse Directory

[topi][relay] new PR to re-add tan to TVM (#5025) · 45ee7b5f

* Add relay operation relay.op.tan.

* Update tan implementation in TVM.

* Update tests.

* Add shape function for tan.

* Add missing main test to python/frontend/tensorflow/test_forward.

* Revert, back to sin/cos.

* Revert "Revert, back to sin/cos."

This reverts commit 4da5b503b921585ba9d80944b29136142b575c40.

* Fix implementation of tan in cuda. Do not support tan for float16.

Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64.

Finally implement tan as sin/cos in llvm.

committed 4 years ago

45ee7b5f Browse Directory

10 Mar, 2020 1 commit
- Revert "[topi][relay] add operation tan to TVM (#4938)" (#5017) · c0bc1882
```
This reverts commit d992468d.
```
  Yao Wang committed 4 years ago
  c0bc1882 Browse Directory
09 Mar, 2020 1 commit
- typo (#5008) · 6ee9c2f8
  雾雨魔理沙 committed 4 years ago
  
  6ee9c2f8 Browse Directory
06 Mar, 2020 1 commit

[topi][relay] add operation tan to TVM (#4938) · d992468d

* Add relay operation relay.op.tan.

* Update tan implementation in TVM.

* Update tests.

* Add shape function for tan.

* Add missing main test to python/frontend/tensorflow/test_forward.

* Revert, back to sin/cos.

* Revert "Revert, back to sin/cos."

This reverts commit 4da5b503b921585ba9d80944b29136142b575c40.

* Fix implementation of tan in cuda. Do not support tan for float16.

Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64.

Try again to implement tan as sin/cos in llvm.

committed 4 years ago

d992468d Browse Directory

25 Feb, 2020 1 commit
- [LLVM] Fix build breaks from StringRef changes (#4923) · 588523dd
```
- llvm::StringRef to std::string conversion is explicit now.

Signed-off-by: Wei Pan <wpan11nv@nvidia.com>
```
  wpan11nv committed 4 years ago
  588523dd Browse Directory
21 Feb, 2020 1 commit

[CODEGEN] Support cuda tensorcore subbyte int data type in auto tensorcore (#4546) · f23ac969

* support cuda tensorcore subbyte int data type in auto tensorcore

* add lisence

* pass cpplint

* fix code review comments

* merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial

* using master's new API

* disable tuning when cuda is not enabled

* address cr comment

* do not run the tuning

* fix test failure

* fix cpplint error

* fix bool type reduction bug

* 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4

* fix typo

committed 4 years ago

f23ac969 Browse Directory

18 Feb, 2020 1 commit
- [REFACTOR][PY] Establish tvm.te and tvm.driver (#4900) · 08338dd5
```
- Move the related files to tvm.te
- Move build_module.py to tvm.driver
```
  Tianqi Chen committed 5 years ago
  08338dd5 Browse Directory
16 Feb, 2020 1 commit

[CodeGen][CUDA] Fix issues in cuda codegen (#4876) · d50ba721

- Do not emit __shared__ etc. as part of type for casting

- Fix fp16 reduction kernels with compiler errors:

  "no operator "+" matches these operands, volatile half + volatile half

  This patch inserts casts to remove volatile type qualifier following
  volatile loads (fp16 only). CUDA fp16 library headers should add
  volatile member functions.

- Update have_fp16 to include compute 6.1 GPUs, which do support fp16,
  although their fp16 throughput is low. Updated tests.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed 5 years ago

d50ba721 Browse Directory

13 Feb, 2020 1 commit

[REFACTOR][PY][API-CHANGE] Establish tvm.target · 51a265af

Move the related target modules into tvm.target.

API change:
- tvm.target.current_target -> tvm.target.Target.current
- tvm.datatype -> tvm.target.datatype

committed 5 years ago

51a265af Browse Directory

11 Feb, 2020 1 commit
- [LLVM] Explicit llvm::StringRef to std::string conversion (#4859) · 91d2f5af
  hlu1 committed 5 years ago
  
  91d2f5af Browse Directory
07 Feb, 2020 1 commit

[REFACTOR][PY][API-Change] Polish tvm.runtime, tvm.runtime.module API update (#4837) · e0122c0e

* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update

This PR updates the tvm.runtime to use the new FFI style.

- Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule
- API changes wrt to runtime.Module
  - tvm.module.load -> tvm.runtime.load_module
  - tvm.module.enabled -> tvm.runtime.enabled
  - tvm.module.system_lib -> tvm.runtime.system_lib
- Remove dep on api_internal from runtime.

* Update module.load in the latest API

committed 5 years ago

e0122c0e Browse Directory

04 Feb, 2020 1 commit
- [LINT] Fix -Wextra (#4804) · 6f7d6fa4
```
* [LINT] Fix -Wextra

* Fix virtual-dtor
```
  Tianqi Chen committed 5 years ago
  6f7d6fa4 Browse Directory
21 Jan, 2020 1 commit

[REFACTOR] Establish printer in the source folder (#4752) · e4d817d4

* [REFACTOR] Establish printer in the source folder.

As we move towards the unified IR, we will eventually want to build a unified
printers for both relay and TIR.

This PR isolate the printer component into a separate folder in src as a first step.

- Refactored the Doc DSL using Object, clean up APIs.
- Isolate out the meta data into a header.
- move printer into relay_text_printer, add comments about further TODos.

* Rename NodePrinter -> ReprPrinter to distinguish it from other printers

committed 5 years ago

e4d817d4 Browse Directory

19 Jan, 2020 2 commits

[REFACTOR][CODEGEN] codegen->target, build_module->driver (#4742) · 33b0831c

This PR moves the codegen related code into the target folder,
as they are target specific functionalities.

We also adopt the term "compiler driver" in common compiler infra
such as rust, GHC and clang.
As a result, build_module is moved into the driver folder.

committed 5 years ago

33b0831c Browse Directory

[REFACTOR] Establish tir (#4740) · cf59b206

TIR is the new namespace for low-level IR
for tensor-level optimizations and loop transformations.

This PR establishes the namespace and files.

- lowered_func.h,buffer.h,data_layout.h -> tir/buffer.h,tir/data_layout.h,tir/lowered_func.h
- ir.h -> tir/expr.h, tir/stmt.h
- ir_functor_ext.h -> tir/expr_functor.h, tir/stmt_functor.h

committed 5 years ago

cf59b206 Browse Directory