1. 05 Apr, 2020 1 commit
  2. 04 Apr, 2020 3 commits
  3. 03 Apr, 2020 8 commits
    • [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. (#5225) · 75e936e1
      * [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager.
      
      - SplitHostDevice
      - ThreadSync
      - BindDevice
      - LowerThreadAllreduce
      - Provide a temp fix for printing IRModule with PrimFunc before the formal text printer.
      
      * Address comments, fix tests.
      
      * Fix relay tests
      
      * Explicit move
      Tianqi Chen committed
    • [RELAY] Non-recursive Graph Vistor and Rewriter (#4886) · 7de8a539
      * First pass a defining a non-recursive Graph Vistor and Rewriter
      
      autoformat
      
      remove a currently empty test until testing is solidfied
      
      * Make CalcDep from Dead Code Elimination non-recursive
      
      * Partially working, not passing all tests yet
      
      passes tests when disabling GetExprRefCount, I think I have a bug in visit counting
      
      fix GetExprRefCount
      
      Fix a subtle bug with nested recursive/non-recursive scopes
      
      * Refactor
      
      * improve comments
      
      * respond to review comments on comments
      
      * Fix a problem with default recursion for dataflow nodes
      
      mark DataflowVisitor methods as override
      
      * implement ScopeMutator
      
      * convert forward_rewrite to ScopeMutator, remove DataflowMutator
      
      * rewrite ExprRewriter and convert fast_math to use it
      
      * switch BiasAddSimplifier to ExprRewriter
      
      fix a clang warning
      
      fix cpp lint
      
      fix doc param error
      
      * respond to review comments
      
      * fix a typo in the iterative looping
      
      * add a regression test for GetExprRefCount issue
      
      * Normalize naming
      
      * fix lint
      
      * First pass a defining a non-recursive Graph Vistor and Rewriter
      
      autoformat
      
      remove a currently empty test until testing is solidfied
      
      * Make CalcDep from Dead Code Elimination non-recursive
      
      * Partially working, not passing all tests yet
      
      passes tests when disabling GetExprRefCount, I think I have a bug in visit counting
      
      fix GetExprRefCount
      
      Fix a subtle bug with nested recursive/non-recursive scopes
      
      * Refactor
      
      * improve comments
      
      * respond to review comments on comments
      
      * Fix a problem with default recursion for dataflow nodes
      
      mark DataflowVisitor methods as override
      
      * implement ScopeMutator
      
      * convert forward_rewrite to ScopeMutator, remove DataflowMutator
      
      * rewrite ExprRewriter and convert fast_math to use it
      
      * switch BiasAddSimplifier to ExprRewriter
      
      fix a clang warning
      
      fix cpp lint
      
      fix doc param error
      
      * respond to review comments
      
      * fix a typo in the iterative looping
      
      * add a regression test for GetExprRefCount issue
      
      * Normalize naming
      
      * fix lint
      
      * respond to review comments
      Matthew Brookhart committed
    • [RELAY][FIX] Fix hang in MergeCompilerRegions (#5227) · 54975a3f
      For certain network topologies, MCR could hang.
      This patch fixes that case.
      
      Change-Id: I3edd8a8a6b452b2b838b777720adea22a3b995b4
      mbaret committed
    • [KERAS]Upsample3d & ZeroPadding3d op (#5125) · b796c13c
      * [KERAS]upsampling3d and zeropadding3d op
      
      * [KERAS]upsampling3d and zeropadding3d test case
      
      * Review comments updated
      Samuel committed
    • [CodeGen][CUDA] Fix bugs (#5209) · 316ce055
      - Support vectorized casts
      
      - It is incorrect to extract elements from int8x4 with
      
         0x000000ff & (x >> i * 8)
      
        as this value is of type int in C/C++. If this expression
        is used for sign extensions, the sign bit will be wrong.
        Simply use C style casts instead and sign bits will just work.
      
      Signed-off-by: Wei Pan <weip@nvidia.com>
      Wei Pan committed
  4. 02 Apr, 2020 12 commits
  5. 01 Apr, 2020 8 commits
  6. 31 Mar, 2020 8 commits
    • [RELAY] Re-wrote the Graph Partitioner to support multiple outputs (#5143) · 14ae3a6e
      * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs
      
      Input : A Relay module that have functions with disjoint annotated regions
              using compiler_begin and compiler_end. There could be multiple outputs.
      
      Output : A Relay module with global functions for such disjoint annotated regions
               with calls inserted at the respective location
      
      Dependencies : AnnotatedRegionSet Utility class.
      
      Methodology :
            1) The AnnotatedRegionSet utility class is able to construct a collection of
               nodes that are bound by a give annotation -- here we use compiler_begin
               and compiler_end
            2) Initially, for each function in the module AnnotatedRegionSets are populated.
            3) Then, Vistor pass is traversed until a compiler_end node is encountered
               that belongs to a "region".
            4) When the first compiler_end of a given annotated region is found, a function is
               formed and inserted.
               a) if the region has multiple outputs, a Tuple node (capturing all outputs)
                  is returned.
            5) Thereafter, if we encounter an another output of the same annotated region,
               it is important to note that the function is already formed. Therefore, it will
               lookup the function and add a TupleGetItemNode is inserted.
                a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet
                   as TupleGetItemNode index.
            6) Therefore, functions will be created for all annotated regions. The name for each
               global function is created using "Region" id and the compiler name.
      
      Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c
      
      * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs
      
          *removed the expected use-case as we are taking broken-down PR approach
          *code style fixes
          *some trivial one liners
      
      * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs
      
          *fixed an implicit copy to a move
      
      * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs
      
          *code style changes for comments
          *renamed test case multiple outputs --> mixed single multiple outputs
              Since the existing test case checks for both single and multiple
              output scenarios
          *added a new test case with conv2d + batch_norm
          *some var name changes in the test
      
      * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs
      
      	*rebased
      manupa-arm committed
    • [Torch] Add support for split (#5174) · 430cb899
      * [Torch] Add support for split
      
      * fix
      
      * fix test class
      Wang Yucheng committed
    • [FRONTEND][KERAS]Max_pool3d and Averagepool3d operator support (#5085) · c97e41b0
      * [KERAS]Pool3d support added
      
      * Keras pool3d testcase added
      Samuel committed
    • [VTA] HW sources refactor (#5188) · 4683c3f5
      * refactor
      
      * path udpate
      Thierry Moreau committed