[TOPI] Improve conv2d_transpose schedule on X86 and CUDA (#3948)
* improve conv2d_transpose x86 performance by reusing conv2d schedule * parallelize across batches to make large-batch conv2d and conv2d_transpose faster * improve doc for autotvm.task.space.FallbackConfigEntity.fallback_with_reference_log * add fallback schedule for schedule_conv2d_transpose_nchw_cuda * fix pylint * fix pylint * unify conv2d_transpose declaration in topi.nn and topi.x86
Showing
Please
register
or
sign in
to comment