* [TOPI][OP] Use Thrust sort for argsort and topk The current GPU sort implementation (odd-even transposition sort) is too slow when the number of elements is large. This PR introduces Thrust implementation of sort which is much faster. Note that this change requires CMake 3.8 or later since we have to use nvcc to compile a thrust code. * cmake: make CUDA optional * allow .cu file to be into the repository * pylint fix and cleanup * require cmake 3.8 only when thrust is enabled * fix nvcc compiler error when passing -pthread * add missing include * add USE_THRUST option in config.cmake * retrigger CI * retrigger CI
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
cblas | Loading commit data... | |
cublas | Loading commit data... | |
cudnn | Loading commit data... | |
dnnl | Loading commit data... | |
edgetpu | Loading commit data... | |
example_ext_runtime | Loading commit data... | |
miopen | Loading commit data... | |
mps | Loading commit data... | |
nnpack | Loading commit data... | |
random | Loading commit data... | |
rocblas | Loading commit data... | |
sort | Loading commit data... | |
tflite | Loading commit data... | |
thrust | Loading commit data... |