codegen_cuda.cc
24.4 KB
-
[CodeGen][CUDA] Vectorization for intrinsics (#5101) · 05b0f7e0
- This allows to emit vectorized loads/stores for CUDA math intrinsics. - A few intrinsics should be lowered as CUDAMath not CUDAFastMath ones. - Fixed the code block identation.
Wei Pan committed