In cases where we have multiple models or threadpools active, spinning around `sched_yield()` may not be desirable, as it prevents the OS from effectively scheduling other threads. Thus, allow users to conditionally disable this behaviour (via an environment variable `TVM_THREAD_POOL_SPIN_COUNT`, similar to existing environment flags for the thread pool such as `TVM_BIND_THREADS`, etc). This substantially improves tail latencies in some of our multi-tenant workloads in practice. Unit tests have been added - on my laptop, running: ``` TVM_THREAD_POOL_SPIN_COUNT=0 ./build/threading_backend_test; TVM_THREAD_POOL_SPIN_COUNT=1 ./build/threading_backend_test; ./build/threading_backend_test; ``` gives https://gist.github.com/ajtulloch/1805ca6cbaa27f5d442d23f9d0021ce6 (i.e. 97ms -> <1ms after this change)
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
.gitignore | Loading commit data... | |
attrs_test.cc | Loading commit data... | |
build_module_test.cc | Loading commit data... | |
container_test.cc | Loading commit data... | |
expr_test.cc | Loading commit data... | |
ir_functor_test.cc | Loading commit data... | |
ir_mutator_test.cc | Loading commit data... | |
ir_simplify_test.cc | Loading commit data... | |
ir_ssa_test.cc | Loading commit data... | |
ir_visitor_test.cc | Loading commit data... | |
packed_func_test.cc | Loading commit data... | |
pattern_match_test.cc | Loading commit data... | |
relay_build_module_test.cc | Loading commit data... | |
relay_pass_type_infer_test.cc | Loading commit data... | |
relay_transform_sequential.cc | Loading commit data... | |
simple_passes_test.cc | Loading commit data... | |
tensor_test.cc | Loading commit data... | |
threading_backend_test.cc | Loading commit data... | |
topi_ewise_test.cc | Loading commit data... |