In cases where we have multiple models or threadpools active, spinning around `sched_yield()` may not be desirable, as it prevents the OS from effectively scheduling other threads. Thus, allow users to conditionally disable this behaviour (via an environment variable `TVM_THREAD_POOL_SPIN_COUNT`, similar to existing environment flags for the thread pool such as `TVM_BIND_THREADS`, etc). This substantially improves tail latencies in some of our multi-tenant workloads in practice. Unit tests have been added - on my laptop, running: ``` TVM_THREAD_POOL_SPIN_COUNT=0 ./build/threading_backend_test; TVM_THREAD_POOL_SPIN_COUNT=1 ./build/threading_backend_test; ./build/threading_backend_test; ``` gives https://gist.github.com/ajtulloch/1805ca6cbaa27f5d442d23f9d0021ce6 (i.e. 97ms -> <1ms after this change)
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
api | Loading commit data... | |
arithmetic | Loading commit data... | |
autotvm | Loading commit data... | |
codegen | Loading commit data... | |
common | Loading commit data... | |
contrib | Loading commit data... | |
lang | Loading commit data... | |
node | Loading commit data... | |
op | Loading commit data... | |
pass | Loading commit data... | |
relay | Loading commit data... | |
runtime | Loading commit data... | |
schedule | Loading commit data... | |
README.md | Loading commit data... |