- Addresses occasional deadlocks in multi-threaded model server requests. - Implemented a single-threaded, single-server setup for testing purposes. - Observed that the Reward model's performance is acceptable and can be used. - Temporarily adopting this single-threaded version
Name |
Last commit
|
Last update |
---|---|---|
.gitignore | Loading commit data... | |
LICENSE | Loading commit data... | |
example_config.toml | Loading commit data... | |
readme.qmd | Loading commit data... | |
refs.bib | Loading commit data... | |
step1_apps_test.py | Loading commit data... | |
step1_evaluate_code.py | Loading commit data... | |
step1_mk_prompt.py | Loading commit data... | |
step1_sample_code.py | Loading commit data... | |
step1_sort_split_dataset.py | Loading commit data... | |
step2_prepare_preference_dataset.py | Loading commit data... | |
step3_train_outcome_reward_model.py | Loading commit data... | |
step4_test_reward_model.py | Loading commit data... | |
step4_test_reward_model_test.py | Loading commit data... | |
utils.py | Loading commit data... | |
utils_metric.py | Loading commit data... | |
utils_preference_dataset.py | Loading commit data... | |
utils_vllm.py | Loading commit data... |