[misc] fix reward model issue with TokenClassification model and support running…
[misc] fix reward model issue with TokenClassification model and support running particular steps instead of epochs (#99) * support user specify training steps * fix typo * update ci * add ci * fix reward model and write more ci script * update ci * lint * align * delete post training val * fix script
Showing
.github/workflows/e2e_gsm8k.yml
0 → 100644
tests/e2e/run_qwen_gsm8k_function_rm.sh
0 → 100644
tests/e2e/run_qwen_gsm8k_model_rm.sh
0 → 100644
Please
register
or
sign in
to comment