Add GRPO CI to FSDP and Megatron simple e2e. (#711)
For longer tests, may check `example/grpo_trainer` folder, these 2 backends can align within 200 steps, but for more steps, megatron seems not able to reach loss convergence. TODO: Extended testing over longer time ranges is required to further validate.
Showing
.github/workflows/e2e_grpo.yml
0 → 100644
tests/e2e/run_deepseek_grpo.sh
0 → 100644
tests/e2e/run_deepseek_grpo_megatron.sh
0 → 100644
tests/e2e/run_qwen_grpo.sh
0 → 100644
tests/e2e/run_qwen_grpo_megatron.sh
0 → 100644
Please
register
or
sign in
to comment