Name |
Last commit
|
Last update |
---|---|---|
.. | ||
run_deepseek7b_llm.sh | ||
run_deepseek7b_llm_seq_balance.sh | ||
run_qwen2-7b.sh | ||
run_qwen2-7b_seq_balance.sh |
## Summary This PR changes all the micro_batch_size to micro_batch_size_per_gpu. **The Core logic of setting batch size:** - **All algorithmic metrics** (train batch size, ppo mini batch size): are global (from the perspective of single-controller), which will be normalized in each Worker. - **All performance-related parameters** (micro batch size, max token length in dynamic batch size) are local parameters, which represent the data sizes per GPU (i.e., each Worker). ## Main Changes 1. Change the scripts and config and delete the normalization for micro_bsz 2. Fix CI for SFT
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
run_deepseek7b_llm.sh | Loading commit data... | |
run_deepseek7b_llm_seq_balance.sh | Loading commit data... | |
run_qwen2-7b.sh | Loading commit data... | |
run_qwen2-7b_seq_balance.sh | Loading commit data... |