| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| config | ||
| README.md | ||
| main_ppo_split.py | ||
| run_deepseek7b_llm.sh | ||
| split_monkey_patch.py |
## Summary This PR changes all the micro_batch_size to micro_batch_size_per_gpu. **The Core logic of setting batch size:** - **All algorithmic metrics** (train batch size, ppo mini batch size): are global (from the perspective of single-controller), which will be normalized in each Worker. - **All performance-related parameters** (micro batch size, max token length in dynamic batch size) are local parameters, which represent the data sizes per GPU (i.e., each Worker). ## Main Changes 1. Change the scripts and config and delete the normalization for micro_bsz 2. Fix CI for SFT
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| config | Loading commit data... | |
| README.md | Loading commit data... | |
| main_ppo_split.py | Loading commit data... | |
| run_deepseek7b_llm.sh | Loading commit data... | |
| split_monkey_patch.py | Loading commit data... |