- 27 Jan, 2025 5 commits
-
-
We set `max_num_batched_tokens` in config `.rollout`, but they weren't actually being passed to VLLM -- causing potential insufficient use of GPUs. This PR: - properly pass `max_num_batched_tokens` from config to vLLM - set `disable_log_stats` to False, so vLLM performance information can be properly displayed (to spot issues)
Xingyao Wang committed -
## Summary This PR changes all the micro_batch_size to micro_batch_size_per_gpu. **The Core logic of setting batch size:** - **All algorithmic metrics** (train batch size, ppo mini batch size): are global (from the perspective of single-controller), which will be normalized in each Worker. - **All performance-related parameters** (micro batch size, max token length in dynamic batch size) are local parameters, which represent the data sizes per GPU (i.e., each Worker). ## Main Changes 1. Change the scripts and config and delete the normalization for micro_bsz 2. Fix CI for SFT
Guangming Sheng committed -
HL committed
-
HL committed
-
- As titled
Guangming Sheng committed
-
- 26 Jan, 2025 2 commits
-
-
minor fix
Ikko Eltociear Ashimine committed -
Guangming Sheng committed
-
- 25 Jan, 2025 1 commit
-
-
This PR adds support for LoRA (Low-Rank Adaptation) for efficient model fine-tuning. ### Changes 1. Added LoRA configuration support in trainer config 2. Modified FSDP wrapping policy to handle LoRA modules 3. Integrated with existing FSDP training infrastructure 4. Added peft dependency 5. Removed unused ring_attn_utils.py ### Features - Configurable LoRA rank and alpha parameters - Target module specification for selective adaptation - Compatible with FSDP sharding strategy ### Testing Tested with Qwen2.5-0.5B-Instruct model on GSM8K dataset using the provided example script. ### Dependencies - Added `peft` package to requirements.txt This PR is based on commit 902ddbe6 and has been merged with the latest upstream main branch. --------- Co-authored-by: Jiayi Pan <i@jiayipan.me> Co-authored-by: openhands <openhands@all-hands.dev>
Xingyao Wang committed
-
- 24 Jan, 2025 7 commits
-
-
HL committed
-
Chi Zhang committed
-
Chi Zhang committed
-
- Support training several iters in SFT trainer - Add CI for SFT trainer to train one iter.
Guangming Sheng committed -
- Force ref/rm to use CPUOffload. Fix root FSDP unit not reshard weights after forward - HSDP support is on hold and assert False right now.
Chi Zhang committed -
This reverts commit 19840945.
shengguangming committed -
shengguangming committed
-
- 23 Jan, 2025 3 commits
-
-
This PR supports: - meta device init (which keeps the shared parameters) - parallel pre-trained weight init for FSDP from huggingface checkpoint --------- Co-authored-by: zhiqi.0 <zhiqi.0@bytedance.com>
Zhiqi Lin committed -
- Implement KL loss, GRPO outcome adv, and utilize bon rollouts - Provide scripts for deepseek and qwen on GSM8k. Can provide more for other datasets. - Support seq balance - Train using qwen2-7b, GSM8k score can reach 0.89
Guangming Sheng committed -
- The actual DP size when using SP is (DP // SP). As SP sets of GPUs have the same sequence but different parts
Guangming Sheng committed
-
- 22 Jan, 2025 1 commit
-
-
- Without multiproc Train 1/2: 1%|▍ | 20/3934 [01:38<5:14:50, 4.83s/it Avg GPU utilization: 55% - With multiproc Train 1/2: 1%|▍ | 20/3934 [01:00<2:57:09, 2.72s/it] Avg GPU utilization: 95%
hoshi-hiyouga committed
-
- 21 Jan, 2025 3 commits
- 20 Jan, 2025 1 commit
-
-
HL committed
-
- 19 Jan, 2025 1 commit
-
-
Chi Zhang committed
-
- 18 Jan, 2025 4 commits
-
-
- forbid uneven chunk for DataProto
Chi Zhang committed -
- Set use_reentrant=False to avoid duplicate allgather in backward when gradient checkpointing is enabled. - Optimize temperature computation by using inplace op - Fix testing logics
Chi Zhang committed -
Guangming Sheng committed
-
Chi Zhang committed
-
- 17 Jan, 2025 3 commits
-
-
- Add format script - Move save_checkpoint to a separate function - Add timing/step, response_length/clip_ratio, prompt_length/clip_ratio and critic/vf_explained_var metrics - The training step starts from 1
Chi Zhang committed -
- As titled
Guangming Sheng committed -
Guangming Sheng committed
-
- 16 Jan, 2025 2 commits
- 14 Jan, 2025 3 commits
-
-
As title
Chi Zhang committed -
Guangming Sheng committed
-
hoshi-hiyouga committed
-
- 13 Jan, 2025 2 commits
-
-
* add ci * fix reward model and write more ci script * support different flash_attn version with variable num returns * update transformers rmpad workflow * balance workload * lint * lint
Guangming Sheng committed -
[misc] fix reward model issue with TokenClassification model and support running particular steps instead of epochs (#99) * support user specify training steps * fix typo * update ci * add ci * fix reward model and write more ci script * update ci * lint * align * delete post training val * fix script
Guangming Sheng committed
-
- 12 Jan, 2025 1 commit
-
-
Guangming Sheng committed
-
- 11 Jan, 2025 1 commit
-
-
* update lightning link * Update verl_getting_started.ipynb
HL committed
-