Commits · fa3bc56ad4457c0a8a119da178b6c1216d336424 · ZhangXiaoyun / verl

13 Apr, 2025 1 commit
- update config for dapo 3.1k · fa3bc56a
  Yaoyu Zhu committed Apr 13, 2025
  
  fa3bc56a Browse Files
12 Apr, 2025 1 commit
- add double ground truth to reward function · c3951916
  Yaoyu Zhu committed Apr 12, 2025
  
  c3951916 Browse Files
11 Apr, 2025 3 commits
- add pass@all in metrics · 76fbbcc5
  Yaoyu Zhu committed Apr 11, 2025
  
  76fbbcc5 Browse Files
- feat:add new metrics · 0a9d70c4
  Shi wenxuan committed Apr 11, 2025
  
  0a9d70c4 Browse Files
- feat: new metrics · e87d0923
  Shi wenxuan committed Apr 11, 2025
  
  e87d0923 Browse Files
10 Apr, 2025 3 commits
- fix config for dapo · 56f07a24
  Yaoyu Zhu committed Apr 10, 2025
  
  56f07a24 Browse Files
- fix bugs in dapo config (no dynamic sampling, no token-level loss) · 65ac1294
  Yaoyu Zhu committed Apr 10, 2025
  
  65ac1294 Browse Files
- fix a bug in codev reward function · d58782a4
  Yaoyu Zhu committed Apr 10, 2025
  
  d58782a4 Browse Files
09 Apr, 2025 5 commits
- update gitignore · 87538066
  Yaoyu Zhu committed Apr 09, 2025
  
  87538066 Browse Files
- fix a bug in reward · f0ee0af3
  Yaoyu Zhu committed Apr 09, 2025
  
  f0ee0af3 Browse Files
- update gitignore and compatibilty for blockelite server · 8f63f283
  Yaoyu Zhu committed Apr 09, 2025
  
  8f63f283 Browse Files
- add data preprocess script · 83cce64a
  Yaoyu Zhu committed Apr 09, 2025
  
  83cce64a Browse Files
- add reward_mapping into reward function and add permission · a8d29994
  Yaoyu Zhu committed Apr 09, 2025
  
  a8d29994 Browse Files
08 Apr, 2025 4 commits
- update gitignore · 85eb0b35
  Yaoyu Zhu committed Apr 08, 2025
  
  85eb0b35 Browse Files
- update gitignore · ee74bb52
  Yaoyu Zhu committed Apr 08, 2025
  
  ee74bb52 Browse Files
- update gitignore · f2982c41
  Yaoyu Zhu committed Apr 08, 2025
  
  f2982c41 Browse Files
- update git ignore and template · a657af06
  Yaoyu Zhu committed Apr 08, 2025
  
  a657af06 Browse Files
07 Apr, 2025 1 commit
- dapo · 51c05054
  ZhangXiaoyun committed Apr 07, 2025
  
  51c05054 Browse Files
27 Mar, 2025 1 commit
- chore: wandb run of an early version · 66686b40
  Shawn/Yuxuan Tong committed Mar 27, 2025
  
  66686b40 Browse Files
25 Mar, 2025 2 commits

resolve conflict by merging main · 88cf46d5
shengguangming committed Mar 25, 2025

88cf46d5 Browse Files

Add tqdm progress bar to RayPPOTrainer to visualize training progress (#615) · 36c10bff

Add tqdm progress bar to RayPPOTrainer for training visualization

This PR enhances the RayPPOTrainer class by implementing a progress bar
that visualizes the training process:

- Imported tqdm module in verl/trainer/ppo/ray_trainer.py (line 27)
- Added progress bar initialization in the fit() method (line 781)
- Implemented progress updates during training iterations (line 931)
- Added proper cleanup by closing the progress bar at the end of
training (line 928)

This improvement provides real-time feedback on training progress,
making it easier to monitor long-running training sessions.

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>

committed Mar 25, 2025

36c10bff Browse Files

23 Mar, 2025 6 commits

fix: slicing returns DataProto not DataProtoItem (#718) · 44a65f95
Jiawei Liu committed Mar 23, 2025

44a65f95 Browse Files

[feat] support a basic utility of VLM RLHF with sglang (#714) · db1d3251

# What does this PR do?
This pr basically does the same thing as this
[pr](https://github.com/volcengine/verl/pull/386), but replaces the
rollout engine with sglang.

committed Mar 23, 2025

db1d3251 Browse Files

[feat] Megatron checkpoint support for current Llama and Qwen models (#687) · 5d0a7eaf

# Intro

Support Megatron checkpoint for Model, Optimizer States and RNG states,
with a new layer of abstraction: `MegatronCheckpointManager` like FSDP.
Also add checkpoint tests.

# Involved Issues and PRs

This solved issue #682 #605 , including PR #510 #634 #368 #330 . Thanks
for the great efforts of @uygnef, @ShareLer and @caaatch22 in these
contributions.

# TODOs

- [ ] Support Megatron dist checkpointing mechanism, now use
torch.save/load to store/restore model weights.
- [x] Quick: Also store hf format model.

---------

Co-authored-by: caaatch22 <mr.liumingjie@gmail.com>
Co-authored-by: Yu Feng <admin@fengyu.org>
Co-authored-by: ShareLer <sharele@163.com>

committed Mar 23, 2025

5d0a7eaf Browse Files

Add GRPO CI to FSDP and Megatron simple e2e. (#711) · 98a0208c

For longer tests, may check `example/grpo_trainer` folder, these 2
backends can align within 200 steps, but for more steps, megatron seems
not able to reach loss convergence.

TODO: Extended testing over longer time ranges is required to further
validate.

committed Mar 23, 2025

98a0208c Browse Files

skip special tokens (#715) · a339f6ff

it should skip special tokens here. just like trl do
https://github.com/huggingface/trl/blob/fc2b041b58f6fbe766dceaec819bc5a8f9d209da/trl/trainer/grpo_trainer.py#L597


if `skip_special_tokens=False`,  completion 

```
<think>...</think><answer>....</answer>
```

will be decoded as things such as
```
<think>...</think><answer>....</answer><|im_end|><|endoftext|>
```

which will render typical `format_reward_func` mismatch

```python
r"^<think>.*?</think>\s*<answer>.*?</answer>$"
```

committed Mar 23, 2025

a339f6ff Browse Files

Fix checkpoint loading in fsdp_checkpoint_manager.py and ray_trainer.py (#712) · c523a314
Haoyang Zou committed Mar 23, 2025

c523a314 Browse Files

22 Mar, 2025 1 commit
- fix: support transformers==4.50.0 (#704) · 3f6d45d9
```
https://github.com/volcengine/verl/issues/703
```
  Lumeng Wu committed Mar 22, 2025
  3f6d45d9 Browse Files
21 Mar, 2025 4 commits
- [Bug Fix] Fix SGLang rollout error under multi node (#652) · 612823ae
  Junrong Lin committed Mar 21, 2025
  
  612823ae Browse Files
- [misc] Add Ulysses parallel config precheck (#674) · e67dea67
```
Prevents training hangs by validating `num_key_value_heads %
ulysses_sequence_parallel_size == 0` before training.
```
  Yu Feng committed Mar 21, 2025
  e67dea67 Browse Files
- docs: add vllm 0.8 page (#694) · 0342042e
```
## What does this PR do?

Add document for using vLLM 0.8 in verl

## Who can review?

@eric-haibin-lin
```
  hoshi-hiyouga committed Mar 20, 2025
  0342042e Browse Files
- docs: fix broken news rendering (#691) · b2ad8fd0
  HL committed Mar 21, 2025
  
  b2ad8fd0 Browse Files
20 Mar, 2025 8 commits
- docs: Adding Openmanus-RL to the Awesome work (#688) · 7df1ffc0
```
Adding Openmanus-RL: a llm agent rl tunning repo with verl
```
  Kunlun Zhu committed Mar 20, 2025
  7df1ffc0 Browse Files
- [tracking] swanlab add `verl` config (#663) · 94788851
```
Add `verl` as the `framework` parameter to the SwanLab config table, so
more developers can see that this training comes from `verl`.
```
  Ze-Yi LIN committed Mar 20, 2025
  94788851 Browse Files
- docs: add meetup slides (#681) · 847bf252
  HL committed Mar 20, 2025
  
  847bf252 Browse Files
- Make Math-Verify Optional (#683) · 529a4fe0
```
https://github.com/volcengine/verl/issues/680

Changes:
- Move math-verify to the optional dependencies. Now it can be installed
via `cd verl && pip install -e .[math]`
- Revert using naive verifier for math dataset. Users can switch to
math-verify or custom a new `compute_score` function.
```
  Yuyang Ding committed Mar 20, 2025
  529a4fe0 Browse Files
- chore: note about overlong filtering [skip ci] · 01ef7184
  Shawn/Yuxuan Tong committed Mar 20, 2025
  
  01ef7184 Browse Files
- fix: typo [skip ci] · c6a2a6ae
  Shawn/Yuxuan Tong committed Mar 20, 2025
  
  c6a2a6ae Browse Files
- fix: typo [skip ci] · e18d0ba8
  Shawn/Yuxuan Tong committed Mar 20, 2025
  
  e18d0ba8 Browse Files
- feat: improve README [skip ci] · 78de7f00
  Shawn/Yuxuan Tong committed Mar 20, 2025
  
  78de7f00 Browse Files