Commits · 7a5e9496bda9b411bbc4c395db82ecd272667057 · ZhangXiaoyun / verl

05 Mar, 2025 3 commits
- support speed up downloading model from modelscope (#463) · 7a5e9496
```
Add support for downloading models from modelscope by setting
`VERL_USE_MODELSCOPE=True`

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
```
  Hong Zhang committed Mar 05, 2025
  7a5e9496 Browse Files
- docs: add meetup info, and skythought (#478) · 1ca4bfaf
  HL committed Mar 05, 2025
  
  1ca4bfaf Browse Files
- [feat] support mfu calculation for megatron_workers (#475) · 6d7d3707
```
calculate mfu in update actor/critic when using megatron workers
```
  Mingjie LIU committed Mar 05, 2025
  6d7d3707 Browse Files
04 Mar, 2025 5 commits
- [fix] use bicubic resampler for resizing image (#474) · b0e7a942
  hoshi-hiyouga committed Mar 05, 2025
  
  b0e7a942 Browse Files
- [CI] Add e2e_ascend CI (#465) · d78186d4
```
This PR is a continuing work of #448 , in order to support e2e CI for
Ascend NPU.
```
  Shuqiao Li committed Mar 04, 2025
  d78186d4 Browse Files
- [doc] add DeepRetrieval to awesome work (#464) · 03e0efaa
```
add DeepRetrieval to README Awesome work
```
  Patrick Jiang committed Mar 04, 2025
  03e0efaa Browse Files
- [fix] separate prompt and response in reward manager (#459) · 27d72812
```
## What does this PR do?

1. Separate the prompt part and the response part in reward manager to
avoid the reward leakage of format reward.
2. Update the reward score function for Geometry3k dataset.
3. Update the content in the readme file.

## Who can review?

@vermouth1992 @PeterSH6
```
  hoshi-hiyouga committed Mar 04, 2025
  27d72812 Browse Files
- [doc] add ReSearch to awesome work (#461) · 296e4111
```
add ReSearch to README Awesome work
```
  Mingyang Chen committed Mar 04, 2025
  296e4111 Browse Files
03 Mar, 2025 5 commits

Update install.rst fix typo (#450) · 65cceb3c
Shuqiao Li committed Mar 03, 2025

65cceb3c Browse Files

[feat] Initial support for VLMs, add Qwen2.5VL GRPO example (#386) · b46f55ec

## What does this PR do?

This PR migrates the feature of RL on VLMs in our implementation in
[EasyR1](https://github.com/hiyouga/EasyR1) fork back to veRL. We have
validated this feature using Qwen2.5-VL 7B model on 8*H100 GPUs. The
configuration and data processing script are provided along this PR for
easy reproducing.

## How to reproduce?

1. Download and preprocess the dataset

```bash
python3 examples/data_preprocess/geo3k.py --local_dir ~/data/geo3k
```

2. Start GRPO training

```bash
bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh
```

## Dependencies

- vllm>=0.7.3
- transformers>=4.49.0
- [qwen-vl-utils](https://pypi.org/project/qwen-vl-utils/)
- [mathruler](https://pypi.org/project/mathruler/)

## Major Changes

### New dataflow for multimodal RL

In this PR, we introduce two new concepts in the dataflow,
`multi_modal_data` and `multi_modal_inputs`. The former means the
multi-modal features required by the **rollout** worker (such as vLLM),
while the latter means the multi-modal features required by the
**actor/critic** worker (such as an HF model). They are different
because the rollout and actor workers have their own data format
requirements.

Taking Qwen2-VL + huggingface + vLLM as an example, the data structure
should be:

- **multi_modal_data**: {"image": [PIL.Image, PIL.Image, ...]}
- **multi_modal_inputs**: {"pixel_values": torch.Tensor,
"image_grid_thw": torch.Tensor}

Both of them are converted to numpy objects and placed in the non-tensor
batch in DataProto.

This design can be extended to other modalities/VLMs easily due to the
agnostic of models.

### Other changes

- Data
- Support pre-processing the
[Geometry3k](https://huggingface.co/datasets/hiyouga/geometry3k)
dataset.
- Support `config.data.image_key`, which should be **a list of Pillow
images**.

- Actor/Ref/Critic
  - Support `multi_modal_inputs`.
  - Process position ids to adapt to the m-rope .

- Rollout
- Update dtensor weight loader to adapt to the Qwen2-VL architecture in
vLLM 0.7+.
  - Support `multi_modal_data`.
- Use `raw_prompt_ids` as the vLLM inputs to **avoid unpadding** the
input ids.

- Reward Manager
- Add **mathruler** for more accurate math scores on the Geometry 3k
dataset

- Models
  - Support calculating the position ids for the m-rope in Qwen2-VL.
- Support removing padding in flash attention2 for m-rope (transformers
itself **does not support it**).

- Sharding Manager
  - Support all-gathering the non-tensor batch.

- FSDP Workers / Checkpoint Merger
  - Support `AutoModelForVision2Seq` at model initialization.

Note: The Ulysses parallelism is not completed yet. We will support it
in the next update.

## Performance

We provide the estimated MFU of the language model part for H100 GPUs.
These values are lower than the actual ones because **we did not compute
the FLOPs of the vision tower part**.

- `remove_padding=False`: MFU ~7%
- `remove_padding=True`: MFU ~20%

The training and test reward score curves are presented as follows.


![image](https://github.com/user-attachments/assets/ecb9fc27-8591-4c5b-ae4b-4ba77c6e30f9)

## Who can review?

@vermouth1992 @PeterSH6

committed Mar 03, 2025

b46f55ec Browse Files

[fix] update yaml file for generation (#445) · a0a4d5fa
```
forget to update params in generation.yaml #259
```
BearBiscuit committed Mar 03, 2025
a0a4d5fa Browse Files

megatron：Update megatron-lm to `core_r0.11.0` (#392) · 0cfd548c

# Support Megatron mcore 0.11

## Description
This PR introduces official support for Megatron mcore 0.11 with the
following updates:
- Upgraded Megatron to version `core_r0.11.0`
- Applied compatibility patch `patches/mcore_r0.11.patch`
- Removed legacy version support for cleaner implementation

Special thanks to @chendong-1998 for:
- Original Megatron upgrade from 0.4 to 0.6 (#93f6a7e)

## Compatibility Notes
Current implementation requires careful handling due to dependency
conflicts:
- `megatron-core==0.11.0` requires torch>=2.6
- `vllm==0.6.3` requires torch==2.4

Installation constraints:
1. Must use vllm's torch dependency (2.4) as baseline
2. Do NOT run `pip install -e .` in mcore directory (will upgrade torch
to 2.6)
3. Apply compatibility patch manually after installation

## Testing
### test with `verl/examples/ppo_trainer/run_deepseek_megatron.sh`

![image](https://github.com/user-attachments/assets/e053c9b8-fdd7-47fc-aaeb-42cf85070056)

---------

Signed-off-by: chendong-1998 <chendong136@huawei.com>
Co-authored-by: chendong-1998 <chendong136@huawei.com>
Co-authored-by: gaoziyuan <gaoziyuan.955@bytedance.com>
Co-authored-by: Sion Gao <gaoziyuan19@mails.ucas.ac.cn>

committed Mar 03, 2025

0cfd548c Browse Files

fire rollout: fix main_generation config and failed tests (#443) · 85768a5c
HL committed Mar 03, 2025

85768a5c Browse Files

02 Mar, 2025 6 commits
- Revert "fix: bind the port with IP address" (#442) · 55a0ab99
```
Reverts volcengine/verl#314
```
  Chi Zhang committed Mar 02, 2025
  55a0ab99 Browse Files
- rollout: FIRE sampling added. (#58) · b677a61e
  Weizhe Chen committed Mar 02, 2025
  
  b677a61e Browse Files
- vllm: fix issue #438 (#440) · 128781cf
  ZSL98 committed Mar 02, 2025
  
  128781cf Browse Files
- fix: bind the port with IP address (#314) · fb13a07d
```
Specify the IP address when calling the bind method.
```
  Willem Jiang committed Mar 02, 2025
  fb13a07d Browse Files
- [doc] add Code-R1 to readme awesome work (#437) · 5273011d
  Guangming Sheng committed Mar 02, 2025
  
  5273011d Browse Files
- docs: add hf ckpt to faq, and include verl apis in the website (#427) · fe547a33
```
Now APIs can be displayed: 


![image](https://github.com/user-attachments/assets/6592ce68-7bf6-46cb-8dd3-a5fa6cd99f3e)
```
  HL committed Mar 02, 2025
  fe547a33 Browse Files
01 Mar, 2025 2 commits

fix: 2 typos (#435) · 99fb2dde
Lumeng Wu committed Mar 01, 2025

99fb2dde Browse Files

Update vLLM>=0.7 doc (#432) · cef4c2de

Because of the ongoing updates in vLLM, I noticed that veRL currently
cannot integrate with the nightly build of vLLM directly. The new DP
feature in the nightly version can no longer be bypassed by simply
adjusting the `data_parallel_size` parameter, and resolving this
requires further investigation.

As a temporary workaround, I recommend a customized installation of vLLM
if the V1 engine is required. I have updated the relevant documentation
accordingly to reflect this guidance.

committed Mar 01, 2025

cef4c2de Browse Files

28 Feb, 2025 3 commits
- [Fix] No Shuffling for `val_dataloader` (#423) · 021db112
```
Validation should not have shuffling.
```
  Shawn/Yuxuan Tong committed Feb 28, 2025
  021db112 Browse Files
- [Feature] Assert Single Batch for `val_dataloader` (#424) · 6e4a445f
```
This is an enhancement for the single batch strategy for
`val_dataloader`, making https://github.com/volcengine/verl/pull/353
more robust.
```
  Shawn/Yuxuan Tong committed Feb 28, 2025
  6e4a445f Browse Files
- ci: Added the secrets scan action (#417) · 60c92147
  Willem Jiang committed Feb 27, 2025
  
  60c92147 Browse Files
27 Feb, 2025 6 commits

[feat] tracking support tensorboard (#408) · 82b38e25

Add tensorboard in Tracking backends.

The user can set the environment variable TENSORBOARD_DIR to specify the
TensorBoard log path.

committed Feb 27, 2025

82b38e25 Browse Files

[ckpt] fix: fix oom when resume from ckpt (#402) · a0f05da8
Chi Zhang committed Feb 27, 2025

a0f05da8 Browse Files
[fix] Fix evaluation file path in remax training scripts. (#404) · 052b0a39
```
The current training script utilizes the same file during training and
evaluation. It is surmised that this may be incorrect.
```
yaguang committed Feb 27, 2025
052b0a39 Browse Files

[ckpt] replace DataLoader with StatefulDataLoader to support resume training for… · 96d98ccb

[ckpt] replace DataLoader with StatefulDataLoader to support resume training for SequentialSampler  (#389)

Try to resolve this
[issue](https://github.com/volcengine/verl/issues/356).

As suggested by this issue discussion, I replace default DataLoader with
StatefulDataloader, which provides state_dict and load_state_dict
methods that may support resuming the iterator position of mid-epoch
checkpointing.

committed Feb 27, 2025

96d98ccb Browse Files

[misc] fix: disable chunked-prefill by default (#259) · 558fae54

Thanks: @HillZhang1999

- Related issue: https://github.com/volcengine/verl/issues/189

`[36m(main_task pid=3523385)[0m ValueError: max_num_batched_tokens
(8192) is smaller than max_model_len (9216). This effectively limits the
maximum sequence length to max_num_batched_tokens and makes vLLM reject
longer sequences. Please increase max_num_batched_tokens or decrease
max_model_len.`

When enable_chunked_prefill is activated, the aforementioned issue will
be concealed. Please increase `max_num_batched_tokens` or `decrease
max_model_len`.

committed Feb 27, 2025

558fae54 Browse Files

[ci] fix: fix qwen0.5b megatron ci (#396) · 59643585
Chi Zhang committed Feb 27, 2025

59643585 Browse Files

26 Feb, 2025 2 commits
- apis: add data proto to documentation page. use copy_to_local instead of… · 2440aa69
```
apis: add data proto to documentation page. use copy_to_local instead of copy_local_path_from_hdfs (#358)
```
  HL committed Feb 26, 2025
  2440aa69 Browse Files
- [misc] add assertion for normalized ppo mini_batch_size and ppo micro… (#382) · efd0061a
```
- As titled
```
  Guangming Sheng committed Feb 26, 2025
  efd0061a Browse Files
25 Feb, 2025 4 commits
- [fix] Passing ppo_epochs to dp_actor.py (#346) · b4c13ce6
```
See issue: https://github.com/volcengine/verl/issues/342
```
  Mingjie Liu committed Feb 25, 2025
  b4c13ce6 Browse Files
- [Fix] Using an enumeration class to avoid spelling errors in adv_esti… (#377) · 3b1aef2f
```
#369

---------

Co-authored-by: Thom <zhangyi@zhangyideMacBook-Pro.local>
```
  _T_L_R_ committed Feb 25, 2025
  3b1aef2f Browse Files
- fix spelling error (#374) · 0dc8e859
  kriswang committed Feb 24, 2025
  
  0dc8e859 Browse Files
- [ppo] fix: fix minibatch size when n > 1 for megatron worker (#370) · bbda2e57
  Chi Zhang committed Feb 25, 2025
  
  bbda2e57 Browse Files
24 Feb, 2025 4 commits

rollout: Fix navive_rollout class names. (#361) · 656accb0
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
湛露先生 committed Feb 24, 2025
656accb0 Browse Files
[docs] modify the comments (#363) · 8eb22b50
BearBiscuit committed Feb 24, 2025

8eb22b50 Browse Files

feat: add support for ulysses sequence parallel for transformers >= 0.48 (#357) · d36422be

close #312 

Add support for ulysses sp for transformers >= 0.48

I've tested transformers 0.45.0, 0.46.0, 0.47.0, 0.48.0 and 0.49.0,
using sp=2 with the following script in my local env
```bash
#!/bin/bash

set -ex
VERSIONS=("4.45.0" "4.46.0" "4.47.0" "4.48.0" "4.49.0")

for version in "${VERSIONS[@]}"; do
    echo "Testing with Transformers version ${version}"
    echo "----------------------------------------"
    
    pip install "transformers==${version}"
    
    PYTHONPATH=./ torchrun --nproc_per_node=2 tests/model/test_transformers_ulysses.py
    
    echo "----------------------------------------"
    echo "Completed testing for version ${version}"
    echo ""
done
```

committed Feb 24, 2025

d36422be Browse Files

[fix] Improve the params template for generation (#351) · e53dcdb9
```
fix the issue[#331](https://github.com/volcengine/verl/issues/331)
```
BearBiscuit committed Feb 24, 2025
e53dcdb9 Browse Files