Commits · 612823ae198ff15846167abf74cf3903176edc98 · ZhangXiaoyun / verl

21 Mar, 2025 4 commits
- [Bug Fix] Fix SGLang rollout error under multi node (#652) · 612823ae
  Junrong Lin committed Mar 21, 2025
  
  612823ae Browse Files
- [misc] Add Ulysses parallel config precheck (#674) · e67dea67
```
Prevents training hangs by validating `num_key_value_heads %
ulysses_sequence_parallel_size == 0` before training.
```
  Yu Feng committed Mar 21, 2025
  e67dea67 Browse Files
- docs: add vllm 0.8 page (#694) · 0342042e
```
## What does this PR do?

Add document for using vLLM 0.8 in verl

## Who can review?

@eric-haibin-lin
```
  hoshi-hiyouga committed Mar 20, 2025
  0342042e Browse Files
- docs: fix broken news rendering (#691) · b2ad8fd0
  HL committed Mar 21, 2025
  
  b2ad8fd0 Browse Files
20 Mar, 2025 5 commits
- docs: Adding Openmanus-RL to the Awesome work (#688) · 7df1ffc0
```
Adding Openmanus-RL: a llm agent rl tunning repo with verl
```
  Kunlun Zhu committed Mar 20, 2025
  7df1ffc0 Browse Files
- [tracking] swanlab add `verl` config (#663) · 94788851
```
Add `verl` as the `framework` parameter to the SwanLab config table, so
more developers can see that this training comes from `verl`.
```
  Ze-Yi LIN committed Mar 20, 2025
  94788851 Browse Files
- docs: add meetup slides (#681) · 847bf252
  HL committed Mar 20, 2025
  
  847bf252 Browse Files
- Make Math-Verify Optional (#683) · 529a4fe0
```
https://github.com/volcengine/verl/issues/680

Changes:
- Move math-verify to the optional dependencies. Now it can be installed
via `cd verl && pip install -e .[math]`
- Revert using naive verifier for math dataset. Users can switch to
math-verify or custom a new `compute_score` function.
```
  Yuyang Ding committed Mar 20, 2025
  529a4fe0 Browse Files
- [ci] fix ci (#675) · 5367156a
  Chi Zhang committed Mar 20, 2025
  
  5367156a Browse Files
19 Mar, 2025 1 commit
- Update the description of DeepRetrieval (#664) · 468adf22
```
We propose a more accurate description of DeepRetrieval.
Thanks for your awesome work!
```
  Patrick Jiang committed Mar 19, 2025
  468adf22 Browse Files
18 Mar, 2025 3 commits
- [misc] fix the wrong url (#657) · c3e530de
  Yuqian Fu committed Mar 18, 2025
  
  c3e530de Browse Files
- misc: change main_task to TaskRunner actor (#648) · c6dc8b73
```
Use ray actor instead of task to run main_task
- Ray task is retried in system error(oom/segmentfault), which may cause
unexpectedly behavior
- Actor is more trackable in ray dashboard, e.g
logging/stacktrace/profile

close #539
```
  Joel committed Mar 18, 2025
  c6dc8b73 Browse Files
- [Bug Fix] Revert the RLHFDataset truncation config (#645) · ff137945
```
Commit c3420692 Rebase caused error. Try to revert and add an assertion
check.
```
  Blue Space committed Mar 18, 2025
  ff137945 Browse Files
17 Mar, 2025 4 commits
- [ci] feat: move dataset.yml to another GPU (#639) · e49fb572
  Chi Zhang committed Mar 18, 2025
  
  e49fb572 Browse Files
- Added DeepEnlighten to Awesome Work Using Verl section (#641) · 87a81365
```
This PR adds **DeepEnlighten** to the "Awesome Work Using Verl" section.

Co-authored-by: yu_wang <yuwang@astri.com>
Co-authored-by: Chi Zhang <zhangchi.usc1992@bytedance.com>
```
  yuwang91 committed Mar 17, 2025
  87a81365 Browse Files
- [doc] update DAPO (#640) · ffd50a49
```
- As titled
```
  Guangming Sheng committed Mar 17, 2025
  ffd50a49 Browse Files
- [rollout] feat: add SGLang as rollout engine to verl (#490) · 333e6d62
```
#22 . WIP, will add more details tomorrow :)

---------

Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
```
  Junrong Lin committed Mar 17, 2025
  333e6d62 Browse Files
16 Mar, 2025 3 commits
- fix readme (#624) · 3b18b0eb
  Chi Zhang committed Mar 16, 2025
  
  3b18b0eb Browse Files
- readme: add MetaSpatial project (#617) · 3ec83117
```
add MetaSpatial in Awesome Work using EasyR1
```
  PzySeere committed Mar 15, 2025
  3ec83117 Browse Files
- [fix] fix python env issue in install (#619) · d754a0cb
  Fengqing Jiang committed Mar 15, 2025
  
  d754a0cb Browse Files
15 Mar, 2025 2 commits
- [misc] fix: validation batch repeat before feed into rollout (#614) · cb943be5
  Guangming Sheng committed Mar 15, 2025
  
  cb943be5 Browse Files
- misc: separate metric utils from ppo trainer (#599) · 6133ae92
```
## What does this PR do?

Use metric_utils to maintain the logic of computing metrics, avoiding
too many lines in ppo trainer

## Who can review?

@vermouth1992 @PeterSH6
```
  hoshi-hiyouga committed Mar 15, 2025
  6133ae92 Browse Files
14 Mar, 2025 8 commits
- Support for GRPO with Megatron backend (#592) · c3420692
```
Support for GRPO with Megatron backend and fix a configuration bug when
not using virtual pipeline.

Calibrated with FSDP backend.
```
  Blue Space committed Mar 14, 2025
  c3420692 Browse Files
- fix: Add error mechanism for mini-batch/batch size divisibility validation (#559) · 4d27461d
  Yuqian Fu committed Mar 14, 2025
  
  4d27461d Browse Files
- [config] feat: lr_warmup_steps (#564) · 22657bad
```
This PR adds the `lr_warmup_steps` configuration.

Note the `num_warmup_steps` is prior to `lr_warmup_steps_ratio`.
```
  Shawn/Yuxuan Tong committed Mar 14, 2025
  22657bad Browse Files
- [update] delete useless config params (#591) · 99ea19a3
  BearBiscuit committed Mar 14, 2025
  
  99ea19a3 Browse Files
- [Config] Providing an option to turn off `torch.compile` in actor (#554) · 54574690
```
## Summary

Providing an option in the config to turn off the `torch.compile` used
in `dp_actor.py`

## Usage

Adding the following line to the driver or cli scripts to turn off
`torch.compile`.
```python
  +actor_rollout_ref.actor.use_torch_compile=False
```
Otherwise, `torch.compile` will be used by default

## Related Issue

#354 #245

---------

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
```
  Hongpeng Guo committed Mar 14, 2025
  54574690 Browse Files
- misc: remove redundant .to(device) (#565) · d2db7252
```
As a `DataProto` instance, calling `to(device)` already moves data.batch
to the specified device.


https://github.com/volcengine/verl/blob/329dcfe1dd60f2d736ee55914e2a49e1887718eb/verl/protocol.py#L324-L336
```
  Lumeng Wu committed Mar 14, 2025
  d2db7252 Browse Files
- doc: add multinode training and debug tutorial (#585) · 9ae01af2
```
#354
```
  Joel committed Mar 14, 2025
  9ae01af2 Browse Files
- docs: fix hardcoded parameters in the Slurm example (#588) · cae8d2f5
```
Follow-up to https://github.com/volcengine/verl/pull/309
```
  Chenhui Zhang committed Mar 14, 2025
  cae8d2f5 Browse Files
13 Mar, 2025 6 commits

fix bug #544 that 'left' and 'right' config for truncation don't work (#583) · e7c40b35
none0663 committed Mar 13, 2025

e7c40b35 Browse Files
fix: remove redundant broadcast in fsdp vllm postprocess (#577) · f7e183e4
```
Remove redundant broadcast in fsdp vllm postprocess since vllm output in
each tp rank should be identical.
```
Joel committed Mar 13, 2025
f7e183e4 Browse Files

fix: remove redundant torch.cuda.empty_cache() (#575) · 3fc3e2b7

#556 take effort to remove remove unnecessary empty_cache, but will
cause CUDA oom at vllm wake_up.
```text
  File "/opt/tiger/ray/session_2025-03-13_12-11-30_408315_2895/runtime_resources/working_dir_files/_ray_pkg_a64b690733067c5c/verl/workers/fsdp_workers.py", line 481, in generate_sequences
    with self.rollout_sharding_manager:
  File "/opt/tiger/ray/session_2025-03-13_12-11-30_408315_2895/runtime_resources/working_dir_files/_ray_pkg_a64b690733067c5c/verl/workers/sharding_manager/fsdp_vllm.py", line 82, in __enter__
    self.inference_engine.wake_up()
  File "/usr/local/lib/python3.11/dist-packages/vllm/entrypoints/llm.py", line 1244, in wake_up
    self.llm_engine.wake_up()
  File "/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py", line 1859, in wake_up
    self.model_executor.wake_up()
  File "/usr/local/lib/python3.11/dist-packages/vllm/executor/executor_base.py", line 216, in wake_up
    self.collective_rpc("wake_up")
  File "/usr/local/lib/python3.11/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
    answer = run_method(self.driver_worker, method, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/utils.py", line 2196, in run_method
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/worker/worker.py", line 140, in wake_up
    allocator.wake_up()
  File "/usr/local/lib/python3.11/dist-packages/vllm/device_allocator/cumem.py", line 207, in wake_up
    create_and_map(handle)
  File "/usr/local/lib/python3.11/dist-packages/vllm/device_allocator/cumem.py", line 75, in create_and_map
    python_create_and_map(*allocation_handle)
RuntimeError: CUDA Error: out of memory at /workspace/csrc/cumem_allocator.cpp:62
```
This PR remove all redundant `torch.cuda.empty_cache()` in FSDP worker
and only empty cache before vllm wake_up and after vllm sleep, since
vllm has its own caching memory allocator
[CuMemAllocator](https://github.com/vllm-project/vllm/blob/v0.7.3/vllm/device_allocator/cumem.py#L103).
Out of vllm scope, we should avoid empty cache to let pytorch using
caching memory to speed up memory allocations.

- [x] Cleanup FSDP worker torch.cuda.empty_cache()
- [ ] Cleanup Megatron worker torch.cuda.empty_cache()

committed Mar 13, 2025

3fc3e2b7 Browse Files

[bugfix] PRIME filter overlong propmts & padding side incorrect & use xformers (#570) · 9bb02d27

### Description
- fix filter_overlong_prompts setting in PRIME

- fix padding side incorrect for Qwen in PRIME 

- When I utilize PRIME recipe to train Qwen series models, I got
“*ValueError: You are attempting to perform batched generation with
padding_side='right' this may lead to unexpected behaviour for Flash
Attention version of Qwen2. Make sure to call tokenizer.padding_side =
'left' before tokenizing the input.*” So I set `use_cache = False` when
calling model to calculate output logits.

- fix CUDA error with vllm v0.6.3 

- When I run PRIME, I may get an error — *CUDA error: an illegal memory
access was encountered*. According to
https://github.com/vllm-project/vllm/issues/10389, I set
`VLLM_ATTENTION_BACKEND=XFORMERS` .

committed Mar 13, 2025

9bb02d27 Browse Files

[bugfix] fix: generation script (#542) · 79e072f1

# Description
- Corrected dummy size to avoid faulty communication.
- Fixed batch number calculation.
- Adjusted worker group role to alleviate memory overhead.
- Add ray.init() to prevent failing to register worker.

committed Mar 13, 2025

79e072f1 Browse Files

[rollout] feat: support sampling in validation stage (#553) · d5de9f4c

Currently, eager mode is applied in the validation stage. However, in
some reasoning tasks, we may need to generate n times and average the
scores.

In this PR, we support using non-eager sampling parameters during
validation by specifying the `val_kwargs` in `actor_rollout_ref.rollout`
config field.


**Future work**
- [ ] Merge `vllm_rollout_spmd.py` and `vllm_rollout.py` into one file.

committed Mar 13, 2025

d5de9f4c Browse Files

12 Mar, 2025 4 commits

[misc] add assertion for normalized ppo_mini_batch_size (#552) · 39b008d2
Zheng-Yuxiang committed Mar 13, 2025

39b008d2 Browse Files
[fix] Fix config param issue (#558) · 329dcfe1
BearBiscuit committed Mar 12, 2025

329dcfe1 Browse Files
refactor: remove custom vllm weight loader and use model.load_weights directly (#543) · 6680185c
```
As we're moving to vllm>=0.7.3, we should remove `verl/third_party`
complelely in the future.
```
Joel committed Mar 12, 2025
6680185c Browse Files

Add Math-Verify Support (#545) · d4a00ef0

# Description

https://github.com/volcengine/verl/issues/287,
https://github.com/volcengine/verl/issues/295.
This PR introduces support for
[Math-Verify](https://github.com/huggingface/Math-Verify) as a new
rule-based reward scorer, significantly improving evaluation accuracy.

# Key changes

- Added `math-verify` to the installation dependencies.
- Introduced `reward_score/math_verify.py` and updated
`reward_score/__init__.py`.

# Test

Comparison between the existing scorer in math.py and the newly added
`math_verify.py`, using Qwen2.5-Math-7B-Instruct:

```
# Use scorer in math.py (original)
{'val/test_score/DigitalLearningGmbH/MATH-lighteval': 0.803}

# Use scorer in math_verify.py (newly added)
{'val/test_score/DigitalLearningGmbH/MATH-lighteval': 0.8338}
```

Test scripts:

```bash
set -x

# Data Process
python examples/data_preprocess/math_dataset.py --local_dir /workspace/datasets/math

# Evaluation
export CUDA_VISIBLE_DEVICES=4,5,6,7
export VLLM_ATTENTION_BACKEND=XFORMERS

math_train_path=/workspace/datasets/math/train.parquet
math_test_path=/workspace/datasets/math/test.parquet

python3 -m verl.trainer.main_ppo \
    data.train_files="$math_train_path" \
    data.val_files="$math_test_path" \
    data.max_prompt_length=2048 \
    data.max_response_length=2048 \
    actor_rollout_ref.model.path=Qwen/Qwen2.5-Math-7B-Instruct \
    actor_rollout_ref.rollout.tensor_model_parallel_size=1 \
    actor_rollout_ref.rollout.name=vllm \
    actor_rollout_ref.rollout.gpu_memory_utilization=0.6 \
    actor_rollout_ref.rollout.n=1 \
    actor_rollout_ref.rollout.temperature=0 \
    trainer.logger=['console'] \
    trainer.project_name='test-math-verify' \
    trainer.experiment_name='test-math-verify' \
    +trainer.val_before_train=True \
    trainer.n_gpus_per_node=4 \
    trainer.nnodes=1 \
    trainer.total_epochs=0 \
    data.train_batch_size=1024 \
    actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=1 \
    actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=1 \
    actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=1 \
    algorithm.adv_estimator=grpo $@
```

committed Mar 12, 2025

d4a00ef0 Browse Files