Commits · 558fae54627f283978e9648d47b004eec182885b · ZhangXiaoyun / verl

27 Feb, 2025 2 commits

[misc] fix: disable chunked-prefill by default (#259) · 558fae54

Thanks: @HillZhang1999

- Related issue: https://github.com/volcengine/verl/issues/189

`[36m(main_task pid=3523385)[0m ValueError: max_num_batched_tokens
(8192) is smaller than max_model_len (9216). This effectively limits the
maximum sequence length to max_num_batched_tokens and makes vLLM reject
longer sequences. Please increase max_num_batched_tokens or decrease
max_model_len.`

When enable_chunked_prefill is activated, the aforementioned issue will
be concealed. Please increase `max_num_batched_tokens` or `decrease
max_model_len`.

committed Feb 27, 2025

558fae54 Browse Files

[ci] fix: fix qwen0.5b megatron ci (#396) · 59643585
Chi Zhang committed Feb 27, 2025

59643585 Browse Files

26 Feb, 2025 2 commits
- apis: add data proto to documentation page. use copy_to_local instead of… · 2440aa69
```
apis: add data proto to documentation page. use copy_to_local instead of copy_local_path_from_hdfs (#358)
```
  HL committed Feb 26, 2025
  2440aa69 Browse Files
- [misc] add assertion for normalized ppo mini_batch_size and ppo micro… (#382) · efd0061a
```
- As titled
```
  Guangming Sheng committed Feb 26, 2025
  efd0061a Browse Files
25 Feb, 2025 4 commits
- [fix] Passing ppo_epochs to dp_actor.py (#346) · b4c13ce6
```
See issue: https://github.com/volcengine/verl/issues/342
```
  Mingjie Liu committed Feb 25, 2025
  b4c13ce6 Browse Files
- [Fix] Using an enumeration class to avoid spelling errors in adv_esti… (#377) · 3b1aef2f
```
#369

---------

Co-authored-by: Thom <zhangyi@zhangyideMacBook-Pro.local>
```
  _T_L_R_ committed Feb 25, 2025
  3b1aef2f Browse Files
- fix spelling error (#374) · 0dc8e859
  kriswang committed Feb 24, 2025
  
  0dc8e859 Browse Files
- [ppo] fix: fix minibatch size when n > 1 for megatron worker (#370) · bbda2e57
  Chi Zhang committed Feb 25, 2025
  
  bbda2e57 Browse Files
24 Feb, 2025 5 commits

rollout: Fix navive_rollout class names. (#361) · 656accb0
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
湛露先生 committed Feb 24, 2025
656accb0 Browse Files
[docs] modify the comments (#363) · 8eb22b50
BearBiscuit committed Feb 24, 2025

8eb22b50 Browse Files

feat: add support for ulysses sequence parallel for transformers >= 0.48 (#357) · d36422be

close #312 

Add support for ulysses sp for transformers >= 0.48

I've tested transformers 0.45.0, 0.46.0, 0.47.0, 0.48.0 and 0.49.0,
using sp=2 with the following script in my local env
```bash
#!/bin/bash

set -ex
VERSIONS=("4.45.0" "4.46.0" "4.47.0" "4.48.0" "4.49.0")

for version in "${VERSIONS[@]}"; do
    echo "Testing with Transformers version ${version}"
    echo "----------------------------------------"
    
    pip install "transformers==${version}"
    
    PYTHONPATH=./ torchrun --nproc_per_node=2 tests/model/test_transformers_ulysses.py
    
    echo "----------------------------------------"
    echo "Completed testing for version ${version}"
    echo ""
done
```

committed Feb 24, 2025

d36422be Browse Files

[fix] Improve the params template for generation (#351) · e53dcdb9
```
fix the issue[#331](https://github.com/volcengine/verl/issues/331)
```
BearBiscuit committed Feb 24, 2025
e53dcdb9 Browse Files

[Fix] Deprecate `val_batch_size` (#353) · 4011f407

Validation datasets are sent to inference engines as a whole batch,
which will schedule the memory themselves.

- [x] Remove `val_batch_size` from examples
- [x] Set default values of `val_batch_size` in configs as `null` and
add DEPRECATED comments
- [x] Add deprecation warnings about `val_batch_size` in
`_validate_config`

committed Feb 24, 2025

4011f407 Browse Files

23 Feb, 2025 2 commits
- feat: tracking support vemlp (#339) · 7a128c1c
```
Tracking backend support vemlp wandb

---------

Co-authored-by: liudayuan.carrot <liudayuan.carrot@bytedance.com>
```
  liudayuan-carrot committed Feb 23, 2025
  7a128c1c Browse Files
- chore: update optimizer_config.py (#348) · 32a3d628
  Ikko Eltociear Ashimine committed Feb 23, 2025
  
  32a3d628 Browse Files
22 Feb, 2025 2 commits
- docs: add links for rloo and volcengine distributed training doc (#343) · d02b8c78
  HL committed Feb 22, 2025
  
  d02b8c78 Browse Files
- algo: Rloo advantage estimator (#341) · 6a820b61
```
Implement RLOO algorithm according to https://arxiv.org/abs/2402.14740
```
  Zefan Wang committed Feb 22, 2025
  6a820b61 Browse Files
21 Feb, 2025 2 commits

docs: add faq for vllm illegal memory access (#333) · 76352ae9
HL committed Feb 21, 2025

76352ae9 Browse Files

[misc] Add Ray Serve to requirements to support multi-node training (#318) · 0268917a

This PR adds Ray Serve to the requirements to enable support for
multi-node training. It addresses the issue described here:
https://github.com/volcengine/verl/issues/87#issuecomment-2659493418

Co-authored-by: Yu Feng <fengyufengyu@didiglobal.com>

committed Feb 21, 2025

0268917a Browse Files

20 Feb, 2025 1 commit
- distro: bump up version to v0.2.0.dev, limit vllm version (#327) · 0a1b16f8
  HL committed Feb 20, 2025
  
  0a1b16f8 Browse Files
19 Feb, 2025 7 commits
- [megatron] feat: support qwen2 megatron backend (#261) · 94487625
```
Support Qwen2 Megatron backend

The code is primarily adapted from the llama folder, with modifications
to use QKV bias and remove the rope_scaling of RoPE in
`verl/models/qwen2/megatron/layers/parallel_attention.py`.

- Train using Qwen2-7B-Instruct with PPO, GSM8k score can reach 0.87 at
step 75.
- not support saver now
```
  Kinman Lei committed Feb 19, 2025
  94487625 Browse Files
- fix vllm 0.7 documentation link in readme (#317) · 35c8daee
  Chi Zhang committed Feb 19, 2025
  
  35c8daee Browse Files
- fix: specify the hash version of action in scorecard.yml (#313) · 39d175b7
  Willem Jiang committed Feb 19, 2025
  
  39d175b7 Browse Files
- docs: add an example for Ray on Slurm (#309) · 55a4d3c7
```
A working Slurm example adapted from
https://docs.ray.io/en/latest/ray-core/starting-ray.html
```
  Chenhui Zhang committed Feb 19, 2025
  55a4d3c7 Browse Files
- docs: add recent event and blogs (#305) · 8cc37e1d
  HL committed Feb 19, 2025
  
  8cc37e1d Browse Files
- Added the dependabot action (#304) · 68564795
  Willem Jiang committed Feb 19, 2025
  
  68564795 Browse Files
- Added content permissions of the workflow (#303) · dd09d47f
```
We need to specify the minimum  permission in the workflow.
```
  Willem Jiang committed Feb 19, 2025
  dd09d47f Browse Files
18 Feb, 2025 3 commits
- fix: fix offload/load optimizer impl (#299) · cf54dd17
```
Avoid CPU-to-device loading or offloading when the optimizer is not
initialized to prevent the incorrect creation of the optimizer.state
```
  Mofan committed Feb 18, 2025
  cf54dd17 Browse Files
- docs: update news · bb939b70
  HL committed Feb 18, 2025
  
  bb939b70 Browse Files
- example: fix the gemma2 example, update NGC dockerfile (#291) · 77f065ea
  HL committed Feb 18, 2025
  
  77f065ea Browse Files
17 Feb, 2025 3 commits

Fix wrongs args desc . (#294) · 0dfcb7f9

1 fix wrong notes description.
2 fix wrong code path.

Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>

committed Feb 17, 2025

0dfcb7f9 Browse Files

[misc] feat: support offload parameter and optimizer during rollout (#284) · 9db52329

- Fixed FSDP1 model offload
- With `actor_rollout_ref.actor.fsdp_config.param_offload=True \` and
`actor_rollout_ref.actor.fsdp_config.optimizer_offload=True \ `. The GPU
memory utilization can increase to 0.9
- With actor, critic and reference offload all enabled, there will only
be one model copy at a time in the GPU memory. Therefore, we can further
increase the `micro_batch_size_per_gpu` or `max_token_per_gpu`

**Specifically:**
- During rollout, only rollout model and KVCache are in the GPU memory.
- During critic compute values, only the critic model will stay in the
GPU memory while its optimizer and other model states are in CPU main
memory
- During actor update, the actor model, optimizer are stored on GPU
while the reference model and critic model, critic optimizer are
offloaded to CPU.

committed Feb 17, 2025

9db52329 Browse Files

Enhancement: Support for `extra_info` in Reward Calculation (#266) · f0e5bdf0

### **Enhancement: Support for `extra_info` in Reward Calculation**  

#### **Summary**  
This update enhances the reward computation process by introducing an
additional `extra_info` parameter. This allows users to pass in more
contextual information when calculating rewards, improving flexibility
for different datasets.

#### **Changes Made**  
- **Updated `_default_compute_score`** to accept an `extra_info`
argument:
  ```python
def _default_compute_score(data_source, solution_str, ground_truth,
extra_info):
  ```
- **Modified the reward manager (`naive.py`)** to pass `extra_info` from
`data_item.non_tensor_batch` to `compute_score`:
  ```python
  extra_info = data_item.non_tensor_batch['extra_info']
  score = self.compute_score(
      data_source=data_source,
      solution_str=sequences_str,
      ground_truth=ground_truth,
      extra_info=extra_info,
  )
  ```
  
#### **Why This Change?**  
- Some datasets require additional context beyond `data_source`,
`solution_str`, and `ground_truth` for accurate reward computation.
- The new `extra_info` field allows users to pass custom metadata,
ideally in dictionary form, as specified in the [official
documentation](https://verl.readthedocs.io/en/latest/preparation/prepare_data.html).
- This change maintains compatibility with existing dataset processing
scripts, as they already include the `extra_info` field.

#### **Impact**  
- **Improved flexibility**: Users can now pass additional contextual
information, making reward computation more adaptable to different
datasets.
- **Backward compatibility**: Since all example datasets already include
`extra_info`, this update should integrate seamlessly.

Let me know if any modifications are needed!

committed Feb 17, 2025

f0e5bdf0 Browse Files

16 Feb, 2025 1 commit
- distro: make liger-kernel optional. do not rely on requirement.txt during setup (#286) · 0c32cf78
  HL committed Feb 16, 2025
  
  0c32cf78 Browse Files
15 Feb, 2025 6 commits

fix the split placement example (#281) · c8b9c355

The split placement example is outdated, I tried it and encountered some
errors. To address this, the following changes were made in this PR
1. Copied the content from `verl/trainer/config/ppo_trainer.yaml` to
`examples/split_placement/config/ppo_trainer_split.yaml`
2. Copied `RayPPOTrainer.fit` method into the `fit` func in
`examples/split_placement/split_monkey_patch.py` and modified it to get
the futures of `critic_output` and `actor_output`

committed Feb 16, 2025

c8b9c355 Browse Files

release: bump up version to v0.2 · 828df7e8
HL committed Feb 15, 2025

828df7e8 Browse Files
[doc] Give an additional instruction in building nightly vLLM (#282) · e47718f6
```
Co-authored-by: zhangshulai <zhangshulai@bytedance.com>
```
ZSL98 committed Feb 15, 2025
e47718f6 Browse Files

feat: Expose `remove_previous_ckpt` option to training entry point an… (#274) · f3afdb33

Related issue: https://github.com/volcengine/verl/issues/273

- Add `remove_previous_ckpt_in_save` and `del_local_ckpt_after_load`
configuration option in `ppo_trainer.yaml`
- Update `RayPPOTrainer` to support optional checkpoint deletion during
loading
- Modify `ActorRolloutRefWorker` and `CriticWorker` to pass checkpoint
removal flag

committed Feb 15, 2025

f3afdb33 Browse Files

[misc] fix install requirement (#279) · f1e13a60
```
- Rollout back vllm version (vllm > 0.7.0 only for testing)
- pyext as an extra requirement
```
Guangming Sheng committed Feb 15, 2025
f1e13a60 Browse Files

Add auto save ckpt at the end of training (#260) · 8003e875

Currently, checkpoints will not be saved until the training steps
satisfy the saving frequency. This PR adds an auto-save ckpt function at
the end of training.

committed Feb 15, 2025

8003e875 Browse Files