Files · 5a400bf25a0f50788dbca2d77071cc00ce651b59 · ZhangXiaoyun / verl

[ckpt] feat: integrate checkpoint resume in RL ray trainer (#222) · 5a400bf2

**Features:**
- Save actor and critic checkpoint:
  - Model
  - Optimizer
  - lr_scheduler
  - rng_state
  - dataloader
- A complete checkpoint represents that dataloader, actor and critic (if
any) state are properly saved
- By default, we will not save the dataset but only store the dataloader
(with sampler) state

**Usage:**
- Support resume mode: auto, disable and resume_from_path
- auto: veRL will automatically check the latest checkpoint from
`trainer.default_local_dir`
   - disable: veRL will always train from scratch
- resume_from_path: When setting `resume_from_path`=True, then user only
need to set the resume_mode to the checkpoint path that you want to
load.

**TODO:**
- Support SFT resume in the next PR
- Support uploader

**Relevant issue:**
- https://github.com/volcengine/verl/issues/76
- https://github.com/volcengine/verl/issues/143

committed Feb 08, 2025

5a400bf2

Name	Last commit	Last update
.github/workflows		Loading commit data...
docker		Loading commit data...
docs		Loading commit data...
examples		Loading commit data...
patches		Loading commit data...
scripts		Loading commit data...
tests		Loading commit data...
verl		Loading commit data...
.gitignore		Loading commit data...
.readthedocs.yaml		Loading commit data...
.style.yapf		Loading commit data...
LICENSE		Loading commit data...
Notice.txt		Loading commit data...
README.md		Loading commit data...
pyproject.toml		Loading commit data...
requirements.txt		Loading commit data...
setup.py		Loading commit data...

README.md