@@ -15,9 +15,9 @@ verl supports various backends. Currently, the following configurations are avai
Training backends
------------------
We recommend using **FSDP** backend to investigate, research and prototype different models, datasets and RL algorithms. The guide for using FSDP backend can be found in `PyTorch FSDP Backend <https://verl.readthedocs.io/en/latest/workers/fsdp_workers.html>`_.
We recommend using **FSDP** backend to investigate, research and prototype different models, datasets and RL algorithms. The guide for using FSDP backend can be found in :doc:`FSDP Workers<../workers/fsdp_workers>`.
For users who pursue better scalability, we recommend using **Megatron-LM** backend. Currently, we support Megatron-LM@core_v0.4.0 with some internal patches (soon be updated to latest version directly relying on upstream Megatron-LM). The guide for using Megatron-LM backend can be found in `Megatron-LM Backend <https://verl.readthedocs.io/en/latest/workers/megatron_workers.html>`_.
For users who pursue better scalability, we recommend using **Megatron-LM** backend. Currently, we support Megatron-LM v0.4 [1]_. The guide for using Megatron-LM backend can be found in :doc:`Megatron-LM Workers<../workers/megatron_workers>`.
Install from docker image
...
...
@@ -25,7 +25,7 @@ Install from docker image
We provide pre-built Docker images for quick setup.
Image and tag: ``verlai/verl:vemlp-th2.4.0-cu124-vllm0.6.3-ray2.10-te1.7-v0.0.3``. See files under ``docker/`` if you want to build your own image.
Image and tag: ``verlai/verl:vemlp-th2.4.0-cu124-vllm0.6.3-ray2.10-te1.7-v0.0.3``. See files under ``docker/`` for NGC-based image or if you want to build your own.
1. Launch the desired Docker image:
...
...
@@ -85,53 +85,14 @@ own post-training jobs.
cd verl
pip3 install -e .
You can also install verl using ``pip3 install``
.. code:: bash
# directly install from pypi
pip3 install verl
Dependencies
------------
verl requires Python >= 3.9 and CUDA >= 12.1.
verl support various backend, we currently release FSDP and Megatron-LM
for actor training and vLLM for rollout generation.
The following dependencies are required for all backends, PyTorch FSDP and Megatron-LM.
The pros, cons and extension guide for using PyTorch FSDP backend can be
found in :doc:`FSDP Workers<../workers/fsdp_workers>`.
.. code:: bash
# install torch [or you can skip this step and let vllm to install the correct version for you]
@@ -145,4 +106,7 @@ found in :doc:`Megatron-LM Workers<../workers/megatron_workers>`.
cp ../verl/patches/megatron_v4.patch .
git apply megatron_v4.patch
pip3 install -e .
export PYTHONPATH=$PYTHONPATH:$(pwd)
\ No newline at end of file
export PYTHONPATH=$PYTHONPATH:$(pwd)
.. [1] Megatron v0.4 is supported with verl's patches to fix issues such as virtual pipeline hang. It will be soon updated with latest the version of upstream Megatron-LM without patches.
@@ -52,9 +51,9 @@ We preprocess the dataset in parquet format so that (1) it contains necessary fi
Step 2: Download a model for post-training
-------------------------------------------
Usually we recommend starting with an "instruct" model variant so that the model follows instructions. In this example, we start with the ``Qwen2.5-0.5B-Instruct`` model.
In this example, we start with the ``Qwen2.5-0.5B-Instruct`` model.
If you start from a "base" model variant, doing SFT before RL is recommended. Refer to the `sft directory <https://github.com/volcengine/verl/blob/main/examples/sft/gsm8k>`_ and `SFT Trainer <https://github.com/volcengine/verl/blob/main/verl/trainer/fsdp_sft_trainer.py>`_ for further details.
If you want to perform SFT before RL, refer to the :doc:`Complete GSM8K Example<../examples/gsm8k_example>`, the `sft directory <https://github.com/volcengine/verl/blob/main/examples/sft/gsm8k>`_ and `SFT Trainer <https://github.com/volcengine/verl/blob/main/verl/trainer/fsdp_sft_trainer.py>`_ for further details.