example: fix remove padding flags for gemma example. update v0.2 install docs (#290)

55093874 · HL · GitHub · 0c32cf78 · 55093874 · 55093874
Unverified Commit 55093874 authored Feb 16, 2025 by HL Committed by GitHub Feb 16, 2025
Show whitespace changes
Inline Side-by-side

Showing with 5 additions and 8 deletions

docs/start/install.rst
+3 -6

examples/ppo_trainer/run_gemma.sh
+2 -2

No files found.
--- a/docs/start/install.rst
+++ b/docs/start/install.rst
@@ -38,9 +38,8 @@ Image and tag: ``verlai/verl:vemlp-th2.4.0-cu124-vllm0.6.3-ray2.10-te1.7-v0.0.3`

 .. code:: bash

-    # install the nightly version (recommended)
-    git clone https://github.com/volcengine/verl && cd verl && pip3 install -e .
-    # or install from pypi via `pip3 install verl`
+    # install the stable version
+    pip3 install verl


 3. Setup Megatron (optional)
@@ -83,9 +82,7 @@ own post-training jobs.
   # install verl together with some lightweight dependencies in setup.py
   pip3 install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124
   pip3 install flash-attn --no-build-isolation
-   git clone https://github.com/volcengine/verl.git
-   cd verl
-   pip3 install -e .
+   pip3 install verl


 Megatron is optional. It's dependencies can be setup as below:

--- a/examples/ppo_trainer/run_gemma.sh
+++ b/examples/ppo_trainer/run_gemma.sh
@@ -9,7 +9,7 @@ python3 -m verl.trainer.main_ppo \
    data.max_response_length=512 \
    actor_rollout_ref.model.path=google/gemma-2-2b-it \
    actor_rollout_ref.actor.optim.lr=1e-6 \
-    actor_rollout_ref.model.use_remove_padding=True \
+    actor_rollout_ref.model.use_remove_padding=False \
    actor_rollout_ref.actor.ppo_mini_batch_size=128 \
    actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=4 \
    actor_rollout_ref.actor.fsdp_config.param_offload=False \
@@ -22,7 +22,7 @@ python3 -m verl.trainer.main_ppo \
    actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=4 \
    actor_rollout_ref.ref.fsdp_config.param_offload=True \
    critic.optim.lr=1e-5 \
-    critic.model.use_remove_padding=True \
+    critic.model.use_remove_padding=False \
    critic.model.path=google/gemma-2-2b-it \
    critic.model.enable_gradient_checkpointing=False \
    critic.ppo_micro_batch_size_per_gpu=4 \