Unverified Commit 16b1984a by 湛露先生 Committed by GitHub

fix: typo (#243)

Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
parent 7e073496
......@@ -130,7 +130,7 @@ See `source code <https://github.com/volcengine/verl/blob/main/verl/trainer/ppo/
- In this function, the rollout model will perform auto-regressive
generation and the actor model will recompute the old log prob for the
generetad response.
generated response.
3. Update actor model
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment