Unverified Commit 0dfcb7f9 by 湛露先生 Committed by GitHub

Fix wrongs args desc . (#294)

1 fix wrong notes description.
2 fix wrong code path.

Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
parent 9db52329
......@@ -66,7 +66,7 @@ Here, ``SampleGenerator`` can be viewed as a multi-process pulled up by
the control flow to call. The implementation details inside can use any
inference engine including vllm, sglang and huggingface. Users can
largely reuse the code in
verl/verl/trainer/ppo/rollout/vllm_rollout/vllm_rollout.py and we won't
verl/verl/workers/rollout/vllm_rollout/vllm_rollout.py and we won't
go into details here.
**ReferencePolicy inference**
......
......@@ -159,8 +159,8 @@ whether it's a model-based RM or a function-based RM
- Note that the pre-defined ``RewardModelWorker`` only supports models
with the structure of huggingface
``AutoModelForSequenceClassification``. If it's not this model, you
need to define your own RewardModelWorker in `FSDP Workers <https://github.com/volcengine/verl/blob/main/verl/trainer/ppo/workers/fsdp_workers.py>`_
and `Megatron-LM Workers <https://github.com/volcengine/verl/blob/main/verl/trainer/ppo/workers/megatron_workers.py>`_.
need to define your own RewardModelWorker in `FSDP Workers <https://github.com/volcengine/verl/blob/main/verl/workers/fsdp_workers.py>`_
and `Megatron-LM Workers <https://github.com/volcengine/verl/blob/main/verl/workers/megatron_workers.py>`_.
- If it's a function-based RM, the users are required to classified the
reward function for each datasets.
......
......@@ -77,7 +77,7 @@ def merge_megatron_ckpt_llama(wrapped_models, config, is_value_model=False, dtyp
"""Merge sharded parameters of a Megatron module into a merged checkpoint.
Args:
wrapped_modelss (list of megatron.model.DistributedDataParallel):
wrapped_models (list of megatron.model.DistributedDataParallel):
The local DDP wrapped megatron modules.
dtype (str or None):
The data type of state_dict. if None, the data type of the original parameters
......
......@@ -77,7 +77,7 @@ def create_huggingface_actor(model_name: str, override_config_kwargs=None, autom
Args:
model_name:
actor_override_config_kwargs:
override_config_kwargs:
Returns:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment