Unverified Commit 018b0d73 by Chi Zhang Committed by GitHub

[misc] chore: refactor and add several metrics (#111)

- Add format script
- Move save_checkpoint to a separate function
- Add timing/step, response_length/clip_ratio, prompt_length/clip_ratio
and critic/vf_explained_var metrics
- The training step starts from 1
parent ff0c7ccd
#!/bin/bash
pip3 install --upgrade yapf
yapf -ir -vv --style ./.style.yapf verl tests single_controller examples
\ No newline at end of file
......@@ -132,7 +132,7 @@ trainer:
nnodes: 1
n_gpus_per_node: 8
save_freq: -1
test_freq: 2
test_freq: -1
critic_warmup: 0
default_hdfs_dir: ~/experiments/gsm8k/ppo/${trainer.experiment_name}
default_local_dir: checkpoints/${trainer.project_name}/${trainer.experiment_name}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment