docs: Add LigerKernel performance tuning documentation (#178)

This PR adds documentation for the LigerKernel option in a new performance tuning section, addressing the comment from volcengine/verl#173. Changes: - Created new performance tuning section in docs - Documented LigerKernel option for SFT - Added performance tuning section to documentation index Related to volcengine/verl#173 --------- Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: HL <linhaibin.eric@gmail.com>

docs: Add LigerKernel performance tuning documentation (#178)
This PR adds documentation for the LigerKernel option in a new performance tuning section, addressing the comment from volcengine/verl#173. Changes: - Created new performance tuning section in docs - Documented LigerKernel option for SFT - Added performance tuning section to documentation index Related to volcengine/verl#173 --------- Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: HL <linhaibin.eric@gmail.com>
3fe77fa7 · Xingyao Wang · GitHub · 13762f43 · 3fe77fa7
Unverified Commit 3fe77fa7 authored Feb 02, 2025 by Xingyao Wang Committed by GitHub Feb 02, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 19 additions and 0 deletions

docs/perf/perf_tuning.rst
+19 -0

No files found.
--- a/docs/perf/perf_tuning.rst
+++ b/docs/perf/perf_tuning.rst
@@ -11,6 +11,8 @@ In this section, we will discuss how to tune the performance of all the stages i

 4. Utilize Ulysses Sequence Parallel for Long Context Training

+5. LigerKernel for SFT performance optimization
+
 Rollout Generation Tuning
 --------------------------

@@ -119,3 +121,20 @@ To utilize this technique, users can set ``ulysses_sequence_parallel_size>1`` in
 We support different model utilize different ulysses_sequence_parallel_size sizes.

 To train log sequence (>32k), users may need to decrease the ``*micro_batch_size_per_gpu`` and ``*max_token_len_per_gpu`` to avoid OOM.
+
+LigerKernel for SFT
+----------------------
+
+LigerKernel is a high-performance kernel for Supervised Fine-Tuning (SFT) that can improve training efficiency. To enable LigerKernel in your SFT training:
+
+1. In your SFT configuration file (e.g., ``verl/trainer/config/sft_trainer.yaml``), set the ``use_liger`` parameter:
+
+   .. code-block:: yaml
+
+      model:
+        use_liger: True  # Enable LigerKernel for SFT
+
+2. The default value is ``False``. Enable it only when you want to use LigerKernel's optimizations.
+
+3. LigerKernel is particularly useful for improving training performance in SFT scenarios.
+