docs: add reference for tiny-zero

c17e6c62 · HL · GitHub · e9549031 · c17e6c62
Unverified Commit c17e6c62 authored Jan 26, 2025 by HL Committed by GitHub Jan 26, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletions

README.md
+2 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -98,9 +98,10 @@ If you find the project helpful, please cite:

 verl is inspired by the design of Nemo-Aligner, Deepspeed-chat and OpenRLHF. The project is adopted and supported by Anyscale, Bytedance, LMSys.org, Shanghai AI Lab, Tsinghua University, UC Berkeley, UCLA, UIUC, and University of Hong Kong.

-## Publications Using veRL
+## Awesome work using veRL
 - [Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization](https://arxiv.org/abs/2410.09302)
 - [Flaming-hot Initiation with Regular Execution Sampling for Large Language Models](https://arxiv.org/abs/2410.21236)
 - [Process Reinforcement Through Implicit Rewards](https://github.com/PRIME-RL/PRIME/)
+- [TinyZero](https://github.com/Jiayi-Pan/TinyZero): a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks

 We are HIRING! Send us an [email](mailto:haibin.lin@bytedance.com) if you are interested in internship/FTE opportunities in MLSys/LLM reasoning/multimodal alignment.