nvidia
/

Llama3-70B-PPO-Chat

Model card Files Files and versions Community

zhilinw commited on Jun 14

Commit

2f90845

•

1 Parent(s): 9add37e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -34,8 +34,8 @@ You can train the model using [NeMo Aligner](https://github.com/NVIDIA/NeMo-Alig
 ## References
 * [PPO method](https://arxiv.org/abs/2203.02155)
-* [HelpSteer](https://arxiv.org/abs/2311.09528)
 * [Llama 3: Open Foundation and Instruct Models](https://ai.meta.com/blog/meta-llama-3/) <br>
 * [Meta's Llama 3 Webpage](https://llama.meta.com/llama3/) <br>
 * [Meta's Llama 3 Model Card](https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md) <br>

 ## References
+* [HelpSteer2](https://arxiv.org/abs/2406.08673)
 * [PPO method](https://arxiv.org/abs/2203.02155)
 * [Llama 3: Open Foundation and Instruct Models](https://ai.meta.com/blog/meta-llama-3/) <br>
 * [Meta's Llama 3 Webpage](https://llama.meta.com/llama3/) <br>
 * [Meta's Llama 3 Model Card](https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md) <br>