Edit model card

zephyr_0.1

The DPO-trained model from alignment-handbook/zephyr-7b-sft-full using 10% data of HuggingFaceH4/ultrafeedback_binarized, as in the "Weak-to-Strong Extrapolation Expedites Alignment" paper.

Downloads last month
14
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Spaces using chujiezheng/zephyr_0.1 3

Collection including chujiezheng/zephyr_0.1