Locutusque
/

Hyperion-3.0-Mistral-7B-DPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Add quant links

#2

by bartowski - opened Mar 24

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -112,6 +112,12 @@ This model is intended for researchers, developers, and organizations seeking a
 ## Training Data
 The `Locutusque/Hyperion-3.0-Mistral-7B-DPO` model was fine-tuned on a carefully curated dataset of 20,000 preference pairs, where 4,000 examples were used to fine-tune. These examples were generated by GPT-4 to ensure the highest quality and relevance across various domains, including programming, medical texts, mathematical problems, and reasoning tasks. The training data was further optimized using Direct Preference Optimization (DPO) to align the model's outputs with human preferences and improve overall performance.
 ## Evaluation Results
 mmlu flan cot 5-shot

 ## Training Data
 The `Locutusque/Hyperion-3.0-Mistral-7B-DPO` model was fine-tuned on a carefully curated dataset of 20,000 preference pairs, where 4,000 examples were used to fine-tune. These examples were generated by GPT-4 to ensure the highest quality and relevance across various domains, including programming, medical texts, mathematical problems, and reasoning tasks. The training data was further optimized using Direct Preference Optimization (DPO) to align the model's outputs with human preferences and improve overall performance.
+## Quants
+ExLlamaV2: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-exl2
+GGUF: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-GGUF
 ## Evaluation Results
 mmlu flan cot 5-shot