mistral-nemo-gutenberg-12B-v4-exl2
This repository contains various EXL2 quantisations of nbeerbower/mistral-nemo-gutenberg-12B-v4.
Quantisations available:
Branch | Description | Recommended |
---|---|---|
2.0-bpw | 2 bits per weight | Low Quality - Smallest Available Quantisation |
3.0-bpw | 3 bits per weight | |
4.0-bpw | 4 bits per weight | ✔️ - Recommended for Low-VRAM Environments |
5.0-bpw | 5 bits per weight | |
6.0-bpw | 6 bits per weight | ✔️ - Best Quality / VRAM Balance |
6.5-bpw | 6.5 bits per weight | ✔️ - Near Perfect Quality, Slightly Higher VRAM Usage |
8.0-bpw | 8.0 bits per weight | Best Available Quality - Almost always unnecessary |
ORIGINAL README:
TheDrummer/Rocinante-12B-v1 finetuned on jondurbin/gutenberg-dpo-v0.1.
Method
Finetuned using an A100 on Google Colab for 3 epochs.
- Downloads last month
- 56
Inference API (serverless) is not available, repository is disabled.
Model tree for CameronRedmore/mistral-nemo-gutenberg-12B-v4-exl2
Base model
TheDrummer/Rocinante-12B-v1
Finetuned
this model