Edit model card

Model Card for Mistral-sci-phi

This model is a fine-tuned version of the Mistral-7B model, optimized for performance and efficiency using the PEFT library and INT4 quantization.

Model Details

Model Description

Mistral-sci-phi is a model fine-tuned from the Mistral-7B base model. It has been optimized for enhanced performance and reduced size, making it highly efficient for various NLP tasks. The model is trained using the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset from the Hugging Face Hub, ensuring it's well-suited for real-world applications.

  • Developed by: Arturo de Pablo
  • Trained by: IZX, Hyper88
  • Model type: Causal Language Model
  • Language(s) (NLP): English
  • License: [More Information Needed]
  • Finetuned from model: mistralai/Mistral-7B-v0.1

Model Sources

Uses

Direct Use

The model can be used directly for generating text and other NLP tasks.

Downstream Use

It can also be integrated into larger systems for more complex applications.

Out-of-Scope Use

The model should not be used for tasks beyond its training and capability scope.

Bias, Risks, and Limitations

The model inherits the biases and limitations of the base Mistral-7B model. Users should be cautious of these when using the model.

Recommendations

Users should evaluate the model's performance and biases in their specific use case and make adjustments as necessary.

How to Get Started with the Model

The model can be loaded and used for inference using the Hugging Face Transformers library.

Training Details

Training Data

The model was trained on the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset available on the Hugging Face Hub.

Training Procedure

The model was fine-tuned using INT4 quantization to optimize its performance and size.

Training Hyperparameters

  • Training was done with a learning rate of 2e-4
  • Batch size of 12
  • Trained for 3 epochs

Evaluation

Testing Data, Factors & Metrics

[More Information Needed]

Results

[More Information Needed]

Environmental Impact

The environmental impact is minimized due to the optimized size and efficiency of the model.

Technical Specifications

Model Architecture and Objective

The model is based on the Mistral-7B architecture and fine-tuned for enhanced performance.

Compute Infrastructure

Sponsored by izx.ai

Software

  • PEFT 0.6.0.dev0

More Information

For more details, visit the model repository.

Model Card Authors

Model Card Contact

https://discord.gg/KGCeKP4ng9

Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .