Edit model card

mistral-sft-spin-filter

This model is a fine-tuned version of AmberYifan/mistral-safe-sft-full on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1558
  • Rewards/real: 10.9847
  • Rewards/generated: -8.6527
  • Rewards/accuracies: 0.9750
  • Rewards/margins: 19.6374
  • Logps/generated: -349.4182
  • Logps/real: -127.9232
  • Logits/generated: -2.7690
  • Logits/real: -2.7222

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • total_train_batch_size: 12
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/real Rewards/generated Rewards/accuracies Rewards/margins Logps/generated Logps/real Logits/generated Logits/real
0.177 0.1024 200 0.1924 6.8795 -3.7641 0.9875 10.6436 -300.5322 -168.9753 -3.0388 -3.0073
0.1792 0.2047 400 0.1701 9.3881 -2.3728 0.9625 11.7610 -286.6194 -143.8888 -2.6922 -2.6519
0.1591 0.3071 600 0.1697 10.1693 -3.2928 0.9500 13.4621 -295.8188 -136.0773 -2.6930 -2.6297
0.1193 0.4094 800 0.1567 10.4715 -4.6410 0.9625 15.1125 -309.3010 -133.0556 -2.8476 -2.7928
0.1532 0.5118 1000 0.1644 10.6372 -6.4904 0.9625 17.1277 -327.7956 -131.3978 -2.8507 -2.7919
0.1475 0.6141 1200 0.1439 10.6120 -8.0436 0.9750 18.6556 -343.3268 -131.6498 -2.7844 -2.7264
0.1095 0.7165 1400 0.1604 10.8843 -8.2313 0.9625 19.1156 -345.2036 -128.9272 -2.8188 -2.7682
0.1391 0.8188 1600 0.1659 10.9471 -7.9452 0.9625 18.8923 -342.3433 -128.2997 -2.7646 -2.7111
0.1322 0.9212 1800 0.1558 10.9847 -8.6527 0.9750 19.6374 -349.4182 -127.9232 -2.7690 -2.7222

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.2.2+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for AmberYifan/mistral-sft-spin-filter

Finetuned
this model