Edit model card

pointwise-reward-gemma-2b-it-ultrafeedback-binarized-unpaired-20240908-161341

This model is a fine-tuned version of google/gemma-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6146
  • Accuracy: 0.6613

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.6762 0.0383 100 0.6744 0.5958
0.6868 0.0766 200 0.6609 0.5969
0.6505 0.1149 300 0.6463 0.6218
0.6422 0.1533 400 0.6511 0.6089
0.6279 0.1916 500 0.6358 0.6309
0.647 0.2299 600 0.6379 0.6261
0.6532 0.2682 700 0.6309 0.6445
0.6373 0.3065 800 0.6313 0.6477
0.6083 0.3448 900 0.6270 0.6357
0.6004 0.3831 1000 0.6229 0.6477
0.6247 0.4215 1100 0.6228 0.6477
0.6592 0.4598 1200 0.6228 0.6532
0.6656 0.4981 1300 0.6194 0.6558
0.6273 0.5364 1400 0.6191 0.6606
0.6337 0.5747 1500 0.6178 0.6543
0.6056 0.6130 1600 0.6185 0.6642
0.6418 0.6513 1700 0.6168 0.6598
0.6438 0.6897 1800 0.6172 0.6661
0.6115 0.7280 1900 0.6149 0.6525
0.6158 0.7663 2000 0.6150 0.6598
0.6192 0.8046 2100 0.6151 0.6642
0.6271 0.8429 2200 0.6147 0.6595
0.5889 0.8812 2300 0.6148 0.6573
0.6344 0.9195 2400 0.6147 0.6595
0.6123 0.9579 2500 0.6146 0.6569
0.6196 0.9962 2600 0.6146 0.6613

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
2.51B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for sahandrez/pointwise-reward-gemma-2b-it-ultrafeedback

Base model

google/gemma-2b-it
Finetuned
this model