pointwise-reward-gemma-2b-it-ultrafeedback-binarized-unpaired-20240908-161341

This model is a fine-tuned version of google/gemma-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.6762	0.0383	100	0.6744	0.5958
0.6868	0.0766	200	0.6609	0.5969
0.6505	0.1149	300	0.6463	0.6218
0.6422	0.1533	400	0.6511	0.6089
0.6279	0.1916	500	0.6358	0.6309
0.647	0.2299	600	0.6379	0.6261
0.6532	0.2682	700	0.6309	0.6445
0.6373	0.3065	800	0.6313	0.6477
0.6083	0.3448	900	0.6270	0.6357
0.6004	0.3831	1000	0.6229	0.6477
0.6247	0.4215	1100	0.6228	0.6477
0.6592	0.4598	1200	0.6228	0.6532
0.6656	0.4981	1300	0.6194	0.6558
0.6273	0.5364	1400	0.6191	0.6606
0.6337	0.5747	1500	0.6178	0.6543
0.6056	0.6130	1600	0.6185	0.6642
0.6418	0.6513	1700	0.6168	0.6598
0.6438	0.6897	1800	0.6172	0.6661
0.6115	0.7280	1900	0.6149	0.6525
0.6158	0.7663	2000	0.6150	0.6598
0.6192	0.8046	2100	0.6151	0.6642
0.6271	0.8429	2200	0.6147	0.6595
0.5889	0.8812	2300	0.6148	0.6573
0.6344	0.9195	2400	0.6147	0.6595
0.6123	0.9579	2500	0.6146	0.6569
0.6196	0.9962	2600	0.6146	0.6613