A wider Baby Berta Model trained using curriculum learning and layer stacking for the BabyLM Challenge Strict Small track.
- Downloads last month
- 18
A wider Baby Berta Model trained using curriculum learning and layer stacking for the BabyLM Challenge Strict Small track.