whisper-base-ko-1 / README.md
arun100's picture
End of training
5345b79 verified
metadata
language:
  - ko
license: apache-2.0
base_model: openai/whisper-base
tags:
  - whisper-event
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_16_0
metrics:
  - wer
model-index:
  - name: Whisper Base Korean
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_16_0 ko
          type: mozilla-foundation/common_voice_16_0
          config: ko
          split: test
          args: ko
        metrics:
          - name: Wer
            type: wer
            value: 45.5026455026455

Whisper Base Korean

This model is a fine-tuned version of openai/whisper-base on the mozilla-foundation/common_voice_16_0 ko dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6687
  • Wer: 45.5026

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.0149 133.0 1000 0.6687 45.5026
0.0048 266.0 2000 0.7148 47.7633
0.0024 399.0 3000 0.7484 48.4848
0.0014 533.0 4000 0.7774 49.0139
0.0009 666.0 5000 0.8037 48.8215
0.0006 799.0 6000 0.8269 49.4468
0.0004 933.0 7000 0.8482 49.3987
0.0003 1066.0 8000 0.8662 54.6417
0.0003 1199.0 9000 0.8800 49.9278
0.0003 1333.0 10000 0.8856 49.8316

Framework versions

  • Transformers 4.38.0.dev0
  • Pytorch 2.1.2+cu121
  • Datasets 2.16.2.dev0
  • Tokenizers 0.15.0