| --- |
| license: apache-2.0 |
| tags: |
| - automatic-speech-recognition |
| - whisper |
| - romanian |
| datasets: |
| - readerbench/echo |
| metrics: |
| - wer |
| model-index: |
| - name: whisper-ro |
| results: |
| - task: |
| name: Automatic Speech Recognition |
| type: automatic-speech-recognition |
| dataset: |
| name: Echo |
| type: readerbench/echo |
| config: ro |
| metrics: |
| - name: WER |
| type: wer |
| value: 0.08668345828147764 |
| --- |
| |
| # whisper-ro |
|
|
| This model is a fine-tuned version of |
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the [Echo |
| dataset](https://huggingface.co/datasets/readerbench/echo), a large open-source |
| Romanian dataset. |
|
|
| | Name | Small | Large-v2 | Fine-tuned small <br/><small>(this model)</small> | |
| |:------------:|:-----:|:--------:|:-------------------------------------------------:| |
| | Common Voice | 33.2 | 15.8 | 12.2 | |
| | FLEURS | 29.8 | 14.4 | 10.9 | |
| | VoxPopuli | 28.6 | 14.4 | 9.4 | |
| | Echo | >100 | >100 | 8.6 | |
| | RSC | 38.6 | 28.5 | 5.4 | |
|
|
| ### Training hyperparameters |
|
|
| The following hyperparameters were used during training: |
|
|
| - `learning_rate`: 1e-05 |
| - `train_batch_size`: 128 |
| - `eval_batch_size`: 128 |
| - `seed`: 42 |
| - `distributed_type`: multi-GPU |
| - `num_devices`: 2 |
| - `total_train_batch_size`: 256 |
| - `total_eval_batch_size`: 256 |
| - `optimizer`: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
| - `lr_scheduler_type`: linear |
| - `lr_scheduler_warmup_steps`: 500 |
| - `num_epochs`: 20.0 |
| - `mixed_precision_training`: Native AMP |
|
|