readerbench
/

whisper-ro

Automatic Speech Recognition

Eval Results (legacy)

Model card Files Files and versions

whisper-ro / README.md

udan11's picture

initial commit

a2fd10c unverified almost 2 years ago

|

history blame contribute delete

1.75 kB

	---
	license: apache-2.0
	tags:
	- automatic-speech-recognition
	- whisper
	- romanian
	datasets:
	- readerbench/echo
	metrics:
	- wer
	model-index:
	- name: whisper-ro
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Echo
	type: readerbench/echo
	config: ro
	metrics:
	- name: WER
	type: wer
	value: 0.08668345828147764
	---

	# whisper-ro

	This model is a fine-tuned version of
	[openai/whisper-small](https://huggingface.co/openai/whisper-small) on the [Echo
	dataset](https://huggingface.co/datasets/readerbench/echo), a large open-source
	Romanian dataset.

	\| Name \| Small \| Large-v2 \| Fine-tuned small <br/><small>(this model)</small> \|
	\|:------------:\|:-----:\|:--------:\|:-------------------------------------------------:\|
	\| Common Voice \| 33.2 \| 15.8 \| 12.2 \|
	\| FLEURS \| 29.8 \| 14.4 \| 10.9 \|
	\| VoxPopuli \| 28.6 \| 14.4 \| 9.4 \|
	\| Echo \| >100 \| >100 \| 8.6 \|
	\| RSC \| 38.6 \| 28.5 \| 5.4 \|

	### Training hyperparameters

	The following hyperparameters were used during training:

	- `learning_rate`: 1e-05
	- `train_batch_size`: 128
	- `eval_batch_size`: 128
	- `seed`: 42
	- `distributed_type`: multi-GPU
	- `num_devices`: 2
	- `total_train_batch_size`: 256
	- `total_eval_batch_size`: 256
	- `optimizer`: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- `lr_scheduler_type`: linear
	- `lr_scheduler_warmup_steps`: 500
	- `num_epochs`: 20.0
	- `mixed_precision_training`: Native AMP