Conformer-CTC Belarusian (MLX)

NVIDIA's Conformer-CTC Large model for Belarusian speech recognition, packaged for MLX inference on Apple Silicon.

Original model: nvidia/stt_be_conformer_ctc_large

Results

Dataset	WER	Speed
CommonVoice 24.0 test (500 samples)	7.58%	8.2 samples/s

Usage

pip install mlx numpy pyyaml torch

git clone https://github.com/molind/mlx-conformer
cd mlx-conformer

python mlx_conformer.py \
    --download nvidia/stt_be_conformer_ctc_large \
    --output models

python mlx_conformer.py --model models/stt_be_conformer_ctc_large --audio test.mp3

Architecture

18 Conformer layers, d_model=512, 8 heads
Conv kernel size 31, 4x subsampling
128 BPE vocabulary + blank
~120M parameters

License

Original model by NVIDIA, licensed under CC-BY-4.0.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for molind/conformer-ctc-be-mlx

Base model

nvidia/stt_be_conformer_ctc_large

Finetuned

(1)

this model