YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Uzbek Matcha-TTS Model Card

Model Overview

This is a fine-tuned Matcha-TTS (Matching Acoustic Conditions for TTS) model optimized for Uzbek language speech synthesis. The model has been specifically trained on male voice data and exported to ONNX format for maximum compatibility across devices and platforms.

Model Details

Model Type: Text-to-Speech (TTS)
Base Architecture: Matcha-TTS
Language: Uzbek (uz)
Voice Gender: Male
Format: ONNX (Open Neural Network Exchange)
License: [Specify your license]

Key Features

Cross-Platform Compatibility

Desktop: Windows, macOS, Linux
Mobile: iOS, Android
Hardware: CPU and GPU support
Edge Deployment: Optimized for on-device inference

Technical Advantages

ONNX Format: Enables deployment on any device with ONNX Runtime
Efficient Inference: Optimized for real-time speech synthesis
No Internet Required: Fully offline capable
Lightweight: Suitable for resource-constrained devices

Intended Use

Primary Use Cases

Uzbek language text-to-speech applications
Accessibility tools for Uzbek speakers
Voice assistants and chatbots
Educational content creation
Audiobook narration in Uzbek

Out-of-Scope Uses

Voice cloning or impersonation
Generation of misleading audio content
Any use that violates privacy or consent

Training Data

Voice Gender: Male speaker(s)
Language: Uzbek

Performance

The model generates natural-sounding Uzbek speech with male voice characteristics. Performance metrics include:

Audio Quality: 22.05 kHz
Latency: Optimized for low-latency inference

Usage

Installation

pip install onnxruntime
# For GPU support
pip install onnxruntime-gpu

Basic Example

import onnxruntime as ort
import numpy as np

# Load the model
session = ort.InferenceSession("matcha_tts_uzbek_male.onnx")

# Prepare input text
text = "Salom, dunyo!"  # Hello, world! in Uzbek

# Run inference
# [Add specific preprocessing and inference code based on your implementation]
outputs = session.run(None, {"text": text})

# Get audio output
audio = outputs[0]

Limitations

The model is trained specifically for male voice synthesis
Performance may vary with complex or uncommon Uzbek words
Pronunciation of foreign words may not be accurate
The model's output quality depends on input text quality and formatting

Ethical Considerations

This model should not be used to generate deceptive or misleading audio content
Users should respect privacy and obtain consent when using synthetic voices in public contexts
Consider disclosing when audio is AI-generated in appropriate contexts

Changelog

Version 1.0.0

Initial release
Fine-tuned for Uzbek male voice
ONNX export for cross-platform compatibility

Acknowledgments

Based on the Matcha-TTS architecture
Thanks to the Uzbek language community for support and feedback

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support