Audio-to-Audio
MLX
Safetensors
audio
speech-enhancement
noise-suppression
deepfilternet
apple-silicon
Instructions to use iky1e/DeepFilterNet2-MLX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use iky1e/DeepFilterNet2-MLX with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir DeepFilterNet2-MLX iky1e/DeepFilterNet2-MLX
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
| license: mit | |
| library_name: mlx | |
| tags: | |
| - mlx | |
| - audio | |
| - speech-enhancement | |
| - noise-suppression | |
| - deepfilternet | |
| - apple-silicon | |
| base_model: DeepFilterNet/DeepFilterNet2 | |
| pipeline_tag: audio-to-audio | |
| # DeepFilterNet2 — MLX | |
| MLX-compatible weights for [DeepFilterNet2](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement model that suppresses background noise from audio. | |
| This is a direct conversion of the original PyTorch weights to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. | |
| ## Origin | |
| - **Original model:** [DeepFilterNet2](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter | |
| - **Paper:** [DeepFilterNet2: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio](https://arxiv.org/abs/2205.05474) | |
| - **License:** MIT (same as the original) | |
| - **Conversion:** PyTorch -> `safetensors` via the included `convert_deepfilternet.py` script | |
| No fine-tuning or quantization was applied. Weights are converted directly from the original checkpoint. | |
| ## Files | |
| | File | Description | | |
| |---|---| | |
| | `config.json` | Model architecture configuration | | |
| | `model.safetensors` | Pre-converted weights (~8.9 MB, float32) | | |
| | `convert_deepfilternet.py` | Conversion script (PyTorch -> MLX safetensors) | | |
| ## Model Details | |
| | Parameter | Value | | |
| |---|---| | |
| | Sample rate | 48 kHz | | |
| | FFT size | 960 | | |
| | Hop size | 480 | | |
| | ERB bands | 32 | | |
| | DF bins | 96 | | |
| | DF order | 5 | | |
| | Embedding hidden dim | 256 | | |
| ## Usage | |
| ### Swift (mlx-audio-swift) | |
| ```swift | |
| import MLXAudioSTS | |
| let model = try await DeepFilterNetModel.fromPretrained("iky1e/DeepFilterNet2-MLX") | |
| let enhanced = try model.enhance(audioArray) | |
| ``` | |
| ### Python (mlx-audio) | |
| ```python | |
| from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel | |
| model = DeepFilterNetModel.from_pretrained("iky1e/DeepFilterNet2-MLX") | |
| enhanced = model.enhance("noisy.wav") | |
| ``` | |
| ## Converting from PyTorch | |
| ```bash | |
| python convert_deepfilternet.py \ | |
| --input /path/to/DeepFilterNet2 \ | |
| --output ./DeepFilterNet2-MLX \ | |
| --name DeepFilterNet2 | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @inproceedings{schroeter2022deepfilternet2, | |
| title = {{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio}, | |
| author = {Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas}, | |
| booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)}, | |
| year = {2022}, | |
| } | |
| ``` | |