FP8-Block Quantized Models - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

FP8-Block Quantized Models

updated Nov 17, 2025

Collection of State-of-the-art FP8 Block Quantized Models

RedHatAI/Qwen3-8B-FP8-block

Text Generation • 8B • Updated Dec 31, 2025 • 48
RedHatAI/Qwen3-32B-FP8-block

Text Generation • 33B • Updated Oct 24, 2025 • 11
RedHatAI/Qwen3-14B-FP8-block

Text Generation • 15B • Updated Oct 24, 2025 • 12
RedHatAI/Llama-3.1-8B-Instruct-FP8-block

Text Generation • 8B • Updated Oct 29, 2025 • 7
nm-testing/Qwen3-VL-235B-A22B-Instruct-FP8-BLOCK

Text Generation • Updated Oct 27, 2025
nm-testing/Llama-4-Scout-17B-16E-Instruct-BLOCK-FP8

Text Generation • 109B • Updated Oct 27, 2025 • 6 • 1
RedHatAI/Llama-3.3-70B-Instruct-FP8-block

Text Generation • 71B • Updated Oct 24, 2025 • 2.04k
nm-testing/Llama-4-Maverick-17B-128E-Instruct-block-FP8

Text Generation • Updated Oct 27, 2025 • 8
nm-testing/Qwen3-30B-A3B-FP8-block

Text Generation • 3B • Updated Oct 27, 2025 • 8
nm-testing/granite-4.0-h-small-FP8-block

Text Generation • 32B • Updated Nov 17, 2025 • 7