Block Diffusion for Flash Speculative Decoding
AI & ML interests
Efficient AI
Recent Activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
models 32
z-lab/Kimi-K2.5-DFlash
Text Generation • Updated
z-lab/paroquant-checkpoints
Updated
z-lab/gemma-4-31B-it-PARO
6B • Updated
z-lab/Qwen3.5-27B-DFlash
Text Generation • 2B • Updated • 687 • 10
z-lab/Qwen3.5-0.8B-PARO
Image-Text-to-Text • 0.4B • Updated • 701 • 1
z-lab/Llama-2-7b-hf-PARO
Text Generation • 1B • Updated • 318 • 1
z-lab/DeepSeek-R1-Distill-Llama-8B-PARO
Text Generation • 1B • Updated • 663 • 1
z-lab/Qwen3.5-35B-A3B-PARO
Image-Text-to-Text • 6B • Updated • 191 • 4
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 2.42k • 15
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 54.5k • 41