view article Article Introducing North Mini Code: Cohere’s First Model For Developers CohereLabs • 12 days ago • 71
view article Article Fine-tune FLUX.2 [klein] with a LoRA under 60 minutes black-forest-labs • 17 days ago • 24
Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 6 days ago • 161
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 507
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 59
Hibiki-Zero Collection Streaming speech translation without the need for word-level alignments • 4 items • Updated May 9 • 4
CASA Collection CASA: Cross-Attention over Self-Attention for Efficient Vision-Language Fusion on long-context streaming inputs • 6 items • Updated Mar 9 • 8
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 100
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28, 2025 • 119
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24, 2025 • 43