NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20, 2025 • 51
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published Dec 23, 2025 • 43
TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation Paper • 2604.19473 • Published Apr 21
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 11 days ago • 168
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8 Text Generation • 50B • Updated Oct 15, 2025 • 243k • 28
Llama Nemotron Collection Open, Production-ready Enterprise Models • 12 items • Updated 11 days ago • 78
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models Paper • 2505.24133 • Published May 30, 2025 • 2