ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 4 days ago • 210
Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published 7 days ago • 32
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published 14 days ago • 30
EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning Paper • 2603.12698 • Published about 1 month ago • 1
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 24 days ago • 66
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 26 days ago • 94
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 26 days ago • 94
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 24 days ago • 66
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 1 day ago • 463k • 326
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction Paper • 2602.13294 • Published Feb 9 • 13