reasoning-gym Collection Datasets generated using https://github.com/open-thought/reasoning-gym (with Qwen3-instruct templates) • 15 items • Updated 17 days ago
Running on CPU Upgrade Featured 3.17k The Smol Training Playbook 📚 3.17k The secrets to building world-class LLMs
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 449
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 94
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 267
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129