Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 189
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models Paper • 2503.04813 • Published Mar 4, 2025 • 2
Dr. Zero: Self-Evolving Search Agents without Training Data Paper • 2601.07055 • Published 11 days ago • 19
Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners Paper • 2601.02996 • Published 16 days ago • 5
GARDO: Reinforcing Diffusion Models without Reward Hacking Paper • 2512.24138 • Published 23 days ago • 29
DiRL: An Efficient Post-Training Framework for Diffusion Language Models Paper • 2512.22234 • Published about 1 month ago • 20
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 23 days ago • 59
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published 23 days ago • 50
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published about 1 month ago • 61
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models Paper • 2512.19995 • Published about 1 month ago • 16
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 80
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 89
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 211
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 77
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 253