LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 1 day ago • 101
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 8 days ago • 73
DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch Paper • 2606.10728 • Published 9 days ago • 33
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published Apr 29 • 52
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published Apr 14 • 34
SWE Agent Series Collection Models trained by SWE-Master and SWE-World, including both policy models and verifiers. • 13 items • Updated Mar 23 • 4
SWE Agent Series Collection Models trained by SWE-Master and SWE-World, including both policy models and verifiers. • 13 items • Updated Mar 23 • 4
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? Paper • 2603.03194 • Published Mar 3 • 57
SWE Agent Series Collection Models trained by SWE-Master and SWE-World, including both policy models and verifiers. • 13 items • Updated Mar 23 • 4