AutoTrainess: Teaching Language Models to Improve Language Models Autonomously Paper • 2606.31551 • Published 5 days ago • 14
AutoTrainess: Teaching Language Models to Improve Language Models Autonomously Paper • 2606.31551 • Published 5 days ago • 14
Dockerless: Environment-Free Program Verifier for Coding Agents Paper • 2606.28436 • Published 9 days ago • 103
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Paper • 2412.21199 • Published Dec 30, 2024 • 13
AlphaResearch: Accelerating New Algorithm Discovery with Language Models Paper • 2511.08522 • Published Nov 11, 2025 • 19
AlphaResearch: Accelerating New Algorithm Discovery with Language Models Paper • 2511.08522 • Published Nov 11, 2025 • 19
AlphaResearch: Accelerating New Algorithm Discovery with Language Models Paper • 2511.08522 • Published Nov 11, 2025 • 19 • 2
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks Paper • 2507.01001 • Published Jul 1, 2025 • 47