EarthSE: A Benchmark for Evaluating Earth Scientific Exploration Capability of LLMs Paper • 2505.17139 • Published May 22 • 2
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines Paper • 2509.21320 • Published Sep 25 • 101
ResearchGPT: Benchmarking and Training LLMs for End-to-End Computer Science Research Workflows Paper • 2510.20279 • Published Oct 23
FlowSearch: Advancing deep research with dynamic structured knowledge flow Paper • 2510.08521 • Published Oct 9
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 7 days ago • 103
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 7 days ago • 103