view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 qgallouedec, stevhliu, pcuenq, sergiopaniego • Mar 31 • 53
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes Paper • 2510.16380 • Published Oct 18, 2025 • 2
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation Paper • 2511.15958 • Published Nov 20, 2025 • 1
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper • 2511.04662 • Published Nov 6, 2025 • 36
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 spisakjo, darktex, zkwentz, mortimerp9, Sanyam, Hamid-Nazeri, Pankit01, emre0, lewtun, reach-vb • Oct 23, 2025 • 162
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness Paper • 2308.08708 • Published Aug 17, 2023 • 6
Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks Paper • 2505.12845 • Published May 19, 2025 • 1
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus Paper • 2411.12498 • Published Nov 19, 2024 • 2
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3, 2024 • 54
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2, 2024 • 69
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model Paper • 2312.11370 • Published Dec 18, 2023 • 20
Prompting Is Programming: A Query Language for Large Language Models Paper • 2212.06094 • Published Dec 12, 2022 • 1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper • 2308.03188 • Published Aug 6, 2023 • 2