-
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15 -
Self-Improving Robust Preference Optimization
Paper • 2406.01660 • Published • 20 -
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Paper • 2407.19594 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2501.05727
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 102
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 44 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 109 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 84 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 27
-
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 25 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 47 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
Agents for self-driving laboratories applied to quantum computing
Paper • 2412.07978 • Published • 1 -
Towards Scientific Discovery with Generative AI: Progress, Opportunities, and Challenges
Paper • 2412.11427 • Published • 3 -
AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions
Paper • 2411.18015 • Published • 1 -
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper • 2501.04306 • Published • 35
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Paper • 2410.22304 • Published • 18 -
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Paper • 2410.19609 • Published • 18 -
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Paper • 2411.00412 • Published • 10 -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
-
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15 -
Self-Improving Robust Preference Optimization
Paper • 2406.01660 • Published • 20 -
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Paper • 2407.19594 • Published • 21
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 102
-
Agents for self-driving laboratories applied to quantum computing
Paper • 2412.07978 • Published • 1 -
Towards Scientific Discovery with Generative AI: Progress, Opportunities, and Challenges
Paper • 2412.11427 • Published • 3 -
AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions
Paper • 2411.18015 • Published • 1 -
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper • 2501.04306 • Published • 35
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 44 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 109 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 84 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 27
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Paper • 2410.22304 • Published • 18 -
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Paper • 2410.19609 • Published • 18 -
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Paper • 2411.00412 • Published • 10 -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
-
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 25 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 47 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48