Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.03632

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Paper • 2309.17179 • Published Sep 29, 2023 • 2
First Finish Search: Efficient Test-Time Scaling in Large Language Models

Paper • 2505.18149 • Published May 23, 2025 • 1

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 71
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3, 2025 • 24
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31, 2025 • 5

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

Paper • 2310.08582 • Published Oct 12, 2023 • 3
Autonomous Tree-search Ability of Large Language Models

Paper • 2310.10686 • Published Oct 14, 2023 • 2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 10
PathFinder: Guided Search over Multi-Step Reasoning Paths

Paper • 2312.05180 • Published Dec 8, 2023 • 10

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15, 2024 • 38
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 19
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30, 2025 • 48
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published Sep 28, 2025 • 47

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 123 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Ada-Instruct: Adapting Instruction Generators for Complex Reasoning

Paper • 2310.04484 • Published Oct 6, 2023 • 5
Diversity of Thought Improves Reasoning Abilities of Large Language Models

Paper • 2310.07088 • Published Oct 11, 2023 • 5
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
Democratizing Reasoning Ability: Tailored Learning from Large Language Model

Paper • 2310.13332 • Published Oct 20, 2023 • 16

about 16 hours ago

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 39
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 56
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Paper • 2309.17179 • Published Sep 29, 2023 • 2
First Finish Search: Efficient Test-Time Scaling in Large Language Models

Paper • 2505.18149 • Published May 23, 2025 • 1

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 19
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30, 2025 • 48
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published Sep 28, 2025 • 47

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 71
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3, 2025 • 24
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31, 2025 • 5

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 123 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

Paper • 2310.08582 • Published Oct 12, 2023 • 3
Autonomous Tree-search Ability of Large Language Models

Paper • 2310.10686 • Published Oct 14, 2023 • 2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 10
PathFinder: Guided Search over Multi-Step Reasoning Paths

Paper • 2312.05180 • Published Dec 8, 2023 • 10

Ada-Instruct: Adapting Instruction Generators for Complex Reasoning

Paper • 2310.04484 • Published Oct 6, 2023 • 5
Diversity of Thought Improves Reasoning Abilities of Large Language Models

Paper • 2310.07088 • Published Oct 11, 2023 • 5
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
Democratizing Reasoning Ability: Tailored Learning from Large Language Model

Paper • 2310.13332 • Published Oct 20, 2023 • 16

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15, 2024 • 38
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4, 2025 • 42

about 16 hours ago

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 39
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 56
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs