-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 106 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 78 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 44 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2602.10560
-
Agentic Uncertainty Quantification
Paper • 2601.15703 • Published • 9 -
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models
Paper • 2601.15690 • Published • 4 -
Agentic Confidence Calibration
Paper • 2601.15778 • Published • 6 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 29
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 160 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 71 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 25
-
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning
Paper • 2602.08234 • Published • 69 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 29 -
SimpleMem: Efficient Lifelong Memory for LLM Agents
Paper • 2601.02553 • Published • 37 -
Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation
Paper • 2602.02007 • Published • 16
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 106 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 78 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 44 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 160 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 71 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 25
-
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning
Paper • 2602.08234 • Published • 69 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 29 -
SimpleMem: Efficient Lifelong Memory for LLM Agents
Paper • 2601.02553 • Published • 37 -
Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation
Paper • 2602.02007 • Published • 16
-
Agentic Uncertainty Quantification
Paper • 2601.15703 • Published • 9 -
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models
Paper • 2601.15690 • Published • 4 -
Agentic Confidence Calibration
Paper • 2601.15778 • Published • 6 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 29