Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11, 2025 • 47
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks Paper • 2503.09572 • Published Mar 12, 2025 • 2
Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example Paper • 2408.06318 • Published Aug 12, 2024 • 1
OdysseyBench: Evaluating LLM Agents on Long-Horizon Complex Office Application Workflows Paper • 2508.09124 • Published Aug 12, 2025 • 3
Speech Evals Collection Synthesized speech evals generated by MistralAI from popular text evaluation datasets to evaluate spoken-language reasoning capabilities of Audio LLMs • 3 items • Updated Nov 28, 2025 • 12
view article Article 5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub Jul 15, 2025 • 24
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 766
view article Article Qwen2-VL-OCR-2B-Instruct and VisionOCR-3B-061125 for precise recognition of [messy] handwriting. Jun 17, 2025 • 11
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings Jun 2, 2025 • 27