HRM-Text: Efficient Pretraining Beyond Scaling Paper • 2605.20613 • Published 26 days ago • 315
VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation Paper • 2605.16079 • Published about 1 month ago • 28
MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published May 14 • 118
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization Paper • 2605.15824 • Published about 1 month ago • 65
GUI-G^2: Gaussian Reward Modeling for GUI Grounding Paper • 2507.15846 • Published Jul 21, 2025 • 135