Agent Skills Should Go Beyond Text: The Case for Visual Skills Paper • 2606.01414 • Published 22 days ago • 10
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents Paper • 2605.18652 • Published May 18 • 8
Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion Paper • 2603.15614 • Published Mar 16 • 6
MIRA: Multimodal Iterative Reasoning Agent for Image Editing Paper • 2511.21087 • Published Nov 26, 2025 • 10
MIRA: Multimodal Iterative Reasoning Agent for Image Editing Paper • 2511.21087 • Published Nov 26, 2025 • 10