LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation Paper • 2510.11063 • Published Oct 13, 2025 • 1
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything Paper • 2401.10228 • Published Jan 18, 2024
RecTok: Reconstruction Distillation along Rectified Flow Paper • 2512.13421 • Published Dec 15, 2025 • 5
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing Paper • 2512.11715 • Published Dec 12, 2025
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World Paper • 2512.10958 • Published Dec 11, 2025 • 1
Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future Paper • 2512.16760 • Published Dec 18, 2025 • 15
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation Paper • 2412.03255 • Published Dec 4, 2024
Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models Paper • 2602.01842 • Published Feb 2 • 3
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation Paper • 2312.07526 • Published Apr 8, 2024
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World Paper • 2506.24102 • Published Jun 30, 2025 • 1
One Flight Over the Gap: A Survey from Perspective to Panoramic Vision Paper • 2509.04444 • Published Sep 4, 2025
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models Paper • 2508.12081 • Published Aug 16, 2025
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training Paper • 2510.11712 • Published Oct 13, 2025 • 31
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs Paper • 2510.18876 • Published Oct 21, 2025 • 37
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published Oct 23, 2025 • 56
From Masks to Worlds: A Hitchhiker's Guide to World Models Paper • 2510.20668 • Published Oct 23, 2025 • 8
PairUni: Pairwise Training for Unified Multimodal Language Models Paper • 2510.25682 • Published Oct 29, 2025 • 15