QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 4 days ago • 95
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning Paper • 2512.02835 • Published 17 days ago • 9
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published 23 days ago • 45
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published 23 days ago • 152
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14 • 15
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding Paper • 2511.13026 • Published Nov 17 • 25
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14 • 112
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 118
Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation Paper • 2510.19592 • Published Oct 22 • 12
Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation Paper • 2510.19592 • Published Oct 22 • 12
Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation Paper • 2510.19592 • Published Oct 22 • 12 • 2
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22 • 39