The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Paper
•
2501.08549
•
Published
Sitong Gong 1 Yunzhi Zhuge 1 Lu Zhang 1 Zongxin Yang 2 Pingping Zhang 1 Huchuan Lu 1
CVPR 2025
1 Dalian University of Technology 2 Havard University
You can find the code at: https://github.com/SitongGong/VRS-HQ
Base model
Chat-UniVi/Chat-UniVi