ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association
Paper • 2509.01584 • Published • 8
ViSTA-SLAM is a real-time monocular visual SLAM system that operates without requiring camera intrinsics. At its core, the system employs a lightweight symmetric two-view association (STA) model as the frontend, which simultaneously estimates relative camera poses and regresses local pointmaps from only two RGB images.
In the backend, ViSTA-SLAM constructs a specially designed Sim(3) pose graph that incorporates loop closures to address accumulated drift. Extensive experiments demonstrate that this approach achieves superior performance in both camera tracking and dense 3D reconstruction quality compared to current methods.
@misc{zhang2025vistaslam,
title={{ViSTA-SLAM}: Visual {SLAM} with Symmetric Two-view Association},
author={Ganlin Zhang and Shenhan Qian and Xi Wang and Daniel Cremers},
year={2025},
eprint={2509.01584},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.01584},
}