MoCA3D: Monocular 3D Bounding Box Prediction in the Image Plane
Paper • 2603.19538 • Published
This release provides the pretrained MoCA3D checkpoint used in the MoCA3D paper.
This repository release corresponds to:
MoCA3D: Monocular 3D Bounding Box Prediction in the Image Plane
Changwoo Jeon, Rishi Upadhyay, Achuta Kadambi (arXiv:2603.19538v1, 2026).
Abstract:
Monocular 3D object understanding is reformulated from a 2D RoI-to-3D lifting task into a pixel-space geometry recovery task. MoCA3D predicts projected 3D box corners and per-corner depths without requiring camera intrinsics at inference time, using dense corner heatmaps and depth maps from a tight 2D box input. It is class-agnostic, optimized for image-plane geometry fidelity, and evaluated with Pixel-Aligned Geometry (PAG) metrics (projected-corner and depth consistency).