Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints

Abstract

Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry (VO) and simultaneous localization and mapping (SLAM), where classic methods consisting of a hand-crafted features and sampling- based outlier rejection have been a dominant choice for over a decade. Although multiple works propose to replace these mod- ules with learning-based counterparts, most have not yet been as accurate, robust and generalizable as conventional methods. In this paper, we design an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection, while directly optimizing for the geometric pose objective. We show both quantitatively and qualitatively that pose estimation performance may be achieved on par with the classic pipeline. Moreover, we are able to show by end-to-end training, the key components of the pipeline could be significantly improved, which leads to better generalizability to unseen datasets compared to existing learning-based methods.

Publication
International Conference on Intelligent Robots and Systems (IROS), 2020
Date