Efficient, consistent, and persistent visual-inertial navigation

Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
The use of visual-inertial navigation systems (VINS) has become ubiquitous due to their ability to provide high quality 3D motion tracking and has continued to be at the center of simultaneous localization and mapping (SLAM) research. Deployment platforms continue to reduce in cost and miniaturize to further enable mass production to consumers (e.g., smartphones, virtual and augmented-reality headsets, and micro-aerial vehicles (MAV)). A key barrier that prevents the wider deployment of VINS is the accuracy and computational demands for long-term persistent state estimation (e.g., hours of continuous operation in a common global frame). Development of computationally efficient VINS which can efficiently incorporate loop-closure information to reduce estimator drift and increase accuracy over long-term estimation periods with persistent maps remains a crucial challenge, which this thesis looks to address. ☐ We first introduce a state-of-the-art open-sourced filter-based VINS research framework, termed OpenVINS, which leverages cutting edge extended Kalman filter (EKF) estimator techniques and demonstrates accurate and consistent state estimation where both the mean and uncertainty of the state are recovered at each timestep. We then focus on how to improve this visual-inertial odometry (VIO) to include further loop-closure information by tracking large environmental plane geometric primitives in an efficient manner leveraging a novel minimal plane representation termed the Closest Point (CP) plane. We show that the inclusion of such CP planes, which can be tracked for significant periods due to their large spatial nature and the proposed novel tracking algorithm, reduces the long-term drift in both simulation and real-world experiments. We then focus on the visual-inertial simultaneous localization and mapping (VI-SLAM) task and how we can perform consistent long-term persistent localization without causing computational complexity to explode over time. We show that the Schmidt-Kalman filter (SKF) methodology can be leveraged in conjunction with two different measurement models, including a novel 2D-to-2D method for indirect loop-closure to historical poses, to bound long-term drift which only increases complexity linearly in terms of size of the historical map. We then show that the proposed Schmidt-EKF for VI-SLAM (SEVIS) can be coupled with a secondary optimization thread, which enables relinearization, to perform large-scale estimation. We finally apply the learned loop-closure and measurement constraint techniques to the distributed multi-robot cooperative localization (CL) case. We show that covariance intersection (CI) can be efficiently leveraged for distributed VI-SLAM and we can limit long-term drift while also not requiring robots to simultaneously visit locations for cross-robot constraints. This novel distributed CL estimator shows state-of-the-art accurate, consistent, and efficient performance both in simulation and real-world experiments.
Description
Keywords
Visual-inertial navigation
Citation