Hostname: page-component-6766d58669-mzsfj Total loading time: 0 Render date: 2026-05-15T02:09:30.067Z Has data issue: false hasContentIssue false

Fusing Stereo Camera and Low-Cost Inertial Measurement Unit for Autonomous Navigation in a Tightly-Coupled Approach

Published online by Cambridge University Press:  16 December 2014

Zhiwen Xian
Affiliation:
(Department of Automatic Control, College of Mechatronics and Automation, National University of Defense Technology)
Xiaoping Hu*
Affiliation:
(Department of Automatic Control, College of Mechatronics and Automation, National University of Defense Technology)
Junxiang Lian
Affiliation:
(Department of Automatic Control, College of Mechatronics and Automation, National University of Defense Technology)
Rights & Permissions [Opens in a new window]

Abstract

Exact motion estimation is a major task in autonomous navigation. The integration of Inertial Navigation Systems (INS) and the Global Positioning System (GPS) can provide accurate location estimation, but cannot be used in a GPS denied environment. In this paper, we present a tight approach to integrate a stereo camera and low-cost inertial sensor. This approach takes advantage of the inertial sensor's fast response and visual sensor's slow drift. In contrast to previous approaches, features both near and far from the camera are simultaneously taken into consideration in the visual-inertial approach. The near features are parameterised in three dimensional (3D) Cartesian points which provide range and heading information, whereas the far features are initialised in Inverse Depth (ID) points which provide bearing information. In addition, the inertial sensor biases and a stationary alignment are taken into account. The algorithm employs an Iterative Extended Kalman Filter (IEKF) to estimate the motion of the system, the biases of the inertial sensors and the tracked features over time. An outdoor experiment is presented to validate the proposed algorithm and its accuracy.

Information

Type
Research Article
Copyright
Copyright © The Royal Institute of Navigation 2014 
Figure 0

Figure 1. The VIS which consists of the stereo camera and the MIMU and the relationship between the world {W}, camera {CL}, {CR}, and IMU {I} reference frames. The stereo camera and the MIMU are rigidly attached.

Figure 1

Figure 2. The flowchart of the tight integration system of the stereo camera and the MIMU. The flowchart is divided into four parts with different colours, namely inertial navigation part (black part in the flowchart); image processing (blue) consisting of feature detecting, tracking and outlier rejection; the part of IEKF (green) and the system states management part (red).

Figure 2

Figure 3. (a) Data collection system, which consists of a laptop and the visual-inertial system. (b) A sample image from outdoor data collection, which contains both near (circles) and far (crosses) features.

Figure 3

Figure 4. Estimated path in the horizontal plane for iner-only-uncomp (sensor biases ignored), iner-only-comp (sensor biases compensated), vis-only and vis-iner solution. The iner-only-uncomp and iner-only-comp solutions exceed the scale of the map after 70 and 76 seconds, respectively. The path estimated by vis-iner agrees well with the known path (the black line) and shows smaller errors in position and heading than the vis-only solution.

Figure 4

Figure 5. Estimated velocity comparison for iner-only-uncomp (magenta dotted line), iner-only-comp (blue solid line), vis-only (green dash-dotted line) and vis-iner (red dash line) solution. After the 61 seconds alignment period, the estimated velocity by iner-only-uncomp diverged tremendously while the iner-only-comp keeps stable for a short time. The vis-only and vis-iner solutions have similar results. In spite of that, the vis-iner shows a smoother solution than the vis-only.

Figure 5

Figure 6. Estimated position comparison for iner-only-uncomp (magenta dotted line), iner-only-comp (blue solid line), vis-only (green dash-dotted line) and vis-iner (red dash line) solution. The left column shows that the vis-iner result has significant improvement over the iner-only results. The zoomed in position results are shown in the right column. Compared to the vis-only result, the vis-iner has higher estimation precision, especially in the height direction (z-axis).

Figure 6

Figure 7. Performance comparison for using near, far, and N&F features. (a) shows the estimated path in the horizontal plane and (b) gives the 3D trajectory estimation.

Figure 7

Figure 8. The features in the image (a) and their position and variance estimation in the world frame (b). Most features in the image (a) matched successfully, with a few features failing and being newly detected. In the 3D image (b), the red crosses indicate the position estimation of currently tracked features, the blue and green ellipse indicate the variance of the features. Note that, the blue and green ellipse look like bold lines radiating from a common centre (in our case, the centre is the camera), which is the result of the great uncertainties in the range direction.

Figure 8

Figure 9. The execution time of processing 1 s data which consists of 100 inertial measurements and 10 image pairs. (a) shows the execution time versus the real test time and (b) shows the main steps and their execution times.