A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots

Liu, Cheng; Wang, Tao; Li, Zhi; Li, Shu; Tian, Peng

doi:10.3390/s25103086

Open AccessArticle

A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots

by

Cheng Liu

¹,

Tao Wang

^1,2

,

Zhi Li

^1,2,*

,

Shu Li

^3,*

and

Peng Tian

⁴

¹

State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, China

²

Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401100, China

³

School of Electrical Engineering, Liaoning University of Technology, Jinzhou 121000, China

⁴

School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing 401100, China

^*

Authors to whom correspondence should be addressed.

Sensors 2025, 25(10), 3086; https://doi.org/10.3390/s25103086

Submission received: 7 April 2025 / Revised: 1 May 2025 / Accepted: 10 May 2025 / Published: 13 May 2025

(This article belongs to the Special Issue Advances in Mobile Robot Perceptions, Planning, Control and Learning: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

IMUs (inertial measurement units) and cameras are popular sensors for autonomous localization due to their convenient integration. This article proposes a collaborative localization method, the CICEKF (collaborative IMU-aided camera extended Kalman filter), with a loosely coupled and two-step structure for the autonomous locomotion estimation of collaborative robots. The first step is for single-robot localization estimation, fusing and connecting the IMU and visual measurement data on the velocity level, which can improve the robustness and adaptability of different visual measurement approaches without redesigning the visual optimization process. The second step is for estimating the relative configuration of multiple robots, which further fuses the individual motion information to estimate the relative translation and rotation reliably. The simulation and experiment demonstrate that both steps of the filter are capable of accomplishing locomotion estimation missions, standalone or collaboratively.

Keywords:

loosely coupled autonomous localization; visual–inertial odometry; multiple robots collaborative localization; data fusion

1. Introduction

1.1. Motivation

The autonomous localization of multiple robots working on ground and aerial space is well-explored and is achieved with different sensor setups such as visual measurement [1], GNSS (global navigation satellite system) [2], and visual–inertial equipment [3]. However, for search and rescue robots facing emergent or indoor situations [4,5], collaborative autonomous navigation under unknown and GNSS-denied environments [6] aiming for group closed-loop motion control [7,8] is still challenging due to the need for accuracy against environmental uncertainty. Visual–inertial approaches are popular for their robustness and real-time performance against interferences [9,10] to solve localization problems for both individual and group robots. Although the vision-based odometry and SLAM (simultaneous localization and mapping) methods have achieved great accuracy for their batch-optimization capability [11,12,13], the vision-only approaches are genuinely weak in solving the collaborative localization problem owing to the lack of a common world coordinate frame as an anchorage [14,15]. On the other hand, IMUs have good real-time responsiveness, while the inherent drift affects the long-term performance.

Therefore, it is necessary to explore approaches with an open and flexible data fusion framework to improve the accuracy of cooperative group localization, which can balance the instantaneity of the IMU sensors and the accuracy of visual measurement for collaborative agents under GNSS-restricted situations.

1.2. Related Work

The research on autonomous localization based on the data fusion of the IMU and visual measurement mainly focuses on tightly and loosely coupled approaches.

The tightly coupled methods are typically biased towards visual measurement when integrating the IMU’s real-time output from the gyroscope and accelerometer into the batch-optimization process. The state-of-the-art single-robot visual–inertial localization methods, such as ORB-SLAM v3 [16] and MSCKF 2.0 [17], have shown great robustness and accuracy. TH Nguyena et al. [18] proposed a two-step optimization-based collaborative visual–inertial range localization method that separately estimates the relative transformation and single-robot localization. Tian et al. [19] proposed a distributed multi-robot system for robust and dense metric semantic SLAM that detects loop closures and performs distributed trajectory estimation when communication among individuals is available. Though achieving great performance in the tests with the benefits of post-optimization, such tightly coupled methods usually lack both openness due to the fixed optimization structure and low real-time performance due to the high computing power demand. These result in inefficiency in dealing with uncertain topology and different mounted sensors of multi-agents.

Loosely coupled methods are naturally fit to perform relative localization due to their flexible algorithm structure and timeliness by emphasizing the IMU update [20]. The main weakness is their unsatisfactory accuracy owing to the inevitable drift of the IMU measurement, which should be carefully handled.

Kelly and Sukhatme [21] achieved the self-calibration of a monocular visual–inertial system with a specifically designed UKF (unscented Kalman filter). Following a similar approach, Weiss and Siegwart [22] introduced the estimation of the world coordinate frame drift of the visual measurement into a multi-level EKF (extended Kalman filter) for a better trajectory estimation for a single drone. Based on further research, Achtelik and Weiss [23] achieved the estimation of the relative configuration of two drones by utilizing a multi-level EKF to handle the loosely coupled visual–inertial data fusion. Researchers have introduced new filter techniques, such as the invariant EKF based on Lie algebra [24], the multi-state constraint Kalman filter [25], and the equivariant filter [26,27], into the trajectory estimation with visual–inertial equipment.

These loosely coupled methods usually treat visual measurement as a black box, thus achieving a flexible structure with great portability across different sensor setups. However, owing to only considering the update with IMU data on the acceleration and velocity levels, the final correction on the position level with the visual measurement result may lose accuracy when a sudden visual measurement instability is encountered.

1.3. Our Approach

This article presents a loosely coupled filter-based approach, the CICEKF, with two steps aiming for both the autonomous localization of a single robot and the collaborative estimation of relative multi-robot localization. The time complexity of the method is

O (n)

. The SR-CICEKF (single-robot CICEKF) step focuses on the fusion of the data on the stereo vision and IMU velocity levels from the integrated sensors for a robust locomotive estimation. The MR-CICEKF (multi-robot CICEKF) step fuses outcomes of the SR-CICEKF from multiple robots to estimate the real-time relative position and attitude from each slave robot to the master robot. The two steps can run separately as standalone filters or serially as a complete CICEKF. Furthermore, to demonstrate the feasibility of the CICEKF, the observability is proven.

This article is organized as follows: The design of the state vectors of the CICEKF is described in Section 2, a detailed description of the propagation and update processes of the filter is provided in Section 3, the local weak observability of the CICEKF is demonstrated in Section 4, simulation with quintic curves is presented in the Section 5, and real-world experiments were carried out and are analyzed in Section 6. The conclusions of this study are provided in Section 7.

2. Design of the State Vector

2.1. Definition of Variables

To focus the discussion on the filter-based data fusion process, the stereo vision is treated as black-box autonomous locomotion measurement equipment. It is assumed that each robot is equipped with an IMU-aided stereo-vision sensor suite, of which the relative position and attitude of the IMU and the cameras are already calibrated without further drifting. The locomotive variables and relative transformations are defined as shown in Figure 1, and the coordinate frames and the notation of variables used in this study are provided in Table 1.

The fixed index of the master robot is set as 0, and the indexes of the slave robots are numbered from 1 to

n

.

{}_{n}p_{c}^{i}

and

{}_{n}{\bar{q}}_{c}^{i}

represent the calibrated relative position and attitude, respectively, between the IMU and the stereo-vision coordinate frames of the

n

-th robot. The

n

-th robot’s linear translation

{}_{n}p_{w}^{c}

and the rotation

{}_{n}{\bar{q}}_{w}^{c}

obtained from the black-box stereo vision are measured by referring to the world coordinate frame.

{}_{n}p_{w}^{i}

and

{}_{n}{\bar{q}}_{w}^{i}

are the linear translation and the rotation of the IMU inspected in the world coordinate frame, respectively. The coupled position and attitude fusion results of each IMU–camera suite, i.e.,

{}_{n}p_{w}^{i c}

and

{}_{n}{\bar{q}}_{w}^{i c}

, are attached to its IMU coordinate frame to simplify the filtering process. The raw measurement outputs of the

n

-th IMU are the linear acceleration

{}_{n}a_{m i}

and angular velocity

{}_{n}ω_{m i}

, which are measured in its body-following coordinate frame. The relative configuration consists of the relative position and attitude between the

n

-th slave robot and the master robot, which are

{}_{0}^{n}p_{r}

and

{}_{0}^{n}{\bar{q}}_{r}

, respectively. The units used in this article are described in the SI. This design of the master/slave relationship among multiple agents allows for convenient modification of the topology under practical situations.

In this study, the drift of the vision’s world coordinate frame is not considered since it is slow enough to be treated as noise. Furthermore, all noise variables are modeled as a random walk·. The small-angle assumption is applied, which means letting the rotation angle be converted from

\bar{q}

be

θ

, and the error of

\bar{q}

is written as

δ \bar{q} = {[q_{0}, δ q^{T}]}^{T} \approx {[1, \frac{1}{2} δ θ^{T}]}^{T}

when

θ ≪ 1^{\circ}

[28].

2.2. Construction of the SR-CICEKF State Vector

From this point onwards, for the filter construction considering the IMU-aided stereo-vision suite of any single robot, the left corner markers are omitted for simplification, such as

{}_{0}p_{w}^{i} = p_{w}^{i}

.

Thus, the vector with 29 elements representing the state of the SR-CICEKF is defined as follows:

X = {\{p_{w}^{i c T} v_{w}^{c T} v_{w}^{i T} {\bar{q}}_{w}^{i c T} {\bar{q}}_{w}^{i T} ω_{c}^{c T} ω_{i}^{i T} b_{a i}^{T} b_{ω i}^{T}\}}^{T},

(1)

where

v_{w}^{c}

is the derivative of

p_{w}^{c}

;

v_{w}^{i}

is integrated from

a_{m i}

, which possesses a bias

b_{a i}

;

ω_{c}^{c}

is the angular velocity of the stereo vision body, which is inspected in the IMU coordinate frame rather than the vision coordinate frame; and

ω_{i}^{i}

is the angular velocity of the IMU body in its own coordinate frame with a bias

b_{ω i}

. Additionally, there are

{\dot{v}}_{w}^{i} = a_{w}^{i} - g = R_{w}^{i} (a_{m i} - b_{a i} - n_{a_{m i}}) - g

and

ω_{i}^{i} = ω_{m i} - b_{ω i} - n_{ω_{m i}}

, where

n_{a_{m i}}

and

n_{ω_{m i}}

are the measurement noises, and

g

is the gravity vector with respect to the world coordinate frame.

2.2.1. Coupling Process for the SR-CICEKF

The independent coupling coefficients of the linear and angular velocities are

μ_{v} \in [0, 1]

and

μ_{ω} \in [0, 1]

, respectively, ensuring adjustability while enhancing motion estimation accuracy without affecting observability.

The linear velocity coupling is straightforward and written as follows:

v_{w}^{i c} = μ_{v} v_{w}^{i} + (1 - μ_{v}) R_{c}^{i} v_{w}^{c} .

(2)

The coupling process of angular velocities cannot be conducted directly since the rotations of the IMU and vision body are inspected in different coordinate frames. Assuming

{\bar{q}}_{w}^{i c} = {\bar{q}}_{w}^{c^{'}} {\bar{q}}_{w}^{i}

, the following equations are obtained referring to [29]:

\frac{d}{d t} {({\bar{q}}_{w}^{i})}^{μ_{ω}} = \frac{1}{2} {({\bar{q}}_{w}^{i})}^{μ_{ω}} \otimes (μ_{ω} {\bar{ω}}_{i}^{i}),

(3)

\frac{d}{d t} {({\bar{q}}_{w}^{c^{'}})}^{1 - μ_{ω}} = \frac{1}{2} {({\bar{q}}_{w}^{c^{'}})}^{1 - μ_{ω}} \otimes ((1 - μ_{ω}) {\bar{ω}}_{c}^{c}),

(4)

where

{\bar{q}}_{w}^{c^{'}}

is achieved by inspecting the rotation of the stereo vision body in the IMU coordinate frame, and

{\bar{q}}^{μ}

is the

μ

-th power of

\bar{q}

, which denotes a unit quaternion scaling the rotation angle around the virtual axis with

μ \in [0, 1]

[30].

According to Appendix A, Equations (3) and (4),

{\bar{q}}_{w}^{i c}

can thus be derived as follows:

\begin{array}{l} {\dot{\bar{q}}}_{w}^{i c} = \frac{d}{d t} {({\bar{q}}_{w}^{i})}^{μ_{ω}} \otimes {({\bar{q}}_{w}^{c^{'}})}^{1 - μ_{ω}} + {({\bar{q}}_{w}^{i})}^{μ_{ω}} \otimes \frac{d}{d t} {({\bar{q}}_{w}^{c^{'}})}^{1 - μ_{ω}} \\ = \frac{μ_{ω}}{2} {({\bar{q}}_{w}^{i})}^{μ_{ω}} \otimes {\bar{ω}}_{i}^{i} \otimes {({\bar{q}}_{w}^{c^{'}})}^{1 - μ_{ω}} + \frac{1 - μ_{ω}}{2} {\bar{q}}_{w}^{i c} \otimes {\bar{ω}}_{c}^{c} \end{array},

(5)

where

\bar{ω} = [0, ω]

.

2.2.2. Simplification of the SR-CICEKF State Vector

To simplify the further discussion, the SR-CICEKF state vector in Equation (1) is rewritten as follows:

X = {\{p_{i c} {}^{T} v_{c} {}^{T} v_{i} {}^{T} {\bar{q}}_{i c} {}^{T} {\bar{q}}_{i} {}^{T}ω_{c} {}^{T}ω_{i} {}^{T} b_{a}^{T} b_{ω}^{T}\}}^{T} .

(6)

The corresponding derivatives are as follows:

{\dot{p}}_{i c} = v_{i c} = μ_{v} v_{i} + (1 - μ_{v}) R_{c}^{i} v_{c}, {\dot{v}}_{c} = n_{v_{c}}, {\dot{v}}_{i} = a_{i} - g = R_{i} (a_{m i} - b_{a i} - n_{a_{m i}}) - g, {\dot{\bar{q}}}_{i c} = \frac{μ_{ω}}{2} {\bar{q}}_{i} {}^{μ_{ω}}\otimes {\bar{ω}}_{i} \otimes {\bar{q}}_{c} {}^{1 - μ_{ω}}+ \frac{1 - μ_{ω}}{2} {\bar{q}}_{i c} \otimes {\bar{ω}}_{c}, {\dot{\bar{q}}}_{i} = \frac{1}{2} {\bar{q}}_{i} \otimes {\bar{ω}}_{i}, {\dot{ω}}_{c} = n_{ω_{c}}, {\dot{ω}}_{i} = \frac{d}{d t} (ω_{m i} - b_{ω_{i}} - n_{ω_{m i}}), {\dot{b}}_{a i} = n_{b_{a i}}, {\dot{b}}_{ω i} = n_{b_{ω i}},

(7)

where

{\bar{q}}_{c} = {\bar{q}}_{w}^{c^{'}} {\bar{q}}_{c}^{i} \otimes {\bar{q}}_{w}^{c}

.

2.2.3. Error of the SR-CICEKF State Vector

The error state contributes to the propagation matrices and is essential for the filtering process.

\tilde{X}

contains 27 elements, written as follows:

\tilde{X} = {\{Δ p_{i c} {}^{T} Δ v_{c} {}^{T} Δ v_{i} {}^{T} δ θ_{i c} {}^{T} δ θ_{i} {}^{T} Δ ω_{c} {}^{T} Δ ω_{i} {}^{T} Δ b_{a i} {}^{T}Δ {b_{ω i}}^{T}\}}^{T} .

(8)

The update rate of this filter-based approach is determined by the IMU refresh rate. Thus, small high-order terms in the process can be omitted, such as

δ q \cdot Δ ω

,

δ q \cdot δ q

, and

δ q \cdot n

since the IMU can usually run faster than 100 Hz.

To build the connection from the state vector to the propagation process, the derivative form of

\tilde{X}

should be inspected.

The derivative of the coupled translation error can be obtained directly as follows:

\dot{Δ p_{i c}} = μ_{v} Δ v_{i} + (1 - μ_{v}) R_{c}^{i} Δ v_{c} .

(9)

Let

{\hat{a}}_{i} = a_{m i} - {\hat{b}}_{a i}

and

Δ b_{a i} = b_{a i} - {\hat{b}}_{a i}

, and according to

R_{i} = R ({\bar{q}}_{i}) \approx {\hat{R}}_{i} (I_{3} + ⌊δ θ_{i} \times⌋)

achieved by following Appendix A, the equations are as follows:

\begin{array}{l} \dot{Δ v_{i}} = a_{i} - {\hat{a}}_{i} = R_{i} (a_{m i} - b_{a i} - n_{a_{m i}}) - g - {\hat{R}}_{i} (a_{m i} - {\hat{b}}_{a i}) + g \\ \approx - {\hat{R}}_{i} ⌊{\hat{a}}_{i} \times⌋ δ θ_{i} - {\hat{R}}_{i} Δ b_{a i} - n_{a_{m i}} \end{array},

(10)

where the higher-order terms are omitted.

According to Appendix A, the equations are as follows:

\{\begin{cases} {\bar{q}}_{i c}^{*} = {({\bar{q}}_{c})}^{1 - {μ_{ω}}^{*}} \otimes {({\bar{q}}_{i})}^{{μ_{ω}}^{*}} \\ {({\bar{q}}_{i}^{μ_{ω}})}^{*} \otimes {\bar{q}}_{i}^{μ_{ω}} = 1 \\ {({\hat{\bar{q}}}_{c}^{1 - μ_{ω}})}^{*} \otimes {\bar{p}}_{i c} \otimes {\hat{\bar{q}}}_{c} {}^{1 - μ_{ω}}≙ \hat{R} {({\hat{\bar{q}}}_{c}^{1 - μ_{ω}})}^{T} p_{i c} = {\hat{R}}_{μ_{ω} c} {}^{T}p_{i c} \end{cases} .

(11)

By subjecting Equation (11) to Equation (5), there is obtained as follows:

\dot{δ {\bar{q}}_{i c}} \approx - \frac{μ_{ω}}{2} \{[\begin{array}{l} 0 \\ 2 ⌊(R_{μ_{ω} c} {}^{T}{\hat{ω}}_{i}) \times⌋ δ q_{i c} \end{array}] - [\begin{array}{l} 0 \\ q_{0} R_{μ_{ω} c} {}^{T}Δ ω_{i} \end{array}]\} - \frac{1 - μ_{ω}}{2} \{[\begin{array}{l} 0 \\ 2 ⌊{\hat{ω}}_{c} \times⌋ δ q_{i c} \end{array}] - [\begin{array}{l} 0 \\ q_{0} Δ ω_{c} \end{array}]\},

(12)

where

Δ \bar{ω} = \bar{ω} - \hat{\bar{ω}}

is applied, and the high-order terms are disregarded.

Following the small-angle assumption, let

δ {\bar{q}}_{i c} = {[q_{0}, δ {q_{i c}}^{T}]}^{T} \approx {[1, \frac{1}{2} δ θ_{i c}^{T}]}^{T}

, and there is as follows:

\dot{δ θ_{i c}} = - μ_{ω} [⌊(R_{μ_{ω} c} {}^{T}{\hat{ω}}_{i}) \times⌋ δ θ_{i c} - R_{μ_{ω} c} {}^{T}Δ ω_{i}] - (1 - μ_{ω}) [⌊{\hat{ω}}_{c} \times⌋ δ θ_{i c} - Δ ω_{c}] .

(13)

Similarly, there is as follows:

\dot{δ {\bar{q}}_{i}} \approx [\begin{array}{l} 0 \\ - ⌊{\hat{ω}}_{i} \times⌋ δ q_{i} \end{array}] - [\begin{array}{l} 0 \\ \frac{1}{2} q_{i 0} Δ ω_{i} \end{array}],

(14)

where

{\hat{ω}}_{i} = ω_{m i} - {\hat{b}}_{ω i}

and

Δ b_{ω i} = b_{ω i} - {\hat{b}}_{ω i}

are applied.

Following the small-angle assumption, let

δ {\bar{q}}_{i} = {[q_{0}, δ {q_{i}}^{T}]}^{T} \approx {[1, \frac{1}{2} δ θ_{i}^{T}]}^{T}

, and there is as follows:

\dot{δ θ_{i}} = - ⌊{\hat{ω}}_{i} \times⌋ δ θ_{i} - Δ ω_{i} = - ⌊{\hat{ω}}_{i} \times⌋ δ θ_{i} - Δ b_{ω i} - n_{ω_{m i}} .

(15)

For other terms in the error state, there is no change compared to the state vector since they are obtained directly from the measurement. Therefore, there is as follows:

{\dot{Δ v}}_{c} = n_{v_{c}}, \dot{Δ ω_{c}} = n_{ω_{c}}, \dot{Δ ω_{i}} = \frac{d}{d t} (Δ b_{ω i} + n_{ω_{m i}}) = n_{ω i}, \dot{Δ b_{a i}} = n_{b_{a i}}, \dot{Δ b_{ω i} =} n_{b_{ω i}} .

(16)

2.3. Construction of the MR-CICEKF State Vector

One primary purpose of this study is to achieve a relative configuration between two robots. Therefore, the IMU coordinate frame, which is inherent and fixed after calibration, is chosen as the anchorage to establish the filter. For the conditions with multiple robots running simultaneously, the relative configuration chain can be easily developed with a convenient adjustment to the topology.

The construction process of the MR-CICEKF state equations is partially similar to SR-CICEKF. The input of the MR-CICEKF is obtained from the output of SR-CICEKF, and the state vector with 19 elements is written as follows:

Y = {\{{}_{0}p_{r} {}^{T}_{0} v_{w}^{i} {}^{T}_{1} v_{w}^{i} {}^{T}_{0} {\bar{q}}_{r} {}^{T}_{0} ω_{i}^{i} {}^{T}_{1} ω_{i}^{i T}\}}^{T},

(17)

where

{}_{0}p_{r}^{T}

and

{}_{0}{\bar{q}}_{r}^{T}

have been described previously,

{}_{0}v_{w}^{i}

and

{}_{0}ω_{i}^{i}

are the IMU’s linear velocity and angular velocity obtained after applying the SR-CICEKF to the master robot’s sensor suite, respectively; and

{}_{1}v_{w}^{i}

and

{}_{1}ω_{i}^{i}

are the outputs of the slave robot.

The derivative form of

Y

connects the variables in the state vector. The derivative of the relative translation is obtained as follows:

\begin{array}{l} {}_{1}p_{w}^{i} -_{0} p_{w}^{i} = R ({}_{0}{\bar{q}}_{r}) \cdot {}_{0}p_{r} \Leftrightarrow \\ \frac{d}{d t} ({}_{1}p_{w}^{i} -_{0} p_{w}^{i}) = R_{r} \cdot ⌊{}_{0}ω_{i}^{i} \times⌋ \cdot {}_{0}p {}_{r}+ R_{r} \cdot {}_{0}{\dot{p}}_{r} \\ \Leftrightarrow {}_{0}{\dot{p}}_{r} = {}_{r}v_{w}^{i} - ⌊{}_{0}ω_{i}^{i} \times⌋ \cdot {}_{0}p_{r} \end{array},

(18)

where

{}_{r}v_{w}^{i} = R_{r}^{T} \cdot ({}_{1}v_{w}^{i} - {}_{0}v_{w}^{i})

is the relative linear velocity of the robots.

According to

{}_{0}{\bar{q}}_{r} =_{0} {\bar{q}}_{w}^{i *} \otimes_{1} {\bar{q}}_{w}^{i}

, the derivative of the relative rotation is written as follows:

\begin{array}{l} {}_{0}{\dot{\bar{q}}}_{r} = \frac{d}{d t} ({}_{0}{\bar{q}}_{w}^{i *}) \otimes_{1} {\bar{q}}_{w}^{i} +_{0} {\bar{q}}_{w}^{i *} \otimes \frac{d}{d t} ({}_{1}{\bar{q}}_{w}^{i}) \\ = \frac{1}{2} ({}_{0}{\bar{q}}_{r} \otimes_{1} ω_{i}^{i} -_{0} ω_{i}^{i} \otimes_{0} {\bar{q}}_{r}) \end{array} .

(19)

2.3.1. Simplification of the MR-CICEKF State Vector

To simplify the further discussion, the MR-CICEKF state vector is rewritten as follows:

Y = {\{p_{r} {}^{T} v_{i 0} {}^{T} v_{i 1} {}^{T} {\bar{q}}_{r} {}^{T} ω_{i 0} {}^{T} {ω_{i 1}}^{T}\}}^{T} .

(20)

Additionally, there is

{}_{r}v_{w}^{i} = v_{r}

. The derivatives of the relative motion variables are written as follows:

{\dot{p}}_{r} = v_{r} = R_{r}^{T} \cdot (v_{i 1} - v_{i 0}) - ⌊ω_{i 0} \times⌋ \cdot p_{r}, {\dot{\bar{q}}}_{r} = \frac{1}{2} ({\bar{q}}_{r} \otimes ω_{i 1} - ω_{i 0} \otimes {\bar{q}}_{r}), {\dot{v}}_{i 0} = n_{v_{i 0}}, {\dot{v}}_{i 1} = n_{v_{i 1}}, {\dot{ω}}_{i 0} = n_{ω_{i 0}}, {\dot{ω}}_{i 1} = n_{ω_{i 1}} .

(21)

2.3.2. Error of the MR-CICEKF State Vector

\tilde{Y}

contains 18 elements and is written as follows:

\tilde{Y} = {\{Δ p_{r} {}^{T} Δ v_{i 0} {}^{T} Δ v_{i 1} {}^{T} δ θ_{r} {}^{T} Δ ω_{i 0} {}^{T} Δ {ω_{i 1}}^{T}\}}^{T} .

(22)

Following Appendix A,

R_{r}^{T} = (I_{3} - ⌊δ θ_{r} \times⌋) {\hat{R}}_{r}^{T}

holds. According to

p_{r} = {\hat{p}}_{r} + Δ p_{r}

, by omitting high-order terms, there is as follows:

\begin{array}{l} Δ {\dot{p}}_{r} = (R_{r} {}^{T}{(v_{i 1} - v_{i 0})} - ⌊ω_{i 0} \times⌋ p_{r}) - ({\hat{R}}_{r} {}^{T}{({\hat{v}}_{i 1} - {\hat{v}}_{i 0})} - ⌊{\hat{ω}}_{i 0} \times⌋ {\hat{p}}_{r}) \\ \approx {\hat{R}}_{r} {}^{T}Δ v_{i 1} - {\hat{R}}_{r} {}^{T}Δ v_{i 0} + ⌊({\hat{R}}_{r} {}^{T}{(v_{i 1} - v_{i 0})}) \times⌋ δ θ_{r} + ⌊{\hat{p}}_{r} \times⌋ Δ ω_{i 0} - ⌊{\hat{ω}}_{i 0} \times⌋ Δ p_{r} \end{array},

(23)

where

ω_{i 0} = {\hat{ω}}_{i 0} + Δ ω_{i 0}

,

v_{i 0} = {\hat{v}}_{i 0} + Δ v_{i 0}

and

v_{i 1} = {\hat{v}}_{i 1} + Δ v_{i 1}

are applied.

Resembling the deduction process of Equations (11) and (12), the error of the relative rotation can be obtained as follows:

δ {\dot{\bar{q}}}_{r} = [\begin{array}{l} 0 \\ - ⌊{\hat{ω}}_{i 1} \times⌋ δ q_{r} \end{array}] + [\begin{array}{l} 0 \\ \frac{1}{2} Δ ω_{i 1} \end{array}] - [\begin{array}{l} 0 \\ \frac{1}{2} {\hat{R}}_{r} {}^{T}Δ ω_{i 0} \end{array}],

(24)

After applying the small-angle assumption, the result is as follows:

δ {\dot{θ}}_{r} = - ⌊{\hat{ω}}_{i 1} \times⌋ δ θ_{r} + Δ ω_{i 1} - {\hat{R}}_{r} {}^{T}Δ ω_{i 0},

(25)

Since the linear and angular velocities are generated from the SR-CICEKF process, which can be seen as a measurement process, there is as follows:

Δ v_{i 0} = n_{v_{i 0}}, Δ v_{i 1} = n_{v_{i 1}}, Δ ω_{i 0} = n_{ω_{i 0}}, Δ ω_{i 1} = n_{ω_{i 1}} .

(26)

2.4. Relationships Among the Variables in the CICEKF

By analyzing the construction of the state vectors, it can be concluded that the SR-CICEKF and the MR-CICEKF are tightly connected in the inheritance of variables and the synchronous update, which are assembled as an entire CICEKF process. The relationships among the variables in the CICEKF states are shown in Figure 2. The whole filter is divided into three parts: data preparation, filter process, and data output. The data preparation contains the data input and fusion for both the master and slave robots as individuals. The filter process is separated into two steps, which are SR-CICEKF and MR-CICEKF.

3. Propagation and Update of the CICEKF

The propagation process is key to connecting different layers of the filter. The update process introduces measurement data to correct the entire data fusion process.

3.1. Propagation and Measurement of the SR-CICEKF

3.1.1. Propagation of the SR-CICEKF

Let the continuous state transition matrix of the SR-CICEKF be

F_{c}

and the continuous noise transition matrix be

G_{c}

, which are constant over each integration time step. By linearizing the continuous-time errors of a CICEKF state, there is as follows [31]:

\dot{\tilde{X}} = F_{c} \tilde{X} + G_{c} n_{X},

(27)

where the state noise vector is

n_{X} = {[n_{ω_{c}}^{T} n_{a_{m i}}^{T} n_{ω_{c}}^{T} n_{ω_{m i}}^{T} n_{b_{a i}}^{T} n_{b_{ω i}}^{T}]}^{T}

.

By applying Taylor’s formula, the discrete form of Equation (27) is as follows [31]:

\{\begin{cases} F_{d} = \exp (F_{c} Δ t) = I_{d} + F_{c} Δ t + \frac{1}{2!} F_{c}^{2} Δ t^{2} + \dots \\ Q_{d} = \int_{t}^{t + Δ t} F_{d} (τ) G_{c} Q_{c} G_{c}^{T} F_{d} {(τ)}^{T} d τ \end{cases},

(28)

where

F_{d}

is the discrete form of

F_{c}

, and the continuous noise covariance matrix

Q_{c} = d i a g (σ_{n_{v_{c}}}^{2}, σ_{n_{a_{i}}}^{2}, σ_{n_{ω_{c}}}^{2}, σ_{n_{ω_{i}}}^{2}, σ_{n_{b_{a i}}}^{2}, σ_{n_{b_{ω i}}}^{2})

is a diagonal matrix with

Q_{d}

being its discrete form.

By considering the first-order expansion, after computing the Jacobians

\frac{\partial \dot{\tilde{X}}}{\partial \tilde{X}}

to obtain

F_{c}

, the expression of

F_{d}

can be written as follows:

F_{d} = {[\begin{array}{l} I_{3} & (1 - μ_{v}) R_{c}^{i} Δ t & μ_{v} I_{3} Δ t & \dots & 0_{3 \times 18} \\ 0_{3} & I_{3} & \dots & 0_{3 \times 21} \\ 0_{3} & 0_{3} & I_{3} & 0_{3} & - {\hat{R}}_{i} ⌊{\hat{a}}_{i} \times⌋ Δ t & 0_{3} & 0_{3} & - {\hat{R}}_{i} Δ t & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & I_{3} & A & 0_{3} & (1 - μ_{ω}) Δ t I_{3} & μ_{ω} R_{μ_{ω} c} {}^{T}Δ t & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & 0_{3} & I_{3} & - ⌊{\hat{ω}}_{i} \times⌋ Δ t & 0_{3} & 0_{3} & 0_{3} & - Δ t I_{3} \\ 0_{12 \times 15} & \dots & I_{12 \times 12} \end{array}]}_{27 * 27},

(29)

where

A = I_{3} - (μ_{ω} ⌊(R_{μ_{ω} c} {}^{T}{\hat{ω}}_{i}) \times⌋ + (1 - μ_{ω}) ⌊{\hat{ω}}_{c} \times⌋) Δ t

.

Following Equations (28) and (29),

Q_{d}

can be obtained from

G_{c}

, which is derived according to

\frac{\partial \dot{\tilde{X}}}{\partial n_{X}}

and written as follows:

G_{c} = {[\begin{array}{c} 0_{3 \times 18} \\ I_{3} & 0_{3} & 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & - {\hat{R}}_{i} & 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3 \times 18} \\ 0_{3} & 0_{3} & 0_{3} & - I_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & I_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & 0_{3} & I_{3} & 0_{3} \\ 0_{6 \times 18} \end{array}]}_{27 * 18} .

(30)

3.1.2. Measurement of the SR-CICEKF

For an autonomous robot, the keyframes of the visual measurement can possess high localization accuracy for its post-batch optimization with visual features. Let the measurement vector of the SR-CICEKF at the

k

-th step be

z_{k}

with its error form as

{\tilde{z}}_{k} = {[{\tilde{z}}_{p}^{T} {\tilde{\bar{z}}}_{q}^{T} {\tilde{z}}_{v}^{T} {\tilde{z}}_{ω}^{T}]}^{T}

, where

{\tilde{z}}_{P}

is the error of the linear position, the quaternion

{\tilde{\bar{z}}}_{q}

is the error of the rotation,

{\tilde{z}}_{v}

is the error of the linear velocity, and

{\tilde{z}}_{ω}

is the error of the angular velocity. The measurement noise of the variables is simplified as

n_{m} = {[n_{p}^{T} n_{θ}^{T} n_{v}^{T} n_{ω}^{T}]}^{T}

.

Following

\tilde{x} = x - \hat{x}

and

δ \bar{q} = {\hat{\bar{q}}}^{*} \otimes \bar{q}

, the entire description of

{\tilde{z}}_{P}

,

{\tilde{\bar{z}}}_{q}

, and

{\tilde{z}}_{v}

based on variables in the error state vector of the SR-CICEKF can be deducted as follows:

\{\begin{cases} {\tilde{z}}_{p} = Δ p + {\hat{R}}_{i c} ⌊(p_{i}^{c}) \times⌋ δ θ_{i c} + n_{p} \\ {\tilde{\bar{z}}}_{q} = {\hat{\bar{q}}}_{i c} {}^{*}\otimes ({\bar{q}}_{c}^{i} \otimes {\bar{q}}_{c}) = δ {\bar{q}}_{i c} \approx [\begin{matrix} 1 \\ \frac{1}{2} δ θ_{i c} + n_{θ} \end{matrix}] \\ {\tilde{z}}_{v} = - μ_{v} Δ t {\hat{R}}_{i} ⌊{\hat{a}}_{i} \times⌋ δ θ_{i} - μ_{v} Δ t {\hat{R}}_{i} Δ b_{a} + (1 - μ_{v}) R_{c}^{i} Δ v_{c} + n_{v} \end{cases} .

(31)

In particular, the angular velocities of both sensors cannot be directly subtracted since they do not share the same coordinate frame. Therefore,

{\tilde{z}}_{ω}

is obtained indirectly from Equation (15) as follows:

\begin{array}{l} {\tilde{z}}_{ω} = \dot{δ θ_{i c}} Δ t + n_{ω} \\ = - Δ t [μ_{ω} ⌊(R_{μ_{ω} c} {}^{T}{\hat{ω}}_{i}) \times⌋ + (1 - μ_{ω}) ⌊{\hat{ω}}_{c} \times⌋] δ θ_{i c} + (1 - μ_{ω}) Δ t Δ ω_{c} + μ_{ω} Δ t R_{μ_{ω} c} {}^{T}Δ ω_{i} + n_{ω} \end{array} .

(32)

The measurement matrix

H_{k}

can be recovered from

{\tilde{z}}_{k} ≃ H_{k} {\tilde{X}}_{k} + n_{m}

, which is written as follows:

H_{k} = {[\begin{matrix} I_{3 \times 3} & 0_{3 \times 6} & {\hat{R}}_{i c} ⌊(p_{i}^{c}) \times⌋ & 0_{3 \times 15} \\ 0_{3 \times 9} & \frac{1}{2} I_{3 \times 3} & 0_{3 \times 15} \\ 0_{3 \times 3} & Δ t (1 - μ_{v}) R_{c}^{i} & 0_{3 \times 6} & - μ_{v} Δ t {\hat{R}}_{i} ⌊{\hat{a}}_{i} \times⌋ & 0_{3 \times 6} & - μ_{v} Δ t {\hat{R}}_{i} & 0_{3 \times 3} \\ 0_{3 \times 9} & B & 0_{3 \times 3} & (1 - μ_{ω}) Δ t I_{3 \times 3} & μ_{ω} Δ t {R_{μ_{ω} c}}^{T} & 0_{3 \times 6} \end{matrix}]}_{12 * 27},

(33)

where

B = - Δ t [μ_{ω} ⌊(R_{μ_{ω} c} {}^{T}{\hat{ω}}_{i}) \times⌋ + (1 - μ_{ω}) ⌊{\hat{ω}}_{c} \times⌋]

.

3.2. Propagation and Measurement of the MR-CICEKF

The processes of the propagation and measurement of the MR-CICEKF are similar to those of the SR-CICEKF and focus on relative motion estimation.

Following the deduction of the propagation matrix from Equation (27) to (28), let the noise vector of the state variables of the MR-CICEKF be

n_{r} = {[{n_{v_{i 0}}}^{T} {n_{v_{i 1}}}^{T} {n_{ω_{i 0}}}^{T} {n_{ω_{i 1}}}^{T}]}^{T}

. And the discrete relative state transition matrix is written as follows:

F_{d}^{r} = {[\begin{array}{l} (I_{3} - ⌊{\hat{ω}}_{i 0} \times⌋) & - {\hat{R}}_{r}^{T} & {\hat{R}}_{r}^{T} & ⌊({\hat{R}}_{r} {}^{T}{(v_{i 1} - v_{i 0})}) \times⌋ & ⌊{\hat{p}}_{r} \times⌋ & 0_{3} \\ 0_{3} & I_{3} & \dots & 0_{3 \times 12} \\ 0_{3} & 0_{3} & I_{3} & \dots & 0_{3 \times 9} \\ 0_{3 \times 9} & \dots & I_{3} (- ⌊{\hat{ω}}_{i 1} \times⌋) & - {\hat{R}}_{r}^{T} & I_{3} \\ 0_{3 \times 12} & \dots & I_{3} & 0_{3} \\ 0_{3 \times 15} & \dots & I_{3} \end{array}]}_{18 \times 18} .

(34)

The discrete noise covariance matrix is

Q_{d}^{r} = \int_{t}^{t + Δ t} F_{d}^{r} (τ) G_{c}^{r} Q_{c}^{r} G_{c}^{r} {}^{T}F_{d}^{r} {(τ)}^{T} d τ

, where there is as follows:

G_{c}^{r} = {[\begin{array}{l} 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ I_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & I_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & I_{3} & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & I_{3} \end{array}]}_{18 \times 12} .

(35)

By defining the measurement noise vector of the MR-CICEKF as

n_{r m} = {[n_{r p}^{T} n_{r θ}^{T}]}^{T}

, let the measurement error of the MR-CICEKF at the

k

-th step be

{\tilde{z}}_{r k} = {[{\tilde{z}}_{r p}^{T} {\tilde{z}}_{r q}^{T}]}^{T}

, then, there is as follows:

\{\begin{cases} {\tilde{z}}_{r p} = ({}_{1}p_{c} - {}_{0}p_{c}) - {\hat{p}}_{r} \\ {\tilde{z}}_{r q} = δ {\bar{q}}_{r} = ({}_{0}{\bar{q}}_{c}^{*} \otimes {}_{1}{\bar{q}}_{c}) \otimes {\hat{\bar{q}}}_{r} \approx [\begin{matrix} 1 \\ \frac{1}{2} \cdot δ θ_{r} + n_{r θ} \end{matrix}] \end{cases} .

(36)

After inspecting the relationship between the measurement and the desired output of the filter through

{\tilde{z}}_{k}^{r} ≃ H_{k}^{r} {\tilde{Y}}_{k} + n_{r m}

, the measurement matrix

H_{k}^{r}

can be deducted as follows:

H_{k}^{r} = {[\begin{matrix} I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & \frac{1}{2} I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \end{matrix}]}_{6 \times 18} .

(37)

3.3. Entire CICEKF Process

From the analysis above, it can be concluded that the CICEKF has two major EKF-based loops to update and correct the relevant states. The first loop, i.e., the SR-CICEKF, is shown in Algorithm 1, and the second loop, i.e., the MR-CICEKF, is shown in Algorithm 2, where

R_{k}

and

R_{k}^{r}

are the measurement noise matrices pre-established as described in [32,33],

K_{k}

and

K_{k}^{r}

are the Kalman gains,

P_{k + 1 | k}

and

P_{k + 1 | k}^{r}

are the prior covariance matrices of error, and

P_{k + 1 | k + 1}

and

P_{k + 1 | k + 1}^{r}

are the posterior covariance matrices of the error.

The first loop governs the propagation, measurement, and update of a single robot’s motion state by fusing the measurement data of the stereo vision and the IMU on the velocity level. The second loop is focused on the estimation of the relative positions and rotations of multiple slave robots with respect to the master robot. This dual-loop design keeps the independence of either algorithm for application in different motion estimation situations. When the two algorithms work serially to yield

{\hat{X}}_{k + 1}

and

{\hat{Y}}_{k + 1}

successively, the entire algorithm complexity can be restricted to

O (n)

.

Algorithm 1: SR-CICEKF
	Input: ${\tilde{\dot{X}}}_{k}$ , ${\hat{X}}_{k}$ , $H_{k}$ ;
1	While true do
2	Update $F_{d}$ , $Q_{d}$ according to $\tilde{\dot{X}}$ and ${\hat{X}}_{k}$ ; ( $\leftarrow$ Propagation process in Section 3.1.1)
3	Update $P_{k + 1 \| k} = F_{d} P_{k \| k} F_{d} + Q_{d}$ ;
4	Update $K_{k} = P_{k + 1 \| k} H_{k}^{T} {(H_{k} P_{k + 1 \| k} H_{k}^{T} + R_{k})}^{- 1}$ ;
5	Calculate ${\tilde{z}}_{k}$ according to $H_{k}$ ; ( $\leftarrow$ Measurement process in Section 3.1.2)
6	Update the current state ${\hat{X}}_{k + 1 \| k + 1} = {\hat{X}}_{k + 1 \| k} + {\hat{\tilde{X}}}_{k}$ ;
7	Update $P_{k + 1 \| k + 1} = (I_{27 \times 27} - K_{k} H_{k}) P_{k + 1 \| k} {(I_{27 \times 27} - K_{k} H_{k})}^{T} + K_{k} R_{k} K_{k}^{T}$ ;
8	$k = k + 1$ ;
9	Output: ${\hat{X}}_{k + 1}$ .

Algorithm 2: MR-CICEKF
	Input: ${\tilde{\dot{Y}}}_{k}$ , ${\hat{Y}}_{k}$ , $H_{k}^{r}$ , ${}_{0}{\hat{X}}_{k}$ , ${}_{1}{\hat{X}}_{k}$ ;
1	While true do
2	Update ${\hat{Y}}_{k}$ according to ${}_{0}{\hat{X}}_{k}$ and ${}_{1}{\hat{X}}_{k}$ ;
3	Update $F_{d}^{r}$ , $Q_{d}^{r}$ according to ${\tilde{\dot{Y}}}_{k}$ and ${\hat{Y}}_{k}$ ; ( $\leftarrow$ Propagation process in Section 3.2)
4	Update $P_{k + 1 \| k}^{r} = F_{d}^{r} P_{k \| k}^{r} F_{d}^{r} + Q_{d}^{r}$ ;
5	Update $K_{k}^{r} = P_{k + 1 \| k}^{r} H_{k}^{r} {}^{T}{(H_{k}^{r} P_{k + 1 \| k}^{r} H_{k}^{r} {}^{T}+ R_{k}^{r})}^{- 1}$ ;
6	Calculate ${\tilde{z}}_{r k}$ according to $H_{k}^{r}$ ; ( $\leftarrow$ Measurement process in Section 3.2)
7	Update the current state ${\hat{Y}}_{k + 1 \| k + 1} = {\hat{Y}}_{k + 1 \| k} + {\hat{\tilde{Y}}}_{k}$ ;
8	Update $P_{k + 1 \| k + 1}^{r} = (I_{18 \times 18} - K_{k}^{r} H_{k}^{r}) P_{k + 1 \| k}^{r} {(I_{18 \times 18} - K_{k}^{r} H_{k}^{r})}^{T} + K_{k}^{r} R_{k}^{r} K_{k}^{r T}$ ;
9	$k = k + 1$ ;
10	Output: ${\hat{Y}}_{k + 1}$ .

When updating each current state, the quaternion error is recovered from the angular errors using the following equations [28]:

\{\begin{cases} δ {\hat{q}}_{k} \approx \frac{1}{2} δ {\hat{θ}}_{k} \\ {\hat{\bar{q}}}_{k + 1} = \{\begin{cases} {[\sqrt{1 - δ {\hat{q}}_{k} {}^{T}δ {\hat{q}}_{k}}, δ {\hat{q}}_{k}^{T}]}^{T} if δ {\hat{q}}_{k} {}^{T}δ {\hat{q}}_{k} \leq 1 \\ \frac{1}{\sqrt{1 + δ {\hat{q}}_{k} {}^{T}δ {\hat{q}}_{k}}} {[1, δ {\hat{q}}_{k}^{T}]}^{T} if δ {\hat{q}}_{k} {}^{T}δ {\hat{q}}_{k} > 1 \end{cases} \end{cases} .

(38)

4. Nonlinear Observability Analysis

4.1. Observability Analysis of the SR-CICEKF

When treating the filter as a nonlinear multi-input and multi-output system, its prerequisite for the filter to achieve correct estimation is that the system is observable. Researchers have proven that a filter possessing local weak observability can function properly, which is equivalent to the constructed observability matrix having full rank [21,34]. To simplify the discussion, a virtual measured angle velocity

ω_{m i c}

and its bias

b_{ω_{i c}}

are assumed and partially coupled based on the IMU and visual measurement. Due to

{\bar{q}}_{i c}

being in the filter state, and thus treated as paired with

ω_{m i c}

, this assumption does not affect the observability analysis. Thus, following Appendix A, the nonlinear measurement process of the SR-CICEKF can be expressed as follows:

f (X) = \dot{X} = f_{0} + f_{1} ω_{m i} + f_{2} ω_{m i c} + f_{3} a_{m i},

(39)

where for a general unit quaternion

\bar{q}

, there is

Ξ (\bar{q}) = {[- q^{T} {(q_{0} I_{3 \times 3} - ⌊q \times⌋)}^{T}]}^{T}

[21], and

\{\begin{cases} f_{0} = {[{(μ_{v} v_{i} + (1 - μ_{v}) R_{c}^{i} v_{c})}^{T} 0_{1 \times 3} {(- R_{i} b_{a_{i}} - g)}^{T} {(\frac{1}{2} Ξ ({\bar{q}}_{i c}) b_{ω_{i c}})}^{T} {(\frac{1}{2} Ξ ({\dot{\bar{q}}}_{i}) b_{ω_{i}})}^{T} 0_{1 \times 12}]}^{T} \\ f_{1} = {[0_{3 \times 13} {(\frac{1}{2} Ξ ({\bar{q}}_{i}))}^{T} 0_{3 \times 12}]}^{T} \\ f_{2} = {[0_{3 \times 9} {(\frac{1}{2} Ξ ({\bar{q}}_{i c}))}^{T} 0_{3 \times 16}]}^{T} \\ f_{3} = {[0_{3 \times 6} R_{i} {}^{T}0_{3 \times 20}]}^{T} \end{cases} .

(40)

The measurement functions for the SR-CICEKF are designed as

h (\dot{X}) = {[h_{1} {}^{T}, \dots, {h_{8}}^{T}]}^{T}

, with

h_{1} = (p_{i c} - R_{i c} p_{i}^{c}) λ

,

h_{2} = {\bar{q}}_{i c}

,

h_{3} = {\bar{q}}_{i}

,

h_{4} = {\bar{q}}_{i c} {}^{T}{\bar{q}}_{i c}

,

h_{5} = {\bar{q}}_{i} {}^{T}{\bar{q}}_{i}

,

h_{6} = v_{c}

,

h_{7} = ω_{c}

, and

h_{8} = ω_{i}

.

Following the Lie derivative rules in Appendix A and in [21], the observability matrix is as follows:

Ω = [\begin{array}{l} \nabla L^{0} h_{1} \\ \nabla L^{0} h_{2} \\ \nabla L^{0} h_{3} \\ \nabla L^{0} h_{4} \\ \nabla L^{0} h_{5} \\ \nabla L^{0} h_{6} \\ \nabla L^{0} h_{7} \\ \nabla L^{0} h_{8} \\ \nabla L_{f_{0}}^{1} h_{1} \\ \nabla L_{f_{0}}^{1} h_{3} \\ \nabla L_{f_{0}}^{2} h_{1} \end{array}] = {[\begin{matrix} \overset{p_{i c}}{\overset{⏞}{I_{3 \times 3}}} & \overset{v_{c}}{\overset{⏞}{0_{3 \times 3}}} & \overset{v_{i}}{\overset{⏞}{0_{3 \times 3}}} & \overset{{\bar{q}}_{i c}}{\overset{⏞}{G_{[1, 4]}}} & \overset{{\bar{q}}_{i}}{\overset{⏞}{0_{3 \times 4}}} & \overset{ω_{c}}{\overset{⏞}{0_{3 \times 3}}} & \overset{ω_{i}}{\overset{⏞}{0_{3 \times 3}}} & \overset{b_{a}}{\overset{⏞}{0_{3 \times 3}}} & \overset{b_{ω}}{\overset{⏞}{0_{3 \times 3}}} \\ 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & I_{4 \times 4} & 0_{4 \times 4} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} \\ 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 4} & I_{4 \times 4} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} \\ 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} & G_{[4, 4]} & 0_{1 \times 4} & 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} \\ 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 4} & G_{[5, 4]} & 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} & 0_{1 \times 3} \\ 0_{3 \times 3} & I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 4} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 4} & I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 4} & 0_{3 \times 3} & I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & G_{[9, 2]} & μ_{v} I_{3 \times 3} & G_{[9, 4]} & 0_{3 \times 4} & 0_{3 \times 3} & 0_{3 \times 3} & G_{[9, 8]} & 0_{3 \times 3} \\ 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 4} & 0_{4 \times 4} & G_{[10, 6]} & 0_{4 \times 3} & 0_{4 \times 3} & 0.5 Ξ ({\bar{q}}_{i}) \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & G_{[11, 4]} & G_{[11, 5]} & 0_{3 \times 3} & 0_{3 \times 3} & - μ_{v} R_{i} & G_{[11, 9]} \end{matrix}]}_{31 \times 29},

(41)

where the matrices

G

with the row and column indexes as subscripts do not contribute to the rank analysis.

To calculate the column rank, block Gaussian elimination can be applied. Once the corresponding block is an identity matrix, the relative column can be eliminated. Thus, only the last two columns are needed for the analysis and can be simplified as follows:

Ω^{'} = [\begin{matrix} 0 & 0.5 Ξ ({\dot{\bar{q}}}_{i}) \\ - μ_{v} R_{i} & G_{[11, 9]} \end{matrix}] .

(42)

Therefore, if

μ_{v}

were non-zero, and any axis of the IMU was excited in any direction,

Ω^{'}

has full column rank, and

Ω

, thus, has full column rank. This means the SR-CICEKF has local weak observability [21,34]. The above conditions of achieving full rank are easily fulfilled during the application.

4.2. Observability Analysis of the MR-CICEKF

Resembling the previous analysis, the nonlinear system representing the measurement results of the MR-CICEKF can be expressed as follows:

f (Y) = \dot{Y} = {\{{\dot{p}}_{r} {}^{T} 0_{3 \times 3} 0_{3 \times 3} {\dot{\bar{q}}}^{T} 0_{3 \times 3} 0_{3 \times 3}\}}^{T} .

(43)

The measurement functions for MR-CICEKF are designed as

h_{r} (\dot{Y}) = {[h_{r 1} {}^{T}, \dots, {h_{r 6}}^{T}]}^{T}

, with

h_{r 1} = p_{r}

,

h_{r 2} = {\bar{q}}_{r}

,

h_{r 3} = v_{i 0}

,

h_{r 4} = v_{i 1}

,

h_{r 5} = ω_{i 0}

, and

h_{r 6} = ω_{i 1}

.

After applying Lie derivative rules, the observability matrix of the MR-CICEKF system is constructed as follows:

Θ = [\begin{array}{l} \nabla L^{0} h_{r 1} \\ \nabla L^{0} h_{r 2} \\ \nabla L^{0} h_{r 3} \\ \nabla L^{0} h_{r 4} \\ \nabla L^{0} h_{r 5} \\ \nabla L^{0} h_{r 6} \end{array}] = {[\begin{matrix} \overset{p_{r}}{\overset{⏞}{I_{3 \times 3}}} & \overset{v_{i 0}}{\overset{⏞}{0_{3 \times 3}}} & \overset{v_{i 1}}{\overset{⏞}{0_{3 \times 3}}} & \overset{{\bar{q}}_{r}}{\overset{⏞}{0_{3 \times 4}}} & \overset{ω_{i 0}}{\overset{⏞}{0_{3 \times 3}}} & \overset{ω_{i 1}}{\overset{⏞}{0_{3 \times 3}}} \\ 0_{4 \times 3} & 0_{4 \times 3} & 0_{4 \times 3} & I_{4 \times 4} & 0_{4 \times 3} & 0_{4 \times 3} \\ 0_{3 \times 3} & I_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & I_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & I_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 3} & 0_{3 \times 4} & 0_{3 \times 3} & I_{3 \times 3} \end{matrix}]}_{19 \times 19} .

(44)

All the block columns in

Θ

have full column rank when the single-robot measurement yields non-zero outcomes, which means that the system has local weak observability [21,34].

5. Data Test

Two sets of motion simulations were performed to test the feasibility of the CICEKF algorithm. The motion curve was chosen as quintic, which is convenient for calculating the linear accelerations.

5.1. SR-CICEKF Simulation Test

To simplify the simulation for an individual sensor suite, the coordinate frames of the IMU and the stereo vision are assumed to coincide, and the biases are omitted. Once the parameters of the quintic curve are confirmed, the position coordinates of the curve can be treated as the position ground truth. By analyzing the projections of the curve on the three standard planes, the roll, pitch, and yaw orientations of the adjacent points can be achieved by allocating the starting and ending points, which yields the ground truth of the orientation. Thus, the ideal linear acceleration, linear velocity, and angular velocity data can be differentiated from the position and orientations, which are utilized as the filter input after being corrupted with Gaussian noise.

The position and orientation curves of the generated quintic trajectory across all three axes are shown in Figure 3 and Figure 4, respectively. It should be noted that the change in the trajectory is designed in a short time span of 10 s with a 1000 Hz update rate, which means the change in simulation data is quite fast; for example, the maximum angular velocity is over 6 rad/s, and the max linear acceleration is over 11 m²/s.

P_{k | k}

can be asymptotically stable after a complete run with the randomized initial values. By reusing the stabilized

P_{k | k}

in other runs, the initial convergence of the filter process can be greatly boosted.

R_{k}

is designed as a skew-symmetric matrix containing small random values. The coefficients are set as

μ_{v} = 0.5

and

μ_{ω} = 0.5

. The initial position and orientation are set as [0, 0, 0].

The three-dimensional position curves of the SR-CICEKF simulation are shown in Figure 5, with the circle mark as the starting point and the star markers as the ending points. The errors between the virtual measurement values and the filtered values of the position and orientation are shown in Figure 6 and Figure 7, respectively. The statistical results of RMSE, mean errors, and STD of the norm errors involving the three axes of the position and the orientation are listed in Table 2.

By inspecting the curves and the statistical values, it can be concluded that the SR-CICEKF quickly converges under severe data changes, while it does not need careful parameter tuning for the first estimation of

P_{k | k}

. This indicates that the filter is robust, and the observability of the fusion process is identified indirectly.

5.2. MR-CICEKF Simulation

The goal of the simulation is to estimate the relative translation and rotation between the master and slave robots. The MR-CICEKF simulation utilizes the same quintic data as the target trajectories of both the master and slave robots. The absolute relative translation of the two robots is set as [1, 1, 1], and the relative rotation is set as [0, 0, 0]. The MR-CICEKF is tested independently of the SR-CICEKF. Thus, the performance of the MR-CICEKF is not affected by the possible inaccuracy produced by the SR-CICEKF process. The configurations of

P_{k | k}^{r}

and

R_{k}^{r}

are similar to the SR-CICEKF simulation, and the simulation input is corrupted with Gaussian noise.

Figure 8 depicts the three-dimensional position curves of the MR-CICEKF simulation, containing the ground truth of the master and slave robots’ trajectories, and the filtered slave robot trajectory, which is achieved by adding the estimated

p_{r}

and

{\bar{q}}_{r}

to the master trajectory. The circle marks are the starting points, and the star markers are the ending points. The statistical values of RMSE, mean errors, and STD of the norm errors involving the three axes of the position and the orientation obtained after comparing the ground truth with the filtered slave robot trajectories are listed in Table 3.

The statistical values indicate that the final error is similar to the small white noise added. Thus, it can be concluded that the MR-CICEKF possesses good estimation accuracy, while the main error is caused by eliminating high-order terms during the state transition. This feature can significantly simplify further application since the MR-CICEKF contributes little error to the entire CICEKF process.

5.3. Dataset Test

To further analyze the performance of the CICEKF in real-world scenarios, a test was performed with the EuRoC dataset. The SR-CICEKF part is tested solely since the MR-CICEKF already showed great performance. The state-of-the-art visual SLAM method, ORB-SLAM V3 [16], was utilized as the visual black box measurement algorithm. The SR-CICEKF runs at 200 Hz, which is synchronous with the IMU update rate of the dataset. By employing EVO tools [35], the RPE features of the trajectories can be directly digitized.

The stereo vision of ORB-SLAM V3 runs at 20 Hz, and all measurement frames were treated as keyframes. The coefficients were set as

μ_{v} = 0.9

and

μ_{ω} = 0.5

. The initial covariance matrix

P_{k | k}

was tuned following the aforementioned strategy. The SR-CICEKF, stereo ORB-SLAM V3, and its variant, along with the stereo MSCKF [17], were carried out with the room dataset of EuRoC under the same computer configuration. Figure 9 shows the comparison of the estimation trajectories of the SR-CICEKF and the ground truth. Figure 10 shows the dataset test scenarios of the algorithms separately. The RPE comparison results between each algorithm with the ground truth achieved by utilizing the EVO tools are listed in Table 4.

From the comparison, it can be concluded that by fusing IMU information and the pure visual black box measurement with the proposed SR-CICEKF algorithm, the accuracy has been greatly improved. Although the accuracy of the approach does not reach that of the optimization-based method, the fusion computational complexity is as low as

O (n)

. This can benefit the fusion with other visual measurement algorithms to improve the performance of trajectory estimation without affecting the computing efficiency. Furthermore, it can be combined with MR-CICEKF for the simultaneous estimation of multiple agents’ trajectories and collaborative configurations.

6. Experiments

Two experiments were designed to focus the discussion on the feasibility of the proposed method. Firstly, the Intel RealSense D435i, which contains a stereo set of cameras and an IMU, was mounted on a wheeled robot that performs locomotion. The SR-CICEKF estimated the locomotive trajectory while the ground truth was measured by the NOKOV motion capture system. Then, the D435i camera data were utilized for collaborative loco-motion estimation. The experimental scenarios (Supplementary Materials) are shown in Figure 11.

For both parts of the CICEKF, the black-box stereo visual algorithm is ORB-SLAM V3 operating at 30 Hz, while the IMU update rate is set to 200 Hz. The calibration of the cameras and IMUs was conducted using Kalibr tools [36] to obtain the intrinsic and extrinsic parameters and the fixed configuration of the integrated sensors.

During the experiment, we noticed that ORB-SLAM V3 and its IMU variant might not be stable under our experimental conditions due to the calibration accuracy and the IMU lacking sufficient excitation, since we used a slow ground vehicle as the motion platform. Therefore, to demonstrate the performance of the SR-CICEKF, the estimated trajectory was compared with that of stereo ORB-SLAM V3 and stereo MSCKF. The coefficients were set as

μ_{v} = 0.5

and

μ_{ω} = 0.5

. The comparison of the measured trajectory and the ground truth is shown in Figure 12. The RPE comparison results between each algorithm with the ground truth in a real scenario are listed in Table 5.

For the MR-CICEKF experiment, the two D435i cameras were calibrated with Kalibr tools to achieve a fixed relative configuration as both the ground truth and the initial guess of the filter. The relative configuration estimation of the MR-CICEKF was compared with the ground truth. The norm errors of the relative configuration estimation are listed in Table 6.

From the experimental results, it can be concluded that the proposed algorithm can not only improve the accuracy of pure visual measurement algorithms but also reliably estimate the relative configuration of multiple robots.

7. Conclusions

This study introduces a novel motion estimation approach with a loosely coupled and dual-step structure for the autonomous and collaborative localization of multiple locomotive robots. Both steps of the algorithm can run standalone, serially, or in parallel while maintaining a computational complexity of

O (n)

, which allows for convenient improvement to pure vision measurement techniques after introducing an IMU. Both the simulations and experiments demonstrate that fusing data on the velocity level without post-optimization greatly contributes to the robot trajectory estimation accuracy. Due to current experimental limitations, further research will be focused on systematically examining how the coupling coefficients affect filter performance and stability across various integrated sensors and high-speed motion units.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s25103086/s1.

Author Contributions

Conceptualization, Z.L. and S.L.; methodology, C.L.; software, Z.L. and T.W.; validation, C.L.; formal analysis, C.L.; investigation, Z.L. and S.L.; resources, Z.L. and S.L.; data curation, C.L. and T.W.; writing—original draft preparation, Z.L. and C.L.; writing—review and editing, T.W. and S.L.; visualization, Z.L. and P.T.; supervision, S.L.; project administration, Z.L. and S.L.; funding acquisition, T.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was sponsored by the National Key Research and Development Program of China [Grant number 2022YFC3320503] and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China [Grant number 12221002], funded by Tao Wang.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The source code presented in this study is available upon request from the corresponding author.

Acknowledgments

The authors acknowledge the anonymous reviewers for their helpful comments on the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Assuming that the rotation

\bar{q}

is accomplished during a certain period, and the corresponding body angular velocity is

ω

, there is

\dot{\bar{q}} = \frac{1}{2} \bar{q} \otimes \bar{ω}

with

\bar{ω} = [0, ω]

[28]. The

μ

-th power of

\bar{q}

is written as

{\bar{q}}^{μ}

, which is a unit quaternion that denotes scaling the rotation angle around the virtual axis with

μ \in [0, 1]

[30], and there is

\frac{d}{d t} ({\bar{q}}^{μ}) = \frac{1}{2} {\bar{q}}^{μ} \otimes (μ \bar{ω})

[29].

Rotating a three-element vector

p

with a rotation

\bar{q}

is written as

\bar{q} \otimes \bar{p} \otimes {\bar{q}}^{*} ≙ R (\bar{q}) p

with

\bar{p} = {[0, p^{T}]}^{T}

.

For general quaternions, the error between the measurement and the expectation is defined as

δ \bar{q}

from

\bar{q} = \hat{\bar{q}} \otimes δ \bar{q}

. The derivative form is

δ \dot{\bar{q}} = {\hat{\bar{q}}}^{*} \otimes (\dot{\bar{q}} - \dot{\hat{\bar{q}}} \otimes δ \bar{q})

[23].

According to

R (δ \bar{q}) \approx I_{3} + ⌊δ θ \times⌋

[28], there is

R (\bar{q}) = \hat{R} (\hat{\bar{q}}) R (δ \bar{q}) \approx \hat{R} (\hat{\bar{q}}) (I_{3} + ⌊δ θ \times⌋)

.

Assuming that there are two quaternions,

\bar{q} = {[q_{0}, q^{T}]}^{T}

and

\bar{p} = {[p_{0}, p^{T}]}^{T}

, the quaternion multiplication can be written as follows [28]:

\bar{q} \otimes \bar{p} = [\begin{array}{c} q_{0} & - q^{T} \\ q & q_{0} I_{3} + ⌊q \times⌋ \end{array}] \otimes \bar{p} = [\begin{array}{c} p_{0} & - p^{T} \\ p & p_{0} I_{3} + ⌊p \times⌋ \end{array}] \otimes \bar{q} .

(A1)

The Lie derivative of the measurement function

h (x)

with respect to the vector

f (x)

is written as follows [21]:

L_{f} h (x) = \nabla_{f} h (x) = \frac{\partial h (x)}{\partial x} f (x) .

(A2)

The

k

-th-order Lie derivative of

h (x)

with respect to

f (x)

is written as follows:

L_{f}^{k} h (x) = \frac{\partial (L_{f}^{k - 1} h (x))}{\partial x} f (x) .

(A3)

In particular, the zeroth-order Lie derivative of

h (x)

is the measurement function itself, i.e.,

L^{0} h (x) ≜ h (x)

.

References

Servières, M.; Renaudin, V.; Dupuis, A.; Antigny, N. Visual and Visual-Inertial SLAM: State of the Art, Classification, and Experimental Benchmarking. J. Sens. 2021, 2021, 2054828. [Google Scholar] [CrossRef]
Sun, Z.; Gao, W.; Tao, X.; Pan, S.; Wu, P.; Huang, H. Semi-tightly coupled robust model for GNSS/UWB/INS integrated positioning in challenging environments. Remote Sens. 2024, 16, 2108. [Google Scholar] [CrossRef]
He, M.; Zhu, C.; Huang, Q.; Ren, B.; Liu, J. A review of monocular visual odometry. Vis. Comput. 2020, 36, 1053–1065. [Google Scholar] [CrossRef]
Elamin, A.; El-Rabbany, A.; Jacob, S. Event-Based Visual/Inertial Odometry for UAV Indoor Navigation. Sensors 2024, 25, 61. [Google Scholar] [CrossRef] [PubMed]
Hu, C.; Huang, P.; Wang, W. Tightly coupled visual-inertial-UWB indoor localization system with multiple position-unknown anchors. IEEE Robot. Autom. Lett. 2023, 9, 351–358. [Google Scholar] [CrossRef]
Tonini, A.; Castelli, M.; Bates, J.S.; Lin, N.N.N.; Painho, M. Visual-Inertial Method for Localizing Aerial Vehicles in GNSS-Denied Environments. Appl. Sci. 2024, 14, 9493. [Google Scholar] [CrossRef]
Nemec, D.; Šimák, V.; Janota, A.; Hruboš, M.; Bubeníková, E. Precise localization of the mobile wheeled robot using sensor fusion of odometry, visual artificial landmarks and inertial sensors. Robot. Auton. Syst. 2019, 112, 168–177. [Google Scholar] [CrossRef]
Li, Z.; You, B.; Ding, L.; Gao, H.; Huang, F. Trajectory Tracking Control for WMRs with the Time-Varying Longitudinal Slippage Based on a New Adaptive SMC Method. Int. J. Aerosp. Eng. 2019, 2019, 4951538. [Google Scholar] [CrossRef]
Tschopp, F.; Riner, M.; Fehr, M.; Bernreiter, L.; Furrer, F.; Novkovic, T.; Pfrunder, A.; Cadena, C.; Siegwart, R.; Nieto, J. Versavis—An open versatile multi-camera visual-inertial sensor suite. Sensors 2020, 20, 1439. [Google Scholar] [CrossRef]
Reginald, N.; Al-Buraiki, O.; Choopojcharoen, T.; Fidan, B.; Hashemi, E. Visual-Inertial-Wheel Odometry with Slip Compensation and Dynamic Feature Elimination. Sensors 2025, 25, 1537. [Google Scholar] [CrossRef]
Sun, X.; Zhang, C.; Zou, L.; Li, S. Real-Time Optimal States Estimation with Inertial and Delayed Visual Measurements for Unmanned Aerial Vehicles. Sensors 2023, 23, 9074. [Google Scholar] [CrossRef] [PubMed]
Lajoie, P.-Y.; Beltrame, G. Swarm-slam: Sparse decentralized collaborative simultaneous localization and mapping framework for multi-robot systems. IEEE Robot. Autom. Lett. 2023, 9, 475–482. [Google Scholar] [CrossRef]
Wang, S.; Wang, Y.; Li, D.; Zhao, Q. Distributed relative localization algorithms for multi-robot networks: A survey. Sensors 2023, 23, 2399. [Google Scholar] [CrossRef]
Cai, Y.; Shen, Y. An integrated localization and control framework for multi-agent formation. IEEE Trans. Signal Process. 2019, 67, 1941–1956. [Google Scholar] [CrossRef]
Yan, Y.; Zhang, B.; Zhou, J.; Zhang, Y.; Liu, X. Real-time localization and mapping utilizing multi-sensor fusion and visual–IMU–wheel odometry for agricultural robots in unstructured, dynamic and GPS-denied greenhouse environments. Agronomy 2022, 12, 1740. [Google Scholar] [CrossRef]
Campos, C.; Elvira, R.; Rodríguez, J.J.G.; Montiel, J.M.; Tardós, J.D. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
Li, M.; Mourikis, A.I. High-precision, consistent EKF-based visual-inertial odometry. Int. J. Robot. Res. 2013, 32, 690–711. [Google Scholar] [CrossRef]
Nguyen, T.H.; Nguyen, T.-M.; Xie, L. Flexible and resource-efficient multi-robot collaborative visual-inertial-range localization. IEEE Robot. Autom. Lett. 2021, 7, 928–935. [Google Scholar] [CrossRef]
Tian, Y.; Chang, Y.; Arias, F.H.; Nieto-Granda, C.; How, J.P.; Carlone, L. Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems. IEEE Trans. Robot. 2022, 38, 2022–2038. [Google Scholar] [CrossRef]
Zhang, Z.; Zhao, J.; Huang, C.; Li, L. Learning visual semantic map-matching for loosely multi-sensor fusion localization of autonomous vehicles. IEEE Trans. Intell. Veh. 2022, 8, 358–367. [Google Scholar] [CrossRef]
Kelly, J.; Sukhatme, G.S. Visual-inertial sensor fusion: Localization, mapping and sensor-to-sensor self-calibration. Int. J. Robot. Res. 2011, 30, 56–79. [Google Scholar] [CrossRef]
Weiss, S.; Siegwart, R. Real-time metric state estimation for modular vision-inertial systems. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 4531–4537. [Google Scholar]
Achtelik, M.W.; Weiss, S.; Chli, M.; Dellaerty, F.; Siegwart, R. Collaborative stereo. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, USA, 25–30 September 2011; pp. 2242–2248. [Google Scholar]
Brossard, M.; Bonnabel, S.; Barrau, A. Invariant Kalman filtering for visual inertial SLAM. In Proceedings of the 2018 21st International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 2021–2028. [Google Scholar]
Sun, W.; Li, Y.; Ding, W.; Zhao, J. A novel visual inertial odometry based on interactive multiple model and multistate constrained Kalman filter. IEEE Trans. Instrum. Meas. 2023, 73, 5000110. [Google Scholar] [CrossRef]
Fornasier, A.; Ng, Y.; Mahony, R.; Weiss, S. Equivariant filter design for inertial navigation systems with input measurement biases. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 4333–4339. [Google Scholar]
van Goor, P.; Mahony, R. An equivariant filter for visual inertial odometry. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 14432–14438. [Google Scholar]
Trawny, N.; Roumeliotis, S.I. Indirect Kalman Filter for 3D Attitude Estimation; University of Minnesota, Department of Computer Science & Engineering, Technical Report: Minneapolis, MN, USA, 2005; Volume 2. [Google Scholar]
Liu, C.; Wang, T.; Li, Z.; Tian, P. A Novel Real-Time Autonomous Localization Algorithm Based on Weighted Loosely Coupled Visual–Inertial Data of the Velocity Layer. Appl. Sci. 2025, 15, 989. [Google Scholar] [CrossRef]
Dam, E.B.; Koch, M.; Lillholm, M. Quaternions, Interpolation and Animation; Datalogisk Institut, Københavns Universitet: Copenhagen, Denmark, 1998; Volume 2. [Google Scholar]
Maybeck, P. Stochastic Models, Estimation, and Control; Academic Press: Cambridge, MA, USA, 1982. [Google Scholar]
Beder, C.; Steffen, R. Determining an initial image pair for fixing the scale of a 3d reconstruction from an image sequence. In Proceedings of the Joint Pattern Recognition Symposium, Hong Kong, China, 17–19 August 2006; pp. 657–666. [Google Scholar]
Eudes, A.; Lhuillier, M. Error propagations for local bundle adjustment. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 2411–2418. [Google Scholar]
Hermann, R.; Krener, A. Nonlinear controllability and observability. IEEE Trans. Autom. Control 1977, 22, 728–740. [Google Scholar] [CrossRef]
Grupp, M. evo: Python Package for the Evaluation of Odometry and SLAM. Available online: https://github.com/MichaelGrupp/evo (accessed on 5 April 2025).
Rehder, J.; Nikolic, J.; Schneider, T.; Hinzmann, T.; Siegwart, R. Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 4304–4311. [Google Scholar]

Figure 1. Relationship of multi-robot coordinate frames.

Figure 2. Relationships among the variables in the CICEKF state vector.

Figure 3. Position curves of the generated quintic trajectory.

Figure 4. Orientation curves of the generated quintic trajectory.

Figure 5. Three-dimensional position curves of the SR-CICEKF simulation.

Figure 6. Position error curves of the SR-CICEKF simulation.

Figure 7. Orientation error curves of the SR-CICEKF simulation.

Figure 8. Three-dimensional position curves of the MR-CICEKF simulation.

Figure 9. Comparison of the estimation trajectories of the SR-CICEKF and the ground truth.

Figure 10. Dataset test scenarios: (a) stereo ORB-SLAM V3, (b) stereo ORB-SLAM V3 with the IMU, and (c) stereo MSCKF.

Figure 11. Experimental scenarios.

Figure 12. Comparison of the estimated trajectory and the ground truth of the SR-CICEKF.

Table 1. Coordinate frames and notations.

Symbol	Description
w	Coordinate frame of the fixed world
i	Coordinate frame attached to the IMU’s rigid body
c	Coordinate frame attached to the rigid body of the dual-camera stereo vision
ic	Coordinate frame attached to the virtual rigid body of the IMU-aided camera system
$x_{A}^{B}$	$x$ represents a general viable vector, $A$ and $B$ are the reference coordinate frames, e.g., $p_{A}^{B}$ denotes the linear translation in the coordinate frame $B$ measured with respect to the coordinate frame $A$
${}_{C}^{D}x$	$x$ represents a general viable vector, $C$ and $D$ are the robot indexes, e.g., ${}_{C}^{D}p$ denotes the linear translation of the $D$ -th robot measured with respect to the $C$ -th robot, particularly, ${}_{C}x = {}_{C}^{C}x$ .
$⌊x \times⌋$	Skew-symmetric matrix of $x$ , and $⌊x \times⌋ y = - x ⌊y \times⌋$ [28] holds
$p$	The linear translation vector of rigid bodies along 3 axes, of which the quasi-quaternion description is $\bar{p} = {[0, p^{T}]}^{T}$
$\bar{q}$	The unit quaternion following the Hamilton notation [21], written as $\bar{q} = {[q_{0}, q_{1}, q_{2}, q_{3}]}^{T} = {[q_{0}, q^{T}]}^{T}$
${\bar{q}}^{*}$	The conjugate form of $\bar{q}$ , and ${\bar{q}}^{*} \otimes \bar{q} = 1$ holds
$b$	The uncertain bias of the measurement result
$R$	The equivalent rotation matrix of the quaternion $\bar{q}$ , e.g., $R_{w}^{i} = R ({\bar{q}}_{w}^{i})$
$n$	White Gaussian noise vector with zero mean and covariance $σ^{2}$
$\dot{x}$ , $\hat{x}$	The first-order time derivative form and the estimated form of the vector $x$ , respectively
$\tilde{x}$	The error form of the vector $x$ , which is defined as $\tilde{x} = x - \hat{x}$
$δ \bar{q}$	The error of the quaternion $\bar{q}$ , and $δ \bar{q} = {\hat{\bar{q}}}^{*} \otimes \bar{q}$ holds

Table 2. The norm errors between the ground truth and the SR-CICEKF simulation results.

Translation RMSE	Translation Mean Error	Translation STD	Orientation RMSE	Orientation Mean Error	Orientation STD
0.1593 m	0.0429 m	0.1535 m	0.106 rad	0.0187 rad	0.1044 rad

Table 3. The norm errors between the ground truth and the filtered slave robot trajectories.

Translation RMSE	Translation Mean Error	Translation STD	Orientation RMSE	Orientation Mean Error	Orientation STD
0.01872 m	5.286 × 10⁻⁶ m	0.0187 m	0.0016 rad	4.95 × 10⁻⁷ rad	0.0016 rad

Table 4. Norm errors of the translations between the ground truth and the results of the dataset tests.

	Translation RMSE	Translation Mean Error	Translation STD
SR-CICEKF	0.004211 m	0.003982 m	0.001371 m
Stereo ORB-SLAM V3	0.03844 m	0.03428 m	0.01739 m
Stereo ORB-SLAM V3 with the IMU	0.003236 m	0.002785 m	0.001648 m
Stereo MSCKF	0.0556 m	0.048994 m	0.02629 m

Table 5. Norm errors of the translations between the ground truth and the experimental results.

	Translation RMSE	Translation Mean Error	Translation STD
SR-CICEKF	0.00459 m	0.006448 m	0.004528 m
Stereo ORB-SLAM V3	0.01723 m	0.01275 m	0.01159 m
Stereo MSCKF	0.01745 m	0.01292 m	0.01173 m

Table 6. Norm errors of the relative configuration estimation.

Translation RMSE	Translation Mean Error	Translation STD	Orientation RMSE	Orientation Mean Error	Orientation STD
0.009398 m	0.0001 m	0.0094 m	0.032 rad	0.00602 rad	0.03145 rad

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, C.; Wang, T.; Li, Z.; Li, S.; Tian, P. A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots. Sensors 2025, 25, 3086. https://doi.org/10.3390/s25103086

AMA Style

Liu C, Wang T, Li Z, Li S, Tian P. A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots. Sensors. 2025; 25(10):3086. https://doi.org/10.3390/s25103086

Chicago/Turabian Style

Liu, Cheng, Tao Wang, Zhi Li, Shu Li, and Peng Tian. 2025. "A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots" Sensors 25, no. 10: 3086. https://doi.org/10.3390/s25103086

APA Style

Liu, C., Wang, T., Li, Z., Li, S., & Tian, P. (2025). A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots. Sensors, 25(10), 3086. https://doi.org/10.3390/s25103086

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Loosely Coupled Collaborative Localization Method Utilizing Integrated IMU-Aided Cameras for Multiple Autonomous Robots

Abstract

1. Introduction

1.1. Motivation

1.2. Related Work

1.3. Our Approach

2. Design of the State Vector

2.1. Definition of Variables

2.2. Construction of the SR-CICEKF State Vector

2.2.1. Coupling Process for the SR-CICEKF

2.2.2. Simplification of the SR-CICEKF State Vector

2.2.3. Error of the SR-CICEKF State Vector

2.3. Construction of the MR-CICEKF State Vector

2.3.1. Simplification of the MR-CICEKF State Vector

2.3.2. Error of the MR-CICEKF State Vector

2.4. Relationships Among the Variables in the CICEKF

3. Propagation and Update of the CICEKF

3.1. Propagation and Measurement of the SR-CICEKF

3.1.1. Propagation of the SR-CICEKF

3.1.2. Measurement of the SR-CICEKF

3.2. Propagation and Measurement of the MR-CICEKF

3.3. Entire CICEKF Process

4. Nonlinear Observability Analysis

4.1. Observability Analysis of the SR-CICEKF

4.2. Observability Analysis of the MR-CICEKF

5. Data Test

5.1. SR-CICEKF Simulation Test

5.2. MR-CICEKF Simulation

5.3. Dataset Test

6. Experiments

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI