Next Article in Journal
Marine Online Platforms of Services to Public End-Users—The Innovation of the ODYSSEA Project
Previous Article in Journal
Detection of Multidecadal Changes in Vegetation Dynamics and Association with Intra-Annual Climate Variability in the Columbia River Basin
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lie Group Modelling for an EKF-Based Monocular SLAM Algorithm

1
Thales LAS France, Rue Pierre Gilles de Gennes, 91470 Limours, France
2
Institut Supérieur de l’Aéronautique et de l’Espace (ISAE-SUPAERO), University of Toulouse, 31400 Toulouse, France
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(3), 571; https://doi.org/10.3390/rs14030571
Submission received: 29 November 2021 / Revised: 18 January 2022 / Accepted: 20 January 2022 / Published: 25 January 2022
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
This paper addresses the problem of monocular Simultaneous Localization And Mapping on Lie groups using fiducial patterns. For that purpose, we propose a reformulation of the classical camera model as a model on matrix Lie groups. Thus, we define an original-state vector containing the camera pose and the set of transformations from the world frame to each pattern, which constitutes the map’s state. Each element of the map’s state, as well as the camera pose, are intrinsically constrained to evolve on the matrix Lie group S E ( 3 ) . Filtering is then performed by an extended Kalman filter dedicated to matrix Lie groups to solve the visual SLAM process (LG-EKF-VSLAM). This algorithm has been evaluated in different scenarios based on simulated data as well as real data. The results show that the LG-EKF-VSLAM can improve the absolute position and orientation accuracy, compared to a classical EKF visual SLAM (EKF-VSLAM).

1. Introduction

The problem of simultaneously localizing a mobile robot and constructing a map from sensor measurements has arisen in many research works during the last decades [1,2]. Simultaneous Localization And Mapping (SLAM) can be seen as an estimation problem, where the state includes the pose of the robot and the position of the detected landmarks. This estimation problem is usually solved by one of the following two methods:
  • the bundle adjustment approach: this formulates the SLAM as an optimization problem, where the set of all measurements (or a part of) is used to estimate the desired system state [3,4];
  • the filtering approach: this is classically based on a Kalman filter and its variants (such as the extended Kalman filter, i.e., EKF). It provides an estimate of the state and its associated covariance matrix at each epoch.
When a visual sensor is used for SLAM, the approach is usually referred to as visual-SLAM (VSLAM) [5]. In this context, the classical landmarks are provided by well-known corner-detection algorithms, such as Harris [6], FAST [7], SIFT, BRIEF, SURF, or ORB. The use of such 2D features admit two main disadvantages:
(1)
the number of detected landmarks in each image can be high (≥500), making the map huge for big environments. This impacts directly the size of the state vector, which increases dramatically, making a filter-based method with 2D features not well-suited for long-term navigation;
(2)
each measurement must be associated to each detection with a data association algorithm [8]. This step remains a challenge and false matching can cause the filter to diverge. Moreover, it can be difficult to treat it from a computational point of view.
To overcome these difficulties, we propose in this study the use of fiducial or coded markers as landmarks [9,10,11]. The main advantage of such patterns is that they reduce the number of landmarks while solving the association step, thanks to labelled detections. Moreover, even if it is required to instrument the environment with coded patterns, a priori knowledge of the absolute position of one of the artificial landmarks would allow precise absolute localization.
For this study, the chosen patterns are constituted of a constellation of circular shapes (see Figure 1) but can be changed for other rigid patterns such as ARTag [12] without loss of generality. By assuming that the patterns have a known given size L, the pose of a pattern in the global frame can be represented as a 3D affine transformation. The latter is composed of a rotation and a translation and can be associated to the set of matrices defined in the special Euclidean group S E ( 3 ) , which has the structure of a Lie group (LG). The aim of this article is to rewrite the camera model in order to express the camera pose, as well as the pose of each landmark, in S E ( 3 ) to reformulate the VSLAM problem as a recursive estimation problem on LGs.

1.1. Formulating VSLAM on Lie Groups

The problem of filtering on LGs has recieved a lot of attention in recent years, particularly in applications such as visual navigation and robotics to estimate the rotation and transformation matrices of a rigid body [13,14]. The interest of the LG framework is to overcome a complex Euclidean parametrization (as Euler angles or quaternions) by using a unified and elegant formalism, taking into account the geometric properties of these matrices. Consequently, the LG structure of the state vector admits advantages, compared to the classical VSLAM extended Kalman filter (EKF-VSLAM):
  • We can rewrite a compact camera model, allowing us to overcome the high non-linearities created by rotation matrices parametrization;
  • The analytical development of quantities of interest, such as Jacobian matrices, are intrinsic to LGs and are consequently less difficult to compute and to implement.
To apply filtering on LGs, several dedicated analytical algorithms have been developed in the literature, such as the Lie Group-Extended Kalman Filter (LG-EKF) [15], the Invariant Kalman filter [16,17], the Unscented Kalman filter (LG-UKF) [18], the information Kalman filter (LG-IKF) [19], but also Monte-Carlo filtering, such as particle filters [20]. In the context of SLAM, several works have been made. In [21], a VSLAM on LG is proposed and based on a LG-UKF. The unknown state is assumed to be lying on the LG S E 2 + p ( 3 ) . In [22], a SLAM on LGs is developed that combines LG-EKF and LG-IKF. In [23], a robust hybrid VSLAM is proposed to detect the loop closures: the map and the pose are constructed in a Euclidean framework, but the loop closures are determined by estimating 3D map similarities with a LG modelling. All of these techniques have in common the assumption that the map is constructed by estimating Euclidean parameters (3D points).
In this study, we propose a LG-EKF-VSLAM which estimates both the state and the map on LGs. The advantages of this method are that it intrinsically takes into account the geometry of the pose, and it copes with the high non-linearities introduced by the camera model. Consequently, the proposed approach should theoretically improve the pose-estimation performance.

1.2. Contribution

Our contribution is two-fold:
  • Theoretical: we define a new matrix state containing the camera pose and the pattern transformations on S E ( 3 ) , which constitutes the map. Consequently, the associated state space is S E ( 3 ) 1 + K ( 1 + K direct products of S E ( 3 ) ), where K is the number of current patterns;
  • Algorithmic: the newly detected patterns are initialized by a S E ( 3 ) transformation, obtained by minimizing a criterion through a Gauss–Newton algorithm on LGs.

1.3. Organization

This paper is organized as follows: the EKF-VSLAM associated with the camera observation model in the context of coded patterns is detailed in Section 2. In Section 3, a background on LGs is introduced, and the considered camera model is detailed and rewritten on LG S E ( 3 ) . The proposed LG-VSLAM, based on LG-EKF, is then explained in Section 4. Section 5 is dedicated to the performance evaluation. The proposed algorithm is implemented and assessed with simulated and real data and compared to a classical Euclidean EKF-SLAM. Finally, conclusions and remarks are drawn in Section 6.

2. The Euclidean Approach

In this section, we present the classical Euclidean pinhole camera model. Particularly, we provide the different Euclidean parametrizations of unknown variables and the way the EKF-VSLAM is applied.

2.1. Coded Patterns

Consider a mobile robot equipped with a monocular camera which tries to localize itself by using coded patterns spread out in its close environment. In our case, each coded pattern is a 2D square surface, composed of four circles of the same size (see Figure 1). We define the center of the bottom left ellipse as the origin of the pattern’s local frame. In this frame, the 3 D coordinates { q i } i = 1 4 of the centers of each ellipse can be computed, knowing the inter-circle distance L. Consequently, every pattern j can be fully characterized by:
  • a rotation matrix between the local frame and a world frame, R W P ( j ) ;
  • the position of the origin of the local frame expressed in this world frame, p W P ( j ) ;
  • the distance L between the center of two ellipses;
  • an ID coded by the circular bar-codes.

2.2. Camera Observation Model

At each instant k, a fiducial pattern detector provides the 2D coordinates of the center of each ellipse, u k , i ( j ) , associated to the 3D point q i of pattern j in the world frame. According to the pinhole camera model, the reprojection function provides the mathematical relation linking { u k , i ( j ) } i = 1 4 and { q i , C ( j ) } i = 1 4 , such as:
u k , i ( j ) = f ( q i , C ( j ) ) 3 ν x q i , C ( j ) 1 + x 0 , ν y q i , C ( j ) 2 + y 0 + n k , i ( j ) ,
with f being the focal distance of the camera, ( x 0 , y 0 ) being the principal point of the image and parameters ν x and ν y , the pixel densities on the horizontal and the vertical axes, respectively. ( v ) l denotes the l t h component of vector v . The random process, n k , i ( j ) , is considered as white Gaussian noise on R 2 with covariance N , n k , i ( j ) N R 2 ( 0 , N ) . As illustrated in Figure 2, q i , C ( j ) explicitly depends on the rotation matrix R W C ( k ) and the rotation between the frame of the coded pattern and the world frame R W P ( j ) . In a classical Euclidean framework, they can be parametrized by:
  • Euler angles: the unknown variables are six angles { θ ( k ) , ϕ ( k ) , ψ ( k ) , θ ( j ) ϕ ( j ) , ψ ( j ) } , such as:
    R W C ( k ) = R x ( θ ( k ) ) R y ( ϕ ( k ) ) R z ( ψ ( k ) )
    R W P ( j ) = R x ( θ ( j ) ) R y ( ϕ ( j ) ) R z ( ψ ( j ) ) ,
    where R u denotes an elementary 3D rotation according to the axis u;
  • Quaternion: In this case R W C ( k ) and R W P ( j ) can be written as a complex combination of quaternion coefficients with the form z = a + b i + c j + d k .
Through this observation model and a given dynamic model, it is possible to estimate the global state x k containing:
  • the pose of the camera θ ( k ) , ϕ ( k ) , ψ ( k ) , p W C ( k ) R 6 ;
  • the set of parameters of the patterns { θ ( j ) , ϕ ( j ) , ψ ( j ) , p W P ( j ) } j = 1 K R 6 K , where K is the number of detected patterns,
by performing a classical EKF-VSLAM [24].

2.3. Why EKF-SLAM on Lie Groups?

Using the EKF-VSLAM with coded patterns and classical camera model has some drawbacks due to the high non-linearity of the observation model. Indeed, we observe that the unknown variables are linked to the observations by the composition of multiple inverse and trigonometric functions for the Euler parametrization. For the quaternion parametrization, the non-linearity is introduced by the composition of inverse and multiple square functions. Thus, the linearization in the update step of the EKF introduces errors. This can induce some numerical problems in the Jacobian calculation and degrade the performance of the estimation.
In the following, we propose to resolve the SLAM problem by performing the filtering on LGs. Indeed, this formalism enables us to overcome these problems. Particularly:
the camera observation model can be written in a compact way and depends “linearly” on the rotation matrix. Consequently, the approximation at order 1 is more valid;
The Jacobian of the observation model can be computed analytically by considering the geometrical structure of the rotation matrices and does not depend on any parametrization.

3. Background on Lie Groups

3.1. Elementary Notions

A LG G is a set which has the properties of a group and a smooth manifold [25,26]. In this work, we deal with matrix LGs; thus, every element of G is a matrix belonging to R n × n .
Due to its group structure, G has an internal composition law which is associative. Furthermore, there is an identity element defined, so that each element of the group has an inverse;
Due to its manifold structure, it is possible to compute the derivative and the integral of two elements, or the inverse of one element.
Furthermore, the properties of Lie Groups allow us to define a tangent vector space T M G at every point M G . It is then possible to associate any point of T I G to any point of T X G , thanks to the operator called left tangent application (see Figure 3). The latter is defined, between two tangent spaces, by:
L X Y L : T X G T Y G V Y X 1 V
Consequently, we can work with the tangent space to the identity I n × n , also called the Lie algebra, denoted as g . This space is a vector space with a dimension of m and defines the dimension of G as manifold.
We can establish a relation between G and g through the exponential mapping, Exp G : g G , and its reciprocal, the logarithm mapping, Log G : G g . They are locally bijective. The mathematical expressions of these operators can be found in [25,27,28].
On the other hand, g can be directly linked to the Euclidean space R m through an isomorphism . G and its inverse . G , such as a = a G and a = a G , a R m , a g . To condense these notations, we define a = Log G ( X ) = Log G ( X ) and X = Exp G ( a ) = Exp G a G , with X G and a R m . Figure 4 shows, geometrically, the relationship between a LG G, its Lie algebra g , and R m .

3.2. Non-Commutativity and Jacobian

3.2.1. Non-Commutativity

In general, a LG is non-commutative and ( a , b ) R m 2 :
Log G ( Exp G ( a ) Exp G ( b ) ) a + b .
The adjoint representation allows one to take into account the non-commutativity of LGs. It is an application A d G ( . ) : G R m × m , such that X G , a R m , X Exp G ( a ) = Exp G ( A d G ( X ) a ) X .
The non-commutativity of LGs can be also explicited through the Baker–Campbell–Hausdorff formula (BCH):
Log G ( Exp G ( a ) Exp G ( a + b ) ) = ϕ G ( a ) b + O ( b 2 ) . ,
and depends of the left Jacobian matrix, which is written:
ϕ G ( a ) = n = 0 + a d G ( a ) n ( n + 1 ) ! ,
where a d G : R m R m × m is an adjoint operator characterizing the non-commutativity directly on the Lie algebra.

3.2.2. Jacobian on LGs

We can define the notion of Jacobian according to a function. Let us consider f : G R an integrable function. We define LG Jacobian of f by the following expression:
D X f = f ( X Exp G ( δ ) ) δ δ = 0 ,
which can be seen as a generalization of the classical directional derivative.

3.3. Lie Groups of Interest

Two useful LGs in our application are introduced in this part.

3.3.1. The Special Orthogonal LG S O ( n )

The LG S O ( n ) is the group of rotation matrices of dimension n. Every element R of S O ( n ) belongs to the following set:
S O ( n ) = { R R n × n | R R = I , | R | = 1 } ,
where | R | is the determinant of matrix R . Its Lie algebra is the set of skew-symmetric matrices:
so ( n ) = { [ w ] × | w R m } ,
where w is the vector of R m isomorph to the element of [ w ] × . The notation [ . ] × refers to the skew-symmetric matrix operator. In the case of n = 3 , every element of S O ( 3 ) is a 3D rotation matrix, which can be classically parametrized in different ways (quaternions or Euler angles), as seen previously. With the LG formalism, we may parametrize it with an angle vector θ R m ( m = 3 ) . It is written, according to the Rodrigues formula [29], as:
R = I 3 × 3 + θ × θ sin ( θ ) + θ × 2 θ 2 1 cos ( θ ) .
Every element of its Lie algebra has the following structure:
[ θ ] × = 0 θ 3 θ 2 θ 3 0 θ 1 θ 2 θ 1 0 .

3.3.2. The Special Euclidean LG S E ( n )

The LG S E ( n ) corresponds to the semi-direct product between S O ( n ) and R n :
S E ( n ) = A = R x 0 1 × n 1 | R S O ( n ) , x R n .
Its Lie algebra is:
se ( n ) = [ w ] × u 0 1 × n 0 | w R n ( n 1 ) 2 , u R n .

3.4. Camera Observation Model on LGs

The observation camera model (1) can be written in a compact way on the LG S E ( 3 ) . Indeed, the camera rotation R W C ( k ) and the associated translation p W C ( k ) constitute the rigid-body transformation of the camera and has the following form:
X W C ( k ) = R W C ( k ) p W C ( k ) 0 1 × 3 1 S E ( 3 ) .
In a similar way for the landmarks, the rotation matrix R W P ( j ) and the associated translation p W P ( j ) are written as:
X W P ( j ) = R W P ( j ) p W P ( j ) 0 1 × 3 1 S E ( 3 ) .
According to Figure 2, we can express the successive S E ( 3 ) transformations between the camera frame C and the pattern frame P in the following way:
q i , W ( j ) , 1 = X W P ( j ) q i , 1
q i , C ( j ) , 1 = X W C ( k ) 1 q i , W ( j ) , 1 .
Consequently, the pinhole model (1) can be written as:
u k , i ( j ) , 1 = h ( X W C ( k ) , X W P ( j ) ) + n k , i ( j ) , 0 ,
where:
h ( X W C ( k ) , X W P ( j ) ) = K I 3 × 4 X W C ( k ) 1 X W P ( j ) q i , 1 I 3 × 4 X W C ( k ) 1 X W P ( j ) q i , 1 3 ,
and K corresponds to the camera’s calibration matrix:
K = ν x f 0 x 0 0 ν y f y 0 0 0 1 .
It is also interesting to note that the model can be written by substituting L:
u k , i ( j ) , 1 = h L ( X W C ( k ) , X W P ( j ) ; L ) + n k , i ( j ) , 0 ,
where:
h L ( X W C ( k ) , X W P ( j ) , L ) = K I 3 × 4 X W C ( k ) 1 X W P ( j ) L e i , 1 I 3 × 4 X W C ( k ) 1 X W P ( j ) L e i , 1 3
e 1 = 0 , 0 , 0   24 e 2 = 1 , 0 , 0   25 e 3 = 0 , 1 , 0   26 e 4 = 1 , 1 , 0   .    27
This reformulation allows us to define an inverse problem for L. Thus, when the latter is unknown, it is also possible to estimate it conjointly with the camera pose.

4. SLAM Filtering Problem on Lie Groups

In the state-of-the-art VSLAM approaches (Euclidean or Lie groups), the aim is to obtain a recursive estimation of both the pose of the camera and the pose of each pattern in the world frame. The unknown state contains, at each instant, the pose parameters (on S E ( 3 ) or on R 6 ) and the set of 3D points on R 3 . Contrary to the Euclidean VSLAM approach, we propose, in the same way as [21], to define the pose parameters as a S E ( 3 ) matrix, but instead of estimating the positions of the 3D points that compose each pattern, we take advantage of the coded patterns approach in order to estimate the S E ( 3 ) transformation X W P ( j ) . Consequently, if the pattern size is known, the estimation of this transformation provides directly the estimation of the 3D position of the center of each ellipse in that pattern. In this way, the global state belongs to several direct products of S E ( 3 ) . To achieve this estimation, we perform SLAM-extended Kalman filtering on LGs (LG-EKF-VSLAM) based on the LG-EKF proposed in [23]. Classically, on a Euclidean space, an extended Kalman filter is a recursive filtering algorithm which enables the approximation of the posterior distribution of parameters of interest by a Gaussian distribution. The LG-EKF can be seen as a generalization of the latter in the case where the parameters are lying on LG.
The main difference with the algorithm proposed in [23] is that we express the landmark initialization phase (mean and covariance) using a concentrated Gaussian on LG.

4.1. Gaussian Distribution on LGs

To perform filtering on LGs, probability distributions on these manifolds have to be defined. Some have already been introduced in the literature, in particular for smooth manifolds [30,31]. We focus, in this paper, on the Gaussian distributions (CGD) on LGs, introduced in [32], which generalizes the concept of Euclidean Gaussian multivariate probability distribution functions (pdf). Indeed, they have similar properties, such as minimizing the entropy under some constraints.
It is possible to obtain a realization from a CGD thanks to a Euclidean Gaussian sample. Indeed, if ϵ is a Gaussian vector with covariance matrix P , and μ is a matrix defined on a LG G of intrinsic dimension m, then X = μ Exp G ( ϵ ) is distributed according to a left-concentrated Gaussian pdf (CGD) (in a similar way, we can define a right CGD where sample X can be written X = Exp G ( ϵ ) μ ) on G, with parameters μ G and P R m × m . We note that X N G L ( X ; μ , P ) . If μ is close to I , the expression of this pdf can be approximated by:
p ( X ) 1 ( 2 π ) m | P | exp 1 2 | | Log G ( μ 1 X ) | | P 2 ,
where | P | is the determinant of the matrix P and . P 2 is the Mahalanobis distance.

4.2. Unknown State and Evolution Model

The unknown state is constituted of the camera pose X W C ( k ) S E ( 3 ) and the transformation X W P ( j ) of every pattern j , j 1 , m k , where m k is the number of already-detected patterns from an initial instant to instant k.
Due to the dynamics of the robot, some other Euclidean parameters (for instance, the vehicle velocity or the ellipse size) may have to be integrated in the state. Thus, to keep the LG compact in form, such additional parameters can be added into a vector g ( k ) R p . Thus, the global state (without landmarks) of our VSLAM problem can be written as:
X a ( k ) = X W C ( k ) 0 4 × p + 1 0 p + 1 × p + 1 I p × p g ( k ) 0 1 × p 1 S E ( 3 ) × R p .
As a consequence, the LG unknown global state X ( k ) , containing X a ( k ) and the set of pattern transformations, can be written as:
X ( k ) = X a ( k ) 0 ( 5 + p ) × 4 0 ( 5 + p ) × 4 0 4 × ( 5 + p ) X W P ( 1 ) 0 4 × 4 0 4 × ( 5 + p ) 0 4 × 4 X W P ( m k ) ,
with X ( k ) G = S E ( 3 ) × R p × S E ( 3 ) m k being a square matrix with size d k X = 5 + p + 4 m k .
To perform the filtering, we have to define the evolution model of this state. By assuming that the patterns are static, the global evolution model can be written as:
Π ( X ( k ) ) = Π ( X ( k 1 ) ) Exp G ( Ω ( X ( k 1 ) , w ( k 1 ) ) + n ( k ) ) ,
where Π is a function which expresses the state on the LG G = S O ( 3 ) × R 3 × R p × S E ( 3 ) m k and defines two separated dynamic models on the rotation R W C ( k ) and the position p W C ( k ) .
Ω : G R dim ( G ) = Ω X ( X a ( k ) ) , 0 1 × 6 m k is a dynamic evolution function (where dim ( G ) corresponds to the dimension of G );
w ( k 1 ) is a control input;
n ( k ) = n a ( k ) , 0 1 × 6 m k , where n a ( k ) is a white Gaussian noise on R dim ( G ) , with covariance matrix Q a ( k ) .

4.3. Principle of LG-EKF

The aim of the LG-EKF is to recursively approximate the posterior distribution of the unknown parameters defined on LG by a left CGD (note that it also possible to define the LG-EKF-SLAM with a right CGD). Precisely, in our case, we want to approach p ( X ( k ) | u ( 1 : k ) ) by:
p ( X ( k ) | u ( 1 : k ) ) N G L ( X ( k ) ; X ^ ( k | k ) , P ( k | k ) ) ,
where u ( 1 : k ) is the set of all collected measurements from instant 1 to k. In the same way as the classical EKF, the estimation is done in two steps: prediction and update.

4.3.1. The Prediction Step

The objective of this step is to estimate the distribution p ( X ( k ) | u ( 1 : k 1 ) ) (i.e., estimating the values { X ^ ( k | k 1 ) , P ( k | k 1 ) } ), given a prior on the distribution p ( X ( k 1 ) | u ( 1 : k 1 ) ) (defined by { X ^ ( k 1 | k 1 ) , P ( k 1 | k 1 ) } ). As n a ( k ) is zero-mean, the prediction of X ^ ( k 1 | k 1 ) is given, thanks to the noise-free dynamic model (30) on LG, by:
Π ( X ^ ( k | k 1 ) ) = Π ( X ^ ( k 1 | k 1 ) ) Exp G ( Ω ( X ^ ( k 1 | k 1 ) ) )
and P ( k | k 1 ) is given by:
P ( k | k 1 ) = F ( k ) P ( k 1 | k 1 ) F ( k ) + ϕ G ( Ω ( X ^ ( k 1 | k 1 ) ) ) Q ( k ) ϕ G ( Ω ( X ^ ( k 1 | k 1 ) ) ) ,
where:
F ( k ) = A d G ( Exp G ( Ω ( X ^ ( k 1 | k 1 ) ) ) + ϕ G ( Ω ( X ^ ( k 1 | k 1 ) ) ) C ( k 1 )
C ( k 1 ) = D X ^ ( k 1 | k 1 ) Ω
Q ( k ) = Q a ( k ) 0 9 × 6 m k 0 6 m k × 9 0 6 m k × 6 m k .

4.3.2. The Update Step

The update step takes advantage of the camera observation u k , i ( j ) provided at instant k of the j th already-known pattern, in order to approximate:
p ( X ( k ) | u ( 1 : k ) ) N G L ( X ( k ) ; X ^ ( k | k ) , P ( k | k ) ) .
The parameters { X ^ ( k | k ) , P ( k | k ) } are written as:
X ^ ( k | k ) = X ^ ( k | k 1 ) Exp G ( K ( k ) m ( k ) ) ,
P ( k | k ) = ϕ G ( K ( k ) m ( k ) ) P ( k | k , ) ϕ G ( K ( k ) m ( k ) ) ,
where:
( m ( k ) = u ˜ k ( j ) h ˜ ( X ^ a ( k | k 1 ) , X ^ W P ( j ) ) Innovation vector K ( k ) = P ( k | k 1 ) J ( k ) ×
J ( k ) P ( k | k 1 ) J ( k ) + N ˜ 1
P ( k | k , ) = ( I K ( k ) J ( k ) ) P ( k | k 1 ) ,
and:
N ˜ = N 0 0 0
u ˜ k ( j ) = u ˜ k , 1 ( j ) , , u ˜ k , 4 ( j )
with u ˜ k , i ( j ) = u k , i ( j ) , 1 h ˜ ( X ^ a ( k | k 1 ) , X ^ W P ( j ) ) =
h X ^ a k | k 1 , X ^ W P j , , h X ^ a k | k 1 , X ^ W P j repeated   4   times   for   the   4   measurements
J k J c ( k ) 0 2 × 6 ( j 1 ) J p ( j ) 0 2 × 6 ( m k j ) I 4 × 1
is the Jacobian matrix of h ˜ , according to X ^ ( k | k 1 ) , with :
J c ( k ) = D X ^ a ( k | k 1 ) h
J p ( j ) = D X ^ W P ( j ) h ,
and defines the Kronecker product .
Remark 1.
It is important to note that if several patterns are detected at the same instant, the update step is recursively realized for each of them.

4.4. Initialization of Pattern on S E ( 3 )

If a new pattern is detected by the camera, the step is called the mapping step. As a new object X W P ( j ) is detected, the current state X ( k ) is augmented as follows:
X p ( k ) = X ( k ) 0 d k 1 X × 4 0 4 × d k 1 X X W P ( j ) .
Consequently, the associated covariance to this state can be written as:
P a ( k | k ) = P ( k | k ) P C P ( k | k ) P C P ( k | k ) P W P ( k | k ) ,
where:
  • P C P ( k | k ) corresponds to the correlation between the new pattern and the previous state;
  • P W P ( k | k ) is the covariance associated to the new pattern j.
At this step, we need to have an initial estimation of the pose of the pattern and of its associated covariance matrix. Classically, in a Euclidean framework, the initialization of a pattern is determined by finding the solution of the inverse of the observation model, and the covariance matrix is approximated by computing correlations using associated Jacobians [33].
In the case of the camera model, this approach is not feasible due to its analytical complexity, and numeric approaches must be considered. For instance, it is possible to find the initial state by implementing a local bundle adjustment, which optimizes a criterion built from the observation model. The covariance matrix of pattern j can be obtained by a Gauss–Laplace approximation [34], and then the global covariance matrix of the new state is updated according to Equation (39). In this work, we propose an adaptation of this method, using the LG formalism.
The initialization of X W P ( j ) can be obtained by minimizing the anti-logarithm (opposite of the logarithm) of the posterior distribution p ( X W P ( j ) | { u k , i ( j ) } i = 1 4 , u ( 1 : k 1 ) , X ^ W C ( k | k 1 ) ) , associated to the measurement j. Due to its non-tractability, we propose approaching it by a left CGD with parameters { X ^ W P , P W P } :
p ( X W P ( j ) | { u k , i ( j ) } i = 1 4 , u ( 1 : k 1 ) , X ^ W C ( k | k 1 ) )
N S E ( 3 ) L ( X W P ( j ) ; X ^ W P , P W P ) ,
where X ^ W P is the estimated pose of the new pattern and P W P its associated covariance.

4.4.1. Computation of X ^ W P

X ^ W P can be determined by maximizing the posterior distribution (52). Due to the Bayes rule, we show that:
p ( X W P ( j ) | { u k , i ( j ) } i = 1 4 , u ( 1 : k 1 ) , X ^ W C ( k | k 1 ) )
i = 1 4 p ( u k , i ( j ) | X ^ W C ( k | k 1 ) , u ( 1 : k 1 ) , X W P ( j ) )
i = 1 4 N R 2 u k , i ( j ) ; h p ( X ^ W C ( k | k 1 ) , X W P ( j ) ) , N ,
where h p is a function conserving the two first components of h . Consequently, we seek to find the minima of the following criterion corresponding to the anti-logarithm of (54):
X ^ W P = argmin X W P S E ( 3 ) 1 2 ϕ ( X W P ) Σ 2 ,
where:
ϕ X W P i = u k , i j h p X ^ W C k | k 1 , X W P i 1 , , 4 ( 56 ) = N I 4 × 4 ( 57 )
and defines an optimization problem on S E ( 3 ) . A local minima is obtained thanks to a Gauss–Newton algorithm on LG [35]. At each iteration l, the updated recursive formula is written as:
X W P ( l ) = X W P ( l 1 ) Exp S E ( 3 ) ( δ ( l ) ) ,
where δ ( l ) is a descent direction, computed in the following way:
δ ( l ) = J ϕ ( l 1 ) Σ 1 J ϕ ( l 1 ) 1 J ϕ ( l 1 ) Σ 1 ϕ ( l 1 ) ,
with:
J ϕ ( l 1 ) = D X W P ( l 1 ) ϕ
ϕ ( l 1 ) = ϕ ( X W P ( l 1 ) ) .

4.4.2. Computation of P W P

The initial covariance of X W P is obtained by using a Gauss–Laplace approximation on LG. It consists to approximate ϕ around X ^ W P under the form:
ϕ ( X W P ) ϕ ( X ^ W P ) + J ϕ Log S E ( 3 ) ( X W P 1 X ^ W P ) .
As X ^ W P is a critical point of ϕ ( X ) Σ 2 , we obtain:
p ( X W P ( j ) | u ( 1 : k 1 ) , X ^ W C ( k | k 1 ) )
exp 1 2 Log S E ( 3 ) ( X W P 1 X ^ W P ) J ϕ Σ 1 J ϕ Log S E ( 3 ) ( X W P 1 X ^ W P ) .
By identification, with the expression of the CGD, it implies that P W P is approximated by:
P W P = J ϕ Σ 1 J ϕ .
To obtain the cross-correlation matrix P C P ( k | k ) , as in (51), we perform an update step of the increased state (50) with the measurement u k , i ( j ) .

5. Experiments

In this section, we compare our proposed approach, the LG-EKF-VSLAM, to the standard EKF-VSLAM algorithm while using coded patterns. We first carried out two Monte-Carlo simulations in order to evaluate the usefulness of performing LG modelling for monocular visual SLAM. In the first scenario, we considered that the distance between the center of two ellipse on the coded patterns, L, is perfectly known. For the second simulation, L is added to the state vector as a parameter to be estimated.
We then evaluated the LG-EKF-VSLAM with real data obtained within our laboratory, since, to the best of our knowledge, there are no available public benchmarks with coded patterns.

5.1. Experiments with Simulated Data

In the simulated scenarios, we considered a wheeled robot following a trajectory generated according to a constant velocity model. The control input is given by a simulated angular velocity w ( k ) . The unknown state is constituted of the camera pose X W C ( k ) and the linear velocity v ( k ) :
X a ( k ) = X W C ( k ) 0 4 × 4 0 4 × 4 I 3 × 3 v ( k ) 0 1 × 3 1 S E ( 3 ) × R 3 .
The discrete dynamic model is written as:
v ( k ) = v ( k 1 ) + n v δ t
p W C ( k ) = p W C ( k 1 ) + v ( k 1 ) δ t + n p δ t
R W C ( k ) = R W C ( k 1 ) Exp S O ( 3 ) ( w ( k 1 ) δ t + n r δ t ) ,
where δ t is the sampling rate. n v , n w , n p , and n r are four white Gaussian noises on linear and angular velocity, position, and orientation, respectively. These equations can be written on G with the form (30), where:
Ω X ( X a ( k 1 ) ) = w ( k 1 ) δ t , v ( k 1 ) δ t , 0 1 × 6 m k
n a ( k ) = n r δ t , n p δ t , n v δ t .
The obtained trajectory is generated during T = 885 s with δ t = 1 s , and is shown in Figure 5. The positions of each coded pattern are defined with a null z-axis and are randomly distributed all along the ground-truth trajectory. Their orientations are obtained by sampling from a CGD on S O ( 3 ) , with mean I 3 and covariance 0 . 2 2 I 3 .
The camera measurements are simulated according to the model (22), with:
K = 200 0 240 0 200 320 0 0 1 , N = σ n 2 I 2 σ n = 10 1 pixel , L = 5 m .
The Jacobian matrices C ( k 1 ) , J ( k ) , and ϕ G ( . ) of the LG-EKF-VSLAM (from Equations (7), (35), and (47), respectively) have been implemented according to the formulation given in Appendix A.1, Appendix B, and Appendix C.
The performance of the method must be evaluated on a state belonging to S E ( 3 ) . Consequently, we studied it by separating the estimator associated with the rotation matrix R ^ W C ( k | k ) and the estimator associated with the position p ^ W C ( k | k ) .
(1)
The quality of R ^ W C ( k | k ) can be evaluated by using two intrinsic metrics on LGs, which enables us to compute the rotation error directly on S O ( 3 ) :
a RMSE (Root Mean Square Error) metric, which can be seen as a generalization of the classical Euclidean RMSE. The classical distance is substituted by the geodesic distance on S O ( 3 ) [36]:
R M S E R = 1 N r T k = 1 T i = 1 N r Log S O ( 3 ) ( R W C ( k ) 1 R ^ W C ( k | k ) , i ) 2 ,
where R ^ ( k | k ) , i corresponds to the Monte-Carlo realization i of the rotation matrix estimator at epoch k. N r is the number of realizations;
a mean RPE (Relative Pose Error) metric to evaluate the relative pose error between two consecutive epochs for several realizations of the algorithm.
R P E R = 1 N r T k = 1 T i = 1 N r Log S O ( 3 ) ( R W C ( k 1 ) 1 R W C ( k ) )
Log S O ( 3 ) ( R ^ W C ( k 1 | k 1 ) , i 1 R ^ W C ( k | k ) , i ) ;
(2)
As the position parameter p W C ( k ) is Euclidean, the quality of its estimator is determined with the classical Euclidean RPE and RMSE.
These performances are compared to a Euclidean EKF-VSLAM. In the latter, the rotation matrix is parametrized by Euler angles θ ( k ) , ψ ( k ) , ϕ ( k ) . Consequently, the unknown state is a vector which belongs to R 6 + p + 6 m k . The estimation error on rotation and position are directly evaluated by a classical RPE and RMSE between the vector θ ( k ) , ψ ( k ) , ϕ ( k ) , p W C ( k ) and the estimated vector θ ^ ( k ) , ψ ^ ( k ) , ϕ ^ ( k ) , p ^ W C ( k ) .

5.1.1. Case 1: L Perfectly Known

According to Table 1, we observe that the proposed method slightly outperforms the classical EKF-VSLAM. Two arguments can explain these results:
  • Firstly, the LG modelling takes into account the intrinsic properties of the unknown state belonging to S E ( 3 ) . Consequently, the indeterminacy of the Euler angle estimates is deleted;
  • Secondly, through this modelling, the observation model is freed from the non-linearity introduced by the cosine and sine functions of the rotation matrix in the Euclidean modelling. In this way, the approximation of the covariance estimation error is more precise.

5.1.2. Case 2: L Is Misspecified

In order to challenge the proposed algorithm, we assume that the distance between two ellipses of a pattern is incorrectly specified. Consequently, L is initialized with an initial error on the true value (5 m) equal to 1 m, and added to the state:
X a ( k ) 0 0 L S E ( 3 ) × R 4 .
In order to estimate L with the LG-EKF-VSLAM, the observation model that explicitly involves L is considered. From a numerical point of view, we also have to compute the Jacobian of the observation model according to L.
From the obtained results given in Table 2 and Figure 6, we can observe that the LG-EKF-VSLAM achieves the best results. Indeed, compared to the EKF-VSLAM, it is able to asymptotically converge to the true value of L, in spite of high initialization errors (here, 1 m ) . Indeed, the estimation error becomes inferior to 0.1 m . Moreover, the precision of the estimator, after convergence, is better than for the Euclidean EKF-VSLAM (see Table 2).

5.2. Experiment with Real Data

5.2.1. Experiment Setup

To validate the methodology presented in the previous section (Section 4), we performed an experiment with a monocular camera (PointGrey Blackfly from Flir @7fps), with a 3.8–13-mm Varifocal lens, embedded on a mobile cart. Additionally, an Optitrack motion-capture system provided the ground truth for the pose of the camera and the coded patterns. The camera’s reference trajectory and the true positions of the coded patterns (i.e., the center of the bottom-left ellipse) are represented in Figure 8. The camera’s calibration matrix is given by:
K = 1250.94 0 667.14 0 1251.77 532.89 0 0 1 .
In this experiment, 12 coded patterns, each printed on a A 3 paper, with L = 0.1415 m , were placed in different places of the room, as illustrated in Figure 7 and Figure 8. The camera moved in a circular motion and the fiducial pattern-detection algorithm provided observations for each frame. The state X a ( k ) contains the pose, but also the linear and angular velocity w ( k ) . Indeed, in our experiment, no angular sensor is available; thus, we used a constant velocity model. Consequently, we have:
X a ( k ) = X W C ( k ) 0 4 × 4 0 4 × 4 0 4 × 4 I 3 × 3 v ( k ) 0 1 × 3 1 0 4 × 4 0 4 × 4 0 4 × 4 I 3 × 3 w ( k ) 0 1 × 3 1 S E ( 3 ) × R 6 .
To take into account the circular motion of the mobile in the filter, we propose leveraging a position-prediction model correlating its position and orientation. Indeed, the correlations between these two variables are especially high in the turns.
The proposed model has the following structure:
p W C ( k ) = p W C ( k 1 ) + R W C ( k 1 ) v ( k 1 ) 1 , 0 , 0 δ t + n p δ t .
The translation velocity and rotation matrix are assumed to follow the evolution models (66) and (68). Moreover, we suppose that the angular velocity follows a random walk process with a Gaussian white noise n w . Consequently, the function Ω X ( X a ( k 1 ) ) is written as:
Ω X ( X a ( k 1 ) ) =
( w ( k 1 ) ) δ t , R W C ( k 1 ) v ( k 1 ) 1 , 0 , 0 δ t , 0 1 × 6 .
The computation of the Jacobian of Ω X is provided in Appendix A.2.

5.2.2. Obtained Results

In order to use the method with real data, we first have to define the noise parameters. Concerning the observation model, the covariance noise is taken to be sufficiently high in order to take into account the uncertainties introduced by non-analytical phenomenons. Consequently, we take N = σ n 2 I 3 with σ n = 2 pixels. Concerning the dynamic model, the standard deviation of noises n r , n p , n v , and n w are, respectively, fixed at σ p = 10 4 δ t m . s 1 σ r = 10 2 δ t rad . s 2 , σ v = 10 6 δ t m . s 2 , and σ w = 10 4 δ t rad . s 2 .
In order to quantify the estimation error of the trajectory, we use the Optitrack reference system and compare it with the estimated trajectory given by each of the two methods. This is done by computing the absolute error on the position at each instant of the trajectory, as represented in Figure 9.
In the two approaches, we first observe, through the Figure 10 and Figure 11, that the estimated position is particularly degraded in some parts of the trajectory, especially between the coded patterns 9 and 11 and 6 and 7. This is due to the pattern detector, which does not provide any detection because of blur and specular reflection on the objects. Consequently, during these instants, the estimation is only realized by dead-reckoning navigation, and the non-linearities of the models increases the pose-estimation error. In Figure 9, we observe that the performance of the two approaches are quite similar in terms of absolute error estimation, even if the LG-EKF-VSLAM performs a little bit better all over the trajectory. This is confirmed through Figure 11. Indeed, we observe that during the last turn of the trajectory, the EKF-VSLAM tends to deviate from the ground truth, implying a higher estimation error.

6. Conclusions

In this paper, we propose an original modelling on LGs in order to perform monocular VSLAM. After reformulating the classical pinhole on LGs, we propose a EKF-VSLAM algorithm on LG for coded-pattern landmarks. We take into account the particularity of the SLAM problem by proposing a LG method to initialize new patterns. Our algorithm has been compared with a classical EKF-VSLAM using a Euler parametrization on simulated data as well as on real-world data. Both the simulation and the experiments showed the validity and the performance improvement provided by the proposed LG-EKF-VSLAM, compared to the state-of-the-art solution. In addition, we have shown that a misspecified value on the size of the coded pattern has a large impact on the estimation solution. When this parameter is taken into account as a new state in the estimation process, the EKF-VSLAM reduces the estimation error, but is outperformed by the LG-EKF-VSLAM, which converges asymptotically to the true value of L. This is mostly due to the LG formalism, which copes better with system non-linearities.
A pertinent perspective would be to propose a LG-based approach by taking into account the lack of knowledge of non-estimated parameters models. In our case, L could be considered uncertain, but not estimated. There exist Euclidean filtering approaches dedicated to this problem [37]; consequently, it would be pertinent to adapt them for parameters lying on LGs.
Moreover, in a general way, it would be interesting to adapt the formalism of LG (especially the optimization method) to other applications in computer vision and signal processing. For instance, we can quote the well-known problems of Structure of Motion (SfM) [38] and image registration [39].
From an application point of view, another perspective would be to fuse the camera measurements with, for instance, information from Ultra-Wide-Band sensors or with a Global Navigation Satellite System for outdoor navigation, in order to improve the performance of the algorithm, especially in situations where camera frames can be lost or where no patterns are present in the close environment of the vehicle.

Author Contributions

Conceptualization, S.L. and D.V.; methodology, S.L.; S.L. designed the models; writing—original draft preparation, S.L.; writing—review and editing, S.L., G.P. and D.V.; experiments were designed by G.P. and D.V., and were equally realized with S.L.; supervision, G.P. and D.V.; project administration, G.P.; funding acquisition, G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the DGA/AID (French Secretary of Defense) under grant n 2018.60.0072.00.470.75.01.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request to the corresponding author. The data belong to ISAE-SUPAERO and are not publicly available.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Computation of C(k−1)

It should be remembered that:
C ( k 1 ) = Ω ( X ^ ( k 1 | k 1 ) Exp G ( δ ) ) δ δ = 0 .

Appendix A.1. Case of Ω X ( X a ( k ) ) from (69)

We know that Ω X : G R 9 is defined by:
Ω X ( X ( k 1 ) ) = w ( k 1 ) δ t , v ( k 1 ) δ t , 0 1 × 3 .
Consequently, the Jacobian of Ω , according to X ( k 1 ) , is written as:
C ( k 1 ) = J R , J p , J v , 0 9 × 6 m k 1 R 9 × 9 ,
with:
J R = Ω X ( R ^ W C ( k 1 | k 1 ) Exp S O ( 3 ) ( δ R ) , . , . δ R δ R = 0
J p = Ω X ( . , p ^ W C ( k 1 | k 1 ) + δ p , . ) δ p δ p = 0
J v = Ω X ( . , . , v ^ ( k 1 | k 1 ) + δ v ) δ v δ v = 0 .
Due to the structure of Ω X , we show that:
J R = 0 9 × 3
J p = 0 9 × 3
J v = 0 6 × 3 , δ t I 3 × 3 .

Appendix A.2. Case of Ω X ( X a ( k ) ) from (78)

We know that Ω X : G R 12 is defined by:
Ω X ( X a ( k 1 ) ) =
( w ( k 1 ) ) , R ( k 1 ) v ( k 1 ) 1 , 0 , 0 δ t , 0 1 × 6 .
The Jacobian has the following structure:
C ( k 1 ) = J R , J p , J v , J w , 0 12 × 6 m k 1 R 12 × 12 ,
where J R , J p , and J v are defined by (A4), (A5), and (A6). We show that:
J R = 0 3 × 3 , G so ( 3 ) ( 1 ) u ( k 1 ) δ t , G so ( 3 ) ( 6 ) u ( k 1 ) δ t , 0 3 × 6
u ( k 1 ) = v ^ ( k 1 | k 1 ) 1 , 0 , 0
J p = 0 12 × 3
J v = 0 3 × 3 , R ^ W C ( k 1 | k 1 ) 1 , 0 , 0 δ t , 0 3 × 6
J w = 0 3 × 9 , I 3 × 3 δ t ,
where { G so ( 3 ) ( l ) } l = 1 3 G so ( 3 ) ( l ) R 3 × 3 is a vector basis of Lie algebra so ( 3 ) .

Appendix B. Computation of J(k)

It should be remembered that:
J ( k ) = J c ( k ) 0 2 × 6 ( j 1 ) J p ( j ) 0 2 × 6 ( m k j ) I 4 × 1 .
Consequently, we have to compute two quantities: J c ( k ) and J p ( j ) .

Appendix B.1. Computation of J c ( k )

J c ( k ) corresponds to the Jacobian of the observation, according to X ^ a ( k | k 1 ) . Consequently:
J c ( k ) = h ( X ^ a ( k | k 1 ) Exp G ( δ c ) , X ^ W P ( j ) ) δ δ = 0
= h ( X ^ W C ( k | k 1 ) Exp S E ( 3 ) ( δ c ) ) δ c , h ( . , g ( k ) + δ g ) δ g 0 p × 1 δ c = 0 , δ g = 0 .
Furthermore:
h ( X ^ W C ( k | k 1 ) Exp S E ( 3 ) ( δ c ) , X ^ W P ( j ) ) δ c δ c = 0 = K I 3 × 4 X ^ W C ( k | k 1 ) Exp S E ( 3 ) ( δ c ) 1 X ^ W P ( j ) q i , 1 δ c I 3 × 4 X ^ W C ( k ) Exp S E ( 3 ) ( δ c ) 1 X ^ W P ( j ) q i , 1 3 δ c = 0 .
In order to compute this Jacobian, we need to know:
the classical derivative product rule according to δ c R 6 ;
the fact that Exp S E ( 3 ) ( δ c ) δ c l δ c = 0 = G se ( 3 ) ( l ) l 1 , 6 , where G se ( 3 ) ( l ) R 4 × 4 is the lth vector basis of Lie algebra se ( 3 ) .
Consequently, by computing:
d ( l ) = h ( X W C ( k ) Exp S E ( 3 ) ( δ c ) , X W P ( j ) ) δ c l δ c = 0 R 3 ,
J c ( k ) can be obtained in the following way:
J c ( k ) = d ( 1 ) , , d ( 6 ) , 0 p × 1 .
d ( l ) is, thus, obtained thanks to the product rule:
d ( l ) = 1 α 2 { K I 3 × 4 G se ( 3 ) ( l ) X W C ( k ) 1 X W P ( j ) q i , 1 α + I 3 × 4 G se ( 3 ) ( l ) X W C ( k ) 1 X W P ( j ) q i , 1 3 β } ,
with:
α = I 3 × 4 X ^ W C k 1 X ^ W P j q i , 1 3 ( A 23 ) β = K I 3 × 4 X ^ W C k 1 X ^ W P j q i , 1 3 . ( A 24 )

Appendix B.2. Computation of J p ( j )

J p ( j ) corresponds to the Jacobian of the observation, according to X ^ W P ( j ) . Thus:
J p ( j ) = h ( X ^ a ( k | k 1 ) , X ^ W P ( j ) Exp S E ( 3 ) ( δ p ) ) δ p δ p = 0
= K I 3 × 4 X ^ W C ( k ) 1 X ^ W P ( j ) Exp S E ( 3 ) ( δ p ) q i , 1 δ p I 3 × 4 X ^ W C ( k ) 1 X ^ W P ( j ) Exp S E ( 3 ) ( δ p ) q i , 1 3 δ p = 0 .
By using the classical derivative product rule, according to δ p R 6 , we obtain:
f ( l ) = h ( X ^ W C ( k ) , X ^ W P ( j ) Exp S E ( 3 ) ( δ p ) ) δ p l δ p = 0 = 1 α 2 { K I 3 × 4 X ^ W C ( k ) 1 X ^ W P ( j ) G se ( 3 ) ( l ) q i , 1 α I 3 × 4 X ^ W C ( k ) 1 X ^ W P ( j ) G se ( 3 ) ( l ) q i , 1 3 β } .
Thus, J p ( j ) is given by:
J p ( j ) = f ( 1 ) , , f ( 6 ) .

Appendix C. Expression of ϕ G ( . )

Appendix C.1. Case of SO(3)

The left Jacobian of S O ( 3 ) is given by the following expression [26]:
ϕ S O ( 3 ) ( a ) = I + 1 cos ( a ) a 2 a × + a sin ( a ) a 3 a × 2 .

Appendix C.2. Case of SE(3)

The left Jacobian of S O ( 3 ) is given by the following expression [32]:
ϕ S E ( 3 ) ( a ) = ϕ S O ( 3 ) ( a r ) Q ( a ) 0 3 × 3 ϕ S O ( 3 ) ( a p ) ,
with a = a r R 3 , a p R 3 and
Q ( a ) = 1 2 a p × + a r sin ( a r ) a r 3 × a r × a p × + a p × a r × + a r × a p × a r × 1 a r 1 2 cos ( a r ) a r 4 × a r × 2 a p + a p a r × 2 3 a r × a r × a r × 2 1 2 1 a r cos ( a r ) a r 4 3 a r sin ( a r ) a r 3 6 a r 5 × a r × a p × a r × 2 + a r × 2 a p × a r × .

Appendix C.3. Case of G

In our case, the LG of interest is written as G = S E ( 3 ) × R p × S E ( 3 ) m k , such that:
ϕ G ( a ) = ϕ S E ( 3 ) ( a 1 ) 0 0 0 ϕ R p ( a 2 ) 0 0 0 ϕ S E ( 3 ) m k ( a 3 ) ,
with a = a 1 R 6 , a 2 R 3 , a 3 R 6 m k . As R p is a commutative LG, ϕ R p ( a 2 ) = I p × p . Furthermore, ϕ S E ( 3 ) m k ( a 3 ) is the block diagonal concatenation of m k repetitions of ϕ S E ( 3 ) ( . ) .

References

  1. Engel, J.; Schöps, T.; Cremers, D. LSD-SLAM: Large-Scale Direct Monocular SLAM. In Computer Vision—ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Springer International Publishing: Cham, Switzerland, 2014; pp. 834–849. [Google Scholar]
  2. Khairuddin, A.R.; Talib, M.S.; Haron, H. Review on simultaneous localization and mapping (SLAM). In Proceedings of the IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, 27–29 November 2015; pp. 85–90. [Google Scholar]
  3. Triggs, B.; McLauchlan, P.; Hartley, R.; Fitzgibbon, A. Bundle Adjustment—A Modern Synthesis. In Proceedings of the International Workshop on Vision Algorithms, Corfu, Greece, 21–22 September 1999. [Google Scholar]
  4. Mouragnon, E.; Lhuillier, M.; Dhome, M.; Dekeyser, F.; Sayd, P. 3D reconstruction of complex structures with bundle adjustment: An incremental approach. In Proceedings of the 2006 IEEE International Conference on Robotics and Automation, Orlando, FL, USA, 15–19 May 2006; pp. 3055–3061. [Google Scholar]
  5. Guivant, J.; Nebot, E. Optimization of the simultaneous localization and map-building algorithm for real-time implementation. IEEE Trans. Robot. Autom. 2001, 17, 242–257. [Google Scholar] [CrossRef] [Green Version]
  6. Harris, C.; Stephens, M. A combined corner and edge detector. In Proceedings of the Fourth Alvey Vision Conference, Manchester, UK, 31 August–2 September 1988; pp. 147–151. [Google Scholar]
  7. Rosten, E.; Drummond, T. Machine learning for high-speed corner detection. In Proceedings of the European Conference on Computer Vision, Graz, Austria, 7–13 May 2006. [Google Scholar]
  8. Bowman, S.L.; Atanasov, N.; Daniilidis, K.; Pappas, G.J. Probabilistic data association for semantic SLAM. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 1722–1729. [Google Scholar]
  9. Lim, H.; Lee, Y.S. Real-time single camera SLAM using fiducial markers. In Proceedings of the ICCAS-SICE, Fukuoka, Japan, 18–21 August 2009; pp. 177–182. [Google Scholar]
  10. Houben, S.; Droeschel, D.; Behnke, S. Joint 3D laser and visual fiducial marker based SLAM for a micro aerial vehicle. In Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Baden-Baden, Germany, 19–21 September 2016; pp. 609–614. [Google Scholar]
  11. Kalaitzakis, M.; Cain, B.; Carroll, S.; Ambrosi, A.; Whitehead, C.; Vitzilaios, N. Fiducial Markers for Pose Estimation Overview, Applications and Experimental Comparison of the ARTag, AprilTag, ArUco and STag Markers. J. Intell. Robot. Syst. 2021, 101, 71. [Google Scholar] [CrossRef]
  12. Fiala, M. ARTag, a fiducial marker system using digital techniques. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005; Volume 2, pp. 590–596. [Google Scholar]
  13. Pilté, M.; Bonnabel, S.; Barbaresco, F. Tracking the Frenet-Serret frame associated to a highly maneuvering target in 3D. In Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control (CDC), Melbourne, VIC, Australia, 12–15 December 2017; pp. 1969–1974. [Google Scholar]
  14. Mika, D.; Jozwik, J. Lie groups methods in blind signal processing. Sensors 2020, 20, 440. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Bourmaud, G.; Giremus, A.; Berthoumieu, Y.; Mégret, R. Discrete Extended Kalman Filter on Lie groups. In Proceedings of the EUSIPCO, Marrakech, Morocco, 9–13 September 2013. [Google Scholar]
  16. Bonnabel, S.; Martin, P.; Salaün, E. Invariant Extended Kalman Filter: Theory and application to a velocity-aided attitude estimation problem. In Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, Shanghai, China, 15–18 December 2009; pp. 1297–1304. [Google Scholar]
  17. Wang, J.; Zhang, C.; Wu, J.; Liu, M. An Improved Invariant Kalman Filter for Lie Groups Attitude Dynamics with Heavy-Tailed Process Noise. Machines 2021, 9, 182. [Google Scholar] [CrossRef]
  18. Brossard, M.; Bonnabel, S.; Condomines, J.P. Unscented Kalman filtering on Lie groups. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 2485–2491. [Google Scholar]
  19. Ćesić, J.; Marković, I.; Bukal, M.; Petrović, I. Extended information filter on matrix Lie groups. Automatica 2017, 82, 226–234. [Google Scholar] [CrossRef]
  20. Marjanovic, G.; Solo, V. An engineer’s guide to particle filtering on matrix Lie groups. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 3969–3973. [Google Scholar]
  21. Brossard, M.; Bonnabel, S.; Barrau, A. Invariant Kalman Filtering for Visual Inertial SLAM. In Proceedings of the 2018 21st International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 2021–2028. [Google Scholar]
  22. Lenac, K.; Marković, I.; Petrović, I. Exactly sparse delayed state filter on Lie groups for long-term pose graph SLAM. Int. J. Robot. Res. 2018, 37, 585–610. [Google Scholar] [CrossRef]
  23. Bourmaud, G.; Mégret, R. Robust large scale monocular visual SLAM. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1638–1647. [Google Scholar]
  24. Hui, C.; Shiwei, M. Visual SLAM based on EKF filtering algorithm from omnidirectional camera. In Proceedings of the 2013 IEEE 11th International Conference on Electronic Measurement Instruments, Harbin, China, 16–19 August 2013; Volume 2, pp. 660–663. [Google Scholar]
  25. Selig, J.M. Geometrical Methods in Robotics; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
  26. Solà, J.; Deray, J.; Atchuthan, D. A micro Lie theory for state estimation in robotics. arXiv 2018, arXiv:1812.01537. [Google Scholar]
  27. Faraut, J. Analysis on Lie Groups: An Introduction; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
  28. Chirikjian, G. Information-Theoretic Inequalities On Unimodular Lie Groups. J. Geom. Mech. 2010, 2, 119–158. [Google Scholar] [CrossRef] [PubMed]
  29. Koks, D. A Roundabout Route to Geometric Algebra. In Explorations in Mathematical Physics: The Concepts Behind an Elegant Language; Springer Science: Berlin/Heidelberg, Germany, 2006; Chapter 4; pp. 147–184. [Google Scholar]
  30. Said, S.; Bombrun, L.; Berthoumieu, Y.; Manton, J.H. Riemannian Gaussian Distributions on the Space of Symmetric Positive Definite Matrices. IEEE Trans. Inf. Theory 2017, 63, 2153–2170. [Google Scholar] [CrossRef] [Green Version]
  31. Pennec, X. Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Math. Imaging Vis. 2006, 25, 127–155. [Google Scholar] [CrossRef] [Green Version]
  32. Barfoot, T.; Furgale, P.T. Associating Uncertainty With Three-Dimensional Poses For Use In Estimation Problem. IEEE Trans. Robot. 2014, 30, 679–693. [Google Scholar] [CrossRef]
  33. Sola, J.; Monin, A.; Devy, M.; Lemaire, T. Undelayed initialization in bearing only SLAM. In Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, AB, Canada, 2–6 August 2005; pp. 2499–2504. [Google Scholar]
  34. Bishop, C. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  35. Bourmaud, G.; Mégret, R.; Giremus, A.; Berthoumieu, Y. From Intrinsic Optimization to Iterated Extended Kalman Filtering on Lie Groups. J. Math. Imaging Vis. 2016, 55, 284–303. [Google Scholar] [CrossRef] [Green Version]
  36. Huynh, D.Q. Metrics for 3D Rotations: Comparison and Analysis. J. Math. Imaging Vis. 2009, 35, 155–164. [Google Scholar] [CrossRef]
  37. Chauchat, P.; Vilà-Valls, J.; Chaumette, E. Robust Linearly Constrained Invariant Filtering for a Class of Mismatched Nonlinear Systems. IEEE Control Syst. Lett. 2022, 6, 223–228. [Google Scholar] [CrossRef]
  38. Ozyesil, O.; Voroninski, V.; Basri, R.; Singer, A. A Survey of Structure from Motion. Acta Numerica 2017, 26, 305–364. [Google Scholar] [CrossRef]
  39. Zitová, B.; Flusser, J. Image registration methods: A survey. Image Vis. Comput. 2003, 21, 977–1000. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Representation of the coded pattern and its local frame. The detection algorithm detects the red points q 1 = 0 , 0 , 0 , q 2 = 0 , L , 0 , q 3 = L , 0 , 0 , and q 4 = L , L , 0 .
Figure 1. Representation of the coded pattern and its local frame. The detection algorithm detects the red points q 1 = 0 , 0 , 0 , q 2 = 0 , L , 0 , q 3 = L , 0 , 0 , and q 4 = L , L , 0 .
Remotesensing 14 00571 g001
Figure 2. Geometrical representation of the projection camera model.
Figure 2. Geometrical representation of the projection camera model.
Remotesensing 14 00571 g002
Figure 3. Geometrical relationship between T I G and T X G through the left tangent application.
Figure 3. Geometrical relationship between T I G and T X G through the left tangent application.
Remotesensing 14 00571 g003
Figure 4. Relation between the Lie group, the Lie algebra, and the Euclidean space. For each element a R m , we associate an element a in Lie algebra, which is converted in a element X of LG G, thanks to the operator Exp G ( . ) .
Figure 4. Relation between the Lie group, the Lie algebra, and the Euclidean space. For each element a R m , we associate an element a in Lie algebra, which is converted in a element X of LG G, thanks to the operator Exp G ( . ) .
Remotesensing 14 00571 g004
Figure 5. Simulated true trajectory and ellipse center positions of 9 coded labelled patterns.
Figure 5. Simulated true trajectory and ellipse center positions of 9 coded labelled patterns.
Remotesensing 14 00571 g005
Figure 6. Evolution of the RMSE on L for N r = 500 realizations.
Figure 6. Evolution of the RMSE on L for N r = 500 realizations.
Remotesensing 14 00571 g006
Figure 7. First image frame integrated in the detector: it computes the ellipse coordinates of the pattern with labels 18, 19, and 21.
Figure 7. First image frame integrated in the detector: it computes the ellipse coordinates of the pattern with labels 18, 19, and 21.
Remotesensing 14 00571 g007
Figure 8. Ground truth obtained by an Optitrack motion capture system in the XY–plane, superimposed to the position of the center of the bottom-left ellipse of each coded pattern.
Figure 8. Ground truth obtained by an Optitrack motion capture system in the XY–plane, superimposed to the position of the center of the bottom-left ellipse of each coded pattern.
Remotesensing 14 00571 g008
Figure 9. Absolute position error obtained for the two methods.
Figure 9. Absolute position error obtained for the two methods.
Remotesensing 14 00571 g009
Figure 10. Estimated trajectory for EKF–VSLAM.
Figure 10. Estimated trajectory for EKF–VSLAM.
Remotesensing 14 00571 g010
Figure 11. Estimated trajectory for LG–EKF–VSLAM.
Figure 11. Estimated trajectory for LG–EKF–VSLAM.
Remotesensing 14 00571 g011
Table 1. Simulation results with known L. Obtained performance estimation for N r = 500 realizations.
Table 1. Simulation results with known L. Obtained performance estimation for N r = 500 realizations.
LG-EKF-VSLAMEKF-VSLAM
RMSE position (m)0.2980.427
RMSE orientation (rad)0.00480.0055
RPE position (m)0.01720.0205
RPE orientation (rad) 5.12 × 10 4 6 × 10 4
Table 2. Simulation results with misspecified L. Obtained performance estimation for N r = 500 realizations.
Table 2. Simulation results with misspecified L. Obtained performance estimation for N r = 500 realizations.
LG-EKF-VSLAMEKF-VSLAM
RMSE position (m)0.7581.257
RMSE orientation (rad)0.01580.065
RPE position (m)0.03980.0924
RPE orientation (rad) 7.2 × 10 4 7.7 × 10 4
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Labsir, S.; Pages, G.; Vivet, D. Lie Group Modelling for an EKF-Based Monocular SLAM Algorithm. Remote Sens. 2022, 14, 571. https://doi.org/10.3390/rs14030571

AMA Style

Labsir S, Pages G, Vivet D. Lie Group Modelling for an EKF-Based Monocular SLAM Algorithm. Remote Sensing. 2022; 14(3):571. https://doi.org/10.3390/rs14030571

Chicago/Turabian Style

Labsir, Samy, Gaël Pages, and Damien Vivet. 2022. "Lie Group Modelling for an EKF-Based Monocular SLAM Algorithm" Remote Sensing 14, no. 3: 571. https://doi.org/10.3390/rs14030571

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop