^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Simultaneous Localization and Mapping (SLAM) is perhaps the most fundamental problem to solve in robotics in order to build truly autonomous mobile robots. The sensors have a large impact on the algorithm used for SLAM. Early SLAM approaches focused on the use of range sensors as sonar rings or lasers. However, cameras have become more and more used, because they yield a lot of information and are well adapted for embedded systems: they are light, cheap and power saving. Unlike range sensors which provide range and angular information, a camera is a projective sensor which measures the bearing of images features. Therefore depth information (range) cannot be obtained in a single step. This fact has propitiated the emergence of a new family of SLAM algorithms: the Bearing-Only SLAM methods, which mainly rely in especial techniques for features system-initialization in order to enable the use of bearing sensors (as cameras) in SLAM systems. In this work a novel and robust method, called

Simultaneous Localization and Mapping (SLAM) is perhaps the most fundamental problem to solve in robotics in order to build truly autonomous mobile robots. SLAM treats of the way how a mobile robot can operate in an

The robot’s sensors have a large impact on the algorithm used for SLAM. Early SLAM approaches focused on the use of range sensors as sonar rings or lasers e.g., [

The aforementioned issues have propitiated that recent work moves towards the use of cameras as the primary sensing modality. Cameras have become more and more interesting for the robotic research community, because they yield a lot of information for data association, although this problem remains latent. Cameras are well adapted for embedded systems: they are light, cheap and power saving. Using vision, a robot can localize itself using common objects as landmarks.

On the other hand, while range sensors (

In this context, a camera connected to a computer becomes a position sensor which could be applied to different fields such as robotics (motion estimation for generally moving robots humanoids), wearable robotics (motion estimation for camera equipped devices worn by humans), tele-presence (head motion estimation using an outward-looking camera), or television (camera motion estimation for live augmented reality) [

Usually the Bearing-Only SLAM has been associated with vision-based SLAM systems, possibly because cameras are by far the most popular bearing sensor used in robotics. In that sense, the use of alternative bearing sensors (

In recent years several important improvements and variants to this kind of methods have appeared [

In this work a novel and robust method called

Bearing-Only SLAM has received most attention in the current decade. Therefore many of the related approaches are actually very recent. In [

Also there are some works that use other estimation techniques (apart to the EKF) in Bearing-Only SLAM like the Particle Filters (PF) based methods. In a series of papers, Kwok uses Particle Filters: in [

Some of the most notably advances on Bearing-Only SLAM have been presented by Davison [

Jensfelt in [

In [

In [

Montiel

On the other hand, experiments show that initial fixed parameters can affect the robustness of the method, especially when an initial metric reference is used in order to recover/set the scale of the map. This fact motivated the authors to develop in [

Let us consider a bearing sensor, with a limited field of view, moving freely in 2DOF. The sensor state x̂_{v}_{v}, y_{v}, θ_{v}_{x}, v_{y}, v_{θ}

At every step it is assumed an unknown linear and angular acceleration with zero mean and known covariance Gaussian processes, ^{W} and α^{W}, producing an impulse of linear and angular velocity:

The sensor motion prediction model is:

An Extended Kalman Filter propagates the sensor pose and velocity estimates, as well as feature estimates.

The complete state x̂ that includes the features ŷ is made of:
_{i}, y_{i}_{i} represents the azimuth (respect to the world reference _{i}). The point depth _{i}_{i}_{i}

The use of an inverse depth parameterization for bearing-only SLAM can improve the linearity of the measurement equation even for small changes in the sensor position (corresponding to small changes in the parallax angle), this fact allows a Gaussian distribution to cover uncertainty in depth which spans a depth range from nearby to infinity. It is well known the relevance of a good uncertainty Gaussian representation in a scheme based in EKF [

_{θ}

The different locations of the sensor, along with the location of the already mapped features, are used to predict the feature angle _{i}

atan2 is a two-argument function that computes the arctangent of

In implementation using real data, features search could be constrained to regions around the predicted _{i}_{i}_{i}P_{k+1}_{i}_{i}_{k+1} is the prior state covariance, and measurements

As it was stated before, depth information cannot be obtained in a single measurement when bearing sensors are used. To infer the depth of a feature, the sensor must observe it repeatedly as the sensor freely moves through its environment, estimating the angle from the feature to its center. The difference between angle measurements is the feature parallax. Actually, parallax is the key that allows to estimating features depth. In the case of indoor sequences, centimeters are enough to produce parallax, on the other hand, the more distant the feature, the more the sensor has to travel to produce parallax. Therefore, in order to incorporate new features to the map, special techniques for features system-initialization are needed in order to enable the use of bearing sensors in SLAM systems.

Let us consider two methods, which represent the main approaches (undelayed and delayed) for addressing the initialization problem.

For the Inverse depth (ID) undelayed method presented in [_{min}

Using the inverse depth parameterization, while the feature is observed at low parallax, the feature will be used mainly to determine the sensor orientation but the feature depth will be kept quite uncertain; if the sensor translation is able to produce a parallax big enough then the feature depth estimation will be improved.

For the ID-Undelayed method (_{v}_{v}_{v}_{v}_{θ}_{i}_{min} expressed as inverse depth:

In [_{min}_{i}_{ρ} = 0.25. The new system state x̂ is conformed simply adding the new feature ŷ_{i} to the final of the vector state:

The state covariance after feature initialization is defined by:

In experiments using the undelayed initialization, it often happens that the inverse depth becomes negative after a Kalman update, due to the observation noise that predominates over the update of the depth, but there are simple solutions to solve this problem. Moreover, when an initial metric reference is used in order to recover/set the scale of the map (very relevant for robotics applications), initial fixed parameters (inverse depth and uncertainty) must be tuned in order to ensure convergence.

_{o}_{ρ}_{o}_{ρ}_{o}_{ρ}_{o}_{ρ}

The issues mentioned above suggest us that the initial inverse depth and their associated initial uncertainty of the new features added to the map could be treated before to be added to the system state instead of use fixed initial depth and uncertainty. In [

For the ID-Delayed method, a new feature _{new}

When a feature is detected the first time

The values _{1}, _{1} and _{1} represent the current robot position; _{1}^{x}, _{1}^{y} and _{1}^{θ} represent their associated variances taken from the state covariance matrix P_{k} and z_{1} is the first bearing measurement to the landmark. In subsequent instants _{min}

The parallax α is estimated using:

The base-line

_{i}_{1}, _{1}, _{1}, _{1}, _{1}^{x}_{1}^{y}_{1}^{θ}

The current state (_{v}, _{v}, _{v}, ^{x}_{v}, ^{y}_{v}, ^{θ}_{v}).

For each candidate point λ_{i}, every time that a new bearing measurement

The angle _{1} and the vector _{1} defines the base-line

The angle _{2} and the vector _{2} defining the base line in the opposite direction of the sensor trajectory by:
_{1} · _{1}) is the dot product between _{1} and _{1}. The directional vector _{1}, expressed in the absolute frame W, points from the sensor location to the direction when the landmark was observed for the first time, and is computed using the data stored in λ_{i} denoting the bearing _{i}_{2} expressed in the absolute frame W is computed in a similar way as _{1} but using the current sensor position _{v}_{i}

_{1} is the vector representing the robot base-line between the robot center position _{1}, _{1} stored in _{i}_{v}, _{v}). _{2} is equal to _{1} but pointing to the opposite direction. The base-line _{2} or _{1}:

If α > α_{min} then _{i}_{i}. The threshold α_{min} can be established depending on the accuracy of the bearing sensor. Depth uncertainty is reasonably well minimized when

For a new feature ŷ_{i,} values of _{i}_{i}_{i}_{i}

The variance _{ρ}_{new} is derived from the error diagonal covariance matrix _{i}

For reasons of simplicity _{i}_{z}_{1} and _{i}_{1}^{x}_{1}^{y}_{1}^{θ}_{z} is constant and is not stored previously in

The new state covariance matrix, after initialization, is:

Note that unlike the ID-Undelayed method there is not an implicit initial uncertainty in depth σ_{ρ} (

In an Undelayed approach, when a feature is added to the map the first time that it has been observed, its depth is modeled with a huge uncertainty. In that sense, this new feature does not provide any depth information. However, at this stage the benefit of the Undelayed approach is that features provide information about the sensor orientation from the beginning.

On the other hand, it can be useful to wait until the sensor movement produces some degrees of parallax, (gathering depth information) in order to improve robustness, especially when an initial metric reference is used for recovering scale. Moreover, when cameras are used in real cluttered environments, the delay can be used for efficiently reject weak features, thus initializing only the best features as new landmarks to the map.

This section presents a novel and robust method, called

When a feature is detected for the first time _{L(i)} which is composed by the 3-dimension state vector:

_{v}_{v}_{v}_{v}_{θ}_{L(i)} defines a directional vector, expressed in the absolute frame W, which represents the direction of the landmark from the sensor, when it was observed for the first time. The covariance matrix P is updated in the same manner as

Parallel to the system state (represented by the state vector x̂), a state vector x̂_{can} is used for estimating (via an extra linear Kalman Filter) the feature depth of each landmark ŷ_{L}. The state x̂_{can} is not directly correlated with the map.

Every time a new feature ŷ_{L(i)} is initialized in x̂, the state x̂_{can} is augmented as:
_{i} is a 3-dimension vector composed by:

For each λ_{i}, α_{i} is the estimated parallax, Δα_{i} is the rate of change in parallax and ρ_{i}

The covariance matrix of x̂_{can}, P_{can}, is augmented simply by:

The three initial values of λ_{i} are set to zero, and the initial values of R_{c} have been heuristically determined as: R_{c} =

While the sensor moves through its environment, it can observe repeatedly a landmark ŷ_{L(i)}, at each iteration generating a new angle measurement _{can}) in order to infer the landmark depth. For each new measurement _{i}_{L(i)} an iteration of the filter is executed.

The state transition model for each λ_{i} is:

A process noise w_{k}_{k}_{k}^{−7}, 10e^{−9}) have been used.

The measurement prediction model is directly obtained from the state. On the other hand, the measurements _{can}_{L(i)}, (ii) the sensor state x̂_{v}_{i}

_{α}_{ρ}_{1} and _{1} are taken from ŷ_{L(i)}, so _{1} = _{i}_{1} = _{i}

The implicit uncertainties in the estimation of the function _{z}_{can}:
_{z}_{z}_{can}. P_{t} is formed by:

All the components of P_{t}, except σ_{z} (the error variance of the bearing sensor) are taken directly from the covariance matrix P of the system state x̂. P_{x̂v} is the submatrix of P corresponding to the covariance of the sensor state x̂_{v}. P_{ŷL(i)} is the covariance of the feature ŷ_{L(i)}. P_{x̂vŷL(i)} and P_{ŷL(i)x̂v} are the correlations between x̂_{v} and ŷ_{L(i)}.

R_{can} is used in the Kalman update equations for estimating the innovation covariance matrix _{i}

Features expressed in the form of ŷ_{L(i)} are very useful to estimate the sensor orientation _{v}_{v}_{v}_{L(i)}, a small sensor translation is enough to produce some parallax and thus to infer depth.

The state x̂_{can} encloses the parallax α_{i} and inverse depth ρ_{i}_{L(i)}. _{i} (upper plot) and _{i}_{can} (taken from

A minimum parallax threshold α_{min} is used for updating a feature ŷ_{L(i)} as ŷ_{i}. Distant features will not produce parallax and therefore will remain expressed as ŷ_{L(i)}, but contributing to the estimation of sensor orientation _{v}

On the other hand, if α_{i} > α_{min} then:
_{i}_{can}. The covariance matrix P is transformed by the corresponding Jacobian:
_{ρ}^{y} the variance of the inverse depth estimation for ŷ_{L(i)} and taken from P_{can}. The constant _{i}

When a feature ŷ_{L(i)} is updated as ŷ_{i}, then its corresponding values will be removed from the linear Kalman filter responsible for estimating the state x̂_{can}.

At any time, the map can include both kind of features ŷ_{L(i)} and ŷ_{i}. Thus each kind of feature has its own measurement prediction model:

- For features ŷ_{i} measurement

- Features ŷ_{L(i)} are supposed to be very far from the sensor and therefore it is assumed that its corresponding bearing measurement _{i}

- For near features ŷ_{L(i)}, the bearing measurement _{i}_{z} for features ŷ_{L(i),} (in update Kalman equations) is multiplied by a high value ^{10} were used in experiments). An interesting issue, to be treated for further work, could be the dynamical estimation of parameter

Both kinds of measurement prediction models must be used together, if several measurements _{i}_{L(2)} and ŷ_{3} simultaneously, then:

Several simulations have been executed in order test the performance of the proposed method in relation to other approaches. Simulations are extremely helpful when different methods are compared among others, because numerous variables (inherent to real environments) are obviated (e.g., data association), or become constants (

^{W}_{x} = 4 m/s^{2}, ^{W}_{y} = 4 m/s^{2} and ^{W}_{θ} = 2 rad/s^{2} were used. The only input sensor of the system is a noisy sensor capable of measuring the bearing of features, with a limited field of view of 110° (emulating a 2-DOF camera). A standard deviation of σ = 1° is considered for the sensor readings.

At the begin of the sequence (plot _{L(i)}, defining a ray (in cyan color). Note that three feature points (in yellow color) have been previously added to the map as a priori known landmarks in order to define/set the scale of the world. Around step 100 (plot _{L(i)} has been transformed to a ŷ_{i} feature (blue color). Also note that some uncertainty (especially in depth) remains at the moment of the transformation. By the last step, at 250 iterations, (plot _{L(i)} but still contributing to the estimation of the sensor orientation.

In order to show the performance of the Concurrent method proposed in this article, a comparative study between the ID-Undelayed method [

The _{k}

Four different tests were realized to comparing the methods under diverse conditions. ^{W}_{x}, ^{W}_{y} and ^{W}_{θ} in 4m/s^{2}) and the time between “frames” (Δ

In experiments _{ini}_{ρ}_{min} = 10° were used for both ID-Delayed and Concurrent methods. The NEES was estimated over the 3-dimensional robot pose. The average NEES for each method was estimated using N=20 Monte Carlo runs. In simulations the sensor was moved 3 meters every Δ

MATLAB code was run using a 1.73 GHz Pentium M laptop.

For some runs of the algorithms, filter divergence could occur.

_{k}

As it would be expected, the whole methods become optimistic after a certain time [_{k}_{k}_{k}_{k}_{k}_{k}_{L(i)} in the state x̂. On the other hand, the Concurrent method is lees sensitive to the parameter Δ_{k}

This work proposes a novel and robust approach for initializing new features in SLAM systems based in bearing sensors. First, an overview of the problem is given and the most relevant related work is presented. Most of the methods presented in the literature can be classified into two categories: Undelayed and Delayed Methods. In that sense, an analysis of two representative methods of the aforementioned taxonomy is also presented. Undelayed methods provide information of orientation since a feature is first detected, on the other hand, Delayed methods await until some depth information is gathered, improving convergence.

The proposed approach in this article, the concurrent initialization method, takes the best of both, Undelayed and Delayed approaches. The key is to use two kinds of feature representations concurrently for both undelayed and delayed stages of the estimation. The simulations results, based in the average NEES test, showed that the Concurrent method can maintain the filter consistency satisfactorily. Moreover, observing the percentage of convergence, the Concurrent method appears also to be robust.

It is important to mention that the complexity of the Concurrent method is higher than other methods (

This research was conducted in the Automatic Control Department, Universitat Politecnica de Catalunya (UPC) and supported by the Spanish Ministry of Science and Innovation, UbROB Project, CICYT number DPI2007-61452.

Inverse depth parameterization.

Simulation of a point reconstruction from observations with different parallax.

Some issues of the ID-Undelayed initialization.

ID-Delayed parameterization.

Concurrent initialization process.

Parallax and depth estimations for a distant and a close feature.

Sequence illustrating the concurrent initialization process: (a) beginning of initialization; (b) initialization process at step 100; (c) initialization process at step 250.

SLAM simulation using the concurrent initialization method.

Comparison of the average NEES for ID-Undelayed, ID-Delayed and Concurrent methods.

Summary of methods.

Deans [ |
Delayed | Bundle adjustment | EKF |

Strelow [ |
Delayed | Triangulation | IEKF |

Bailey [ |
Delayed | Triangulation | EKF |

Davison [ |
Delayed | Multi-Hypotheses | EKF |

Kwok (a) [ |
Delayed | Multi-Hypotheses | PF |

Kwok (b) [ |
Undelayed | Multi-Hypotheses | PF |

Sola [ |
Undelayed | Multi-Hypotheses | EKF |

Lemaire [ |
Delayed | Multi-Hypotheses | EKF |

Jensfelt [ |
Delayed | Triangulation | EKF |

Eade [ |
Delayed | Single Hypotheses | FastSLAM |

Montiel [ |
Undelayed | Single Hypotheses | EKF |

Munguía[ |
Delayed | Triang.- S. Hypotheses | EKF |

Delayed-Undelayed | KF-Triang.- S. | EKF |

Setup of the tests.

^{W}_{x} |
^{W}_{y} |
^{W}_{θ} |
||
---|---|---|---|---|

4 | 4 | 2 | 1/30 | |

4 | 4 | 2 | 1/120 | |

6 | 6 | 3 | 1/30 | |

6 | 6 | 3 | 1/120 |

Execution time & Failed attempts.

Δ |
Δ |
|||||
---|---|---|---|---|---|---|

ID-Undelayed | 76s | 302s | 2 | 1 | 0 | 2 |

ID-Delayed | 60s | 235s | 9 | 2 | 11 | 4 |

Concurrent | 94s* | 376s* | 0 | 0 | 0 | 0 |