3.1. SLAM Solution with Extended Kalman Filter (EKF-SLAM)
Autonomous navigation is among the main areas of research in the mobile robotics field which often requires a SLAM solution. SLAM algorithms allow a robot to map its environment while concurrently localizing itself within that environment. The trajectory of the robot and the position of the landmark in the map are estimated without the need for a priori knowledge.
By considering a landmark-based description of the environment, the SLAM can be considered a state estimation problem for an uncertain dynamic system, based on noisy measurements. Consider a robot moving in an unknown environment (NAO robot), observing a number of landmarks through its embedded sensor (
Figure 1).
The robot requires a control signal {, , …, } to be able to move from position to using the IMU (inertial measurement unit) which has some uncertainty in its measurement, and the robot takes some observation using its external sensors, e.g., lasers. The robot pose and/or the landmark positions are concurrently estimated. The goal in every iteration is to find the posterior data which refers to the estimated data of robot position (or the whole trajectory) and all landmarks positions together and reduce the measurement errors and observation error. It can be represented in one vector the environment map containing a list of objects.
EKF-SLAM was the first algorithm developed by Smith to solve SLAM problem and is still one of the most influential and widely used SLAMs [
34]. It concurrently solves the online SLAM problem where the map is feature-based. The EKF essentially linearizes the non-linear functions around the current state which is accomplished by the Taylor expansion technique. After this linearization process, a classical Kalman filter is employed to estimate states. The EKF-SLAM estimates the state from a series of noisy measurements (movements and observations), where the noise distribution is assumed Gaussian. The probability density function (PDF) of the estimated state is Gaussian, as well:
where
is a Gaussian probability density with its mean
and covariance matrix
. Therefore, for every iteration of the EKF filter, the uncertainty of the state
will be represented through a column vector
of size
and a covariance matrix
of size
×
, where n is the dimension of the state. In the EKF-SLAM, the probability density of the state transition
must be Gaussian, this means that given the state
at time t − 1 and the movement
at time t, the transition of the state can be written as [
35]:
where
is an additive Gaussian noise with zero mean and covariance matrix
that models the uncertain in the state transition, i.e., P(
) ∼
(0,
). Since the motion model depends only on the previous robot pose and on the current movement
, we can omit in the equation the map vector, i.e.:
The function
doesn’t modify the map. The probability density function of the observation
must be Gaussian as well, that is:
where
is a function that maps the current state in an expected observation
given the landmark
associated to
,
is an additive Gaussian noise with zero mean and covariance matrix
that models the observation noise, i.e., P(
) ∼
(0,
).
In the 2D case, the state
at time t can be represented by the following column vector:
where
=
denotes the robot’s coordinate,
are the coordinates of the
−th landmark
with
its distinctive signature,
= 1, …, N.
The state vector contains only landmark positions () and the robot pose SLAM is not expected to convey more information about the landmark’s description in the environment.
In the extended Kalman filter, the
and functions are linearized using first order Taylor expansion around the state of the system in order to apply the Kalman equations. In the case of the state transition function
we can write:
where
is the n × n jacobian matrix of the function
(n is the dimension of the state). The jacobian usually depends on
and
:
The Kalman filter is completed in two steps: prediction and correction [
36,
37]. In the SLAM problem, during the prediction step the state at time
is updated according to the motion model:
where
and
are the mean and covariance of the state at time
, respectively. During the correction step, for every observation
associated with the landmark
, it is computed an expected observation
and a corresponding Kalman gain (11) that specifies the degree to which the incoming observation corrects the current state estimation (Equations (12) and (13))
where
is the jacobian of the function
:
The standard formulation of the EKF-SLAM solution is not robust to incorrect association of observations to landmarks: an accurate data association is then desirable [
38].
In this work, the noise in motion model is modeled by white Gaussian noise with zero mean and covariance value (0, ) = diag [0.015 0.015 0.0001] and for observation model is (0, )= diag [0.008 0.001].
3.2. Ellipsoidal Set Membership Filter Method for SLAM (Ellipsoidal SLAM)
The approach here is to cast the nonlinear dynamics in a way that is suitable for implementation within the linear set-member filter framework. Specifically, the nonlinear dynamics are linearized about the current estimate in a manner that is similar to the EKF. The remaining terms are then bounded, and they are incorporated into the algorithm as additions to the process or sensor noise bounds [
39]. This allows the solution to be guaranteed for nonlinear systems so long as the bound on the nonlinear term is guaranteed. The proposed method is compatible with many current robust control methods that require hard bounds, as well as with planning algorithms that require guaranteed uncertainty information. It is also computationally efficient for online recursive implementation. The new algorithm is termed the extended set membership filter (ESMF) [
11].
Figure 2 shows the how ESMF works.
Consider a general discrete non-linear system described as:
with its nonlinear output:
where
∈
is the system state and
∈
is the measurement output,
and
are both general nonlinear
functions and the initial state
is known to be bounded by an ellipsoid given as:
where
is the center of the ellipsoid.
∈
and
∈
are process and measurement noise, respectively, which are bounded at each time step
k and satisfy the following inequalities:
where
and
are symmetric and positive definite matrices.
At time step k, the goal is to characterize a set of states represented by a minimized ellipsoid that are consistent with the available measurements and a priori bound constraints; the true state is guaranteed to be contained in a resultant compact ellipsoid .
Note that no assumptions on the structure of the noise are made except the bounds; hence, many types of uncertainties are included within this framework including Gaussian and non-Gaussian uncertainties.
Assuming that
and
are continuously differentiable, and for all estimated values
, Equation (15) is linearized around the current state
at time step k using Taylor expansion yields:
The extended set membership filter (ESMF) method considers the higher order terms (H.O.T.
or the remainder which is ignored in the traditional EKF) as a part of the process noise, and thus the linearized Equation (17) can be rewritten as:
where
a Lagrange remainder is term and
is the nth derivative. The term
can take on any value over an interval for which
is defined. Thus,
can be bounded by simply defining the interval
and evaluating
using interval mathematics. Using interval analysis, the Lagrange remainder term can be expressed as:
where the
is the state interval bound in which
is defined:
For a one-state (i.e., n = 1) linearization case with first order approximation (i.e., nr = 1), the state function in (9).
The reminder or H.O.T. can be bounded by several ways. One approach is to choose the process and sensor noise ellipsoids large enough to assure that they bound both the original noise and the H.O.T. The approach here is to bind the H.O.T. using interval mathematics at each time step. The interval bound can then be bounded using an ellipsoid and combined with the original process noise bound. This approach is more amenable to on-line implementation as compared to similar approaches. Compared to the EKF, the ESMF now bounds the linearization error. In addition, the only restriction on the ESMF is that the Hessian and Jacobin must be continuous over the set of states. The procedure for bounding the general remainder
is shown pictorially in
Figure 3 for a two state case.
It is now possible to recursively estimate state ellipsoid sets of any nonlinear system for which the Jacobian and the Hessian are well defined.
An adaptive ellipsoidal SLAM algorithm can be summarized as follows (Algorithm 1):
Algorithm 1. Adaptive Ellipsoidal SLAM |
Start |
Initialization - |
= 0 |
Get Observation – |
|
= get_observations; Looking for NAO markers |
while not_stop |
Prediction Step – (4) |
Check for safe distance to move by sonar. Move command |
No safe distance. Turn 180 degree |
[] = prediction ( |
= |
|
= |
∈ |
|
Get Observation (5)- Looking for NAO markers by turning head by 15 degrees. |
If NAO find NAO markers to robot frame |
Data Association ( |
Correction _Step - ( |
∈ |
|
+ |
|
= |
Map_Step - ( Add new Naomarks to the map |
No Naomark –Turn Nao by 180 degree and go to step(5) |
|
Check if iteration numbers are achieved |
No Go to step (4) |
K = K + 1 |
End |
where
are filter parameters to be chosen online to minimize the ellipsoid.
The obtained EKF/ellipsoidal SLAMs results are still not very satisfactory, presumably due to the range measurement bias and some other facts of imperfection like false positives in the data association and landmark description, etc. In SLAM, the state vector contains only landmark positions () and robot poses The SLAM alone is not expected to convey more information about landmarks details. The map representation does not include any information about landmark description in the environment and this will lead to false positives in the data association.
Accurate data association and landmark identification is very important in order to avoid convergence towards the wrong SLAM solution. Also, the computational cost of the EKF SLAM is quadratic with respect to a number of landmarks, i.e., O (N). It is a result of maintaining a full covariance matrix of state estimate. Also, the correction step of the EKF will touch every single element of the covariance matrix with a computation cost of O (N), which makes the problem intractable for cases with hundreds of landmarks.
To improve the EKF/ellipsoidal SLAM’s performance, including mapping in terms of reducing the computational effort, landmark identification and recognition, to simplify the data association problem and to improve the trajectory control of the algorithm, the robotic augmented reality (RAR) technique is introduced in this project and used to get better EKF/ellipsoidal SLAMs performance. An overview of augmented reality is included in the next section. Augmented reality is a recent technology which enables users to obtain additional preloaded information from the observation of a particular object.
3.3. Robotic Augmented Reality (RAR)
Researchers in many areas have been inspired by augmented reality (AR) in many areas including the medical, military, and entertainment fields. However, only a few applications are founded in robotics research, especially in the robotic navigation area. In this work, augmented reality technology has been integrated with EKF/ellipsoidal SLAM algorithms to improve their performances in real time. A review of RAR is introduced in the next section.
The fundamental idea of augmented reality (AR) is to mix the view of the real environment with virtual or additional computer-generated graphic content to improve our perception of the surroundings. An example of AR application for mobile devices to obtain information in the environment is shown in
Figure 4.
Augmented reality is a subset of the more general area of mixed reality (MR) [
40], which refers to a multi-axis spectrum of areas that cover virtual reality (VR), telepresence, augmented reality (AR) and other related technologies [
41].
There are two main classes of augmented reality: marker-based and marker-less. In a marker-based augmented reality application, the images to be recognized and provided beforehand, which significantly simplifies the process of image recognition. Adding a few small, easy to understand bits of information to a real scene can help guide the robot to easily perform many tasks.
Augmented reality can be used to improve the development of robot and robot applications, such as robot navigation and human and robot interactions (HRIs). More and more studies have been conducted on AR for humanoid robots and some of the applications of AR to humanoid robots have been in path planning [
42,
43], where AR is used for drawing guide paths to provide a simple and intuitive method for interactively directing the navigation of a humanoid robot through complex terrain.
The objective of augmented reality technology is to enhance the information acquisition of practical world by augmenting with computer-generated sensory inputs. We have integrated augmented reality with the EKF/ellipsoidal SLAM algorithms to improve SLAM performance and accuracy. SLAM performance is improved by presenting preloaded information to the robot through different AR markers placed in the robot’s indoor environment, and this information will help in the mapping process, simplifying the data association problem and reducing computational costs.
Figure 5 demonstrates the outline of the navigation strategy. The same AR navigation module will be used as a part of a simultaneous localization and mapping system, which is developed for the same humanoid robot platform [
44].
The SLAM new state vector can be represented by the following column vector including the AR marker IDs:
This new state vector will contain more information about the landmark IDs, which solves the landmark identification problem and simplifies the data association problem, because the corresponding landmark can be easily matched once the marked ID is re-observed, without having to compute the Mahalanobis distance frequently. This will optimize path planning and will reduce computational costs.
3.3.1. Robotic Augmented Reality Implementation on the NAO Robot
The augmented reality and landmark-based SLAM approach is used for humanoid (NAO) robot applications in indoor environments.
In order to explain the details of the proposed algorithm, we start this section with a description of the NAO humanoid robot and how the NAO marks are employed. This background information is required before introducing distance estimation for NAO that is essentially provides a monocular vision system. We will then explain the NAO odometry problem and go through the derivation of the SLAM for this system.
Platform and Software (NAO Humanoid Robot)
NAO is a small humanoid robot, developed by the French company Aldebaran, and is designed to be versatile and affordable for educational research at universities, research centers, and also is a good candidate for social robots in applications such as museums, dance performances, etc. [
45].
The main version of NAO is the H25, which has 25 degrees of freedom. Nao is 58 cm height and about 5 kg weight. The robot also has an Intel ATOM Z530 1.6 GHz processor, with 1 Gb of RAM, as well as an Ethernet port, a Wi-Fi connection and a USB port [
45] provides a summary of the NAO robot and its main features including movement, sensing, communication and acting. Gouaillier [
46] describes the details of the mechatronic design of the NAO.
Figure 6 shows the NAO robot’s main sensors.
In order to perceive its environment, Nao is equipped with several sensors, such as two cameras, four microphones, two sonars, two bumpers, and an accelerometer, to name a few. The two video cameras are located on its head; one is in the robot’s forehead (top camera) to view straight ahead of the robot, and the other one (lower camera) is located at its mouth to view the ground in front of the robot. In the NAO H25, which is the model used for our project, video cameras provide up to 1280 × 960 resolutions at 30 frames per second (fps). Cameras are crucial sensors for autonomous navigation
The NAO cameras and their field of view and their positioning are described in
Figure 7.
It can be seen that there is no considerable overlap between the two video cameras’ fields of view. Therefore, the cameras do not provide stereo vision. This work utilized the NAO straight ahead camera as a monocular vision system (top camera) for perception of the world.
Each camera is an Aptina MT9M114, maximum resolution 1280 × 960 with an associated frequency of 5 Hz, and maximum frequency 30 Hz from 640 × 480 Video Graphics Array (VGA) (resolutions [
45]. The resolution of the image is between QQVGA 80 × 60 and 4VGA 1820 × 960. The focal length of the camera is fixed, and its aperture angle is 60.9° horizontally and 47.6° vertically.
The acquisition system of the camera is a rolling shutter, that is to say the image is acquired line by line. This acquisition mode involves vertical deformation, which effects the image when the robot moves; the speed of movement of the head is greater than the acquisition speed of the columns. Settings such as white balance, brightness, or exposure time are adjustable. In practice, when the robot is moving, the images are subject to deformation because of strong kinetics and because the head is not stabilized except when the robot has both feet on the ground or is not making a step.
NAO has a stable walking algorithm, described in [
45]. A simple control system maintains its equilibrium (notably using the inertial unit and the Magnetic Rotary Encoder
(MRE )sensors), but has no feedback on the actual movement of the robot. The control is thus in an open loop on the position.
On the software side, the NAO robot is equipped with the NAOqi operating system by the Softbank Robotics Community. The NAOqi OS [
45] is the operating system of the robot and is an embedded GNU/Linux distribution based on Gentoo Linux. It was developed by Aldebaran to specifically fit the needs of the robot by providing and running programs and libraries that together transform it into a social robot. The main software of the NAOqi OS is NAOqi, which is responsible for running behaviors on the robot but can also be used to test code on a simulated robot on a PC. The NAOqi framework currently supports five programming languages: C++, Python, URBI, Java and Matlab. It has also been tested in the Microsoft.Net framework for the C#, F# and Visual Basic programming languages. Amongst the specified programming languages, Python and C++ are the most developed for the NAO, with extensive libraries. This work was mainly developed in the Python language using the Windows platform. The implementation methods were typical software engineering approaches, which included planning, developing, debugging and testing.
NAO Markers
The NAO’s vision recognition engine can be used is to recognize different objects. Some of the vision built-in functions and modules provided by Aldebaran are used, such as redBallDetection, to recognize balls and faceDetection to recognize faces. ALLandMarkDetection is a vision module in which the robot recognizes special landmarks with specific patterns on them. We refer to those landmarks as “NAO marks”. One can use the ALLandMarkDetection module for various applications. For example, they can be placed at different locations in the robot’s field of action. Depending on which landmark the robot detects, the information about the robot’s location can be inferred. Coupled with other sensory information, it is possible to build a rather robust localization module. This information will help with the estimation of the camera pose in the NAO space and the camera pose in the world space. Let us first discuss how to coordinate NAO marks in the robot’s camera [
45].
Aldebaran robotics provides a total of 29 markers. Each of these markers has a unique shape that distinguishes it from the others. As shown in
Figure 8, a NAO marker is a black circle with a white pattern in it. Encoded in the shape of the white pattern is the unique identity of the marker. When the NAO camera detects the NAO mark, it acquires useful information from these markers as we will see in next section [
14].
As stated in the Aldebaran documents [
45], the vision recognition module offers a simple approach to perceive the environment which is simple but presents several challenges. The first one is that the total number of NAO marks is limited, so if an application requires more of them, they need to be decoded again. The second is that these NAO marks require sufficient illumination, as correct decoding depends on contrast differences in the image. Proper illumination must be in the range of 100 lux and 500 lux. Lighting conditions below this range often result in the misidentification of the marker, or no detection at all in some cases. The third is that the tilt of the marker’s plane relative to the camera must be between +/− 60 degrees for detectability. For optimal performance, the NAO mark must also be in the direct line of sight of the robot. The final limitation is the size of the marker within the image and the range of detection by the camera. The minimum size is approximately 0.035 rad which corresponds to 14 pixels in a Quarter Video Graphics Array (QVGA) image, while the maximum size is approximately 0.40 rad or 160 pixels within the QVGA image. At this marker image size ranges and marker real size being 108.54 mm, the distance range for detection is from 30 cm to about 200 cm [
45].
The NAO mark can correctly detect if the image has only 60˚ of inclination in relation to the camera. On the other hand, an advantage of the NAO mark is that its rotation has no influence on detection.
Also, it should be stated that the NAO can also recognize Quick Response (QR) codes; a QR code is a quick response code which is like a bar code in matrix form. It can store a lot of useful information. The NAO can be implemented to detect the QR code sign; the robot can then move closer to read it and finally work out its meaning. The AR markers (or NAO markers) do not have to be placed in any particular location as the NAO is programmed to recognize them and find where they are as it walks in its environment and then localizes them with respect to its pose as well as the global position.
Our real-time experiments show that QR codes have more or less the same issues as the NAO markers. Firstly, the image resolution must be set according to the distance between the camera and the code itself. Secondly, the maximum distance for the NAO to be able detect the QR code is about 2 m, and this distance decreases when the QR code is not in straight view of the camera. Whereas our experiments show that NAO marks are very good to when used as landmark and produce better performance. By improving the NAO mark size and color as shown in figure below, the NAO is able to detect it up to 3.0–3.4 m from any camera angle.
Figure 9 shows the detected NAO mark and QR codes by the NAO and their distances.
The coordinates of the marker acquired by the robot’s camera are computed using the information shape of the marker. First, we need to know the physical dimension of the NAO mark detected.
The marker size of the printed marker used in our study shown in
Figure 10 is 195 mm with some information augmented to it. With this, we can calculate the distance of the marker from the robot, as explained bellow.
In
Figure 11, the variable (S) is the distance of the marker from the camera can be calculated using the angular size (a) and the marker size (m) as show in the following equation. The parameters sizeX and sizeY, for which identical values are given in the image of detected markers, are the dimensions from the center-most to the edge on the horizontal and vertical image axes, respectively.
In Equation (21), the angles alpha and beta are used to obtain the transformation from the robot to the landmark frame. To obtain the coordinate of the marker in the robot frame, a transformation from the camera frame to the robot frame is required. This transformation is performed using methods defined in the transform class of the implementation. The computation for this transformation is presented in Equation (22):
The result is a transformation matrix which includes the (x, y, z) coordinates of the landmark in the robot frame.
Figure 12 explains the transformation between the global frame of the world and local frames of the NAO robot and landmark (NAO mark).
In general, a 2D transformation between two frames is a combination of rotation and translation, written as:
where “robot” indicates marker coordinates in the robot frame (
) and
and
are the robot’s location in the global frame. The angle
denotes the orientation of the robot in the global frame. The equation can be rewritten as:
Now it’s possible to get the global coordinates of any NAOmark in the environment.
NAO Robot Odometry Problem
Bipedal walking robots present specific challenges depending on the environment in which they operate; they rarely achieve the desired trajectory because of the deviation generated during walking [
47]. This problem is due to different circumstances such as robot manufacturing, wear and tear of mechanic parts, or variations in floor flatness, and due to partial hardware failures, out-of-spec components, the friction forces that are generated during motion, etc. [
14].
The odometry provided by the NAO robot firmware is computed from the robot model, i.e., step length and walk angle only, with no data from the inertial unit used (although it is available on the robot). The robot pose outputs from odometry are relative to some starting point and so only relative transformations to the previous poses are used in our localization system. The produced poses are known to be biased with an additive error, which is not negligible for legged robots. It was noted in the preliminary experiments that the odometry alone cannot be used for reliable robot navigation, as the resulting robot pose deviates greatly from the ground truth.
Figure 13 shows the discrepancy between the actual robot trace and the reported odometry for a straight walk command. According to the odometry, the robot moved across a straight line because of the given command, whereas in reality the robot slipped to the left [
48].
The NAO’s issue of deviation during walking will lead to errors in robot position and orientation after a few steps. Experiments show that the NAO’s orientation has a large error compared to its position error and as a result, all landmarks (NAO mark location) will have an inaccurate position value and consequently lead to a completely inaccurate map. Perception sensors which provide information relative to the environment (e.g., the camera) also have uncertainty in their measurements. Due to this high uncertainty in motion and perception, the estimation of the robot’s position, orientation and building of an accurate map can be improved using some filtering methods like the Kalman filter and its various non-linear extensions such as the extended Kalman filter (EKF) or the ellipsoidal SLAM as the goal of this paper.
The augmented reality and vision-based probabilistic landmark-based SLAM approach are implemented for humanoid (NAO) robot applications in indoor environments. It should be noted that the NAO’s mono-camera system is used to capture NAO marks as landmarks for SLAM applications as this humanoid robot do not have stereo vision, and due to its rather closed architecture, it is not straightforward to interface any external hardware (such as a stereo camera) to this platform. Also, adding additional hardware affects the walking stability of NAO. Finally, our solution alleviates the additional computational costs of image processing. By extension, an additional sensor (sonar) is used to find the safe distance for the NAO to move ahead and to avoid obstacles to perform the SLAM algorithm. Also, the sonar sensor can be used to find a safe distance and to perform SLAM faster. In the first stage, the application involves the NAO’s visual recognition ability to recognize some of the NAO markers and obtain some preloaded augmented information. The information can also later be used for the simplification of the data association problem in EKF/ellipsoidal SLAMs and help navigate in a dense environment with multiple landmarks and obstacles. In the next stage, the same AR module is transplanted into a vision-based autonomous humanoid robot to determine the position with respect to its environment. Markers whose locations are mapped based on the estimated location of the robot will be accurately updated in the update step of the EKF/ellipsoidal SLAMs.
Initially, the EKF/ellipsoidal SLAMs perform regular map initialization processes and then start its regular motion function. However, before the EKF/ellipsoidal SLAMs conduct the observation function using the camera sensor, the NAO calls the landmark detection module aiming to find if there are NAO marks in the camera’s current field of view. This process is shown in
Figure 14 and the algorithm below illustrates the process. The NAO starts turning its head by 15 degree increments and at the same time looking for NAO markers. In each head turn, and once the vision system detects any NAO marks, it will get the NAO mark’s data and all information associated with it and at the same time calculate the marker’s coordinates with respect to the robot’s frame. The designed system recognizes the location of a NAO marker from the image sequence taken from the environment using the NAO’s camera and adds the location information to the user’s view in the form of 2D objects and other information content using augmented reality (AR).
The pseudo-code for the augmented EKF/ellipsoidal SLAM of the whole project with augmented reality is presented below (Algorithm 2).
Algorithm 2. Augmented EKF/ellipsoidal SLAM |
Start |
Initialization - EKF/Ellipsoidal -SLAM Initialization, NAO Robot Initialization |
Get Observation – |
Looking for NAO markers Yes - turning NAO’s head by 15 degrees. |
No - NAO markers Turn Nao by 180 degree |
while not_stop |
Prediction Step - Check safe distance to move by sonar. Move command |
No safe distance. Turn Nao by180 degree |
Get Observation (5) - Looking for NAO markers by turning head by 15 degrees. |
If NAO find NAO markers to robot frame |
Yes – Are/Is there any Augmented NAO markers |
Yes- go to step(6) |
No –Turn NAO by 180 degree and go to step(5) |
Data Association(6)- |
NAO markers matching and data-association simplification |
Correction _Step - Run standard EKF/ Ellipsoidal - SLAM update step. |
Augmented _Map Add new NAOmarkers to the map |
Check if iteration numbers are achieved |
No Go to step 4 |
End |
NAO looks for the augmented NAO marker in its environment and the landmark detection module should report the detected NAO marker by circling them, with the mark’s ID displayed next to it. Once the right NAO marker is detected and the mark ID is retrieved, extra information can be achieved corresponding to the mark ID number, which is inspired by augmented reality. The sonar checks available space before the robot moves to the next prediction step.
As we can see from the algorithm, the NAO starts turning its head by 15 degree steps and gets the NAO marker’s data at each step and at the same time calculates the marker’s coordinates with respect to the robot’s frame. Using augmented reality algorithms, whether the robot sees normal NAO markers or augmented NAO markers, the robot tries to find its location, simplify the data association and then goes to the update step. If the robot did not see any augmented NAO markers, it will go to the update step with the location calculated in the prediction step and run a regular EKF/ellipsoidal SLAM. All NAO markers are then mapped to the global location using EKF/ellipsoidal SLAM.
The integration of augmented reality with the EKF/ellipsoidal SLAM algorithms succeeded in the NAO robot based on the use of NAO marker recognition functions, as seen previously. Generally, the fundamental structures of EKF SLAM and ellipsoidal SLAM remain, while the augmented reality processes take place when the NAO marker is detected, where additional information is retrieved regarding the detected NAO marker. The main contribution of integrating augmented reality into both EKF SLAM and ellipsoidal SLAM is the use of this additional information to assist the robot in navigating tasks within a practical environment.
The integration of augmented reality (AR) with EKF/ ellipsoidal SLAM is proposed and implemented on a humanoid robot in this paper. The goal has been to improve the performance of these SLAMs in terms of reducing the computational effort, simplifying the data association problem and to improving the SLAM algorithm.