Indoor Positioning Systems of Mobile Robots: A Review

: Recently, with the in-depth development of Industry 4.0 worldwide, mobile robots have become a research hotspot. Indoor localization has become a key component in many ﬁelds and the basis for all actions of mobile robots. This paper screened 147 papers in the ﬁeld of indoor positioning of mobile robots from 2019 to 2021. First, 12 mainstream indoor positioning methods and related positioning technologies for mobile robots are introduced and compared in detail. Then, the selected papers were summarized. The common attributes and laws were discovered. The development trend of indoor positioning of mobile robots is obtained.


Introduction
With the continuous innovation of sensor technology and information control technology and the impact of a new round of a world industrial revolution, indoor mobile robot technology has developed rapidly.Mobile robots have produced huge economic benefits in laboratories, industry, warehousing and logistics, transportation, shopping, entertainment, and other fields [1].They can replace manual tasks such as the transportation of goods, monitoring patrols, dangerous operations, and repetitive labor.In addition, mobile robots realize 24 h automation and unmanned operation of factories and laboratories [2].The question of how to achieve high-precision mobile robot positioning technology for these functions has always been an urgent problem to be solved.Positioning refers to the estimation of the position and the direction of a mobile robot in the motion area, which is a prerequisite for robot navigation [3].It is a difficulty in the field of mobile robots and is the focus of researchers.
The positioning problem in the outdoor environment has been solved by the ubiquitous global navigation satellite system (GNSS).However, due to the complex and diverse environmental conditions (factories, laboratories, hospitals, shopping malls, etc.), signal attenuation, multipath, and non-visual distance problems, the GNSS is no longer applicable in the indoor environment [4].Thus, many kinds of indoor positioning systems have been proposed by researchers.In actual work, mobile robots usually need to move, pick and place objects, patrol, pass through access control, and take elevators in the working environment [5].The high accuracy of indoor positioning is the premise to ensure that the mobile robot can complete the task and ensure the safety of the human-machine environment.The demand to achieve a centimeter-level positioning accuracy of mobile robots attracts researchers to continue to explore the positioning of indoor mobile robots.Currently, the mainstream indoor positioning technologies include Wi-Fi, Bluetooth, ZigBee, Radio Frequency Identification (RFID), Ultra-Wide Band (UWB), Inertial Measurement Unit (IMU), Visible Light Communication (VLC), Infrared (IR), Ultrasonic, Geomagnetic, Light Detection and Ranging (LiDAR), and Computer Vision [6].These technologies have been maturely applied to the positioning of mobile robots.

LiDAR and Computer Vision prefer Simultaneous Localization and Mapping (SLAM)
as the positioning technique.SLAM maps the surrounding environment in real-time through a radar point cloud or image depth information without prior information and uses probabilistic methods to estimate its position in real-time [7].SLAM is the main method to solve the problem of automatic navigation of mobile robots in unknown environments.It has been more than 30 years since the SLAM method was proposed.From the development of the algorithms and filters needed to solve the problem to the implementation of powerful mobile robots, SLAM technology has great improvement.Various SLAM methods have been proposed by researchers, such as LiDAR-based (EKF-SLAM, Core SLAM, Gmapping SLAM, Hector SLAM, Cartographer SLAM, etc. [8]) or based on Computer Vision (MonoSLAM, PTAM, ORB SLAM, LSD SLAM, SVO, DSO, etc. [9]).In addition, data such as wheel encoders or IMUs are usually fused with SLAM-ranging information.
Radio frequency (RF) technology based on indoor positioning systems is the most common, and Wi-Fi, Bluetooth, ZigBee, RFID, and UWB are all radio frequency technologies [10].They are mature communication technologies that can provide positioning functions with simple modifications.These methods are low-cost and practical.The disadvantage is that the accuracy is not high, and further optimization is required.Usually, these optimization methods are based on filters [11] and machine learning [12].Their ranging techniques are mainly based on geometric algorithms such as TOA, AOA, TDOA, or RSSI based on fingerprint algorithms [13].These ranging techniques convert signal characteristics into distances and angles and then infer the pose of the mobile robot carrying the transmitter based on the known receiver location.
Although radio frequency technology is common, it cannot be applied in areas where the radio is restricted, such as hospitals, airports, etc. [14].IR and VLC are more private and only propagate under Line of Sight (LOS).The infrared-based positioning technology is relatively simple, and it also uses the transmitter and the receiver to evaluate the pose.There are already mature commercial solutions such as StarGazer available [15].VLC is an emerging indoor positioning solution, with Light-Emitting Diode (LED) lights and photosensitive sensors/cameras as its hardware.It is a good motivation to use LED lights that are already used as lighting fixtures for positioning [16].This motivation is similar to Wi-Fi, and its location accuracy is better than Wi-Fi.
An ultrasonic is a mature technology that also uses radio frequency signals for ranging.Compared with radio frequency signals, ultrasonic waves have serious multipath effects and are also affected by temperature and humidity.It is more suitable for high-precision positioning at medium and short distances, so ultrasonic is usually used to assist positioning or obstacle avoidance [17].
The IMU is an essential sensor for mobile robots.The positioning technique used by the IMU is dead reckoning [18].By calculating the acceleration and angular velocity information collected by the accelerometer and gyroscope, the position of the robot is continuously updated.The calibration accuracy of the IMU directly affects the positioning accuracy of the mobile robot.The positioning accuracy of the IMU is not high enough, so it is usually used as an auxiliary positioning technology to perform data fusion with other positioning technologies.
Geomagnetic is based on the naturally occurring Earth's magnetic field.It utilizes the unique properties of each point in the Earth's magnetic field for location determination.Indoor geomagnetic positioning first needs to collect the geomagnetic feature information in the indoor range to establish a geomagnetic database.Then, during the operation of the mobile robot, it obtains its position by matching the geomagnetic feature information collected in real-time with the data in the database.The main feature is that it is free from interference and has no accumulated errors [19].
The angles of evaluating an indoor positioning system are positioning accuracy, cost, whether it needs facility modification and robustness, etc. [10].At present, there is no system as dominant as GNSS in the field of indoor positioning.Choosing a suitable indoor positioning system also requires a trade-off between multiple angles.Each positioning technology has advantages and disadvantages.For example, UWB technology that solves the problems of multipath and signal attenuation of RF signals is expensive.Ultrasonic technology has a centimeter-level positioning and is only suitable for short-distance positioning and detection of obstacles because ultrasonic waves cannot cross obstacles.Wi-Fi and Bluetooth are cheap and low energy, but their positioning accuracy is easily affected by the environment.Computer Vision SLAM with high accuracy without modifying the environment has high requirements on algorithms and computers.High-precision LiDAR SLAM depends on the number of threads of the radar, and the number of threads is directly related to the price.
Data fusion of positioning methods is a good method to improve the positioning effect, which can overcome the shortcomings of different positioning methods and enable mobile robots to adapt to complex and diverse working environments.With the improvement of computer hardware, intelligent algorithms [20], machine learning [12], deep learning [21], and cloud computing [22] can be used in the positioning of mobile robots to achieve higher positioning accuracy.
At present, there is much research on the indoor positioning of mobile robots, but there are very few reviews.This paper uses (mobile AND robot AND indoor AND positioning/localization) as the search term.The selected search scope is the paper title, abstract, and keywords.The time limit is 2019-2022.A total of 495 papers were automatically screened.To improve the quality of the selected papers, the papers without citations are removed, leaving 258 papers.Then, a secondary screening was conducted using the following criteria: 1.
The positioning object must be a mobile robot instead of a drone or an underwater drone.

2.
The method proposed in the paper should focus on indoor positioning, rather than navigation, mapping, path planning, human-computer interaction, and obstacle avoidance.

3.
Papers belonging to review and survey types will also be screened out.

4.
The paper should be written in English.
After the screening, there were 147 papers on the indoor positioning of mobile robots.
The second section of this paper focuses on 12 mainstream positioning methods and related positioning technologies.A comparative analysis is made based on positioning accuracy, cost, and advantages and disadvantages.The third section conducts a comprehensive analysis of the selected 147 pieces of literature, digs out common laws, and discovers future trends.The fourth section presents conclusions and future prospects for the indoor localization of mobile robots.

Overview of Indoor Positioning Technologies for Mobile Robots
This section details the basic principles of the 12 positioning methods, the localization techniques used, and case studies.The 12 positioning methods are divided into Non-Radio Frequency (IMU, VLC, IR, Ultrasonic, Geomagnetic, LiDAR, and Computer Vision) [20] and Radio Frequency (Wi-Fi, Bluetooth, ZigBee, RFID, and UWB).

Non-Radio Frequency Technologies
Although radio localization is the most commonly used indoor localization method, LiDAR and Computer Vision are more attractive to researchers in the field of mobile robotics.They are more accurate and require no modification to the environment.This section details the non-radio methods: IMU, VLC, IR, Ultrasonic, Geomagnetic, LiDAR, and Computer Vision.

Inertial Measurement Unit
The IMU is the most basic sensor in mobile robots, and all mobile robots are equipped with an IMU.The IMU is composed of an accelerometer and gyroscope, which can measure the acceleration, angular velocity, and angle increment of the mobile robot.Real-time positioning of the robot is achieved through integral calculation of the motion trajectory and pose according to the parameters of acceleration, angular velocity, and angle increment.
This method can only rely on the internal information of the robot for autonomous navigation [23].Therefore, the calibration accuracy of the IMU directly affects the positioning accuracy of the mobile robot.
The advantages of inertial navigation are strong anti-interference; inertial sensors will not reduce the accuracy due to the interference of external environment signals such as Wi-Fi, sound waves, etc.The working speed of the IMU is fast, and the inertial navigation of the IMU is more suitable for fast-moving objects than other positioning technologies.The main disadvantage is that the positioning error of inertial navigation is cumulative.Thus, the last positioning error will affect the next positioning, which will cause the error to expand.As the robot runs, the positioning error of inertial navigation will continue to increase if there is no correction process.Therefore, we usually use other navigation methods in combination with inertial navigation.In this way, the positioning method will retain the advantages of strong anti-interference and high speed of inertial navigation, and will limit the error to a certain range.
The positioning technique of the IMU is dead reckoning [12].The dead reckoning method knows the initial position and pose of the mobile robot and calculates the current position through the moving distance and rotation angle recorded by the IMU.The formula for deriving the current position Pn(Xn,Yn) is expressed as Equation ( 1), and the schematic diagram of the dead reckoning algorithm is shown in Figure 1: where P 0 (X 0 , Y 0 ) is the starting position of the mobile robot, d is the moving distance, and α is the rotation angle of the mobile robot.
ure the acceleration, angular velocity, and angle increment of the mobile robot.Real-tim positioning of the robot is achieved through integral calculation of the motion trajector and pose according to the parameters of acceleration, angular velocity, and angle incre ment.This method can only rely on the internal information of the robot for autonomou navigation [23].Therefore, the calibration accuracy of the IMU directly affects the posi tioning accuracy of the mobile robot.
The advantages of inertial navigation are strong anti-interference; inertial sensor will not reduce the accuracy due to the interference of external environment signals such as Wi-Fi, sound waves, etc.The working speed of the IMU is fast, and the inertial naviga tion of the IMU is more suitable for fast-moving objects than other positioning technolo gies.The main disadvantage is that the positioning error of inertial navigation is cumula tive.Thus, the last positioning error will affect the next positioning, which will cause th error to expand.As the robot runs, the positioning error of inertial navigation will con tinue to increase if there is no correction process.Therefore, we usually use other naviga tion methods in combination with inertial navigation.In this way, the positioning method will retain the advantages of strong anti-interference and high speed of inertial navigation and will limit the error to a certain range.
The positioning technique of the IMU is dead reckoning [12].The dead reckonin method knows the initial position and pose of the mobile robot and calculates the curren position through the moving distance and rotation angle recorded by the IMU.The for mula for deriving the current position Pn(Xn,Yn) is expressed as Equation ( 1), and th schematic diagram of the dead reckoning algorithm is shown in Figure 1: where P0 (X0, Y0) is the starting position of the mobile robot, d is the moving distance, and α is the rotation angle of the mobile robot.The IMU is usually used as an auxiliary navigation method to assist other positionin methods to estimate the position of the robot.The anti-interference feature of the IMU can The IMU is usually used as an auxiliary navigation method to assist other positioning methods to estimate the position of the robot.The anti-interference feature of the IMU can be used to correct the positioning information.In recent literature, IMU localization methods focus on how to perform data fusion.The fusion algorithms used are mainly based on filters, but machine learning is also used.Table 1 shows the specific information in the IMU-based data fusion papers.The key element of VLC communication is the LED light [16].The communication principle is to encode information and transmit the encoded information by adjusting the intensity of the LED, and the photosensitive sensor can receive and decode the high-frequency flickering signal.Using photo sensors to locate the position and orientation of LEDs, the RSSI, TOA, and AOA of radio methods are the main localization techniques [37].In contrast, a camera can also be used to receive images of LEDs [38].This approach is similar to Computer Vision.
LED is a kind of energy-saving lighting equipment, which has the characteristics of low energy consumption, long life, environmental protection, and anti-electromagnetic interference.With the popularity of LED lights now, the cost of environmental modification using VLC positioning is lower, which is similar to Wi-Fi.LED lights can be installed in large numbers indoors while considering the functions of lighting and communication.Due to its high-frequency nature, it transmits information without disturbing the illumination [39].A major disadvantage of VLC is that it can only communicate within the line-of-sight (LOS) range.Moreover, light does not interfere with radio frequency equipment and can be safely used in places where radio frequency signals are prohibited.
The direction of improvement of VLC positioning based on RSSI technology lies in the process of optimizing the signal strength mapping position.An improved Bayesian-based fingerprint algorithm [37], as well as the optimization of a baseline smoother based on an extended Kalman filter and a central difference Kalman, filter have been described [40].The VLC method can be fused with other localization methods, such as the IMU [16] and the encoder [41], and the corresponding fusion algorithms are the EKF and particle filter.Similar to fingerprint recognition, which requires building a signal strength library in advance, LEDs require a prior calibration stage.Amsters et al. used the Cartographer algorithm in LiDAR SLAM to build a VLC environment map, and identified the location of the LED in the map according to the LED frequency and pixel coordinate information obtained by the camera [42,43].CMOS cameras combined with vision processing algorithms can efficiently identify LED IDs [38,44].Under normal circumstances, VLC is one-way data transmission, and the LED terminal sends encoded information to the mobile robot.Louro et al. designed a VLC-based bidirectional data transmission, implemented LED-tovehicle, vehicle-to-LED, or vehicle-to-vehicle communication, and proposed parity bits to reduce the bit error rate [39,45].Wang et al. designed a novel position and orientation sensor based on LEDs [46].The sensor consists of four pairs of linear charge-coupled devices (CCDs) and cylindrical lenses.The 3D coordinate measurement system based on the intersection of four planes can be used to detect the pose of the mobile robot.Jeong et al. proposed a VLC localization method based on a fuzzy logic system capable of estimating the mobile robot's position by analyzing the chromaticity and frequency component ratios of LED lights installed under the ceiling [47].

Infrared Detection Technologies
Infrared is an electromagnetic wave that is invisible to the human eye and has a longer wavelength than visible light.The positioning of indoor mobile robots based on infrared methods relies on artificial landmarks with known positions, which can be divided into active landmarks and passive landmarks according to whether the landmarks require energy or not.The principle of active landmarks is that a mobile robot equipped with an infrared transmitter emits infrared rays and a receiver installed in the environment receives the infrared rays and then calculates the position of the robot, which can achieve sub-meter accuracy.The principle of passive landmarks is that the arrangement in the environment can reflect infrared landmarks and the mobile robot receives the reflected infrared information at the same time and obtains the landmark ID, position and angle, and other information [15].At present, infrared-based positioning technology is relatively mature, but its penetration is poor, and it can only perform line-of-sight measurement and control.It is susceptible to environmental influences, such as sunlight and lighting, so the application is limited.
The center for life science automation (CELISCA) has developed a complete automatic transportation system for multi-mobile robots across floors [15].The positioning method was selected by Hagisonic's StarGazer.It performs indoor positioning based on infrared passive landmarks and obtains the relative position of the robot and passive landmarks through a visual calculation to evaluate the robot's own position and orientation.The average positioning accuracy is about 2 cm, and the maximum error is within 5 cm.Bernardes et al. designed an infrared-based active localization sensor [48].Infrared LED lights are arranged on the ceiling, and the mobile robot is equipped with an infrared light receiver.Combined with EKF, the pose of the mobile robot is calculated using the emission angle of the LED.

Ultrasonic Detection Technologies
An ultrasonic is defined as a sound wave with a vibration frequency higher than 20 kHz, which is higher than the upper limit of human hearing and cannot be heard.At present, ultrasonic ranging has three methods: phase detection, acoustic amplitude detection, and time detection.In practical applications, most of them are time detection methods using the TOF principle.The time detection method obtains the distance by calculating the time when the ultrasonic transmitter transmits the ultrasonic wave and the time difference and sound speed of the ultrasonic wave received by the receiving end [17].The accuracy can theoretically reach the centimeter level [49].Among them, the phase detection method has the advantage of better detection accuracy.Its principle is to calculate the measurement distance based on the ultrasonic wavelength by comparing the phase difference when the ultrasonic wave is transmitted and the phase difference when the ultrasonic wave is received.It also has a defect.The measured phase difference is a multiple solution value with a period of 2nπ, so its measurement range needs to be within the wavelength.The acoustic wave amplitude detection method detects the amplitude of the returned acoustic wave, but it is easily affected by factors such as the medium, resulting in inaccurate measurement, but this method is easy to implement.The principle of the transit time detection method is to determine the distance of the obstacle according to the time difference between the ultrasonic transmitting end transmitting the acoustic wave signal and the receiving end receiving the acoustic signal reflected by the obstacle.Although the positioning accuracy of ultrasonic waves can reach the centimeter level in theory, the propagation speed of ultrasonic waves is different in different media.Moreover, it will be affected by changes in temperature and air pressure, and sometimes temperature compensation is required.
Tsay et al. developed a navigation tool for multiple agricultural robots based on a Spread Spectrum Sound-based Local Positioning System (SSSLPS).With static measurements using Time Division Multiple Access (TDMA), the positioning accuracy achieved was 1.2 cm.In contrast, under dynamic measurements employing Frequency Division Multiple Access (FDMA), the positioning accuracy reached 6.2 cm [49].Chen et al. proposed an indoor localization system based on discrete sonar sensors [50].The sonar receiver is arranged on the ceiling, and the robot carries the ultrasonic transmitter.Similar to the previous VLC case, the SLAM technique was chosen for calibration.Since the ultrasonic path loss and ultrasonic emission dead zone have a great influence on the positioning, Zhang et al. designed a multi-degree-of-freedom ultrasonic receiving device [51].It can automatically rotate and follow the ultrasonic transmitter on the mobile robot.Grami et al. used particle filters to fuse ultrasonic localization and odometry information [17].Gualda proposed a simultaneous calibration and navigation system that can simultaneously perform navigation and ultrasonic calibration of mobile robots [52].He also compared various Bayesian filters, including the Extended Kalman Filter (EKF), Unscented Kalman Filter (UKF), and H-∞ filter.Simulation results show that an average error of H-∞ is 0.33 m, and both the EKF and UKF are 0.35 m.Magrin et al. used an ANN (artificial neural network) to fuse a sonar octagon, a digital compass, and wireless network signal strength [53].The octagonal sonar can cover 360 • of the robot, and the recognition rate was 100% in grids measuring 120 × 120 and 60 × 60.

Geomagnetic Field Detection Technologies
The earth is wrapped in a natural geomagnetic field formed between the north and south poles, and the strength of the magnetic field varies with latitude and longitude.At present, geomagnetic fingerprints are mainly used for indoor positioning based on geomagnetism.Magnetic fingerprints have unique properties in space, with geomagnetic modulus values in a small contiguous region forming a fingerprint sequence [54].Geomagnetic fingerprint positioning includes two stages: offline training and online positioning.During the offline training stage, the geomagnetic sensor is used to collect the geomagnetic information about the sampling points, and the information is stored in the fingerprint information database.In the online positioning stage, the geomagnetic information about the current location obtained by the geomagnetic sensor is used to form the fingerprint information and match the information in the fingerprint database.The geomagnetic field is less affected by external factors, as there is no cumulative error and there is no NLOS problem, but the accuracy is at the meter level.Generally, geomagnetism is used as an auxiliary positioning tool to correct accumulated errors, or used with other sensors for information fusion.The formula for geomagnetic indoor positioning is: where P is the position of the device is located.B is the measured magnetic field strength at the device's location.B 0 is the magnetic field strength at a reference point.B 1 is the magnetic field strength at another reference point.Lv et al. developed a self-tuning Kalman filter that fused the geomagnetic sensor, the IMU, and encoder, and used the real-time update global position characteristics of the geomagnetic sensor to correct the error accumulation defect of dead reckoning [18].The positioning method of particle filter-based is presented by Isaku Nagai et al. [55] to fuse data from multiple magnetic sensors, an optical sensor, and a gyroscope.

LiDAR Detection Technologies
LiDAR systems consist of an optical transmitter, a reflected light detector, and a data processing system.First, the optical transmitter emits discrete laser pulses to the surrounding environment.The laser pulses are reflected after encountering obstacles.The reflected laser light is received by the reflected light detector, and then the data are sent to the data processing system to obtain a two-dimensional image or 3D information of the surrounding environment.LiDAR systems are active systems since they emit laser pulses and detect reflected light.This feature supports the mobile robot to work in a dark environment, does not require any modification of the environment, and is highly universal.In indoor situations, 2D LiDAR is mainstream, and 3D LiDAR is usually used in the field of unmanned driving [56].
The 2D LiDAR SLAM is the mainstream method for the indoor positioning of mobile robots.Without prior information, LiDAR SLAM uses LiDAR as a sensor to obtain surrounding environment data, evaluate its pose, and draw an environment map based on the pose information.Finally, synchronous positioning and map construction are realized.At present, there are two types of 2D LiDAR SLAM methods to solve the indoor positioning problem of mobile robots: filter-based and graph-based [57].The problem to be solved is the estimation of a posterior probability distribution over the robot's location given sensor measurements and control inputs.This can be stated as follows: where x t represents the robot's state (position and orientation) at time t; z {1:t} represents the lidar observation data from time 1 to t; u {1:t} represents the robot control input from time 1 to t; P(x t |z {1:t} , u {1:t} ) represents the posterior probability of the robot's state at time t, given the observation data and control input; P(z t |x t ) represents the likelihood probability of observation data z t , given the robot state x t ; P(x t |x t−1 , u t ) represents the transition probability of the current state x t , given the previous state x t−1 and the current control input u t ; and η is a normalization factor that ensures the sum of probabilities is 1.By iterating this formula, we can estimate the robot's position and orientation based on the LiDAR observation data and robot control input.
The filter-based SLAM method has been proposed since the 1990s and has been widely studied and applied [58].Its principle is simply to use recursive Bayesian estimation to iterate the posterior probability distribution of the robot pose to construct an incremental map and achieve localization.The most basic algorithms are EKF SLAM and particle filter SLAM.Gmapping SLAM and Hector SLAM are now commonly used methods.
EKF SLAM is linearized by first-order Taylor expansion to approximate the nonlinear robot motion model and observation model.Cadena et al. improved the EKF slam based on directional endpoint features [58].
The particle filter RBPF SLAM, also known as sequential Monte Carlo, is a recursive algorithm that implements Bayesian filtering through a non-parametric Monte Carlo method [59,60].Zhang [64].Based on the EKF and RBPF, FastSLAM combines the advantages of both.It uses a particle filter for path estimation and a Kalman filter for the maintenance of map state variables.FastSLAM has great improvements in computational efficiency and scalability.Yan and Wong integrated particle filter and FastSLAM algorithms to improve the localization operation efficiency [65].
In 2007, Gmapping SLAM was released as open source, which is an important achievement in the field of laser SLAM [66].It is a particle filter-based method.On this basis, many companies, and researchers have applied Gmapping in mobile robot products [67,68].Norzam et al. studied the parameters in the Gmapping algorithm [69].Shaw et al. combined Gmapping and particle swarm optimization [70].
Hector SLAM is also a classic open-source SLAM algorithm based on filtering.The main difference from Gmapping is that it does not require odometer data and can only construct maps based on laser information [71].Garrote et al. fused mobile Marvelmind beacons and Hector SLAM using PF [72].Teskeredzic et al. created a low-cost, easily scalable Unmanned Ground Vehicle (UGV) system based on Hector SLAM [73].
The difference between graph-based SLAM and particle filter-based iterative methods is that graph-based SLAM estimates the pose and trajectory of mobile robots entirely based on observational information.All collected data are recorded, and finally, computational mapping is performed [74].The pose graph represents the motion trajectory of the mobile robot.The pose of the robot is a node of the pose graph, and an edge is formed according to the relationship between the poses.This process of extracting feature points from LiDAR point cloud data is called the front end.The corresponding back end is to further optimize and adjust the edges connected by the node poses, and the optimization is based on the constraint relationship between the poses.
The methods of 2D LiDAR SLAM based on graph optimization include Karto SLAM [75], published by Konolige et al. in 2010, and the Cartographer algorithm, open-sourced by Google in 2016 [76].
Cartographer is currently the best 2D laser SLAM open-source algorithm for mapping performance.Cartographer SLAM uses odometer and IMU data for trajectory estimation.Using the estimated value of the robot's pose as the initial value, the LiDAR data are matched and the value of the pose estimator is updated.After a frame of radar data is subjected to motion filtering, it is superimposed to form a submap.Finally, all subgraphs are formed into a complete environment map through loop closure detection and back-end optimization [76].Deng et al. divided the environment into multiple subgraphs, reducing the computational cost of the Cartographer [77].Sun et al. improved Cartographer's boundary detection algorithm to reduce cost and error rate [57].Gao et al. proposed a new graph-optimized SLAM method for orientation endpoint features and polygonal graphs, which experimentally proved to perform better than Gmapping and Karto [78].
To solve the problem of a large amount of computation in SLAM, Sarker et al. introduced cloud servers into LiDAR SLAM [22].Li et al. used cloud computing to optimize a Monte Carlo localization algorithm [88].Recently, some researchers have also performed a lateral comparison between localization methods and algorithms.Rezende et al. compared attitude estimation algorithms based on wheel, vision, and LiDAR odometry, as well as ultra-wideband radio signals, all fused with the IMU [89].The results strongly suggested that LiDAR has the best accuracy.Shen et al. compared Gmapping, Hector SLAM, and Cartographer SLAM, and the results indicate that Cartographer performs best in complex working environments, Hector SLAM is suitable for working in corridor-type environments, and Gmapping SLAM is most suitable for simple environments [90].

Computer Vision Detection Technologies
Computer Vision uses cameras to obtain environmental image information.It is designed to recognize and understand content in images/videos to help mobile robots localize.Computer Vision localization methods are divided into beacon-based absolute localization and visual odometry-based SLAM methods.Among them, visual SLAM is a hot spot for the indoor positioning of mobile robots.
Absolute positioning based on beacons is the most direct positioning method.The most common method is the QR code [91][92][93][94][95][96][97].QR codes can provide location information directly in the video image.By arranging the distribution of QR codes reasonably, the mobile robot can be positioned in the whole working environment.Avgeris et al. designed a cylinder-shaped beacon [98].Typically, beacons are placed in the environment, and Song et al. innovatively put QR codes on robots and vision cameras mounted on the ceiling [93].The location of the mobile robot can be determined by tracking the QR code.Lv et al. utilized grids formed by tile joints to assist mobile robot localization [99].
Visual SLAM is similar to LiDAR SLAM.Multiple cameras or stereo cameras are used to collect feature points with depth information in the surrounding environment, and feature point matching is performed to obtain the pose of the mobile robot.Visual SLAM systems use geometric features such as points, lines, and planes as landmarks to build maps.The visual SLAM flowchart is shown in Figure 2 [100].
Computer Vision uses cameras to obtain environmental image information.It is designed to recognize and understand content in images/videos to help mobile robots localize.Computer Vision localization methods are divided into beacon-based absolute localization and visual odometry-based SLAM methods.Among them, visual SLAM is a hot spot for the indoor positioning of mobile robots.
Absolute positioning based on beacons is the most direct positioning method.The most common method is the QR code [91][92][93][94][95][96][97].QR codes can provide location information directly in the video image.By arranging the distribution of QR codes reasonably, the mobile robot can be positioned in the whole working environment.Avgeris et al. designed a cylinder-shaped beacon [98].Typically, beacons are placed in the environment, and Song et al. innovatively put QR codes on robots and vision cameras mounted on the ceiling [93].The location of the mobile robot can be determined by tracking the QR code.Lv et al. utilized grids formed by tile joints to assist mobile robot localization [99].
Visual SLAM is similar to LiDAR SLAM.Multiple cameras or stereo cameras are used to collect feature points with depth information in the surrounding environment, and feature point matching is performed to obtain the pose of the mobile robot.Visual SLAM systems use geometric features such as points, lines, and planes as landmarks to build maps.The visual SLAM flowchart is shown in Figure 2 [100].

Sensor Data Input
Front-End Visual Odometer

•
Sensor data input: Environmental data collected by the camera.Occasionally, the IMU is used as a secondary sensor.

•
Front-end visual odometer: Preliminary camera poses estimation based on image information of adjacent video frames.

•
Back-end nonlinear optimization: optimize the camera pose obtained by the front-end to reduce the global error.

•
Loop closure detection: According to the image to determine whether to reach the previous position.Form a closed loop.

•
Mapping: Build a map of the environment based on continuous pose estimates.
Currently, many researchers are working on assigning specific meanings to objects in visual SLAM, called semantic SLAM.Semantic SLAM builds maps with semantic entities, which not only contain the spatial structure information of traditional visual SLAM, but also the semantic information of objects in the workspace [123].It is beneficial to improve the speed of closed-loop detection, establish human-computer interaction, and perform complex tasks.
In semantic SLAM for dynamic scenes, Han and Xi utilized PSPNet to divide video frames into static, latent dynamic, and prior dynamic regions [109].Zhao et al. proposed to combine semantic segmentation and multi-view geometric constraints to exclude dynamic feature points and only use static feature points for state estimation [101].Yang et al. introduced a new dynamic feature detection method called semantic and geometric constraints to filter dynamic features [114].
Object detection and recognition based on deep learning is now mainstream in the field of Computer Vision. Lee et al. used YOLOv3 to remove dynamic objects from dynamic environments [124].Maolanon et al. demonstrated that the YOLOv3 network can be enhanced by using a CNN [21].Xu and Ma simplified the number of convolutional layers of YOLOv3's darknet53 to speed up recognition [125].Zhao et al. used a Mask-RCNN network to detect moving objects [123].
For the division of semantic objects, Rusli et al. used RoomSLAM, which is a method of constructing a room through wall recognition by using the room as a special semantic identifier for loop closure detection [126].Liang et al. identified house numbers and obtained room division information [84].
The image information obtained by visual SLAM has the largest amount of data among all positioning technologies.The inability of onboard computers to meet the computational demands of visual SLAM is one of the reasons that currently limits the development of visual SLAM, in particular, deep learning-based visual SLAM [105,107].Corotan et al. developed a mobile robot system based on Google's ARcore platform [127].Zheng et al. built a cloud-based visual SLAM framework [102].Keyframes are first extracted and submitted to the cloud server, saving bandwidth.

Radio Frequency Technologies
Radio frequency-based positioning is a common type of positioning system, covering many fields.It is very convenient to expand from the original communication function to the positioning function, which offers the advantages of low energy consumption and low cost, including Wi-Fi, Bluetooth, ZigBee, RFID, and UWB.

Radio Frequency Technology Positioning Algorithms
This section introduces several of the classic radio frequency technology positioning algorithms: RSSI, TOA, TDOA, and AOA, as shown in Figure 3.

RF positioning algorithms can be divided into two categories according to whether they are based on received signal strength indication (RSSI).
There are two main kinds of positioning techniques based on RSSI: triangulation and fingerprint positioning [128].The position of the AP (access point) is known by RSSI-based triangulation [129].The distance between the robot and each AP can be calculated according to the signal attenuation model, and then circles are drawn around each AP according to the obtained distance information.The intersection point is the mobile robot's position.

RF positioning algorithms can be divided into two categories according to whether they are based on received signal strength indication (RSSI).
There are two main kinds of positioning techniques based on RSSI: triangulation and fingerprint positioning [128].The position of the AP (access point) is known by RSSIbased triangulation [129].The distance between the robot and each AP can be calculated according to the signal attenuation model, and then circles are drawn around each AP according to the obtained distance information.The intersection point is the mobile robot's position.This method needs to measure the location of the AP in advance, so it is not suitable for situations where the environment is changing.
Fingerprint positioning is more accurate than triangulation.It is divided into two stages, an offline training stage and an online localization stage [130].In the training phase, the room is divided into small area blocks to establish a series of sampling points, and the radio frequency technology receiving equipment is used to sample these one by one, record the RSSI value and AP address obtained at this point, and establish a database.During online positioning, the mobile robot carries the radio frequency technology signal receiving device, obtains the current RSSI and AP address, and transmits this information to the server to match with the established database to obtain the estimated position.Collecting fingerprints in advance requires significant effort, and the fingerprints need to be re-collected after the signal-receiving device moves.
CSI (Channel State Information) is an upgraded version based on RSSI.The problem of multipath fading is the main problem of RSSI which is difficult to solve but will greatly reduce the positioning accuracy [131].CSI can receive and transmit channel amplitude and phase responses at different frequencies.In frequency, CSI has fine-grained characteristics compared to the coarse-grained RSSI and has the characteristics of resisting multipath fading [132].Compared to RSSI, CSI contains more than 10 times the information.Thus, CSI has better stability and positioning accuracy [4].However, the amount of data transmitted by CSI is huge, which requires more equipment and transmission time.
The positioning methods not based on RSSI mainly include Time of Arrival (TOA), Angle of Arrival (AOA), and Time Difference of Arrival (TDOA) algorithms.These methods are easy to calculate but are generally affected by problems, such as multipath and nonline-of-sight (NLOS) problems of the signal.
The most basic time-based RF positioning method is the Time of Arrival (TOA) [31].The distance between the two is assessed by recording the time it takes for a signal to travel from the target to the access point (AP).Conventional TOA positioning requires at least three APs with known positions to participate in the measurement.The positioning model can be established according to the triangulation method or geometric formula to solve the target position.The TOA algorithm completely relies on time to calculate the target position and requires exact time synchronization between the target and the AP.The formula for TOA can be described as: where d is the distance between the transmitter and the device.v is the speed of signals.
t r is the time the signal is received by the device.t s is the time the signal is transmitted by the transmitter.TDOA is a time difference positioning algorithm.It is an improved version based on TOA [133].The positioning is performed by recording the time difference between the target and different APs to estimate the distance difference between different APs.Like TOA, at least three APs with known locations are required to participate in the measurement.The method selects a base station as a reference base station and combines other base stations and reference base stations to establish multiple hyperbolas with the base station as the focus.The intersection of multiple hyperbolas is the position of the target.The focus of the hyperbolic equation is the base station, and the long axis is the distance between the signal arrival time differences between the base stations.Since there is no need to detect signal transmission time, TDOA does not require time synchronization between the target and the AP; it only requires time synchronization between APs.This feature reduces the need for time synchronization.The formula for TDOA can be described as: where d is the distance between the transmitter and the device.v is the speed of signals.
∆t is the time difference between signals received at two receivers.∆ f is the frequency difference between the two signals.
AOA estimates the target position by obtaining the relative angle of the target to the AP [134].AOA requires at least two APs and sets up array antennas or directional antennas to obtain the angle information of the target and AP.AOA has a simpler structure than the above two algorithms but is seriously affected by non-line-of-sight.The positioning accuracy depends on the accuracy of the antenna array and requires regular inspection and maintenance.The formula for AOA can be described as: where θ is the angle of arrival of the signal.(x, y) is the position of the device being located.(x s , y s ) is the position of the signal source.AOA is a direction-based localization technique that requires multiple antennas or an array of antennas to accurately estimate the angle of arrival.

Wi-Fi
Wi-Fi is the 802.11wireless local area network standard defined by the Institute of Electrical and Electronics Engineers (IEEE).The research on Wi-Fi-based indoor positioning started in 2000 [10].Wi-Fi devices are widely used in various modern indoor occasions, and Wi-Fi is mainly used in mobile robot communication in an indoor environment.It is currently the preferred method in the field of indoor positioning [135].Wi-Fi-based indoor positioning is relatively simple to deploy in an indoor renovation, and there is no requirement for additional hardware devices.The working frequency of Wi-Fi is in the 2.4 GHz and 5 GHz frequency bands, of which the 5 GHz frequency band has the advantages of less interference, lower noise, and faster speed.The main purpose of Wi-Fi now is communication.If Wi-Fi is to be used for robot positioning, further processing of Wi-Fi signals is required to improve positioning accuracy.
In the screened papers, all the papers based on Wi-Fi technology applied the RSSI technique.While techniques such as TOA and AOA are not uncommon, the RSSI technique has a higher positioning accuracy ceiling.RSSI is less affected by non-line-of-sight and is more suitable for the actual working environment of mobile robots.The problem with RSSI is multipath fading and frequent signal fluctuation.With the introduction of filters, neural networks, and deep learning, there are more and more schemes combined with RSSI.Wang et al. used a K-ELM (Kernel extreme learning machine) to improve the accuracy of a fingerprint recognition algorithm [12].Cui et al. proposed a robust principal component analysis-extreme learning machine (RPCA-ELM) algorithm based on robust principal component analysis to improve the localization accuracy of mobile robot RSSI fingerprints [136].Since the fluctuation of Wi-Fi signal strength has a great influence on fingerprint positioning accuracy, Thewan et al. proposed the Weight-Average of the Top-n Populated (WATP) filtering method.Compared with Kalman filtering, it reduces the calculation time while satisfying the positioning accuracy.The Kalman filtering Mean-Square-Error (MSE) is 13.65 dBm, and the proposed method MSE is 0.12 dBm [11].Sun et al. applied cellular automata to the indoor RSS positioning of AGVs to achieve low-cost filtering of noise caused by environmental factors and mechanical errors [137].Zhang et al. optimized Wi-Fi-based RSS localization with Deep Fuzzy Forest [138].The first step in using RSSI is to collect Wi-Fi fingerprints at various indoor locations and establish a database.This step is labor-intensive, and as the location of the device changes, the database needs to be rebuilt.Lee et al. [87] and Zou et al. [86] combined the LiDAR method with the Wi-Fi method and used SLAM technology to collect Wi-Fi fingerprints.CSI and RSSI are compared, and the results indicate that the CSI positioning accuracy error is about 2% smaller than the RSSI, which improves the problem of multipath effects in RSSI [132].

Bluetooth
Bluetooth uses the 802.15.1 standard developed by the IEEE.The worldwide universal Bluetooth 4.0 protocol was announced in 2010 [139].Bluetooth communicates using radio waves with frequencies between 2.402 GHz and 2.480 GHz.Among them, the Bluetooth low-energy (BLE) version has the advantages of high speed, low cost, and low power consumption, so it is widely used on the Internet of Things (IoT).Bluetooth positioning technology is mainly based on RSSI positioning.It uses the broadcast function of the Bluetooth beacon to measure the signal strength.A Bluetooth beacon installed in the working environment will continuously send broadcast messages.After the terminal mounted on the mobile robot receives the transmitted signal, it evaluates the pose of the mobile robot according to the positioning algorithm.Bluetooth is widely used in indoor positioning IoT because of its low cost and low operating expenses.Since it is susceptible to noise interference and has insufficient stability, and the positioning accuracy of unprocessed Bluetooth signals is at the meter level, it is rarely used in the field of indoor positioning of mobile robots.
The iBeacon developed by Apple in 2012 and the Eddystone released by Google in 2015 greatly promoted the development of BLE [140].Although the positioning accuracy of the existing Bluetooth 4.0 technology cannot meet the high-precision requirements of mobile robots, with the emergence of Bluetooth 5.0, this demand can be met.Compared with Bluetooth 4.0, Bluetooth 5.0 has a longer effective distance, a faster data transmission rate, and lower energy consumption [141].Most importantly, the localization accuracy is improved to the sub-meter level [142].Bluetooth 5.0 has not been put into formal application, and it is the indoor positioning technology of mobile robots available in the future.
The trend of Bluetooth-based indoor positioning is to combine optimization algorithms with positioning technology.Mankotia et al. implemented a Bluetooth localization of a mobile robot based on iterative trilateration and cuckoo search (CS) algorithm based on a Monte Carlo simulation [143].Comparing the cuckoo algorithm and particle filter in the simulation, the positioning accuracy of the CS algorithm (mean error 0.265 m) is higher than that of the particle filter (mean error 0.335 m), and the overall error is reduced by 21%.

ZigBee
ZigBee uses the IEEE 802.15.4 standard specification.Its positioning principle is similar to that of Wi-Fi and Bluetooth, and it is also mainly based on RSSI to estimate the distance between devices.ZigBee is characterized by low cost, wide signal range, high reliability, low data rate, and good topology capabilities.The disadvantage is that the stability is poor, it is susceptible to environmental interference, and the positioning accuracy is at the meter level.
Wang et al. designed a mobile robot positioning system based on the ZigBee positioning technology and combined the centroid method and the least square method to improve the positioning accuracy of the RSSI algorithm in complex indoor environments [144].The mobile robot is both the coordination node and the mobile node of the network, which increases the mobility and flexibility of the ZigBee network.Since RSSI-based positioning is easily affected by environmental changes, Luo et al. proposed a ZigBee-based adaptive wireless indoor localization system (ILS) in a dynamic environment [128].The system is divided into two steps.First, the mobile robot uses LiDAR SLAM to collect RSSI values in the working environment and update the fingerprint database in real-time.Secondly, the adaptive signal model fingerprinting (ASMF) algorithm is proposed.The signal attenuation model of the ASMF can reduce noise, evaluate the confidence of RSSI measurement values, and reduce the interference of abnormal RSSI values.

Radio Frequency Identification
RFID positioning technology uses radio frequency signals to transmit the location of the target.The hardware consists of three parts: a reader, an electronic tag, and an antenna.The reader is the core part of the RFID system, mainly composed of data processing modules.It communicates with the electronic tag through the antenna and reads the information such as ID, RSSI, and phase of the electronic tag [145].RFID tags are divided into active and passive tags according to whether they need power [146].Active tags can always transmit wireless signals to the environment, while passive tags are passively responding to the wireless signals emitted by readers.Active tags have wider coverage, but in indoor positioning, passive tags are used more due to their low-cost, easy deployment, and maintenance.RFID has different operating frequencies, mainly including low frequency (LF) (9-135 KHz), high frequency (HF) (13.553-15.567MHz), and ultra-high frequency (UHF) (860-930 MHz) [147].At present, UHF-RFID is mostly used in the indoor positioning of mobile robots.Compared with LF and HF, UHF-RFID has the advantages of a simple structure, low cost, and high data transmission rate.Most of the RFID-based papers screened selected UHF-RFID.
Similar to many RF schemes, applying filters to RFID signals is also a common method.DiGiampaolo et al. [148] and Wang et al. [146] both chose Rao-Blackwellized particle filters as the algorithm for RFID SLAM.The reflection and divergence problem of RFID tags is serious.Magnago proposed an Unscented Kalman Filter (UKF) to solve the phase ambiguity of the RF signal backscattered by UHF-RFID tags [149].Bernardini et al. equipped a mobile robot with dual reader antennas to reduce signal loss and tag backscatter, and used the PSO algorithm to optimize the specific absorption rate (SAR) process [150].Gareis et al. also used SAR RFID for localization, adding eight channels for monitoring [151].The positioning accuracy reached the centimeter level.Tzitzis et al. introduced a multi-antenna localization method based on the K-means algorithm [152].

Ultra-Wide Band
As a carrierless communication technology, a UWB uses non-sinusoidal narrowband pulses with very short periods to transmit data, which can achieve high speed, largebandwidth, and high time-resolution communication in proximity [153].Extremely short pulse modulation makes it possible to greatly reduce the effects of multipath problems.Low transmission power avoids interference with Wi-Fi, BLE, or similar devices.A UWB has stronger wall penetration than Wi-Fi and BLE.Recently, the UWB has become the focus in the field of indoor positioning due to its stable performance and centimeter-level positioning accuracy, making it one of the ideal choices for the positioning of mobile robots.High-precision positioning at the centimeter level requires numerous anchor points, higher carrier wave frequency, and sampling frequency, resulting in relatively high hardware and computational costs.
A UWB can achieve localization using RSS, TOA, AOA, and TDOA techniques [154][155][156].Traditional UWB localization systems are mostly based on mathematical methods and filters [157,158].Lim et al. proposed a three-layered bidirectional Long Short-term Memory (Bi-LSTM) neural network to optimize UWB localization [159].Sutera et al. used reinforcement learning to optimize noise in UWB localization and correct localization errors [160].Recently, data fusion is the trend of UWB positioning.A Constraint Robust Iterate Extended Kalman filter (CRIEKF) algorithm has been proposed by Li and Wang to fuse the UWB and IMU [33].A Sage-Husa Fuzzy Adaptive Filter (SHFAF) is used to fuse the UWB and IMU [31].Cano et al. proposed a Kalman filter-based anchor synchronization system to tune clock drift [154].The UWB has also been used as a hybrid localization method and SLAM method for data fusion using filter algorithms [32,36].Liu et al. studied how to arrange anchor points scientifically, and proposed a UWB-based multi-base station fusion positioning method [161].In the actual robot positioning process, the arrangement and rationality of the base station directly determine the positioning accuracy and benign space area.

SLAM
SLAM is the most used technology in mobile robot positioning (word frequency 39).Among the 147 papers, 68 papers use SLAM-based positioning technology.Whether it is LiDAR SLAM or visual SLAM, it offers the advantage of not needing to change the environment, has no special requirements for the environment, is universal, and has the ability to avoid obstacles.Today, there are very cheap options for cameras and LiDAR.In theory, if the algorithms and computing resources are good enough, it can become a "GPS" technology for indoor navigation.
Computer Vision is the positioning technology most concerned by researchers, and visual + vision has appeared 19 + 7 = 26 times.Vision has the richest environmental information of all methods, and is also the most computationally challenging.It also has the highest requirements for algorithm innovation, so many researchers focus on improving the performance of vision algorithms.Visual SLAM is the main method based on vision, followed by beacon-based.Semantic (word frequency eight; twelve papers use semantic SLAM) is a new trend in the direction of visual SLAM.It gives the space and objects in the environment the actual meaning.It improves the recognition of the environment, the process of visual SLAM, and the efficiency of loop closure detection.It is also beneficial for mobile robots to recognize the environment at a higher level and perform more complex tasks.Semantic maps are mostly based on the ORB-SLAM2 algorithm (7/12).The most used semantic SLAM target detection algorithm is the YOLO-V3 algorithm (7/12).

Data Fusion
Data fusion is also a key direction for researchers to innovate (word frequency 15, papers 29).Various sensors have different advantages, disadvantages, and applicable scenarios.The realization of sensor information fusion can complement each other's strengths and weaknesses, meet the needs of various working environments, and improve the positioning accuracy of mobile robots.Among the one hundred and forty-seven Vision has the richest environmental information of all methods, and is also the most computationally challenging.It also has the highest requirements for algorithm innovation, so many researchers focus on improving the performance of vision algorithms.Visual SLAM is the main method based on vision, followed by beacon-based.Semantic (word frequency eight; twelve papers use semantic SLAM) is a new trend in the direction of visual SLAM.It gives the space and objects in the environment the actual meaning.It improves the recognition of the environment, the process of visual SLAM, and the efficiency of loop closure detection.It is also beneficial for mobile robots to recognize the environment at a higher level and perform more complex tasks.Semantic maps are mostly based on the ORB-SLAM2 algorithm (7/12).The most used semantic SLAM target detection algorithm is the YOLO-V3 algorithm (7/12).

Data Fusion
Data fusion is also a key direction for researchers to innovate (word frequency 15, papers 29).Various sensors have different advantages, disadvantages, and applicable scenarios.The realization of sensor information fusion can complement each other's strengths and weaknesses, meet the needs of various working environments, and improve the positioning accuracy of mobile robots.Among the one hundred and forty-seven selected papers, thirty-one papers used multiple sensors for data fusion, and one paper used four localization methods for data fusion [118].The data fusion of the IMU positioning method and other positioning methods has appeared 16 times, including vision, LIDAR, UWB, visible light, and geomagnetic field.The advantage of the IMU is that the whole process of positioning is closed loop and is not disturbed by other factors.The disadvantage is also obvious, the positioning accuracy is not high, and there will be accumulated errors.Therefore, it is the best choice to fuse the IMU and the method that can evaluate the pose in real-time, and the two can complement each other's advantages.The whole closed-loop feature of the IMU is used to prevent the sudden positioning error of other methods, and the real-time performance of other methods is used to correct the accumulated error of the IMU.Almost all mobile robots will be equipped with the IMU.
The fusion algorithms used in the data fusion papers mainly include the EKF, particle filter, neural network, and improved algorithm based on the Kalman filter.Among them, the EKF accounted for 12/29 and the fusion method based on the improved version of the Kalman filter accounted for 7/29.The EKF is a nonlinear version of the Kalman filter.It is a nonlinear state estimation approximation method based on Taylor series expansion to overcome nonlinear system and measurement models.It has been used for a long time in nonlinear sequential data processing problems.In the past ten years, the EKF has also been widely used in the field of mobile robot positioning, which is the most classic algorithm to solve the problem of multi-sensor data fusion.Compared with the extended Kalman filter, the neural network has a stronger ability to deal with nonlinear problems, which also means that it can process more data and obtain higher accuracy.In other fields related to data processing, papers based on neural networks are growing rapidly, but in the localization problem of mobile robots that requires robustness, the instability of neural networks is a problem that needs to be solved.In the papers using other improved algorithms of Kalman filtering, the researchers take the improvement of the Kalman algorithm as the biggest innovation of the paper, which also indicates that the core of multi-sensor data fusion lies in the algorithm.Multiple sensors will cause a large increase in the amount of information, and the specifications of this information are inconsistent.How to filter useless information and noise quickly, process data, and comprehensively analyze it are the keys to the fusion algorithm, and they are also the focus of future research work.

Innovative Methods
Of the 147 papers, 31 papers used the ROS system.The main applications of ROS are concentrated in papers based on LiDAR and Computer Vision localization methods.In 2007, the ROS system was developed by the Stanford University Robotics Team.It is currently a very widely used open-source robot operation sub-system in the field of robotics.Similar to MATLAB, ROS encapsulates complex program implementations into different libraries, saving a lot of low-level programming time for program compilers.Each library can be compiled separately, which is convenient for program compilers to modify the functions of the underlying code, and it is also convenient for beginners to learn and use other people's open-source programs.It is very beneficial for research innovation in the algorithm.The completely open-source features and the improvement of countless scientific research and programming workers make the ROS system widely used in the field of robotics.
Four papers used cloud computing.At present, to ensure the flexibility and endurance of mobile robots, microcontrollers are the main choice for airborne processors.However, as sensor technology improves, so does the computing demand for mobile robots.Especially in the direction of sensor fusion and SLAM, insufficient computing power restricts the effectiveness of the algorithm.The emergence of cloud computing can become a breakthrough for this bottleneck.Transfer the huge computing tasks of mobile robots to cloud computers.Use the fast and efficient computing power of cloud computing to ensure the real-time performance of mobile robot positioning and mapping.In this way, the mobile robot acts as a data collector and becomes part of the cloud framework.In this way, a large amount of useful data collected by the sensor can be fully utilized to improve positioning accuracy.However, the effect of cloud computing is limited by the transmission efficiency of the network.
In the selected papers, there are three ways used to demonstrate the scientificity and validity of the proposed method: simulation, testing in a real environment, and running on public datasets.The most effective method of scientific research is simulation + actual measurement.Simulation can easily modify sensor types and parameters, test positioning algorithm codes, simulate various working environments, and demonstrate the theoretical feasibility of the proposed method.Testing in a real environment is the most convincing way to demonstrate, especially the design of a positioning system for a specific environment.The ultimate purpose of proposing a positioning method for a mobile robot is to run it in an actual working environment.Running public datasets is an important means to demonstrate the feasibility of SLAM methods.As SLAM becomes the preferred technology for the indoor positioning of mobile robots, how to uniformly evaluate the advantages and disadvantages of various algorithms has become the focus.Public datasets are conducive to the objective evaluation of algorithms and horizontal comparison of excellent algorithms in the field of SLAM.The most commonly used public dataset is the Technical University of Munich (TUM) dataset.

Types of Papers
The graph above summarizes the number of methods used for the selected papers.There are a total of 12 different methods.It can be clearly found that Computer Vision (55) and LiDAR (44) dominate the indoor localization of mobile robots, followed by the IMU ( 16), UWB (13), VLC (12), Wi-Fi (8), RFID (6), and Ultrasonic (5), which have also been studied more.Geomagnetic (2), ZigBee (2), IR (1), and Bluetooth (1) received little attention.
As can be seen in Figure 5, in the field of radio, the UWB is the most used method, even surpassing Wi-Fi.In fact, in the field of indoor positioning, Wi-Fi is the most popular method.The reason is that the indoor positioning of mobile robots requires higher accuracy than ordinary indoor positioning, and the UWB is a method in which the positioning accuracy of the radio method can stably reach the cm level.Thus, the UWB is more suitable for mobile robots.The limitation of the UWB is the high cost of meeting the high-precision requirements.Wi-Fi positioning is also a very mature method.Almost all modern facilities have ready-to-use equipment, the cost of transformation is low, and the existing research on Wi-Fi positioning is also very extensive.Applying Wi-Fi to a robot is technically simple, but it lacks precision.It usually needs to be combined with other optimization methods, such as neural networks, deep learning, filters, etc. RFID is connected behind Wi-Fi, and all RFID-based indoor positioning papers of mobile robots use UHF-RFID.UHF-RFID is an emerging technology, and the positioning accuracy of a few papers has reached the centimeter level, and most of them are at the decimeter level.Non-radio localization has more applications in robotic indoor localization, especially Computer Vision and LiDAR.With the continuous improvement of algorithms and the improvement of sensor equipment by researchers, they have the advantages of centimeter-level high precision, there is no need for environmental modification, and they have the advantage of adapting to complex environments.The only limitation is the high need for computational cost.If the computational problem can be solved, the accuracy and speed of positioning will be better.As necessary hardware for robots, the IMU is usually used as an auxiliary component for data fusion.VLC is an emerging positioning technology.It innovatively uses the common LED in daily life as the main component, which can meet the needs of lighting and positioning at the same time.An ultrasonic can achieve Non-radio localization has more applications in robotic indoor localization, especially Computer Vision and LiDAR.With the continuous improvement of algorithms and the improvement of sensor equipment by researchers, they have the advantages of centimeterlevel high precision, there is no need for environmental modification, and they have the advantage of adapting to complex environments.The only limitation is the high need for computational cost.If the computational problem can be solved, the accuracy and speed of positioning will be better.As necessary hardware for robots, the IMU is usually used as an auxiliary component for data fusion.VLC is an emerging positioning technology.It innovatively uses the common LED in daily life as the main component, which can meet the needs of lighting and positioning at the same time.An ultrasonic can achieve centimeter-level accuracy in short-range positioning and is mainly used for robot obstacle avoidance.Geomagnetic positioning is not used much in the field of robotics due to its low positioning accuracy.It uses the natural magnetic field of the earth and is also a positioning technology worth exploring.

Paper Citations
From the process of retrieving mobile robot indoor positioning papers and Table 3, it can be found that there are many related papers in the past three years.There are many positioning schemes proposed, but none of them have the same dominant positioning technology as GPS.Due to the complexity and diversity of work environments and different needs, the new methods proposed in the researchers' published papers are suitable for specific environments.Moreover, most of the proposed methods are based on the optimization of existing technologies, and there is no subversive original method.The highest citation count is forty-seven, and the citation count of most papers is below five.Papers in this field may have the following characteristics: (1) the practicality of the proposed scheme is not significant enough, (2) there is little content overlap between the papers, and the application scope of the proposed scheme is narrow, and (3) the paper is based on past knowledge points, (4) which proves that there are few reviews or surveys in the field of the indoor positioning of mobile robots lately.These 147 papers were published in 72 journals.The most published journals are IEEE Access-fifteen (papers), Sensors-thirteen, Robotics and Automation-six, Intelligent Robots and Systems-five, IEEE Internet of Things Journal-four, IEEE Transactions on Industrial Electronics-three, Proceedings of the ACM International Conference Proceeding Series-three, International Journal of Advanced Robotic Systems-three, Optical Engineering-three, Advanced Intelligent Mechatronics-three, and Transactions on Automation Science and Engineeringthree.The top conferences in the field of robotics, such as ICRA (0), IROS (1), RSS (0), ROVISP (0), etc., have hardly appeared, indicating that the current focus of robotics research has shifted from the direction of positioning.
The organization information of the first author was also collected, and it can be seen that researchers from different countries pay attention to the field of indoor positioning of mobile robots.The top countries were China-forty-five (papers), the United Statestwenty, Germany-eleven, Spain-eight, South Korea-seven, the United Kingdom-six, France-six, Italy-five, Japan-five, and Canada-four.

Conclusions
This paper conducts a systematic review of the literature in the field of indoor positioning of mobile robots in the past three years.First, the 12 mainstream positioning methods are classified and introduced, including the positioning technology applied in the positioning methods.The advantages and disadvantages of each positioning method in terms of positioning accuracy, cost, need for facility modification and robustness are analyzed.Each localization method is supported by the screened literature.The paper also deeply introduces SLAM based on LiDAR and Computer Vision, which is the focus of indoor positioning of mobile robots.Due to its high precision, high stability, no need to change the environment, and universality to the environment, it has the potential to become an indoor positioning 'GPS' method.Then, the 147 papers selected from the papers in the past three years are summarized and analyzed, and the latest status and future development trends of indoor positioning of mobile robots are obtained.
To verify the feasibility of the proposed positioning method, both simulation and experiment in the real environment are good choices.Simulation can easily simulate different working scenarios, and experiments in the real environment can fully demonstrate the application ability of the positioning method.The combination of both approaches is the most ideal verification method.
In addition, sensor fusion has great advantages.Each positioning method has advantages and disadvantages, which can be combined to make up for its shortcomings, adapt to more and more complex environments, and achieve higher accuracy.The EKF algorithm is the most popular algorithm for sensor data fusion.Emerging technologies such as deep learning, neural networks, and cloud computing also appeared in the selected papers, and they all combined with localization methods to receive solutions for realizing high-precision localization of mobile robots.They are of great help to the improvement of algorithm performance and are the future trend of mobile robot development.

Figure 1 .
Figure 1.Schematic diagram of the IMU dead reckoning algorithm.

Figure 1 .
Figure 1.Schematic diagram of the IMU dead reckoning algorithm.
et al. used the PSO algorithm to improve the particle filter [20].Nie et al. added loop detection and correction functions [61].Chen et al. proposed a heuristic Monte Carlo algorithm (HMCA) based on Monte Carlo localization and discrete Hough transform (DHT) [62].Garrote et al. added reinforcement learning to particle filter-based localization [63].Yilmaz and Temeltas employed self-adaptive Monte Carlo (SA-MCL) based on an ellipse-based energy model

Figure 2 .
Figure 2. Visual SLAM flowchart.•Sensordata input: Environmental data collected by the camera.Occasionally, the IMU is used as a secondary sensor.•Front-endvisual odometer: Preliminary camera poses estimation based on image information of adjacent video frames.•Back-endnonlinear optimization: optimize the camera pose obtained by the frontend to reduce the global error.•Loopclosure detection: According to the image to determine whether to reach the previous position.Form a closed loop.•Mapping:Build a map of the environment based on continuous pose estimates.

Figure 4 .
Figure 4. Word cloud of the 147 paper titles.

Figure 4 .
Figure 4. Word cloud of the 147 paperComputer Vision is the positioning technology most concerned by researchers, and visual + vision has appeared 19 + 7 = 26 times.Vision has the richest environmental information of all methods, and is also the most computationally challenging.It also has the highest requirements for algorithm innovation, so many researchers focus on improving the performance of vision algorithms.Visual SLAM is the main method based on vision, followed by beacon-based.Semantic (word frequency eight; twelve papers use semantic SLAM) is a new trend in the direction of visual SLAM.It gives the space and objects in the environment the actual meaning.It improves the recognition of the environment, the process of visual SLAM, and the efficiency of loop closure detection.It is also beneficial for mobile robots to recognize the environment at a higher level and perform more complex tasks.Semantic maps are mostly based on the ORB-SLAM2 algorithm (7/12).The most used semantic SLAM target detection algorithm is the YOLO-V3 algorithm (7/12).

Figure 5 .
Figure 5.The number of papers for the 12 localization methods.

Figure 5 .
Figure 5.The number of papers for the 12 localization methods.

Table 1 .
The IMU-based data fusion papers.