Next Article in Journal
Resource-Aware Deep Reinforcement Learning for Joint Caching and Service Placement in Multi-Access Edge Computing
Next Article in Special Issue
Cross-Dataset Insights for Fine-Grained Vehicle Orientation Prediction
Previous Article in Journal
Experimental Evaluation and Theoretical Analysis of I/Q Imbalance in Direct Millimeter-Wave Six-Port QPSK Demodulators
Previous Article in Special Issue
A Vision–Locomotion Framework Toward Obstacle Avoidance for a Bio-Inspired Gecko Robot
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A Survey of Machine Learning Algorithms for Autonomous Vehicles

by
Agnieszka Lazarowska
*,
Monika Rybczak
,
Mirosław Łącki
,
Krystian Kozakiewicz
,
Józef Lisowski
and
Andrzej Stateczny
Department of Autonomous Systems, Faculty of Computer Science, Gdynia Maritime University, 81-225 Gdynia, Poland
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(10), 2073; https://doi.org/10.3390/electronics15102073
Submission received: 4 April 2026 / Revised: 4 May 2026 / Accepted: 6 May 2026 / Published: 13 May 2026

Abstract

This paper presents a comprehensive review of recent works (2020–2026) on machine learning (ML) algorithms applied to autonomous platforms such as unmanned underwater vehicles (UUVs), unmanned surface vehicles (USVs), unmanned aerial vehicles (UAVs), and ground-based mobile robots. The review focuses on the following functional areas: environment perception, simultaneous localization and mapping (SLAM), collision avoidance and path planning, and motion control. Different ML methods are covered, including supervised, semi-supervised, and unsupervised learning, as well as reinforcement learning and deep reinforcement learning. The reviewed methods are analyzed with respect to their performance, robustness, and suitability for different operational environments, including underwater, surface, air, and land domains. Finally, the authors identify key challenges and outline promising future directions aimed at improving the safety, autonomy, and reliability of autonomous vehicles.

1. Introduction

Autonomous vehicles are highly complex systems used in various industries, such as transportation, agriculture, defense, logistics, and warehousing. These vehicles can operate without human control by relying on advanced technologies, including sensors, artificial intelligence, and complex control algorithms. Achieving autonomous operation involves a set of subtasks, such as environment perception, simultaneous localization and mapping (SLAM), collision avoidance, path planning, and motion control. Autonomous vehicles operate in sea, air, and land environments and are therefore classified based on their operating domain as unmanned underwater vehicles (UUVs), unmanned surface vehicles (USVs), unmanned aerial vehicles (UAVs), and unmanned ground vehicles (UGVs).
The authors in [1] reported on the growth of publications related to different types of autonomous vehicles between 2015 and 2024. Other recent survey papers on autonomous vehicles include [2], where the authors compare vision-, LiDAR-, and wireless-sensor-based methods applied to collaborative positioning in swarms of UAVs, UGVs, USVs, and UUVs.
Table 1 presents a comparative analysis of recent review papers on autonomous vehicles. The surveys were compared with respect to the types of autonomous vehicles, methods, and tasks. Most recent surveys focus on a single type of autonomous vehicle, such as UAVs [3,4,5], UUVs [6,7,8,9], mobile robots [10,11], or self-driving cars [12]. Review papers covering multiple types of vehicles are much less common [1,2,13,14].
Another observation is that many surveys focus on a single task, such as path (trajectory) planning [4,9,13,14,17], sensor fusion [1], detection and classification [3], collaborative positioning [2], visual-based localization and mapping [8], or tracking control [7]. Surveys addressing multiple tasks are also less common [5,10,11,16].
Many surveys also consider different methodological approaches, such as traditional methods (e.g., graph-based and sampling-based), bio-inspired techniques, and machine learning-based approaches [4,6,10,11,13,14,16]. Surveys focusing exclusively on machine learning-based solutions are relatively rare [3,5,15,17].
This paper presents a survey of machine learning methods (2020–2026) applied to environment perception, SLAM, collision avoidance, path planning, and motion control in different types of autonomous vehicles, including UUVs, USVs, UAVs, and ground-based mobile robots.
The contributions of this paper to the literature on autonomous vehicles can be summarized as follows:
  • The surveyed literature covers four types of autonomous platforms: unmanned underwater vehicles, unmanned surface vehicles, unmanned aerial vehicles, and ground-based mobile robots.
  • The following tasks are considered in the literature review: environment perception, simultaneous localization and mapping, collision avoidance and path planning, and motion control.
  • The review focuses exclusively on machine learning-based methods.
  • The review covers studies published between 2020 and 2026.
  • The analysis also includes a tabular comparison, which provides information such as authors, year of publication, machine learning method, task, and verification method (simulation and/or real-world experiments).

2. Machine Learning for Unmanned Underwater Vehicles

Unmanned Underwater Vehicles can be divided into two types: Autonomous Underwater Vehicles (AUVs) and remotely operated vehicles (ROVs) [18]. AUVs operate without real-time human control, executing preplanned missions and potentially adapting their behavior using onboard sensors and algorithms. They typically have limited communication during operation.
ROVs are piloted by a human operator in real time, typically via a tether that provides communication (often high-bandwidth) and sometimes power. This makes them well suited for precise inspection and manipulation tasks, but it limits their range and mobility compared with fully autonomous systems. These two types of UUVs are shown in Figure 1 and Figure 2.
Figure 3 presents the search results for scientific articles from multiple publishers’ databases published between 2020 and 2026. The following search query was used: “unmanned underwater vehicles” OR “UUV” AND “machine learning”. The results of the bibliometric analysis related to UUVs and ML are shown in Figure 4. The analysis was performed using VOSviewer (1.6.20) [19], which enables the creation of cluster maps based on authors’ keywords extracted from papers published between 2020 and 2026. The data were obtained from bibliographic files exported from the Web of Science.
Figure 1. Bluefin-21AUV [20].
Figure 1. Bluefin-21AUV [20].
Electronics 15 02073 g001
Figure 2. Defender ROV [21].
Figure 2. Defender ROV [21].
Electronics 15 02073 g002
Figure 3. Number of publications on ML algorithms for UUVs (2020–2026).
Figure 3. Number of publications on ML algorithms for UUVs (2020–2026).
Electronics 15 02073 g003
Figure 4. Keyword co-occurrence clustering results in VOSviewer for “unmanned underwater vehicles” and “machine learning”.
Figure 4. Keyword co-occurrence clustering results in VOSviewer for “unmanned underwater vehicles” and “machine learning”.
Electronics 15 02073 g004

2.1. Environment Perception for UUVs

In UUVs, environment perception is typically built as a pipeline:
sensepreprocessdata fusion
interpretuse for navigation/avoidance/manipulation
under strong environmental constraints such as currents, temperature, attenuation, turbidity, and multipath effects in water [22].
Limited onboard computational power and narrow communication channels also play a key role in efficient signal processing [18]. Recent environment perception methods for UUVs increasingly rely on forward-looking sonar, with learning-based detection and segmentation improving robustness under low visibility conditions. For instance, Cao et al. [23] combined sonar image detection with DRL-based avoidance, while Gao et al. [24] proposed unsupervised obstacle segmentation to reduce annotation requirements. Other works leverage oscillatory sonar scanning for 3D reconstruction to support obstacle avoidance in complex environments [25].
Environment perception is tightly coupled with collision avoidance and path planning, and many studies address these tasks simultaneously, for example, by combining sonar-based detection with avoidance or planning frameworks [23,25,26,27].

2.2. Simultaneous Localization and Mapping for UUVs

This section reviews recent advances in deep learning (DL)-based underwater SLAM that address challenges encountered during underwater navigation. Conventional SLAM techniques often show limited performance in such conditions [28]. Recent surveys highlight key challenges in underwater SLAM, such as sensing degradation, illumination changes, and the need for multi-sensor integration [28]. For UUV-specific visual navigation and positioning, Qin et al. [6] provide a comprehensive survey discussing SLAM-related pipelines and sensing configurations for underwater environments.
According to the survey, there is no single dominant or widely used machine learning method for UUV SLAM; however, deep learning-based methods appear to be the most prominent ML family discussed, most often in combination with adaptive multi-sensor fusion and Kalman-filter-based localization frameworks [22]. Image recognition methods such as CNNs and YOLO are used in UUV SLAM primarily as front-end perception modules for feature extraction, object detection, and segmentation. However, in the current literature, they are more often applied to support specific SLAM components rather than serving as the basis of a complete end-to-end UUV SLAM framework.

2.3. Collision Avoidance and Path Planning for UUVs

Recent works increasingly apply ML, especially deep reinforcement learning (DRL), to local collision avoidance and real-time path planning for UUVs operating in cluttered and uncertain underwater environments. For example, Gao et al. [26] proposed a PPO–DWA planner that combines the Proximal Policy Optimization algorithm with a modified Dynamic Window Approach to satisfy kinematic constraints while using forward-looking sonar observations. A related approach based on the A3C algorithm was presented in [29], while in [30] the authors proposed Deep Deterministic Policy Gradient (DDPG) for collision avoidance tasks.
Learning in unknown environments is also a challenging task but can be effectively addressed through the integration of the Dubins Improved Hybrid A* (DIHA*) algorithm and the Fuzzy Heading Avoidance (FHA) algorithm. ML has also been applied to AUV obstacle avoidance using event-triggered reinforcement learning [31] and the Elephant Clan Update Optimization algorithm [32].
Figure 5 and Table 2 summarize recent collision avoidance and path planning methods for UUVs/AUVs.

2.4. Motion Control of UUVs

Motion control of UUVs refers to the set of algorithms that enable the vehicle to follow desired motions, i.e., regulate its position, velocity, and attitude by commanding actuators (thrusters, control fins, and ballast systems), while accounting for nonlinear hydrodynamics and environmental disturbances.
In practice, it includes tasks such as heading, depth, and attitude control, station keeping, and trajectory/path tracking, typically covering full six-degree-of-freedom (6-DOF) motion: surge, sway, heave, roll, pitch, and yaw.
Motion control is closely linked to collision avoidance and path planning, which is why many studies overlap between Section 2.3 and Section 2.4.
Additional papers on machine learning-based motion control for UUVs are presented in Figure 6 and Table 3. They include solutions based on advanced algorithms such as Multi-Agent Proximal Policy Optimization (MAPPO) and Weighted Generative Adversarial Imitation Learning (WGAIL), which can mitigate reward-design challenges in DRL-based controllers, as well as reinforcement learning methods enhanced with large language models (LLMs).

3. Machine Learning for Unmanned Surface Vehicles

Unmanned surface vehicles (USVs) can be regarded as autonomous marine robots. They are used for various marine tasks, such as environmental monitoring, harbor patrol, coastal and seabed mapping, offshore platform inspection, oceanographic and meteorological data collection, and search and rescue operations.
Figure 7 summarizes the results of a search in multiple publishers’ databases for papers on machine learning applied to USVs published between 2020 and 2026. The following search query was used: “unmanned surface vehicle” OR “USV” AND “machine learning”. The results of a bibliometric analysis related to USVs and ML are shown in Figure 8. The analysis was performed using VOSviewer [19], which enables the creation of cluster maps based on authors’ keywords extracted from papers published between 2020 and 2026. The data were obtained from bibliographic files exported from the Web of Science.

3.1. Environment Perception for USVs

Environment perception in USVs typically involves processing data from sensors such as cameras, LiDAR, radar, sonar, Global Navigation Satellite System (GNSS), and Inertial Measurement Units (IMUs). The aim of this task is to identify obstacles such as vessels and shorelines, as well as environmental conditions such as waves and visibility. These data are crucial for enabling safe and autonomous navigation of USVs and serve as input to algorithms performing functions such as collision avoidance, path planning, and autonomous decision-making. An example of a USV, HydroDron, with an environment perception system is shown in Figure 9. The paper [39] presents a survey of recent solutions and technologies applied to USVs, with special emphasis on learning-based methodologies and data-driven systems for guidance, navigation, and control.
Figure 10 and Table 4 present recent ML-based environment perception methods for USVs.
In [41], a dual-camera system applied on a USV was proposed for target detection and localization. In this approach, a convolutional neural network (CNN), YOLOv3, was used for target detection and extraction of regions of interest (ROIs). Feature extraction and matching were then performed within the ROIs instead of across the whole image. Targets were localized by applying the triangulation principle using matched points and calibrated camera parameters.
The authors in [42] introduced an environment perception system for a USV based on sensor fusion of camera and LiDAR data. The system consists of five modules: a surface image denoising module (SIDM), a water reflection removal module (WRRM), a sea-sky detection module (SSDM), a surface target detection module (STDM), and an obstacle avoidance module (OAM). In the SSDM module, an SVM classifier was used to determine a hyperplane separating the sea and sky regions. In the STDM module, the YOLOv3 CNN was used for target detection, similarly to [41]. The environment perception system was tested in simulations and real-world experiments using the USV platform WAM-V-USV. Field tests were carried out on the Songhua River in China and in Hawaii, USA.
In [43], the authors also presented an approach based on camera data for target detection. A CNN was used in this approach to improve obstacle edge detection accuracy. It was integrated into a superpixel segmentation method called Simple Linear Iterative Clustering (SLIC). The method was validated in simulations using three publicly available maritime image datasets.
The authors in [44] proposed a cooperative USV–UAV system based on camera data for object detection and classification, as well as semantic segmentation of sea and air regions. The system utilized CNNs, specifically the YOLOX and PIDNet models. The approach was evaluated in field tests carried out in the Huanghai Sea near Yancheng, China, using a cooperative USV–UAV platform consisting of an unmanned catamaran and a quadrotor.
In [45], the authors introduced a visual perception method utilizing a lightweight convolutional neural network (CNN) and the USV’s real-time heading angle. The model inputs include local environmental images and heading angle features.
In [46], the authors proposed a semantic segmentation method to classify image elements into three categories: navigable water surfaces, waterborne obstacles, and background regions. The authors developed a model called MarineSeg, composed of a CNN–Transformer encoder with a voting-based decoder.
The analysis of recent literature on ML-based environment perception for USVs indicates that most methods are based on camera data and the application of CNNs for target detection [41,42,43]. These deep learning techniques have significantly improved perception accuracy and robustness in complex and dynamic marine environments.
CNNs are commonly used for image-based tasks, including those involving USVs, due to their ability to effectively handle complex visual variability such as waves, reflections, and lighting conditions. Besides object detection, another task commonly addressed using ML-based visual perception is semantic segmentation of sea and sky regions [42,44,46]. These methods are typically evaluated in simulations, although in some cases they are also tested using real USV platforms [41,42,44].
Environment perception for USVs using machine learning methods remains an open research area. LiDAR- and radar-based perception using clustering algorithms, as well as point cloud processing using deep learning, are still under active development. Similarly, sensor fusion models that combine data from cameras, radar, and LiDAR using neural networks are also an ongoing area of research.

3.2. Simultaneous Localization and Mapping for USVs

SLAM for USVs is an important technology that enables marine vehicles to autonomously navigate and map unknown maritime environments. The goal of SLAM in USVs is to build a real-time map of the surroundings while simultaneously estimating the vehicle’s precise position within that map. Typical sensors used for this task include LiDAR, sonar, Global Positioning System (GPS), and cameras.
Machine learning methods can be integrated into SLAM algorithms for USVs to enhance perception, mapping accuracy, and adaptability in complex maritime environments. In particular, deep learning can improve feature extraction from sensor data, enabling more robust identification of landmarks and environmental structures.
However, an analysis of recent literature on SLAM for USVs reveals that the application of machine learning to this task remains limited. Conventional SLAM techniques, such as the Voxel Generalized Iterative Closest Point (VGICP) algorithm for LiDAR-based SLAM, are still actively developed and have been proposed in recent studies, e.g., in [47].
The paper [48] presents a survey of small object detection (SOD) methods developed for various types of autonomous vehicles, including USVs, and also considers ML-based approaches. SOD techniques can be integrated into SLAM systems to improve mapping and navigation accuracy.
A survey of collaborative SLAM methods for different types of unmanned vehicles, such as UAVs, UGVs, USVs, and UUVs, is presented in [2], considering various sensor modalities, including wireless, vision, and LiDAR. The study also addresses the use of deep learning techniques.
An overview of sensors used in autonomous vehicles, including cameras, LiDAR, radar, ultrasonic sensors, GPS/GNSS, IMU/Inertial Navigation System (INS), odometry sensors, and acoustic systems, is presented in [1].
In [49], the authors proposed a multi-sensor fusion approach for environmental perception during berthing navigation, which was tested on a USV in Lingshui Bay, Dalian, China. In [50], a LiDAR-based real-time moving object detection method for USVs was introduced. The approach was validated through experiments using a USV platform.
Many studies focus on individual components applicable to SLAM in USVs rather than providing complete end-to-end SLAM solutions. For instance, ref. [51] presents a review of datasets and deep learning techniques for vision systems in USVs. In [52], the authors describe a visual SLAM approach in which the deep learning model YOLOv8n-seg is used to detect dynamic objects.
SLAM methods for USVs must be robust to external environmental factors that can cause distortion in point clouds. LiDAR-based SLAM approaches are also affected by poor laser reflectivity from water surfaces, resulting in sparse or missing point clouds and a lack of reliable features [47].

3.3. Collision Avoidance and Path Planning for USVs

In USVs, similarly to robot path planning, approaches can be classified into global (offline) path planning and local (online) path planning, the latter also referred to as collision avoidance. Global path planning is typically understood as the process of generating an optimal route that avoids known hazards such as shorelines, shallow waters, restricted zones, and high-risk traffic areas, often using nautical charts and environmental forecasts. Local path planning algorithms usually generate paths during navigation to handle dynamic and uncertain situations, such as encounters with manned vessels, other USVs, or unexpected obstacles detected by onboard sensors. This group of methods is also referred to as collision avoidance, as they focus on short-term maneuvers (e.g., adjusting speed or heading) to maintain safe separation and comply with maritime regulations (International Regulations for Preventing Collisions at Sea—COLREGs).
Figure 11 and Table 5 present recent collision avoidance and path planning methods for USVs. The analysis of this comparison indicates that most methods are based on different types of deep reinforcement learning (DRL), such as Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO). The second most represented group consists of approaches based on reinforcement learning (RL), such as Inverse Reinforcement Learning (IRL) and Multi-Agent Reinforcement Learning (MARL).
The COLREGs implementation was typically included in the reward function. In [54], an appropriate shape and size of the ship domain was applied to enforce COLREGs compliance of the solutions. All approaches were evaluated in simulations, and most algorithms were tested using scenarios with a limited number of target ships (TSs). The authors in [58,60] used encounter situations defined in the Imazu problem.
The key challenges in the application of machine learning methods to collision avoidance and path planning for USVs can be summarized as follows:
  • ML-based models may not guarantee safety; therefore, they should be combined with explicit safety constraints;
  • ML-based models may not guarantee COLREGs-compliant solutions;
  • limited training datasets;
  • multiple-vessel interaction scenarios;
  • reward function design;
  • explainability requirements.

3.4. Motion Control of USVs

Motion control methods for an unmanned surface vehicle are used to ensure that the vessel follows a predefined path or a sequence of waypoints. Various advanced control strategies, such as PID control, model predictive control, and adaptive control, are employed to accomplish this task while accounting for environmental disturbances such as waves, wind, and currents. Motion control tasks for USVs can be divided into several categories, including station keeping, trajectory tracking (TT), path following (PF), waypoint navigation, dynamic positioning, and formation control.
Figure 12 and Table 6 present recent machine learning-based motion control methods for USVs.
In [64], the authors proposed an approach based on Generative Adversarial Imitation Learning (GAIL) for steering a USV toward a goal position while avoiding obstacles. In this method, the reward function is derived from a set of demonstrations performed by a human expert.
A reinforcement learning-based control method for trajectory tracking of fully actuated surface vessels was introduced in [65]. The method’s efficiency was evaluated in simulations and sea trials using the ReVolt unmanned surface vehicle. Three different tracking tasks were considered: the four-corner DP test, straight-path tracking, and curved-path tracking.
The use of model-based deep reinforcement learning for motion control of an underactuated unmanned surface vehicle was presented in [66]. In this approach, a data-driven prediction model based on a deep neural network was developed using recorded input and output data. The stochastic gradient descent (SGD) method was used for training the neural network. Based on the learned prediction model, a model predictive control strategy was then applied for the USV’s path-following and trajectory-tracking tasks. The method was tested in simulations.
In the work [67], the authors applied the Deep Q-Network (DQN) along with Neural Network–Model Predictive Control (NN-MPC) for waypoint tracking. In this approach, initial training was carried out on the Turtlebot UGV, and afterward, the policy developed in the ground domain was transferred to the water domain and further trained using a USV. The method was validated in both simulations and field tests.
In [68], the authors proposed a receding-horizon reinforcement learning-based (RHRL) control method for the trajectory tracking task of a USV. The method was tested in simulations and compared with Lyapunov-based MPC (LMPC) and sliding mode control (SMC). The authors reported the following advantages of their approach: computational efficiency, lower sample complexity, and higher learning efficiency.
The authors in [69] introduced an Improved Transferring Deep Deterministic Policy Gradient (ITDDPG)-based path-following controller for a USV. The method was tested in both simulations and experiments. The authors concluded that their approach is characterized by the following advantages: faster convergence speed, superior tracking performance, and stronger robustness and generalization ability compared with traditional DDPG methods.
In [70], the authors proposed a DNN-based controller for real-time optimal USV control. The method was validated in simulations.
A survey focused on motion control methods for USVs, including literature prior to 2020, can be found in [71]. It covers tasks such as target tracking, trajectory tracking, path following, and cooperative formation control of USVs, as well as methods based on neural networks, fuzzy logic, reinforcement learning, and Adaptive Dynamic Programming (ADP).
Most recent machine learning-based control methods for USVs focus on trajectory tracking and path-following tasks and apply reinforcement learning [65,68,69] or deep reinforcement learning [66] methods. In some cases, hybrid approaches are used, such as in [67], where the authors combined DQN with NN-MPC. Most methods are evaluated in simulations, but some also include real-world experiments using a USV platform.

4. Machine Learning for Unmanned Aerial Vehicles

The main classification of flying drones is based on their size. This scale includes categories such as unmanned aerial vehicles (UAVs), micro air vehicles (MAVs), nano air vehicles (NAVs), pico air vehicles (PAVs), and smart dust (SD) [72]. Most of these platforms cannot use machine learning due to their very small size and the lack of space for appropriate sensors and computing units required for data processing and analysis. The types of flying drones capable of using ML are primarily UAVs and MAVs; however, UAVs are the most commonly used category due to the widespread adoption of drones in both civil and military applications [73]. An example of a UAV, eBee X, is shown in Figure 13.
Because of their wide range of applications and easy access to the environment, UAVs can be used for environmental perception, SLAM, collision avoidance, path planning, and motion control. The bird’s-eye view enables exploration of large areas, and the relatively low density of obstacles at certain altitudes facilitates measurements. The main limitations are the weight of the drone, which restricts the amount of equipment and sensors that can be carried, and the need for machine learning to optimize UAV performance.
Figure 14 summarizes the results of a search in multiple publishers’ databases for papers on machine learning applied to UAVs published between 2020 and 2026. The following search query was used: “unmanned aerial vehicle” AND “machine learning”.
The results of a bibliometric analysis related to UAVs and machine learning are shown in Figure 15. The analysis was performed using VOSviewer [19], which enables the creation of cluster maps based on authors’ keywords extracted from papers published between 2020 and 2026. The data were obtained from bibliographic files exported from the Web of Science.

4.1. Environment Perception for UAVs

This section provides an overview of environmental perception, which uses machine learning to process raw spatial data. Interpreting this data is crucial, as it enables its use in other aspects of UAV operations. Environmental perception relies on a variety of sensors, including LiDAR, RGB cameras, radar, and ultrasonic sensors. Due to the diverse nature of the environmental data, machine learning is often used for its processing. Since the data formats vary widely and cover a broad range of applications, different algorithms and models are employed.
Detecting and recognizing objects in real time in military applications using RGB camera images is a challenging task that requires appropriate solutions. The approach proposed by the authors in [75] involves the use of the YOLOv3 algorithm, which belongs to the class of convolutional neural networks. This type of CNN enables fast real-time image processing by analyzing each frame only once during the entire run. Thanks to this design, the model’s processing time is reduced, allowing rapidly changing environmental data to be continuously analyzed. With real-time processing capability, a smartphone was used as a device for displaying the analysis results. A similar approach—this time applied to environmental perception for navigation—can be found in another study, which also utilizes the YOLOv3 machine learning algorithm [76].
Another issue is the processing of point clouds obtained via LiDAR. Compared with 2D images from RGB cameras, LiDAR data form a 3D point cloud, which requires a different approach. In the paper [77], the authors addressed the problem of segmenting and classifying the obtained data. The ITCD (Individual Tree Crown Delineation) algorithm, which belongs to the supervised learning category, was used for data segmentation. In subsequent work, CNNs were used for data classification.
The authors of the paper [78] addressed a similar problem. They demonstrated the use of the HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) algorithm, which is an unsupervised learning method. This algorithm groups clusters of varying density, enabling efficient segmentation of unknown data.
The authors of the study [79] described their work on fused data from an RGB camera and LiDAR. Two CNN-based architectures (U-Net and DeepLabv3) were used for data classification. The aim was to compare the effectiveness of these algorithms with RF (Random Forest) and SVM (Support Vector Machine) methods. The study demonstrated that advanced neural networks are more effective when dealing with complex data.
The authors of another article [80] also addressed the problem of classifying fused LiDAR and RGB camera data. They reduced the dataset by considering only data points located above a certain height. As a result, they used an SVM algorithm and showed that it performs more effectively in data classification than a decision tree (DT), which is faster but less suitable for complex data.
Fused data from LiDAR and hyperspectral imaging require a complex processing approach. The authors of [81] conducted a series of simulations to identify a suitable algorithm for this problem. The Gradient Boosting Machine (GBM) proved to be the best of the tested methods, enabling effective feature extraction from complex data. The study demonstrated that supervised learning performs well with fused data.
Figure 16 and Table 7 present the latest machine learning-based environment perception methods for UAVs. It can be observed that all analyzed studies are based on real-world experiments, which reflects the complexity of the environments in which UAVs operate. Due to the large number of obstacles in three-dimensional space, simulation alone is often insufficient.
The main methods used for environmental perception are deep learning and supervised learning. These approaches are commonly employed because environmental perception is closely related to object detection and classification tasks, for which they are particularly effective. For the detection and classification of visual data, such as camera images, deep learning with CNN models is most commonly used [75,76,79]. In the case of classifying data not derived from cameras but from other sensors, supervised learning is more commonly used [77,80,81]. This is due to the need to define human-labeled classes in order to effectively analyze the environment.

4.2. Simultaneous Localization and Mapping for UAVs

This section discusses the use of machine learning in localization and mapping systems. For UAVs, localization and mapping present significant challenges due to the three-dimensional nature of the environment. Linear algorithms are insufficient; therefore, machine learning is used to enhance the navigation capabilities of UAVs.
Machine learning-based navigation often relies on different data than traditional navigation methods. The authors of [82] used camera-based (vision-based) data. Based on this type of data, reinforcement learning was applied using the Deep Q-Network (DQN) algorithm. This approach enables autonomous navigation in dynamic environments. The results of the algorithm can be used in later stages for collision avoidance and motion control.
The authors of [83] also used camera data for UAV navigation. The difference lies in the algorithm used: a CNN (specifically, the ResNet-18 model implemented in Python) was applied for navigation and terrain mapping. Visual and geometric features were extracted using convolutional layers, while the map can be dynamically updated during flight.
Another challenge is navigating enclosed spaces such as buildings and underground corridors. In such cases, machine learning can be used to assist with terrain mapping. The authors of [84] used a CNN (the ENet model) to extract human silhouettes from the data. This ensures that not all detected elements are treated as part of the terrain, which simplifies the mapping process.
Mapping a selected area using additional parameters posed a challenge for the authors of [85]. The goal was to map groundwater levels and soil moisture. Both tasks were performed using a Random Forest (RF) model. This approach enables the mapping of large areas without the need for field measurements. Thanks to the RF model, it is possible to predict groundwater levels and soil moisture, which supports environmental monitoring and mapping.
Detecting specific objects during terrain mapping enables faster responses from UAVs. The authors of [86] described such a detection method. Machine learning was used to detect obstacles and determine their positions. A CNN-based algorithm (YOLO model) was used for this purpose. Detecting objects as obstacles enabled the subsequent application of collision-avoidance algorithms. This indicates that mapping and localization, when supported by machine learning, are crucial for UAV operations.
Another approach to mapless navigation is to use LiDAR data without real-time mapping. Instead, positioning is performed dynamically. This approach was explored by the authors of [87]. Instead of generating a terrain map, the obtained data are used to immediately react to obstacles using two reinforcement learning models: TD3 (Twin Delayed Deep Deterministic Policy Gradient) and SAC (Soft Actor–Critic). The task of these models is to detect objects and determine their distance and angle relative to the UAV. With this information, the models can be used for collision avoidance by outputting flight speed and direction.
The next example of navigation involves using an RGB camera to locate objects in front of a UAV. The authors of [88] investigate the use of a CNN (the Faster R-CNN model) to detect objects and determine whether an obstacle is too close (a critical obstacle). A forest was used for testing, where trees served as obstacles. These tests provided realistic results regarding how accurately the CNN can recognize and localize obstacles.
Images can also be used to detect moving objects such as people, cars, and animals, which is essential for navigation in real-world conditions. The authors of [89] investigated a solution to this problem in dynamic environments. They used a Deep Q-Network, a type of reinforcement learning model. The distance to the target was used as the reward, where shorter distances yielded higher rewards. Conversely, the distance to detected objects was used as a penalty, where shorter distances resulted in higher penalties.
Figure 17 and Table 8 present the latest machine learning-based SLAM methods for UAVs. It is important to note that deep learning using CNNs dominates the field of UAV localization [83,84,86,88]. Due to the complexity of the environment, the use of cameras as a source of visual information is essential for SLAM systems to function properly. Camera images enable effective object localization through CNN-based detection. The mapping problem was addressed using supervised learning, which enabled object clustering via a Random Forest algorithm [85].

4.3. Collision Avoidance and Path Planning for UAVs

The previous sections summarized research on environmental perception and navigation. Many of the findings from this research have contributed to collision avoidance through object detection and localization. This section discusses research focused on collision avoidance and path planning for UAVs.
One machine learning approach is reinforcement learning, as proposed by the authors of [90]. Three algorithms were compared: Proximal Policy Optimization, Advantage Actor–Critic (A2C), and Soft Actor–Critic. The study addressed collision avoidance for a flying UAV using LiDAR data and was divided into two main stages with varying levels of environmental complexity. The first stage included three obstacles: one dynamic and two static. The second stage additionally accounted for the presence of other UAVs performing the same task. The entire study was conducted in a simulated environment, and the results demonstrate the effectiveness of the aforementioned algorithms for collision avoidance. PPO exhibited greater stability and better performance, while A2C demonstrated significantly higher processing speed. The SAC algorithm proved to be the least effective among those tested for this specific problem.
The authors of [91] explored a different approach to collision avoidance. Instead of using LiDAR data, they utilized camera images processed by artificial neural networks. A combination of two neural networks—CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network)—was used to analyze video frames. This hybrid neural network enabled object detection and the subsequent determination of an evasive trajectory. The experiment was conducted under real-world conditions; however, the model was only able to avoid a single recognizable object.
An example of a collision avoidance study that included both simulation and real-world experiments is described in [92]. The authors employed deep reinforcement learning using the Deep Deterministic Policy Gradient algorithm. The algorithm processed incoming data, adjusted the UAV’s flight speed, and adapted its trajectory to avoid obstacles.
For the path planning task, the authors of [93] proposed a reinforcement learning approach based on the Q-learning algorithm. The main objective of the study was to plan flight trajectories while accounting for both static and dynamic obstacles. The authors compared the Q-learning algorithm with other methods and demonstrated that, despite requiring longer training time, Q-learning produces the shortest paths among the evaluated algorithms.
The authors of [94] investigated cooperative trajectory planning for multiple UAVs. The main aspects considered include coordination among multiple vehicles, collision avoidance, and synchronization of arrival times at the destination. Reinforcement learning was used for this purpose, employing the Proximal Policy Optimization algorithm, which enabled the simultaneous coordination of multiple UAVs.
A key limitation of earlier approaches was the poor scalability of centralized swarm control and the risk of collisions in complex scenarios. This limitation was addressed by using reinforcement learning instead of classical optimization methods.
The authors of the paper [95] investigated UAV path planning in a large and dynamic environment. To address the research problem—characterized by a vast space, incomplete knowledge of the environment, and dynamic obstacles—they employed deep reinforcement learning using the DQN algorithm. The algorithm demonstrated that deep reinforcement learning performs well in collision avoidance and path planning within a vast environment containing moving objects.
Path planning is often based not only on collision avoidance but also on other factors. The authors of [96] conducted a study on a multi-criteria problem that additionally considered fuel consumption and safety. Reinforcement learning using the Q-learning algorithm was employed to solve this problem. This algorithm, combined with metaheuristics, enabled the selection of an optimal UAV trajectory under complex environmental conditions and specific constraints.
The combination of the Q-learning algorithm with metaheuristics also made it possible to address more complex challenges, such as path planning and collision avoidance in environments with multiple constraints, as well as swarm control. The authors of [97] investigated this problem. Reinforcement learning using the Q-learning algorithm was employed for this task. As previously described, this algorithm is also applicable to such complex problems, and it enabled the coordination of multiple UAVs.
Figure 18 and Table 9 present the latest machine learning-based collision avoidance and path planning methods for UAVs. It can be observed that reinforcement learning and deep reinforcement learning are the most commonly used machine learning approaches. Among all algorithms, Q-learning was the most frequently used [93,96,97]. This algorithm is particularly well suited for path planning based on three-dimensional environmental data. This is because Q-learning assigns values to possible actions and, based on appropriately selected parameters, selects the action with the highest value. All studies on collision avoidance and path planning required a simulation component due to the risk of UAV damage resulting from failure to avoid obstacles.

4.4. Motion Control of UAVs

Controlling UAV motion requires considering many factors and operating in three-dimensional space. Previous sections have discussed aspects such as path planning and mapping. Based on this information, the UAV must be controlled appropriately to ensure optimal performance. Today, machine learning is often used for autonomous UAV control.
An important task in motion control is position control. Due to numerous disturbances and nonlinear dynamics, UAV control is challenging. To address this, the authors of [98] replaced the classical PID controller and, using supervised learning, trained a neural network on data generated by the PID controller. This approach improved the accuracy and stability of UAV positioning.
The authors of [99] also investigated replacing the classical PID controller but instead used reinforcement learning with the DDPG algorithm. In this case, rather than learning from PID-generated data, the model was trained using a reward-and-penalty mechanism based on executed actions, which enabled the development of optimal control policies independent of classical PID behavior.
The promising results of the DDPG algorithm led the researchers to further develop it in [100]. They proposed the Robust DDPG algorithm, an improved version of the original method. The enhanced version offers more stable learning, incorporates noise into the state space, and improves exploration of control strategies.
The authors of [101] focused on comparing UAV motion control capabilities using reinforcement learning and deep reinforcement learning. They used DQN as the algorithm under study, and the results clearly demonstrate the effectiveness of this type of machine learning. This highlights the complexity of UAV control, where many factors influence vehicle performance.
Many UAVs are based on quadrotor configurations. Controlling these systems—which are uncommon in other types of unmanned vehicles—presents an additional challenge in UAV motion control. The authors of [102] addressed this issue. As a solution, the researchers proposed deep reinforcement learning using the PPO algorithm. This enabled optimal motor control and achieved better results than a classical PID controller.
Swarm control is a highly challenging task, which the authors of [103] investigated using deep reinforcement learning. They employed the SAC algorithm to control motion as effectively as possible while avoiding collisions. The study showed that the DDPG and PPO algorithms perform worse in controlling multiple UAVs simultaneously.
Figure 19 and Table 10 present the latest machine learning-based motion control methods for UAVs. As with collision avoidance and path planning, most of the research is based on simulations. This is mainly due to the risk of damaging UAVs in the event of a failure in controlling their movement. The most commonly used approach is reinforcement learning, employing various algorithms [99,100,101,102,103].
Based on the observed research results, it can be concluded that controlling a robot in three-dimensional space with a large number of obstacles is highly challenging, and there is still a need to transfer such research into experimental validation in real-world environments. Another important aspect is swarm control, in which a group of UAVs is required to synchronize their operations. This field requires further development and research.

5. Machine Learning for Mobile Robots

This review covers the application of machine learning to ground-based mobile robots, which move using wheels, tracks, or legged locomotion. This field involves the integration of sensor data processing, distance estimation between objects, and actuator control, such as wheel motors or track drives. During the literature review, the application of machine learning in mobile robots was examined across several scientific databases, as illustrated in Figure 20.
The results of a bibliometric analysis related to ground mobile robots and machine learning are shown in Figure 21. The analysis was performed using VOSviewer [19], which enables the creation of cluster maps based on authors’ keywords extracted from papers published between 2020 and 2026. The data were obtained from bibliographic files exported from the Web of Science.
Subsequent sections of this part of the paper cover the application of machine learning methods for environment perception, SLAM, path planning, obstacle avoidance, and motion control. An example of a UGV, Husky A300 from Clearpath Robotics (Kitchener, ON, Canada), is shown in Figure 22.

5.1. Environment Perception for Mobile Robots

In mobile robotics, processing sensor data to improve environmental perception is a primary area of interest for researchers, as it directly supports subsequent stages such as navigation and speed control, which are essential in autonomous systems. The author categorized the overview of environment perception according to the following sensors: LiDAR (Light Detection and Ranging), GPS, radar, RGB cameras, RFID, and thermal sensors. All publications described in this section are shown in Figure 23 and Table 11.
The problem of environment perception can be considered in two areas. The first concerns environmental detection in navigation using laser-based solutions. Another area addressed by the authors of [105] involves unstructured environments such as construction sites. In this example, the authors propose the use of a neural network that integrates stereovision data for robust semantic segmentation of obstacles, improving perception in diverse environments. This approach enables real-time processing of sensor data in changing scenarios in autonomous mobile systems.
As the authors of [106] note, the results may help in selecting effective design approaches for controlling robotic devices, applying machine learning methods for pattern recognition and classification, and using computer technologies for designing control systems and simulating robotic devices.
In [107], the authors demonstrate the use of an improved neural network, specifically an evolving self-organizing incremental neural network (ESOINN), for environmental perception in both internal and external environments of a walking robot.
The authors of [108] also combined PIR and ultrasonic sensors on a mobile social robot whose task is to locate people in the environment. The data from the sensors are analyzed using two machine learning algorithms: a supervised method (decision tree) and an unsupervised method (K-means). The results show that the accuracy of detecting a person in the tested area was 70%.
A combination of sensors for capturing RGB images and depth information was presented in [109]. The study evaluated two CNN-based networks using this data to determine the angular velocity of the robot. The application of supervised learning using a support vector machine was also explored for vehicles operating in a networked, dynamic environment. The method proved effective under challenging environmental conditions [110]. Finally, by combining multiple sensors supported by neural networks on a mobile robot (TurtleBot3 Waffle, ROBOTIS Co., Ltd., Seoul, Republic of Korea), the authors created a set of scenarios to train the robot to detect glass as part of the environment map using sensor fusion.
Table 11. Recent ML-based environment perception methods for mobile robots.
Table 11. Recent ML-based environment perception methods for mobile robots.
MethodAuthorsYearTaskSim./Exp.
SL, NNBaretto-Cubero et al. [111]2022LiDAR LDS-01, ultrasonic sensors HC-SR04, camera RealSense 435i, TurleBot3 waffleReal exp.
SL, NNKondratenko et al. [106]2022Software, mobile robotsSim.
SL, E-SOINN NNXu et al. [107]2023Depth Camera (D435i, and T265), hexapod robotReal exp.
USL and SL K-means and DTCuiffreda et al. [108]2023SRF10 ultrasonic and HC-SR501 PIR sensors; social mobile robotReal exp.
SL, NNZiegler et al. [105]2025RGB-D, Autonomous Machine SystemsSim. and Real exp.
SL, CNNZain et al. [109]2025IntelRealsense D415, LiDAR, Mobile robot inspired by Turtlebot2Real exp.
SL, SVMCui et al. [110]2025Connected network, mobile robotsSim.
Figure 23 and Table 11 present the latest environment perception methods for mobile robots. The analysis of this comparison reveals that most methods are based on supervised learning (SL) using neural networks. As shown in the table, neural networks are the most commonly used approach. Mobile robots are equipped with numerous sensors, including cameras that generate large amounts of image data, which require efficient processing—often in real time. Neural networks meet these requirements because they enable automatic feature learning and effective processing of large datasets. Furthermore, they are characterized by an end-to-end approach in which input sensor data are directly converted into output decisions. This property is particularly important when integrating a wide variety of sensors, which is a key aspect of environmental perception in mobile robots. Consequently, neural networks are often the most effective choice for such tasks.

5.2. Simultaneous Localization and Mapping for Mobile Robots

SLAM for mobile robots is used to gradually create a coherent map of an unknown environment while simultaneously localising the robot within that map. All publications described in this section are shown in Figure 24 and Table 12.
Interesting review papers include [112,113], in which the authors claim that DRL policies in mobile robot navigation do not always ensure optimal performance and that the efficiency of DRL-based navigation is subject to fluctuations.
The paper [10] presents an overview of research on mobile robots. The review covers nearly 80 articles on classical methods and machine learning. It was demonstrated that the applied techniques improve the navigation of mobile robots in complex indoor environments. However, DRL still appears to be the most promising approach for achieving breakthroughs in navigation capabilities, as demonstrated in several application scenarios, such as indoor navigation, local obstacle avoidance, multi-robot navigation, and social navigation.
An SLAM-related approach was presented in [114], which discusses the application of different deep learning techniques. Five DL methods were investigated in this paper: convolutional neural networks for feature extraction and semantic understanding, recurrent neural networks for modeling temporal relationships, deep reinforcement learning for developing exploration strategies, graph neural networks (GNNs) for modeling spatial relationships, and attention mechanisms (AM) for selectively processing information.
A review of recent literature shows that SLAM and visual SLAM (VSLAM) capabilities can be extended through integration with learning-based algorithms for mobile robots. It is shown that combining deep learning with SLAM and VSLAM represents an innovative approach for autonomous robotic systems.
The authors of [115,116] presented a solution for mapless navigation control based on deep learning techniques. The proposed method utilizes data from LiDAR (Light Detection and Ranging) sensors and is based on imitation learning. The experimental results show that the mobile robot navigates safely in four unknown environments with an average success rate of 75%. The study was conducted for three simulation scenarios and one real-world scenario using reinforcement learning.
As the authors of [117] mention, both the robot’s trajectory and the map were estimated online in their approach, without the need for prior knowledge of the environment. The study focused on incorporating two types of deep learning agents: Deep Q-Networks and Double Deep Q-Networks. The mobile robot was trained to avoid collisions and navigate in an unknown environment.
In [118], the authors proposed a navigation method for a mobile robot to improve its adaptability to the environment. In this approach, a reinforcement learning method was applied to control the robot. The authors used the DDPG algorithm to train the neural network and verified that this method performs better than approaches utilizing vision and laser sensors, improving the navigation success rate by 10%.
In [119], the authors proposed a random forest method for location recognition and navigation. The focus was on location recognition based on 3D point clouds, and the random forest model was trained using feature vectors. Numerous experiments were conducted on the KITTI dataset and in the outdoor campus environment of Southeast University, located in Nanjing, Jiangsu Province, referred to as SEU.
The authors of [120] also combined the RTAB-Map method with deep CNN algorithms, improving target detection for the SROBO mobile robot.
In [121], the authors proposed an autonomous obstacle avoidance method combined with SLAM and based on deep reinforcement learning for a wheeled snake robot using multi-sensor data. The authors of [116] presented a deep learning-based corridor area classifier that utilizes 2D LiDAR data. The study shows that the maximum error is 77% for the modified algorithm, and that the robot’s speed increased by more than 5.6%. The route completion rate increased by 36.25% when deep learning was included, compared with SLAM-only localization.
Table 12. Recent ML-based SLAM methods for mobile robots.
Table 12. Recent ML-based SLAM methods for mobile robots.
MethodAuthorsYearTaskSim./Exp.
DRL, GA3CSurman et al. [122]2020Navigation control, Turtlebot, Nvidia JetsonTX1Sim. and Real exp.
DRLZhu et al. [112]2021Navigation control, mobile robotSim.
DRLNguyen et al. [116]2021Navigation controlSim. and Real exp.
DRL, DQN, and DDQNLee et al. [117]2022Navigation (Gazebo)Sim. and Real exp.
SL, RFZhou et al. [119]2022Mobile robot (experimental platform), LiDAR, IMU, GPSSim. and Real exp.
SL, Deep CNNSadeghi et al. [120]2022Mobile robot (SROBO), LiDAR, IMU, GPS, NVIDIASim. and Real exp.
RL and NN, DDPGYan et al. [118]2023Mobile robot, GazeboSim.
DRL, DNN, and PPOWong et al. [123]2024Mobile robot, LiDAR, camera, ORB-SLAM2Sim.
DRL, DQNLiu et al. [121]2024Mobile snake robot (Gazebo, RViz), NVIDIA Jetson, LiDAR, IMUSim. and Real exp.
DRLXu et al. [124]2025Mobile robot with Intel RealSense D435i cameraSim.
Figure 24 and Table 12 show the latest SLAM methods for mobile robots. The analysis of this comparison reveals that most methods are based on DRL, including DQN, DDPG, deep neural networks (DNNs), and PPO. The DRL approach performs well in decision-making in dynamic and uncertain environments.
In classical SLAM, the robot estimates its position and builds a map, whereas in a DRL-based approach, it additionally learns strategies for exploring its surroundings. This means that the robot not only “sees” but also actively decides where to move in order to build a map more quickly and accurately.
A second advantage is the ability to learn an action policy based on experience. DRL enables the robot to optimize its movements through interaction with the environment, without the need to manually design rules. As a result, it can cope better with complex, previously unknown environments.

5.3. Collision Avoidance and Path Planning for Mobile Robots

In the case of mobile robots, key research topics include path planning and obstacle avoidance. Research often focuses on avoiding static obstacles, but the most interesting findings concern the avoidance of dynamic obstacles. All publications referred to in this section are shown in Figure 25 and Table 13.
For mobile robots, obstacle avoidance—particularly of dynamic obstacles—is an important research topic. The authors of [125] proposed a reinforcement learning approach based on the SAC algorithm to train a multi-agent system in which each agent makes decisions independently without communicating with other robots.
However, the authors of [126] noted that deep reinforcement learning improves the performance of the STORM robot in avoiding obstacles in high-traffic environments and challenging scenarios.
In [127], the authors used the Rapidly-exploring Random Tree (RRT) method to address obstacle avoidance and path planning. GoogLeNet was used to classify obstacles. The RRT method generates a path from the starting point to the destination. In the experiments, the RRT approach was compared with a genetic algorithm (GA) and particle swarm optimization (PSO) in a static environment, as well as with an artificial potential field-based approach in a dynamic environment.
Path planning for mobile robots using machine learning has been further enhanced in [128], where the authors proposed a new algorithm by integrating the Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the PPO framework. The results show that the ICM provides intrinsic rewards in addition to external rewards, accelerating initial convergence. The actor–critic network was further improved by incorporating an LSTM-based architecture to better handle dynamic obstacles.
Table 13. Recent ML-based collision avoidance and path planning methods for mobile robots.
Table 13. Recent ML-based collision avoidance and path planning methods for mobile robots.
MethodAuthorsYearTaskSim./Exp.
RL, RRTZhang et al. [127]2020Tracked trolley mobile robotSim. 2D
SL, CNNRan et al. [129]2021Mobile Platform Robot, Nividia and STM32, dynamic obstacleReal exp.
RL, SAC, A3C, DWA, TEBChoi et al. [125]2021Mobile platform SR7, GeFirce GTXSim. and Real exp.
RL, DRLFeng et al. [126]2021Mobile platform (STORM), GazeboReal exp. Sim.
ML, ASGLDR (SGL, LR)Das et al. [130]2022Mobile robotSim. and Real exp.
DRL, DQNWang et al. [131]2022Pioneer3-AT robot, GE Force Nividia, GazeboSim. and Real exp.
RL, DDPG-DGDashpande et al. [132]2024Mobile robot (GAZEBO)Sim. and Real exp.
RL, QLXiao et al. [133]2024Mobile robotSim.
RL, DDPGZgang et al. [128]2024Mobile robot, Turtlebot3 BurgerSim. Real exp.
ML, ASGD-LARSThakur et al. [134]2025Autonomous mobile robot (AMR), MatlabSim. and Real Exp.
In [130], the authors proposed a machine learning algorithm called Adaptive Stochastic Gradient Descent Linear Regression (ASGLDR), which enables static obstacle avoidance by modeling mobile robot motion as left or right turns.
A similar approach using the Adaptive Stochastic Gradient Descent with Least Angle Regression (ASGD-LARS) algorithm was introduced in [134]. This model was designed to classify the motion of an autonomous mobile robot (AMR) into three categories: no turn, left turn, and right turn. The proposed approach was implemented on a NodeMCU ESP8266 controller to enable real-time static obstacle avoidance.
The authors introduced a computation method based on the Euclidean distance from static obstacles and the speeds of individual robot wheels, allowing faster information processing on the NodeMCU controller and enabling rapid decision-making. Additionally, to ensure fast convergence, the weights were updated using the adaptive stochastic gradient descent optimization technique.
The work [132] presents a solution for planning and optimizing the route of a mobile robot in open terrain. The study applies a combination of DDPG and Differential Gradient (DG) methods for motion planning involving two mobile robots. The parameters required for decision-making were obtained from LiDAR, cameras, and proximity sensors.
Regarding algorithm improvement, the authors of [131] applied an enhanced DQN method, using collected data as training samples and combining environmental features and target point information as network inputs.
A trajectory tracking method based on optimized Q-learning (QL) was proposed in [133] for dynamic local environments. The authors applied the Observational Space Data Processing Method (OSDPM), in which raw LiDAR data are transformed into a reduced-dimensionality feature vector containing information about the robot’s surroundings.
Figure 25 and Table 13 show the latest collision avoidance and route planning methods for mobile robots. The analysis of this comparison reveals that most methods are based on various types of DRL. The second most common group comprises approaches based on RL.
In this section, which focuses on collision avoidance, reinforcement learning is particularly well suited to the problem, as it enables decision-making in dynamic and uncertain environments. RL/DRL methods allow for learning an optimal action strategy based on interactions with the environment through a reward function. This enables the robot to adaptively plan its motion trajectory while minimizing the risk of collision. Furthermore, this approach performs well in sequential decision-making problems and supports the integration of perception and control in real time.

5.4. Motion Control of Mobile Robots

In previous studies, robot motion control typically involved the use of PID controllers. Nowadays, researchers increasingly apply machine learning to address this task. All publications referred to in this section are shown in Figure 26 and Table 14.
In [135], motion control was implemented using RL instead of a traditional control algorithm. The study involved computer simulations to learn angular velocity based on the error angle. The aim was to maintain a constant linear velocity at its maximum value during the robot’s trajectory outside the docking area. The authors used a Q-learning algorithm.
In some studies, authors combine classical methods with machine learning. The papers [136,137] present an innovative hybrid control strategy that combines a neural network-based kinematic controller with adaptive model reference control, in which controller parameters are determined online using neural networks.
The research described in [138] demonstrates the integration of a PID controller with machine learning based on the SAC algorithm. This approach enables system control in environments that change in real time. A hierarchical structure was developed, consisting of an upper-level controller based on the SAC algorithm—one of the most competitive continuous control methods—and a lower-level controller based on an incremental PID controller. The effectiveness of the SAC–PID control method was verified using several trajectories of varying difficulty in both the Gazebo simulation environment and on a real Mecanum mobile robot.
Similarly, the authors of [139] proposed a hybrid control strategy combining model-based control with a DL method based on an actor–critic (AC) framework. After training the model using the DDPG method, the action referred to as ACI (“acquired control input”) compensates for errors, resulting in improved control performance under conditions of speed and acceleration limitations, as well as system uncertainty.
Another approach to position control is the use of RL agents to control robot motion. These agents learn from experience and can improve decision-making compared with traditional control algorithms [140,141]. Importantly, the results show that DDPG and DQN agents achieve the shortest time to reach the destination.
Table 14. Recent ML-based Motion control methods for mobile robots.
Table 14. Recent ML-based Motion control methods for mobile robots.
MethodAuthorsYearTaskSim./Exp.
RL, QLFarias et al. [135]2020Position control, mobile robotSim.
DRL, DDPGGao et al. [139]2021Tracking control, mobile robotSim.
NNHassan et al. [136]2022Tracking control, wheel mobile robot, STM32, MatlabSim.
DRL, DQN, DDPGQuioga et al. [140]2022Position control, mobile robot Kephera IVSim. and Real exp.
RL, SACYu et al. [138]2022automatic control, Mecanum mobile robot, GazeboSim. and Real exp.
DRL, SACCao et al. [141]2024Path following, mobile robot, Raspberry pi4Sim.
NNHa et al. [137]2025Tracking control, wheel mobile robot, STM32Sim. and Real exp
Figure 26 and Table 14 show the latest motion control methods for mobile robots. The analysis of this comparison reveals that most methods are based on various types of DRL, including DDPG and DQN. The second most common group comprises approaches based on RL, such as Q-learning and SAC.
There is no definitive answer as to which method is best suited for motion control. However, DRL-based approaches—particularly algorithms such as DDPG and SAC—can be recommended. The motion control problem is continuous in nature, requiring the generation of smooth control signals. Unlike classical RL methods, DRL algorithms are designed to operate in continuous spaces and enable end-to-end learning of control strategies.
Furthermore, DRL demonstrates a high capacity to adapt to nonlinear robot dynamics and variable environmental conditions. In particular, the SAC algorithm is characterized by greater learning stability and improved exploration of the state space.

6. Discussion

For UUVs, future research should prioritize tightly integrated perception–planning–control pipelines that operate reliably under acoustic noise, low visibility, and limited communication bandwidth. Recent studies indicate the need to combine forward-looking sonar perception with real-time avoidance and planning strategies [23,24,26,27]. A promising direction is to move from loosely coupled modules toward end-to-end or hierarchically coordinated architectures that explicitly model uncertainty and safety constraints.
Another key direction is robust sim-to-real transfer for UUV autonomy, including SLAM under domain shift, data-efficient learning, and long-duration adaptation to changing underwater conditions. In the UUV SLAM-related literature, the most commonly used machine learning approach is deep learning, often combined with Kalman filtering. However, no single architecture—such as YOLO or a specific CNN variant—emerges as dominant; such models are more frequently applied in environmental perception.
Survey papers emphasize multi-sensor integration and deployment challenges [18,28], while recent control studies suggest that hybrid DRL frameworks and sim-to-real tuning can improve robustness in field operations [36,38]. In this context, future work should also emphasize standardized benchmarks, shared datasets, and reproducible sea-trial protocols to enable fair comparisons across methods.
Most ML-based methods for environment perception in USVs rely on cameras and CNNs [41,43,44,45], whereas many recent studies employ conventional SLAM techniques instead of ML-based approaches [47]. SLAM for USVs remains a challenging task due to external environmental factors, which can lead to sparse or missing point clouds. For motion control, most recent approaches are based on RL or DRL [66,67,69,70,71].
The key challenges associated with the application of machine learning (ML) methods to collision avoidance and path planning for USVs include the lack of inherent safety guarantees (thus requiring the incorporation of safety constraints), possible non-compliance with COLREGs, limited availability of training data, the complexity of multi-vessel interactions, challenges in reward function design, and the need for explainability.
For UAVs, a major challenge is the three-dimensional environment, which has been divided into several key problems in scientific research. These include obstacle avoidance, terrain mapping based on various data sources, multidimensional and multi-criteria control, and swarm control. It is worth noting that different tasks are addressed using various machine learning methods. Deep reinforcement learning dominates motion control, while SLAM employs a more diverse set of methods depending on the specific task or data being processed.
Convolutional neural networks are most frequently applied in mobile robot environmental perception, whereas for robot motion control, neural networks and deep learning techniques are used, as well as reinforcement learning supported by DQN and DDPG algorithms.
It has been observed that the majority of approaches applying machine learning methods to mobile robots are tested in simulation environments, while real-world demonstrations are less common.
It is worth noting that, when conducting research related to mobile robots, the following software environments are commonly used: MATLAB, ROS, virtual machines, Gazebo, and RViz. Raspberry Pi is a popular control platform, alongside NVIDIA computing platforms.
Table 15 presents a summary of ML-based approaches for different autonomous vehicles and the four tasks covered in this review paper. As can be seen in this table, CNNs are the most commonly used approach for environment perception in different vehicles, such as UUVs, USVs, UAVs, and mobile robots. CNNs are widely used for environment perception in autonomous vehicles due to their ability to efficiently extract spatial features from images (edges, corners, textures). They are particularly effective for tasks such as object detection and semantic segmentation. However, vision transformers (ViTs) are regarded as a promising and competitive approach for computer vision tasks. They model relationships within an image using self-attention mechanisms, which makes them well suited for challenging environments with clutter or varying lighting conditions.
Simultaneous Localization and Mapping, which involves building a map of an environment while simultaneously estimating the vehicle’s position within that map, is a complex task that is relatively less developed in terms of the application of machine learning methods. Many studies apply ML only to certain components of SLAM solutions rather than providing a complete ML-based SLAM system.
As can also be seen in Table 15, environment perception and SLAM approaches using ML are more developed for UAVs and ground-based mobile robots. In contrast, solutions for UUVs and USVs are less mature, which may result from the demanding conditions of the marine environment. Reflections, waves, and changing lighting conditions degrade sensor data quality, while underwater environments further complicate perception due to light absorption and scattering.
Collision avoidance and path planning for all four types of autonomous vehicles are mainly based on RL or DRL approaches. For UAVs and mobile robots, these methods are tested in both simulations and real-world experiments, while for UUVs and USVs, validation is typically limited to simulations. Motion control, similarly to collision avoidance and path planning tasks, is commonly based on RL or DRL across all types of vehicles. This may result from the fact that the vehicle can learn a control policy directly from interactions with the environment, and an exact mathematical model of the vehicle’s dynamics is not required. This is an advantage compared with traditional methods such as PID or MPC.
Motion control methods for different vehicles are more often tested in both simulations and real-world experiments, but papers including real-world tests are less common for UAVs. Generally, as can be seen in Table 15, similar ML methods are applied across different operational domains; however, it also appears that the technology is more developed for UAVs and ground-based mobile robots compared with UUVs and USVs.
The survey indicates that the use of machine learning-based approaches in autonomous vehicles is growing rapidly. However, several challenges remain, leaving considerable scope for further improvement.

7. Conclusions

Autonomous vehicles are high-tech systems capable of functioning independently of human intervention by utilizing complex technologies such as sensors, artificial intelligence, and advanced control algorithms. This paper provides a comprehensive review of recent studies (2020–2026) on machine learning algorithms applied to autonomous platforms, including unmanned aerial vehicles, unmanned surface vehicles, unmanned underwater vehicles, and ground-based mobile robots.
The review addresses key tasks such as environment perception, simultaneous localization and mapping, collision avoidance, path planning, and motion control. Various machine learning approaches are examined, including supervised, semi-supervised, and unsupervised learning, as well as reinforcement learning and deep reinforcement learning.
The analyzed methods are evaluated in terms of performance, robustness, and suitability across different operational domains. Finally, the paper highlights major challenges and outlines promising future research directions aimed at improving the safety, autonomy, and reliability of autonomous vehicles.

Author Contributions

Conceptualization, A.L., M.R., M.Ł., K.K., A.S. and J.L.; methodology, A.L., M.R., M.Ł., K.K., A.S. and J.L.; formal analysis, A.S. and J.L.; resources, A.L., M.R., M.Ł., K.K., A.S. and J.L.; data curation, A.L., M.R., M.Ł. and K.K.; writing—original draft preparation, A.L., M.R., M.Ł. and K.K.; writing—review and editing, A.L., A.S. and J.L.; visualization, A.L., M.R., M.Ł. and K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the research project of the Faculty of Computer Science, Gdynia Maritime University, Poland, No. WI/2026/PZ/02: “Development of methods and algorithms for environmental perception, navigation and control of autonomous vehicles”.

Data Availability Statement

All data supporting the findings of this study are available within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AAVAutonomous aerial vehicle
ACActor–critic
ADPAdaptive dynamic programming
AGVAutonomous ground vehicle
AMAttention mechanisms
AMRAutonomous mobile robot
ASGD-LARSAdaptive stochastic gradient descent with least angle regression
ASGLDRAdaptive stochastic gradient descent linear regression
AUVAutonomous underwater vehicle
A2CAdvantage actor–critic
A3CAsynchronous advantage actor–critic
CNNConvolutional neural network
COLAVCollision avoidance
COLREGsInternational Regulations for Preventing Collisions at Sea
DDPGDeep Deterministic Policy Gradient
DDPG-DGDeep deterministic policy gradient with dynamic goals
DDQNDouble deep Q-network
DEDifferential evolution
DGDifferential gradient
DIHA*Dubins improved hybrid A*
DLDeep learning
DNNDeep neural network
DRLDeep reinforcement learning
DTDecision tree
DWADynamic window approach
DQNDeep Q-network
EKFExtended Kalman filter
ESOINNEvolving self-organizing incremental neural network
FHAFuzzy heading avoidance
GAGenetic algorithm
GAILGenerative adversarial imitation learning
GA3CGPU-based asynchronous advantage actor–critic
GBMGradient boosting machine
GNNGraph neural network
GNSSGlobal navigation satellite system
GPSGlobal positioning system
HDBSCANHierarchical density-based spatial clustering of applications with noise
ICMIntrinsic curiosity module
ILImitation learning
IMUInertial measurement unit
INSInertial navigation system
IRLInverse reinforcement learning
ITCDIndividual tree crown delineation
ITDDPGImproved transferring deep deterministic policy gradient
LiDARLight detection and ranging
LLMLarge language model
LMPCLyapunov-based MPC
LSTMLong short-term memory
MAPPOMulti-agent proximal policy optimization
MARLMulti-agent reinforcement learning
MAVMicro aerial vehicle
MDSMultidimensional scaling
MLMachine learning
MPCModel predictive control
NAVNano aerial vehicle
NNNeural network
OAMObstacle avoidance module
OSDPMObservational space data processing method
PAVPico aerial vehicle
PFPath following
PPPath planning
PPOProximal policy optimization
PSOParticle swarm optimization
QLQ-learning
RFRandom forest
RHRLReceding-horizon reinforcement learning
RLReinforcement learning
RNNRecurrent neural network
ROIRegion of interest
ROVRemotely operated vehicle
RRTRapidly exploring random trees
SACSoft actor–critic
SDSmart dust
SGDStochastic gradient descent
SIDMSurface image denoising module
SLSupervised learning
SLAMSimultaneous localization and mapping
SLICSimple linear iterative clustering
SMCSliding mode control
SODSmall object detection
SSDMSea-sky detection module
STDMSurface target detection module
SVMSupport vector machine
TD3Twin delayed deep deterministic policy gradient
TTTrajectory tracking
UAVUnmanned aerial vehicle
USVUnmanned surface vehicle
UUVUnmanned underwater vehicle
VGICPVoxel generalized iterative closest point
VSLAMVisual SLAM
WGAILWeighted generative adversarial imitation learning
WRRMWater reflection removal module
6-DOFSix-degree-of-freedom

References

  1. Kumar, M.; Rattan, N.; Mondal, S. Sensor systems for autonomous vehicles: Functionality and reliability challenges in adverse environmental conditions. Measurement 2026, 258, 119215. [Google Scholar] [CrossRef]
  2. Li, Z.; Jiang, C.; Gu, X.; Xu, Y.; Zhou, F.; Cui, J. Collaborative positioning for swarms: A brief survey of vision, LiDAR and wireless sensors based methods. Def. Technol. 2024, 33, 475–493. [Google Scholar] [CrossRef]
  3. Rahman, M.H.; Sejan, M.A.S.; Aziz, M.A.; Tabassum, R.; Baik, J.-I.; Song, H.-K. A Comprehensive Survey of Unmanned Aerial Vehicles Detection and Classification Using Machine Learning Approach: Challenges, Solutions, and Future Directions. Remote Sens. 2024, 16, 879. [Google Scholar] [CrossRef]
  4. Sharma, G.; Jain, S.; Sharma, R.S. Path Planning for Fully Autonomous UAVs—A Taxonomic Review and Future Perspectives. IEEE Access 2025, 13, 13356–13379. [Google Scholar] [CrossRef]
  5. Alqudsi, Y.; Makaraci, M. UAV swarms: Research, challenges, and future directions. J. Eng. Appl. Sci. 2025, 72, 12. [Google Scholar] [CrossRef]
  6. Qin, J.; Li, M.; Li, D.; Zhong, J.; Yang, K. A Survey on Visual Navigation and Positioning for Autonomous UUVs. Remote Sens. 2022, 14, 3794. [Google Scholar] [CrossRef]
  7. Tijjani, A.S.; Chemori, A.; Creuze, V. A survey on tracking control of unmanned underwater vehicles: Experiments-based approach. Annu. Rev. Control 2022, 54, 125–147. [Google Scholar] [CrossRef]
  8. Ding, S.; Zhang, T.; Lei, M.; Chai, H.; Jia, F. Robust visual-based localization and mapping for underwater vehicles: A survey. Ocean Eng. 2024, 312, 119274. [Google Scholar] [CrossRef]
  9. Almuzaini, T.S.; Savkin, A.V. Trajectory Planning for Autonomous Underwater Vehicles in Uneven Environments: A Survey of Coverage and Sensor Data Collection Methods. Future Internet 2026, 18, 79. [Google Scholar] [CrossRef]
  10. Damjanović, D.; Biočić, P.; Prakljačić, S.; Činčurak, D.; Balen, J. A comprehensive survey on SLAM and machine learning approaches for indoor autonomous navigation of mobile robots. Mach. Vis. Appl. 2025, 36, 55. [Google Scholar] [CrossRef]
  11. Waga, A.; Benhlima, S.; Bekri, A.; Abdouni, J.; Saber, F.Z. A survey on autonomous navigation for mobile robots: From traditional techniques to deep learning and large language models. J. King Saud Univ. Comput. Inf. Sci. 2025, 37, 198. [Google Scholar] [CrossRef]
  12. Hamidaoui, M.; Talhaoui, M.Z.; Li, M.; Midoun, M.A.; Haouassi, S.; Mekkaoui, D.E.; Smaili, A.; Cherraf, A.; Benyoub, F.Z. Survey of Autonomous Vehicles’ Collision Avoidance Algorithms. Sensors 2025, 25, 395. [Google Scholar] [CrossRef] [PubMed]
  13. Alexander, A.; Venkatesan, K.; Mounsef, J.; Ramanujam, K. A Comprehensive Survey of Path Planning Algorithms for Autonomous Systems and Mobile Robots: Traditional and Modern Approaches. IEEE Access 2025, 13, 176287–176326. [Google Scholar] [CrossRef]
  14. Jafarpourdavatgar, H.; Saeedinia, S.A.; Mohaghegh, M. Geometrical Optimal Navigation and Path Planning—Bridging Theory, Algorithms, and Applications. Sensors 2025, 25, 6874. [Google Scholar] [CrossRef] [PubMed]
  15. Sarhadi, P.; Naeem, W.; Athanasopoulos, N. A Survey of Recent Machine Learning Solutions for Ship Collision Avoidance and Mission Planning. IFAC-PapersOnLine 2022, 55, 257–268. [Google Scholar] [CrossRef]
  16. Bae, I.; Hong, J. Survey on the Developments of Unmanned Marine Vehicles: Intelligence and Cooperation. Sensors 2023, 23, 4643. [Google Scholar] [CrossRef] [PubMed]
  17. Xu, L.; Zhang, W. Survey on Path Planning Based on Deep Reinforcement Learning. In Proceedings of the 2025 2nd International Conference on Machine Learning and Intelligent Computing, Zhengzhou, China, 25–27 April 2025; pp. 685–695. [Google Scholar]
  18. Wibisono, A.; Piran, M.J.; Song, H.-K.; Lee, B.M. A Survey on Unmanned Underwater Vehicles: Challenges, Enabling Technologies, and Future Research Directions. Sensors 2023, 23, 7321. [Google Scholar] [CrossRef]
  19. Van Eck, N.J.; Waltman, L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef]
  20. Autonomous Underwater Vehicle Bluefin-21. Available online: https://gdmissionsystems.com/products/underwater-vehicles/bluefin-21-autonomous-underwater-vehicle (accessed on 30 March 2026).
  21. Remotely Operated Vehicle Defender. Available online: https://www.unmannedsystemstechnology.com/company/videoray/mission-specialist-defender/ (accessed on 30 March 2026).
  22. Shaukat, N.; Moinuddin, M.; Otero, P. Underwater Vehicle Positioning by Correntropy-Based Fuzzy Multi-Sensor Fusion. Sensors 2021, 21, 6165. [Google Scholar] [CrossRef]
  23. Cao, X.; Ren, L.; Sun, C. Research on Obstacle Detection and Avoidance of Autonomous Underwater Vehicle Based on Forward-Looking Sonar. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 9198–9208. [Google Scholar] [CrossRef]
  24. Gao, S.; Guo, W.; Xu, G.; Liu, B. An Unsupervised Obstacle Segmentation Method for Forward-Looking Sonar Based on Teacher–Student Transfer Learning. J. Mar. Sci. Eng. 2025, 13, 2134. [Google Scholar] [CrossRef]
  25. Zhi, H.; Zhou, Z.; Wu, H.; Chen, Z.; Tian, S.; Zhang, Y.; Ruan, Y. Oscillatory Forward-Looking Sonar Based 3D Reconstruction Method for Autonomous Underwater Vehicle Obstacle Avoidance. J. Mar. Sci. Eng. 2025, 13, 943. [Google Scholar] [CrossRef]
  26. Gao, Z.; Ren, X.; Yu, J. Research on Method of Collision Avoidance Planning for UUV Based on Deep Reinforcement Learning. J. Mar. Sci. Eng. 2023, 11, 2245. [Google Scholar] [CrossRef]
  27. Li, X.; Yu, S.; Gao, X.-Z.; Yan, Y.; Zhao, Y. Path planning and obstacle avoidance control of UUV based on an enhanced A* algorithm and MPC in dynamic environment. Ocean Eng. 2024, 302, 117584. [Google Scholar] [CrossRef]
  28. Heshmat, M.; Saoud, L.S.; Abujabal, M.; Sultan, A.; Elmezain, M.; Seneviratne, L.; Hussain, I. Underwater SLAM Meets Deep Learning: Challenges, Multi-Sensor Integration, and Future Directions. Sensors 2025, 25, 3258. [Google Scholar] [CrossRef]
  29. Wang, H.; Gao, W.; Wang, Z.; Zhang, K.; Ren, J.; Deng, L.; He, S. Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm. J. Mar. Sci. Eng. 2023, 12, 63. [Google Scholar] [CrossRef]
  30. Yuan, J.; Han, M.; Wang, H.; Zhong, B.; Gao, W.; Yu, D. AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient. J. Mar. Sci. Eng. 2023, 11, 2258. [Google Scholar] [CrossRef]
  31. Liu, S.; Ma, C.; Juan, R. AUV Obstacle Avoidance Framework Based on Event-Triggered Reinforcement Learning. Electronics 2024, 13, 2030. [Google Scholar] [CrossRef]
  32. Barik, P.; Parhi, D.R. AUV path planning and obstacle avoidances in marine environment based on enhanced ECUO technique. Expert Syst. Appl. 2025, 290, 128388. [Google Scholar] [CrossRef]
  33. Dai, N.; Qin, P.; Xu, X.; Zhang, Y.; Shen, Y.; He, B. An AUV collision avoidance algorithm in unknown environment with multiple constraints. Ocean Eng. 2024, 294, 116846. [Google Scholar] [CrossRef]
  34. Chen, T.; Zhang, Z.; Fang, Z.; Jiang, D.; Li, G. Imitation learning from imperfect demonstrations for AUV path tracking and obstacle avoidance. Ocean Eng. 2024, 298, 117287. [Google Scholar] [CrossRef]
  35. Huang, F.; Xu, J.; Wu, D.; Cui, Y.; Yan, Z.; Xing, W.; Zhang, X. A general motion controller based on deep reinforcement learning for an autonomous underwater vehicle with unknown disturbances. Eng. Appl. Artif. Intell. 2023, 117, 105589. [Google Scholar] [CrossRef]
  36. Xu, J.; Xie, G.; Liu, W.; Tang, J.; Yang, Z.; Xing, T.; Yang, Y.; Zhang, S.; Li, X. Ocean Diviner: A Diffusion-Augmented Reinforcement Learning Framework for AUV Robust Control in Underwater Tasks. arXiv 2025, arXiv:2507.11283. [Google Scholar]
  37. Cai, L.; Chang, K.; Girdhar, Y. Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles. arXiv 2025, arXiv:2410.00120. [Google Scholar]
  38. Xie, G.; Xu, J.; Tang, J.; Huang, Y.; Wang, Z.; Zhang, S.; Ma, D.; Qu, J.; Li, X. EasyUUV: An LLM-Enhanced Universal and Lightweight Sim-to-Real Reinforcement Learning Framework for UUV Attitude Control. arXiv 2026, arXiv:2510.22126. [Google Scholar]
  39. Zheng, H.; Liu, C. An overview of Unmanned Surface Vehicles: Methods, practices, and applications. Control Eng. Pract. 2025, 164, 106479. [Google Scholar] [CrossRef]
  40. Autonomous Surface Vehicle HydroDron. Available online: https://marinetechnology.pl/en/hydrodron-2/ (accessed on 19 February 2026).
  41. Wang, Y.; Peng, M.; Liu, Z.; Wan, W.; Di, K.; Hu, C.; Liu, L.; Lv, T.; Yang, C. Binocular visual environment perception technology for unmanned surface vehicle. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, XLIII-B2-2020, 1297–1302. [Google Scholar] [CrossRef]
  42. Zhang, W.; Jiang, F.; Yang, C.-F.; Wang, Z.-P.; Zhao, T.-J. Research on Unmanned Surface Vehicles Environment Perception Based on the Fusion of Vision and Lidar. IEEE Access 2021, 9, 63107–63121. [Google Scholar] [CrossRef]
  43. Xue, H.; Chen, X.; Zhang, R.; Wu, P.; Li, X.; Liu, Y. Deep Learning-Based Maritime Environment Segmentation for Unmanned Surface Vehicles Using Superpixel Algorithms. J. Mar. Sci. Eng. 2021, 9, 1329. [Google Scholar] [CrossRef]
  44. Cheng, C.; Liu, D.; Du, J.-H.; Li, Y.-Z. Research on Visual Perception for Coordinated Air–Sea through a Cooperative USV-UAV System. J. Mar. Sci. Eng. 2023, 11, 1978. [Google Scholar] [CrossRef]
  45. Li, T.; Zhang, X.; Huang, Y.; Yang, C. Lightweight CNN-Based Visual Perception Method for Assessing Local Environment Complexity of Unmanned Surface Vehicle. Sensors 2025, 25, 980. [Google Scholar] [CrossRef]
  46. Gu, Q.; Deng, B.; He, Y.; Zhang, Y.; Cheng, L.; Wang, Y. MarineSeg: A CNN–transformer hybrid architecture with feature voting decoder for robust semantic segmentation in USV-captured images. Neurocomputing 2026, 671, 132597. [Google Scholar] [CrossRef]
  47. Chen, X.; Lin, Y.; Yang, X.; Zhao, S. The tight-coupled SLAM system based on LiDAR and improved VGICP method for waterfront environments. Ocean Eng. 2025, 326, 120934. [Google Scholar] [CrossRef]
  48. Nikouei, M.; Baroutian, B.; Nabavi, S.; Taraghi, F.; Aghaei, A.; Sajedi, A.; Moghaddam, M.E. Small object detection: A comprehensive survey on challenges, techniques and real-world applications. Intell. Syst. Appl. 2025, 27, 200561. [Google Scholar] [CrossRef]
  49. Lu, H.; Zhang, Y.; Zhang, C.; Niu, Y.; Wang, Z.; Zhang, H. A multi-sensor fusion approach for maritime autonomous surface ships berthing navigation perception. Ocean Eng. 2025, 316, 119965. [Google Scholar] [CrossRef]
  50. Yao, R.; Wang, H.; Guo, Y.; Xie, Z. Robust real-time moving object detection on water surface: A LiDAR feature matching approach for maritime reliability enhancement. Ocean Eng. 2026, 346, 123860. [Google Scholar] [CrossRef]
  51. Trinh, L.; Mercelis, S.; Anwar, A. A comprehensive review of datasets and deep learning techniques for vision in unmanned surface vehicles. Ocean Eng. 2025, 334, 121501. [Google Scholar] [CrossRef]
  52. Zhang, Y.; Zhang, L.; Yu, Q.; Xing, B. Research on the Visual SLAM Algorithm for Unmanned Surface Vehicles in Nearshore Dynamic Scenarios. J. Mar. Sci. Eng. 2025, 13, 679. [Google Scholar] [CrossRef]
  53. Zheng, M.; Xie, S.; Chu, X.; Zhu, T.; Tian, G. Research on autonomous collision avoidance of merchant ship based on inverse reinforcement learning. Int. J. Adv. Robot. Syst. 2020, 17, 1–15. [Google Scholar] [CrossRef]
  54. Chun, D.H.; Roh, M.I.; Lee, H.W.; Ha, J.; Yu, D. Deep reinforcement learning-based collision avoidance for an autonomous ship. Ocean Eng. 2021, 234, 109216. [Google Scholar] [CrossRef]
  55. Wei, G.; Kuo, W. COLREGs-Compliant Multi-Ship Collision Avoidance Based on Multi-Agent Reinforcement Learning Technique. J. Mar. Sci. Eng. 2022, 10, 1431. [Google Scholar] [CrossRef]
  56. Zheng, K.; Zhang, X.; Wang, C.; Li, Y.; Cui, J.; Jiang, L. Adaptive collision avoidance decisions in autonomous ship encounter scenarios through rule-guided vision supervised learning. Ocean Eng. 2024, 297, 117096. [Google Scholar] [CrossRef]
  57. Xu, X.; Cao, Y.; Cai, P.; Zhang, W.; Chen, H. Research on real-time collision avoidance and path planning of USVs in multi-obstacle ships environment. Ocean Eng. 2024, 295, 116890. [Google Scholar] [CrossRef]
  58. Xie, W.; Gang, L.; Zhang, M.; Liu, T.; Lan, Z. Optimizing Multi-Vessel Collision Avoidance Decision Making for Autonomous Surface Vessels: A COLREGs-Compliant Deep Reinforcement Learning Approach. J. Mar. Sci. Eng. 2024, 12, 372. [Google Scholar] [CrossRef]
  59. Wang, C.; Wang, N.; Gao, H.; Wang, L.; Zhao, Y.; Fang, M. Knowledge transfer enabled reinforcement learning for efficient and safe autonomous ship collision avoidance. Int. J. Mach. Learn. Cybern. 2024, 15, 3715–3731. [Google Scholar] [CrossRef]
  60. Chun, D.H.; Roh, M.I.; Lee, H.W.; Yu, D. Method for collision avoidance based on deep reinforcement learning with path-speed control for an autonomous ship. Int. J. Nav. Archit. Ocean Eng. 2024, 16, 100579. [Google Scholar] [CrossRef]
  61. Song, W.; Chen, Z.; Sun, M.; Wang, Y.; Sun, Q. A COLREGs-based path-planning method for collision avoidance considering path cost through reinforcement learning. Ocean Eng. 2025, 325, 120746. [Google Scholar] [CrossRef]
  62. Jia, X.; Gao, S.; He, W. Meta-reinforcement learning-based collision avoidance for autonomous ship. Ocean Eng. 2025, 339, 122064. [Google Scholar] [CrossRef]
  63. Fan, Y.; Sun, Z.; Wang, G. Progressive deep reinforcement learning for intelligent collision avoidance in unmanned surface vehicles. Ocean Eng. 2025, 332, 121438. [Google Scholar] [CrossRef]
  64. Vedeler, A.; Warakagoda, N. Generative Adversarial Immitation Learning for Steering an Unmanned Surface Vehicle. In Proceedings of the Northern Lights Deep Learning Workshop; Septentrio Academic Publishing: Tromsø, Norway, 2020; Volume 1, p. 6. [Google Scholar]
  65. Martinsen, A.B.; Lekkas, A.M.; Gros, S.; Glomsrud, J.A.; Pedersen, T.A. A Reinforcement Learning-Based Tracking Control of USVs in Varying Operational Conditions. Front. Robot. AI 2020, 7, 32. [Google Scholar] [CrossRef]
  66. Peng, Z.; Liu, E.; Pan, C.; Wang, H.; Wang, D.; Liu, L. Model-based deep reinforcement learning for data-driven motion control of an under-actuated unmanned surface vehicle: Path following and trajectory tracking. J. Frankl. Inst. 2023, 360, 4399–4426. [Google Scholar] [CrossRef]
  67. Li, J.; Chavez-Galaviz, J.; Azizzadenesheli, K.; Mahmoudian, N. Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller. Sensors 2023, 23, 3572. [Google Scholar] [CrossRef]
  68. Wen, Y.; Chen, Y.; Guo, X. USV Trajectory Tracking Control Based on Receding Horizon Reinforcement Learning. Sensors 2024, 24, 2771. [Google Scholar] [CrossRef]
  69. Liu, J.; Xiao, C.; Yuan, H.; Li, C.; Li, Q. Unmanned surface vessels path following control using improved transferring reinforcement learning: Simulation and experiment. Ocean Eng. 2025, 320, 120346. [Google Scholar] [CrossRef]
  70. Lai, J.; Ren, Z.; Wu, Z.; Tan, Q.; Xie, S. Learning-based real-time optimal control of unmanned surface vessels in dynamic environment with obstacles. Ocean Eng. 2025, 335, 121505. [Google Scholar] [CrossRef]
  71. Er, M.J.; Chuang Ma, C.; Liu, T.; Gong, H. Intelligent motion control of unmanned surface vehicles: A critical review. Ocean Eng. 2023, 280, 114562. [Google Scholar] [CrossRef]
  72. Hassanalian, M.; Abdelkefi, A. Classifications, applications, and design challenges of drones: A review. Prog. Aerosp. Sci. 2017, 91, 99–131. [Google Scholar] [CrossRef]
  73. Mohsan, S.A.H.; Khan, M.A.; Noor, F.; Ullah, I.; Alsharif, M.H. Towards the Unmanned Aerial Vehicles (UAVs): A Comprehensive Review. Drones 2022, 6, 147. [Google Scholar] [CrossRef]
  74. Unmanned Aerial Vehicle EAGLENXT eBee X. Available online: https://eaglenxt.com/products/drones/ebee-series/ebee-x/ (accessed on 30 March 2026).
  75. Zhang, T.; Hu, X.; Xiao, J.; Zhang, G. A Machine Learning Method for Vision-Based Unmanned Aerial Vehicle Systems to Understand Unknown Environments. Sensors 2020, 20, 3245. [Google Scholar] [CrossRef] [PubMed]
  76. Wang, D.; Li, W.; Liu, X.; Li, N.; Zhang, C. UAV environmental perception and autonomous obstacle avoidance: A deep learning and depth camera combined solution. Comput. Electron. Agric. 2020, 175, 105523. [Google Scholar] [CrossRef]
  77. Ma, Y.; Zhao, Y.; Im, J.; Zhao, Y.; Zhen, Z. A deep-learning-based tree species classification for natural secondary forests using unmanned aerial vehicle hyperspectral images and LiDAR. Ecol. Indic. 2024, 159, 111608. [Google Scholar] [CrossRef]
  78. Neuville, R.; Bates, J.S.; Jonard, F. Estimating Forest Structure from UAV-Mounted LiDAR Point Cloud Using Machine Learning. Remote Sens. 2021, 13, 352. [Google Scholar] [CrossRef]
  79. Gonzalez-Perez, A.; Abd-Elrahman, A.; Wilkinson, B.; Johnson, D.J.; Carthy, R.R. Deep and Machine Learning Image Classification of Coastal Wetlands Using Unpiloted Aircraft System Multispectral Images and LiDAR Datasets. Remote Sens. 2022, 14, 3937. [Google Scholar] [CrossRef]
  80. Tsai, M.-D.; Tseng, K.-W.; Lai, C.-C.; Wei, C.-T.; Cheng, K.-F. Exploring Airborne LiDAR and Aerial Photographs Using Machine Learning for Land Cover Classification. Remote Sens. 2023, 15, 2280. [Google Scholar] [CrossRef]
  81. Dilmurat, K.; Sagan, V.; Moose, S. AI-Driven Maize Yield Forecasting Using Unmanned Aerial Vehicle-Based Hyperspectral and LiDAR Data Fusion, ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, V-3-2022, 193–199. [Google Scholar] [CrossRef]
  82. Fu, C.; Xu, X.; Zhang, Y.; Lyu, Y.; Xia, Y.; Zhou, Z.; Wu, W. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput. Appl. 2022, 34, 14599–14607. [Google Scholar] [CrossRef]
  83. Liu, Y.; Xie, K.; Huang, H. VGF-Net: Visual-Geometric fusion learning for simultaneous drone navigation and height mapping. Graph. Model 2021, 116, 101108. [Google Scholar] [CrossRef]
  84. Li, D.; Yang, W.; Shi, X.; Guo, D.; Long, Q.; Qiao, F.; Wei, Q. A Visual-Inertial Localization Method for Unmanned Aerial Vehicle in Underground Tunnel Dynamic Environments. IEEE Access 2020, 8, 76809–76822. [Google Scholar] [CrossRef]
  85. Lendzioch, T.; Langhammer, J.; Vlček, L.; Minařík, R. Mapping the Groundwater Level and Soil Moisture of a Montane Peat Bog Using UAV Monitoring and Machine Learning. Remote Sens. 2021, 13, 907. [Google Scholar] [CrossRef]
  86. Tullu, A.; Endale, B.; Wondosen, A.; Hwang, H.-Y. Machine Learning Approach to Real-Time 3D Path Planning for Autonomous Navigation of Unmanned Aerial Vehicle. Appl. Sci. 2021, 11, 4706. [Google Scholar] [CrossRef]
  87. Grando, R.B.; de Jesus, J.; Kich, V.A.; Kolling, A.H.; Drews, P.L.J., Jr. Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles. J. Intell. Robot. Syst. 2022, 104, 29. [Google Scholar] [CrossRef]
  88. Lee, H.Y.; Ho, H.W.; Zhou, Y. Deep Learning-based Monocular Obstacle Avoidance for Unmanned Aerial Vehicle Navigation in Tree Plantations. J. Intell. Robot. Syst. 2021, 101, 5. [Google Scholar] [CrossRef]
  89. Samma, H.; El-Ferik, S. Autonomous UAV Visual Navigation Using an Improved Deep Reinforcement Learning. IEEE Access 2024, 12, 79967–79977. [Google Scholar] [CrossRef]
  90. Kuo, P.H.; Chen, K.L.; Lin, Y.S.; Chiu, Y.C.; Peng, C.C. Deep reinforcement learning–based collision avoidance strategy for multiple unmanned aerial vehicles. Eng. Appl. Artif. Intell. 2025, 160, 111862. [Google Scholar] [CrossRef]
  91. Pedro, D.; Matos-Carvalho, J.P.; Fonseca, J.M.; Mora, A. Collision Avoidance on Unmanned Aerial Vehicles Using Neural Network Pipelines and Flow Clustering Techniques. Remote Sens. 2021, 13, 2643. [Google Scholar] [CrossRef]
  92. Azzam, R.; Chehadeh, M.; Hay, O.A.; Humais, M.A.; Boiko, I.; Zweiri, Y. Learning-Based Navigation and Collision Avoidance Through Reinforcement for UAVs. IEEE Trans. Aerosp. Electron. Syst. 2024, 60, 2614–2628. [Google Scholar] [CrossRef]
  93. Sonny, A.; Yeduri, S.R.; Cenkeramaddi, L.R. Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance. Appl. Soft Comput. 2023, 147, 110773. [Google Scholar] [CrossRef]
  94. Yuksek, B.; Demirezen, M.U.; Inalhan, G.; Tsourdos, A. Cooperative Planning for an Unmanned Combat Aerial Vehicle Fleet Using Reinforcement Learning. J. Aerosp. Inf. Syst. 2021, 18, 739–750. [Google Scholar] [CrossRef]
  95. Xie, R.; Meng, Z.; Wang, L.; Li, H.; Wang, K.; Wu, Z. Unmanned Aerial Vehicle Path Planning Algorithm Based on Deep Reinforcement Learning in Large-Scale and Dynamic Environments. IEEE Access 2021, 9, 24884–24900. [Google Scholar] [CrossRef]
  96. Qu, C.; Gai, W.; Zhong, M.; Zhang, J. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning. Appl. Soft Comput. 2020, 89, 106099. [Google Scholar] [CrossRef]
  97. Zhang, X.; Xia, S.; Li, X.; Zhang, T. Multi-objective particle swarm optimization with multi-mode collaboration based on reinforcement learning for path planning of unmanned air vehicles. Knowl.-Based Syst. 2022, 250, 109075. [Google Scholar] [CrossRef]
  98. Cardenas, J.A.; Carrero, U.E.; Camacho, E.C.; Calderon, J.M. Intelligent Position Controller for Unmanned Aerial Vehicles (UAV) Based on Supervised Deep Learning. Machines 2023, 11, 606. [Google Scholar] [CrossRef]
  99. Khan, F.S.; Mohd, M.N.H.; Zulkifli, S.A.; Abro, G.E.M.; Kazi, S.; Soomro, D.M. Deep Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Control Using 3D Hand Gestures. Comput. Mater. Contin. 2022, 72, 5741–5759. [Google Scholar] [CrossRef]
  100. Wan, K.; Gao, X.; Hu, Z.; Wu, G. Robust Motion Control for UAV in Dynamic Uncertain Environments Using Deep Reinforcement Learning. Remote Sens. 2020, 12, 640. [Google Scholar] [CrossRef]
  101. Jembre, Y.Z.; Nugroho, Y.W.; Khan, M.T.R.; Attique, M.; Paul, R.; Shah, S.H.A.; Kim, B. Evaluation of Reinforcement and Deep Learning Algorithms in Controlling Unmanned Aerial Vehicles. Appl. Sci. 2021, 11, 7240. [Google Scholar] [CrossRef]
  102. Jiang, Z.; Lynch, A.F. Quadrotor motion control using deep reinforcement learning. J. Unmanned Veh. Syst. 2021, 9, 234–251. [Google Scholar] [CrossRef]
  103. Zhao, Z.; Wan, Y.; Chen, Y. Deep Reinforcement Learning-Driven Collaborative Rounding-Up for Multiple Unmanned Aerial Vehicles in Obstacle Environments. Drones 2024, 8, 464. [Google Scholar] [CrossRef]
  104. Unmanned Ground Vehicle Clearpath Robotics Husky A300. Available online: https://clearpathrobotics.com/husky-a300-unmanned-ground-vehicle-robot/ (accessed on 30 March 2026).
  105. Ziegler, P.; Franke, J.; Reitelshöfer, S. Mobile Robot Environment Perception: Adaptable Real-Time Neural Network Architecture for RGB-D Ground Segmentation. In Proceedings of the 2025 European Conference on Mobile Robots (ECMR), Padova, Italy, 2–5 September 2025; pp. 1–7. [Google Scholar] [CrossRef]
  106. Kondratenko, Y.; Atamanyuk, I.; Sidenko, I.; Kondratenko, G.; Sichevskyi, S. Machine Learning Techniques for Increasing Efficiency of the Robot’s Sensor and Control Information Processing. Sensors 2022, 22, 1062. [Google Scholar] [CrossRef] [PubMed]
  107. Xu, P.; Ding, L.; Li, Z.; Yang, H.; Wang, Z.; Gao, H.; Zhou, R.; Su, Y.; Deng, Z.; Huang, Y. Learning physical characteristics like animals for legged robots. Natl. Sci. Rev. 2023, 10, nwad045. [Google Scholar] [CrossRef]
  108. Ciuffreda, I.; Casaccia, S.; Revel, G.M. A Multi-Sensor Fusion Approach Based on PIR and Ultrasonic Sensors Installed on a Robot to Localise People in Indoor Environments. Sensors 2023, 23, 6963. [Google Scholar] [CrossRef] [PubMed]
  109. Zain, L.H.; Ammar, H.H.; Shalaby, R.E. Imitation Learning for Obstacle Avoidance Using End-to-End CNN-Based Sensor Fusion. In 2025 International Telecommunications Conference (ITC-Egypt); IEEE: Piscataway, NJ, USA, 2025; pp. 546–552. [Google Scholar]
  110. Cui, J. Research on Machine Learning Models for Sensor Fusion of Intelligent Connected Vehicles. In 2025 5th Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS); IEEE: Piscataway, NJ, USA, 2025; pp. 627–631. [Google Scholar] [CrossRef]
  111. Barreto-Cubero, A.J.; Gómez-Espinosa, A.; Escobedo Cabello, J.A.; Cuan-Urquizo, E.; Cruz-Ramírez, S.R. Sensor Data Fusion for a Mobile Robot Using Neural Networks. Sensors 2022, 22, 305. [Google Scholar] [CrossRef]
  112. Zhu, K.; Zhang, T. Deep reinforcement learning based mobile robot navigation: A review. Tsinghua Sci. Technol. 2021, 26, 674–691. [Google Scholar] [CrossRef]
  113. Le, H.; Saeedv, S.; Hsu, C.C. A Comprehensive Review of Mobile Robot Navigation Using Deep Reinforcement Learning Algorithms in Crowded Environments. J. Intell. Robot. Syst. 2024, 110, 158. [Google Scholar] [CrossRef]
  114. Hoang, M.L. Unlocking robotic perception: Comparison of deep learning methods for simultaneous localization and mapping and visual simultaneous localization and mapping in robot. Int. J. Intell. Robot. Appl. 2025, 9, 1011–1043. [Google Scholar] [CrossRef]
  115. Tsai, C.Y.; Nisar, H.; Hu, Y.C. Mapless lidar navigation control of wheeled mobile robots based on deep imitation learning. IEEE Access 2021, 9, 117527–117541. [Google Scholar] [CrossRef]
  116. Nguyen, P.T.-T.; Yan, S.-W.; Liao, J.-F.; Kuo, C.-H. Autonomous Mobile Robot Navigation in Sparse LiDAR Feature Environments. Appl. Sci. 2021, 11, 5963. [Google Scholar] [CrossRef]
  117. Lee, M.-F.R.; Yusuf, S.H. Mobile Robot Navigation Using Deep Reinforcement Learning. Processes 2022, 10, 2748. [Google Scholar] [CrossRef]
  118. Yan, K.; Gao, J.; Li, Y. Deep Reinforcement Learning Based Mobile Robot Navigation Using Sensor Fusion. In Proceedings of the 42nd Chinese Control Conference (CCC), Tianjin, China, 24–26 July 2023; pp. 4125–4130. [Google Scholar] [CrossRef]
  119. Zhou, B.; He, Y.; Huang, W.; Yu, X.; Fang, F.; Li, X. Place recognition and navigation of outdoor mobile robots based on random Forest learning with a 3D LiDAR. J. Intell. Robot. Syst. 2022, 104, 72. [Google Scholar] [CrossRef]
  120. Sadeghi Esfahlani, S.; Sanaei, A.; Ghorabian, M.; Shirvani, H. The Deep Convolutional Neural Network Role in the Autonomous Navigation of Mobile Robots (SROBO). Remote Sens. 2022, 14, 3324. [Google Scholar] [CrossRef]
  121. Liu, X.; Wen, S.; Hu, Y.; Han, F.; Zhang, H.; Karimi, H.R. An active SLAM with multi-sensor fusion for snake robots based on deep reinforcement learning. Mechatronics 2024, 103, 103248. [Google Scholar] [CrossRef]
  122. Surmann, H.; Jestel, C.; Marchel, R.; Musberg, F.; Elhadj, H.; Ardani, M. Deep reinforcement learning for real autonomous mobile robot navigation in indoor environments. arXiv 2020, arXiv:2005.13857. [Google Scholar] [CrossRef]
  123. Wong, C.-C.; Feng, H.-M.; Kuo, K.-L. Multi-Sensor Fusion Simultaneous Localization Mapping Based on Deep Reinforcement Learning and Multi-Model Adaptive Estimation. Sensors 2023, 24, 48. [Google Scholar] [CrossRef]
  124. Xu, Z.; Song, Y.; Pang, B.; Xu, Q.; Yuan, X. Deep learning-based visual slam for indoor dynamic scenes. Appl. Intell. 2025, 55, 434. [Google Scholar] [CrossRef]
  125. Choi, J.; Lee, G.; Lee, C. Reinforcement learning-based dynamic obstacle avoidance and integration of path planning. Intell. Serv. Robot. 2021, 14, 663–677. [Google Scholar] [CrossRef]
  126. Feng, S.; Sebastian, B.; Ben-Tzvi, P. A collision avoidance method based on deep reinforcement learning. Robotics 2021, 10, 73. [Google Scholar] [CrossRef]
  127. Zhang, L.; Zhang, Y.; Li, Y. Path planning for indoor mobile robot based on deep learning. Optik 2020, 219, 165096. [Google Scholar] [CrossRef]
  128. Zhang, Q.; Ma, W.; Zheng, Q.; Zhai, X.; Zhang, W.; Zhang, T.; Wang, S. Path planning of mobile robot in dynamic obstacle avoidance environment based on deep reinforcement learning. IEEE Access 2024, 12, 189136–18915210. [Google Scholar] [CrossRef]
  129. Ran, T.; Yuan, L.; Zhang, J.B. Scene perception based visual navigation of mobile robot in indoor environment. Isa Trans. 2021, 109, 389–400. [Google Scholar] [CrossRef] [PubMed]
  130. Das, S.; Mishra, S.K. A Machine Learning approach for collision avoidance and path planning of mobile robot under dense and cluttered environments. Comput. Electr. Eng. 2022, 103, 108376. [Google Scholar] [CrossRef]
  131. Wang, W.; Wu, Z.; Luo, H.; Zhang, B. Path planning method of mobile robot using improved deep reinforcement learning. J. Electr. Comput. Eng. 2022, 2022, 5433988. [Google Scholar] [CrossRef]
  132. Deshpande, S.V.; Harikrishnan, R.; Ibrahim, B.S.K.K.; Ponnuru, M.D.S. Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration. Cogn. Robot. 2024, 4, 156–173. [Google Scholar] [CrossRef]
  133. Xiao, H.; Chen, C.; Zhang, G.; Chen, C.P. Reinforcement learning-driven dynamic obstacle avoidance for mobile robot trajectory tracking. Knowl.-Based Syst. 2024, 297, 111974. [Google Scholar] [CrossRef]
  134. Thakur, A.; Das, S.; Mishra, S.K.; Swain, S.K. Adaptive stochastic gradient descent with least angle regression enhanced navigation: Intelligent path planning in cluttered environments for autonomous robots. Trans. Energy Syst. Eng. Appl. 2025, 6, 1–26. [Google Scholar] [CrossRef]
  135. Farias, G.; Garcia, G.; Montenegro, G.; Fabregas, E.; Dormido-Canto, S.; Dormido, S. Position control of a mobile robot using reinforcement learning. IFAC-PapersOnLine 2020, 53, 17393–17398. [Google Scholar] [CrossRef]
  136. Hassan, N.; Saleem, A. Neural network-based adaptive controller for trajectory tracking of wheeled mobile robots. IEEE Access 2022, 10, 13582–13597. [Google Scholar] [CrossRef]
  137. H.à, V.T.; Thuong, T.T. Neural-backstepping adaptive control for nonlinear motion of sliding mobile robots. Int. J. Mech. Eng. Robot. Res. 2025, 14, 323–339. [Google Scholar] [CrossRef]
  138. Yu, X.; Fan, Y.; Xu, S.; Ou, L. A self-adaptive SAC-PID control approach based on reinforcement learning for mobile robots. Int. J. Robust Nonlinear Control 2022, 32, 9625–9643. [Google Scholar] [CrossRef]
  139. Gao, X.; Gao, R.; Liang, P.; Zhang, Q.; Deng, R.; Zhu, W. A hybrid tracking control strategy for nonholonomic wheeled mobile robot incorporating deep reinforcement learning approach. IEEE Access 2021, 9, 15592–15602. [Google Scholar] [CrossRef]
  140. Quiroga, F.; Hermosilla, G.; Farias, G.; Fabregas, E.; Montenegro, G. Position Control of a Mobile Robot through Deep Reinforcement Learning. Appl. Sci. 2022, 12, 7194. [Google Scholar] [CrossRef]
  141. Cao, Y.; Ni, K.; Kawaguchi, T.; Hashimoto, S. Path Following for Autonomous Mobile Robots with Deep Reinforcement Learning. Sensors 2024, 24, 561. [Google Scholar] [CrossRef]
Figure 5. Recent ML-based collision avoidance and path planning methods for UUVs.
Figure 5. Recent ML-based collision avoidance and path planning methods for UUVs.
Electronics 15 02073 g005
Figure 6. ML-based motion control methods for UUVs/AUVs.
Figure 6. ML-based motion control methods for UUVs/AUVs.
Electronics 15 02073 g006
Figure 7. Number of publications on ML algorithms for USVs (2020–2026).
Figure 7. Number of publications on ML algorithms for USVs (2020–2026).
Electronics 15 02073 g007
Figure 8. Keyword co-occurrence clustering results in VOSviewer for “unmanned surface vehicles” and “machine learning”.
Figure 8. Keyword co-occurrence clustering results in VOSviewer for “unmanned surface vehicles” and “machine learning”.
Electronics 15 02073 g008
Figure 9. The HydroDron USV platform [40].
Figure 9. The HydroDron USV platform [40].
Electronics 15 02073 g009
Figure 10. Recent ML-based environment perception methods for USVs.
Figure 10. Recent ML-based environment perception methods for USVs.
Electronics 15 02073 g010
Figure 11. Recent ML-based collision avoidance and path planning methods for USVs.
Figure 11. Recent ML-based collision avoidance and path planning methods for USVs.
Electronics 15 02073 g011
Figure 12. Recent ML-based motion control methods for USVs.
Figure 12. Recent ML-based motion control methods for USVs.
Electronics 15 02073 g012
Figure 13. The EAGLENXT eBee X UAV [74].
Figure 13. The EAGLENXT eBee X UAV [74].
Electronics 15 02073 g013
Figure 14. Number of papers on machine learning applied to UAVs from different publishers’ databases (2020–2026).
Figure 14. Number of papers on machine learning applied to UAVs from different publishers’ databases (2020–2026).
Electronics 15 02073 g014
Figure 15. Keyword co-occurrence clustering results in VOSviewer for “unmanned aerial vehicles” and “machine learning”.
Figure 15. Keyword co-occurrence clustering results in VOSviewer for “unmanned aerial vehicles” and “machine learning”.
Electronics 15 02073 g015
Figure 16. Recent ML-based environment perception methods for UAVs.
Figure 16. Recent ML-based environment perception methods for UAVs.
Electronics 15 02073 g016
Figure 17. Recent ML-based SLAM methods for UAVs.
Figure 17. Recent ML-based SLAM methods for UAVs.
Electronics 15 02073 g017
Figure 18. Recent ML-based collision avoidance and path planning methods for UAVs.
Figure 18. Recent ML-based collision avoidance and path planning methods for UAVs.
Electronics 15 02073 g018
Figure 19. Recent ML-based motion control methods for UAVs.
Figure 19. Recent ML-based motion control methods for UAVs.
Electronics 15 02073 g019
Figure 20. Number of papers on machine learning applied to mobile robots in different publishers’ databases (2020–2026).
Figure 20. Number of papers on machine learning applied to mobile robots in different publishers’ databases (2020–2026).
Electronics 15 02073 g020
Figure 21. Keyword co-occurrence clustering results in VOSviewer for “mobile robots” and “machine learning”.
Figure 21. Keyword co-occurrence clustering results in VOSviewer for “mobile robots” and “machine learning”.
Electronics 15 02073 g021
Figure 22. The Clearpath Robotics Husky A300 UGV [104].
Figure 22. The Clearpath Robotics Husky A300 UGV [104].
Electronics 15 02073 g022
Figure 23. Recent ML-based environment perception methods for mobile robots.
Figure 23. Recent ML-based environment perception methods for mobile robots.
Electronics 15 02073 g023
Figure 24. Recent ML-based SLAM methods for mobile robots.
Figure 24. Recent ML-based SLAM methods for mobile robots.
Electronics 15 02073 g024
Figure 25. Recent ML-based collision avoidance and path planning methods for mobile robots.
Figure 25. Recent ML-based collision avoidance and path planning methods for mobile robots.
Electronics 15 02073 g025
Figure 26. Recent ML-based motion control methods for mobile robots.
Figure 26. Recent ML-based motion control methods for mobile robots.
Electronics 15 02073 g026
Table 1. A comparative analysis of recent review papers on autonomous vehicles.
Table 1. A comparative analysis of recent review papers on autonomous vehicles.
AuthorsYearType of VehiclesMethodsTasks
Sarhadi et al. [15]2022ships, USVsMLcollision avoidance, mission planning
Qin et al. [6]2022UUVsgeometry-based, deep learning-basedvisual navigation and positioning
Tijjani et al. [7]2022UUVsPD/PID, SMC, adaptive, observation-based, MPC, combinedtracking control
Bae & Hong [16]2023USVs, UUVsbio-inspired, graph-based, ML, hybrid with MLsensors, obstacle avoidance, route planning, cooperation
Rahman et al. [3]2024UAVsMLdetection and classification
Li et al. [2]2024UAVs, UGVs, USVs, UUVsfeature matching, point cloud registration, scan registration, ML, DE, graph theory, EKF, MDScollaborative positioning for swarms based on camera, LiDAR, wireless sensor
Ding et al. [8]2024UUVsfilter-based and nonlinear optimization methods for multi-sensor fusion SLAMvisual-based localization and mapping
Alexander et al. [13]2025AAVs, AGVs, mobile robotsgrid-based, graph-based, sampling-based, reactive, predictive, optimization-based, bio-inspired, MLpath planning
Damjanović et al. [10]2025mobile robotstraditional and MLindoor autonomous navigation (SLAM, path planning and obstacle avoidance)
Waga et al. [11]2025mobile robotsgraph-based, sampling-based, gradient-based, bionic, learning-based (DL, RL, and LLM)autonomous navigation (obstacle avoidance and path planning)
Xu & Zhang [17]2025mobile robots, UAVsDRLpath planning
Jafarpourda- vatgar et al. [14]2025mobile robots, UAVs, robotic manipulatorsgreedy algorithms, DP, EA, sampling-based, hybrid with MLpath planning
Hamidaoui et al. [12]2025self-driving carssensor-based, path planning, decision-making, MLcollision avoidance, path planning
Sharma et al. [4]2025UAVssampling-based, graph-based, bio-inspired, heuristics, ML, hybridpath planning
Alqudsi & Makaraci [5]2025UAV swarmsintegration of AI and ML in UAV swarmscoordinated path planning, task assignment, formation control
Kumar et al. [1]2026AGVs, UAVs, ASVs, AUVssensor fusion: high-level fusion (HLF), low-level fusion (LLF), and mid-level fusion (MLF)sensors: camera, radar, LiDAR, ultrasonic sensors, GNSS/GPS, INS/IMU, DR/DVL/Acoustic sensors, odometry sensors
Almuzaini & Savkin [9]2026AUVsplanar coverage strategies, terrain-aware, occlusion-aware, multi-AUV, online planning, energy-driven, channel-aware, information-basedtrajectory planning
Table 2. Recent collision avoidance and path planning methods for UUVs/AUVs.
Table 2. Recent collision avoidance and path planning methods for UUVs/AUVs.
MethodAuthorsYearTaskSim./Exp.
DRL (PPO–DWA)Gao et al. [26]2023COLAVSim.
DRL (A3C)Wang et al. [29]2023Obst. Av.Sim.
DRL (DDPG)Yuan et al. [30]2023COLAVSim.
Enhanced ML + MPCLi et al. [27]2024Dyn. PP + Obst. Av.Sim.
DIHA* + FHADai et al. [33]2024COLAVSim.
Soft Actor–CriticLiu et al.  [31]2024PP + Obst. Av.Sim.
Imitation learning (WGAIL)Chen et al. [34]2024Path tracking + Obst. Av.Sim.
Enhanced ECUOBarik and Parhi [32]2025PP + Obst. Av.Sim.
Table 3. Recent ML-based motion control methods for UUVs/AUVs.
Table 3. Recent ML-based motion control methods for UUVs/AUVs.
MethodAuthorsYearTaskSim./Exp.
DRL (MAPPO) + ESOHuang et al. [35]2023General motion control under disturbancesSim.
Imitation learning (WGAIL)Chen et al. [34]2024Path tracking + obstacle avoidanceSim.
Diffusion-augmented RLXu et al. [36]2025Robust control in underwater tasksSim.
DRL (6-DOF direct thruster control)Cai et al. [37]2025Full 6-DOF motion controlSim./Exp.
Sim2Real RL + adaptive controller (+LLM tuning)Xie et al. [38]2026Attitude controlSim./Exp.
Table 4. Recent ML-based environment perception methods for USVs.
Table 4. Recent ML-based environment perception methods for USVs.
MethodAuthorsYearSensorsSim./Exp.
CNNWang et al. [41]2020CameraExp.
SVM, CNNZhang et al. [42]2021Camera & LiDARSim./Exp.
CNNXue et al. [43]2021CameraSim.
CNNCheng et al. [44]2023CameraExp.
CNNLi et al. [45]2025ImagesSim.
CNN-TransformerGu et al. [46]2026ImagesSim.
Table 5. Recent ML-based collision avoidance and path planning methods for USVs.
Table 5. Recent ML-based collision avoidance and path planning methods for USVs.
MethodAuthorsYearTaskSim./Exp.
IRLZheng et al. [53]2020COLAVSim.
DRL (PPO)Chun et al. [54]2021COLAVSim.
MARLWei & Kuo [55]2022COLAVSim.
RGVSLZheng et al. [56]2024COLAV/PPSim.
DRL (DDPG)Xu et al. [57]2024COLAV/PPSim.
DRL (PPO)Xie et al. [58]2024COLAVSim.
KT RLWang et al. [59]2024COLAVSim.
DRL (MLP-CNN)Chun et al. [60]2024COLAVSim.
RLSong et al. [61]2025COLAV/PPSim.
MRLJia et al. [62]2025COLAV/PPSim.
DRL (PNN)Fan et al. [63]2025COLAVSim.
Table 6. Recent ML-based motion control methods for USVs.
Table 6. Recent ML-based motion control methods for USVs.
MethodAuthorsYearTaskSim./Exp.
GAILVedeler & Warakagoda [64]2020HCSim.
RLMartinsen et al. [65]2020TTSim. & Exp.
DRLPeng et al. [66]2023PF & TTSim.
DQN+NN-MPCLi et al. [67]2023PFSim. & Exp.
RHRLWen et al. [68]2024TTSim.
ITDDPGLiu et al. [69]2025PFSim. & Exp.
DNNLai et al. [70]2025PP+TTSim.
Table 7. Recent ML-based environment perception methods for UAVs.
Table 7. Recent ML-based environment perception methods for UAVs.
MethodAuthorsYearTaskSim./Exp.
DL CNNZhang et al. [75]2020Object detectionSim. & Exp.
DL CNNDashuai et al. [76]2020Object detectionSim. & Exp.
UL HDBSCANNeuville et al. [78]2021Data segmentationExp.
DL CNNGonzalez-Perez et al. [79]2022Image classificationExp.
SL GBMDilmurat et al. [81]2022Data classificationSim. & Exp.
SL SVMTsai et al. [80]2023Data classificationExp.
SL ITCDYe et al. [77]2024Data classificationExp.
Table 8. Recent ML-based SLAM methods for UAVs.
Table 8. Recent ML-based SLAM methods for UAVs.
MethodAuthorsYearTaskSim./Exp.
DL CNNLi et al. [84]2020LocalizationSim. & Exp.
DL CNNLiu et al. [83]2021Localization + MappingSim.
SL RFLendzioch et al. [85]2021MappingExp.
DL CNNTullu et al. [86]2021LocalizationSim. & Exp.
DL CNNLee et al. [88]2021LocalizationSim. & Exp.
DRL DQNFu et al. [82]2022LocalizationSim.
RL TD3+SACGrando et al. [87]2022LocalizationSim. & Exp.
DRL DQNSamma et al. [89]2024LocalizationSim.
Table 9. Recent ML-based collision avoidance and path planning methods for UAVs.
Table 9. Recent ML-based collision avoidance and path planning methods for UAVs.
MethodAuthorsYearTaskSim./Exp.
RL Q-learningQu et al. [96]2020PPSim.
DL CNNPedro et al. [91]2021COLAVSim. & Exp.
RL PPOYuksek et al. [94]2021PPSim.
DRL DQNXie et al. [95]2021PPSim.
RL Q-learningZhang et al. [97]2022PPSim.
RL Q-learningSonny et al. [93]2023PPSim. & Exp.
DRL DDPGAzzam et al. [92]2024COLAVSim. & Exp.
RL PPO+A2C+SACKuo et al. [90]2025COLAVSim.
Table 10. Recent ML-based motion control methods for UAVs.
Table 10. Recent ML-based motion control methods for UAVs.
MethodAuthorsYearTaskSim./Exp.
DRL DDPGWan et al. [100]2020Motion controlSim.
DRL DQNJembre et al. [101]2021Motion controlSim.
DRL PPOJiang et al. [102]2021Motion controlSim.
DRL DDPGKhan et al. [99]2022Motion controlSim. & Exp.
SL DNNCardenas et al. [98]2023Position controlSim.
DRL SACZhao et al. [103]2024Multi-motion controlSim.
Table 15. Summary of machine learning applications for various tasks in different autonomous vehicles.
Table 15. Summary of machine learning applications for various tasks in different autonomous vehicles.
Task/VehiclesUUVsUSVsUAVsMobile Robots
environment perceptionvarious CNN algorithmsCNNs, sim./exp.various: DL (CNNs), SL (SVM, GBM), UL (DBSCAN)neural networks, sim./exp.
SLAMML with Kalman filterDL-based visual SLAMvarious ML methods, sim./exp.DRL, sim./exp.
COLAV, PPDRL, sim.RL/DRL, sim.RL/DRL, sim./exp.RL/DRL, sim./exp.
motion controlRL/DRL, sim./exp.RL/DRL/hybrid, sim./exp.DRL, sim.RL/DRL, sim./exp.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lazarowska, A.; Rybczak, M.; Łącki, M.; Kozakiewicz, K.; Lisowski, J.; Stateczny, A. A Survey of Machine Learning Algorithms for Autonomous Vehicles. Electronics 2026, 15, 2073. https://doi.org/10.3390/electronics15102073

AMA Style

Lazarowska A, Rybczak M, Łącki M, Kozakiewicz K, Lisowski J, Stateczny A. A Survey of Machine Learning Algorithms for Autonomous Vehicles. Electronics. 2026; 15(10):2073. https://doi.org/10.3390/electronics15102073

Chicago/Turabian Style

Lazarowska, Agnieszka, Monika Rybczak, Mirosław Łącki, Krystian Kozakiewicz, Józef Lisowski, and Andrzej Stateczny. 2026. "A Survey of Machine Learning Algorithms for Autonomous Vehicles" Electronics 15, no. 10: 2073. https://doi.org/10.3390/electronics15102073

APA Style

Lazarowska, A., Rybczak, M., Łącki, M., Kozakiewicz, K., Lisowski, J., & Stateczny, A. (2026). A Survey of Machine Learning Algorithms for Autonomous Vehicles. Electronics, 15(10), 2073. https://doi.org/10.3390/electronics15102073

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop