Next Article in Journal
Artificial Intelligence in Agri-Robotics: A Systematic Review of Trends and Emerging Directions Leveraging Bibliometric Tools
Previous Article in Journal
Vision-Guided Grasp Planning for Prosthetic Hands with AABB-Based Object Representation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Autonomous Mobile Robot Path Planning Techniques—A Review: Metaheuristic and Cognitive Techniques

by
Mubarak Badamasi Aremu
1,2,
Gamil Ahmed
2,
Sami Elferik
1,2,* and
Abdul-Wahid A. Saif
1,2
1
Control and Instrumentation Engineering Department, King Fahd University of Petroleum and Minerals, Dharan 31261, Saudi Arabia
2
Interdisciplinary Research Center (IRC) for Smart Mobility and Logistics, King Fahd University of Petroleum and Minerals, Dharan 31261, Saudi Arabia
*
Author to whom correspondence should be addressed.
Robotics 2026, 15(1), 23; https://doi.org/10.3390/robotics15010023
Submission received: 23 November 2025 / Revised: 9 January 2026 / Accepted: 9 January 2026 / Published: 14 January 2026
(This article belongs to the Section Sensors and Control in Robotics)

Abstract

Autonomous mobile robots (AMRs) require robust, efficient path planning to operate safely in complex, often dynamic environments (e.g., logistics, transportation, and healthcare). This systematic review focuses on advanced metaheuristic and learning- and reasoning-based (cognitive) techniques for AMR path planning. Drawing on approximately 230 articles published between 2018 and 2025, we organize the literature into two prominent families, metaheuristic optimization and AI-based navigation, and introduce and apply a unified taxonomy (planning scope, output type, and constraint awareness) to guide the comparative analysis and practitioner-oriented synthesis. We synthesize representative approaches, including swarm- and evolutionary-based planners (e.g., PSO, GA, ACO, GWO), fuzzy and neuro-fuzzy systems, neural methods, and RL/DRL-based navigation, highlighting their operating principles, recent enhancements, strengths, and limitations, and typical deployment roles within hierarchical navigation stacks. Comparative tables and a compact trade-off synthesis summarize capabilities across static/dynamic settings, real-world validation, and hybridization trends. Persistent gaps remain in parameter tuning, safety, and interpretability of learning-enabled navigation; sim-to-real transfer; scalability under real-time compute limits; and limited physical experimentation. Finally, we outline research opportunities and open research questions, covering benchmarking and reproducibility, resource-aware planning, multi-robot coordination, 3D navigation, and emerging foundation models (LLMs/VLMs) for high-level semantic navigation. Collectively, this review provides a consolidated reference and practical guidance for future AMR path-planning research.

1. Introduction

Autonomous Mobile Robots (AMRs) are increasingly deployed across sectors such as transportation, logistics, manufacturing, agriculture, and healthcare, where safe and efficient navigation is crucial to mission success. A core requirement in these applications is path planning, the ability to determine a collision-free, efficient route from an initial position to a target destination while considering environmental constraints, kinematic limitations, and operational objectives. Path planning becomes particularly challenging in dynamic and uncertain environments, where obstacles may move unpredictably, environmental maps may be incomplete, and mission objectives can evolve in real time [1].
Over the past two decades, metaheuristic and artificial intelligence (AI)-based approaches have emerged as powerful alternatives to classical and heuristic methods for AMR path planning. Metaheuristic algorithms, such as Particle Swarm Optimization (PSO), Genetic Algorithms (GA), Ant Colony Optimization (ACO), Firefly Algorithm (FA), and Grey Wolf Optimization (GWO), provide versatile optimization frameworks capable of exploring large and complex search spaces [2,3]. These methods are often robust to problem nonlinearity, adaptable to multiple objective criteria, and applicable in both static and dynamic environments. In parallel, reasoning- and learning-based techniques, including Fuzzy Logic (FL), Artificial Neural Networks (ANN), Neuro-Fuzzy (NF) systems, Reinforcement Learning (RL), and Deep Reinforcement Learning (DRL), leverage computational intelligence to enable adaptive decision-making, perception-driven navigation, and autonomous policy learning in challenging scenarios.
These techniques offer significant advantages, including adaptability to 3D navigation, practical obstacle avoidance, and robustness in dynamic conditions. However, despite their potential, these approaches face inherent challenges, including sensitivity to parameter tuning, high computational demands for real-time deployment, susceptibility to premature convergence, and concerns regarding interpretability. Addressing these limitations requires hybridization strategies, adaptive parameter control, and robust learning frameworks that balance global exploration with local exploitation while ensuring safety and efficiency in real-world operation [4].
In the previous work [5] (Part I of this two-part review series), a comprehensive analysis of commonly used classical and heuristic-based approaches to AMR path planning was presented. Building upon that foundation, this study reviews metaheuristic and AI-based (cognitive) approaches. Additionally, we aim to expand the scope of our research by highlighting the advantages and limitations of these methods. Furthermore, we provide a comparative analysis of these algorithms across different environmental scenarios in a tabular format. Specifically, we:
  • Provide a structured taxonomy of metaheuristic and cognitive (AI-based) algorithms for AMR path planning, highlighting their principles, strengths, and limitations.
  • Summarize recent advancements, hybridization strategies, and application domains, with emphasis on emerging trends that integrate global optimization and learning-based adaptability.
  • Present a comprehensive comparative analysis of algorithmic capabilities across static and dynamic scenarios, including 2D vs. 3D navigation, multi-robot coordination, and available real-world experimental validations.
  • Identify persistent research challenges, such as parameter sensitivity, computational cost, sim-to-real transfer, and interoperability, while outlining promising future directions in multi-robot collaboration, real-time adaptability, and scalable 3D navigation.
  • Introduce a dedicated discussion of open research questions in AMR path planning, providing a roadmap for addressing safety guarantees, explainability, benchmarking, and lifelong adaptation in complex real-world environments.
The remainder of this paper is organized as follows: Section 3 provides a detailed review of metaheuristic and AI-based AMR path-planning techniques. Section 4 and Section 5 present comparative discussions, research trends, and identified opportunities. The open questions in AMR path planning are discussed in Section 6. Section 7 summarizes the key findings and concludes the paper.

2. Methodology

This paper presents a systematic review of existing approaches to AMR path planning. The overall workflow adopted in this study is illustrated in Figure 1. After defining the motivations and objectives of the review, an overview of key concepts is provided to familiarize readers with the topic, followed by a survey of algorithms. Based on insights from the selected papers, we analyze to highlight significant findings, discuss the limitations of existing work, and identify opportunities for future research. Finally, the paper concludes by summarizing the findings and outlining potential research directions.
The search for recent works in AMR path planning was performed using Scopus, Google Scholar, and Web of Science. Several trial queries were tested with different string combinations, converging to the following research equation:
(“AMR” OR “Autonomous mobile robot”) AND (“Path planning” OR “Navigation” OR “Route planning”) AND (“Techniques” OR “Approaches” OR “Algorithms”)
The review intentionally focuses on publications from 2018 to 2025. The rationale for this time frame is to capture the most recent contributions and promising research directions that align with emerging trends and future feasibility. Earlier works, many of which have been thoroughly covered in previous surveys, are acknowledged. Still, our goal is to avoid redundancy and instead emphasize advances that reflect the current state of the art and the challenges that remain to be addressed.
The selection process of papers is illustrated in Figure 2. Beyond the time frame, each article was required to explicitly address several core research questions to ensure relevance and rigor. These included the type of algorithm employed, the nature of the robot or agent under study, and the characteristics of the environment, static or dynamic, structured or unstructured. Additional criteria included the domain of operation, distinguishing between 2D and 3D settings, and the time domain, assessing whether the algorithm was applied in online or offline contexts. Finally, the inclusion of comparative results or benchmarking against existing methods was deemed essential for evaluating performance. Collectively, these criteria ensured that only relevant, recent, sufficiently detailed, and methodologically sound studies were included, thereby enabling a meaningful comparative evaluation of metaheuristic approaches and AI-based reasoning/learning techniques for AMR path planning.

3. Autonomous Mobile Robots Path Planning Techniques

As shown in Figure 3, AMR path planning algorithms can be broadly categorized into four groups: conventional methods, heuristic methods, metaheuristic methods, and machine learning or AI-based techniques [5,6]. This classification is based on the underlying principles and computational strategies used for path generation and optimization. This section focuses exclusively on the advanced categories, metaheuristics, and learning- and reasoning-based (AI) techniques, which have shown considerable promise in addressing the limitations of earlier methods.
Metaheuristic algorithms, inspired by natural processes or stochastic search strategies, are particularly suited to high-dimensional, nonlinear, and multi-objective optimization problems. They offer flexible frameworks that can adapt to varying environmental complexities, making them valuable for AMR navigation in both static and dynamic settings [7,8,9]. AI-based approaches, encompassing rule-based reasoning, supervised and unsupervised learning, and reinforcement learning paradigms, introduce adaptivity and decision-making capabilities that are essential for robust operation in uncertain and evolving environments [10]. In the subsections that follow, we review representative algorithms from each of these two categories. For each method, we describe its core principles, key mathematical formulation or procedural flow, recent advancements, notable applications, and identified strengths and limitations. Special attention is given to hybridization strategies, parameter-tuning innovations, and performance validation in real-world or realistic simulation environments. This structured evaluation provides the basis for the comparative analysis and research opportunities discussed in Section 4.

3.1. Terminology and Taxonomy (Path Planning vs. Motion Planning vs. Navigation)

To avoid ambiguity, we distinguish the commonly conflated notions of path planning, motion planning, and navigation in AMRs. Path planning refers to generating a collision-free geometric route in the configuration/workspace from start to goal, often represented as waypoints or a piecewise-linear curve. Motion planning extends this by producing a time-parameterized trajectory that satisfies robot feasibility constraints (e.g., nonholonomic kinematics, bounds on velocity/acceleration, and, when modeled, dynamics). Navigation denotes the closed-loop autonomy stack that integrates perception and localization with global planning, local planning/obstacle avoidance, and trajectory tracking/control. Exploration/coverage differs from planning in that the objective is information gain (map building/coverage) rather than reaching a known goal along an optimized route [11,12,13].
Accordingly, the methods surveyed in this paper are classified along the following widely used criteria:
  • Planning scope: global (map-based route generation), local (reactive obstacle avoidance in the vicinity), or exploration/coverage (information-driven navigation in unknown environments).
  • Output type: waypoints/path, trajectory, or direct control commands (e.g., ( v , ω ) ).
  • Constraint level: geometric-only vs. feasibility-aware planning that respects mobile robot motion constraints, including nonholonomic kinematics (e.g., unicycle/differential-drive, Ackermann steering) and, when available, dynamic limits.
  • Environmental assumptions: static vs. dynamic obstacles/targets, known vs. unknown maps, and 2D vs. 3D operation.
In the AI-based family, it is essential to note that several RL/DRL approaches are most commonly deployed as local modules for collision avoidance or exploration and are frequently combined with classical/global planners in hierarchical navigation pipelines rather than serving as standalone global path planners. In the following, we review metaheuristic and AI-based methods using these criteria to highlight their scope, constraints, and typical roles in the navigation stack.

3.2. Metaheuristic Approach

Metaheuristics are high-level optimization strategies designed to explore and exploit the solution space more effectively than problem-specific heuristics, particularly for complex, nonlinear, and multi-objective problems [14]. In the context of AMR path planning, metaheuristic algorithms operate iteratively, starting with an initial set of candidate solutions and refining them through evolutionary or swarm-based processes. At each iteration, a fitness function evaluates the quality of candidate paths, guiding the search toward near-optimal or optimal solutions (Figure 4). Their generality and flexibility make them applicable across diverse AMR navigation scenarios, from static grid-based maps to dynamic, cluttered environments [15,16].
Recent literature confirms that metaheuristics are among the most widely used methods in mobile robot navigation [2,17]. Popular algorithms such as Particle Swarm Optimization (PSO), Genetic Algorithms (GA), and Ant Colony Optimization (ACO) continue to dominate due to their flexibility and robustness across environments [17,18]. Hybrid metaheuristic strategies, particularly those that combine the strengths of multiple algorithms, such as PSO with Simulated Annealing or ACO with GA, have shown promise in improving convergence speed, avoiding local optima, and supporting dynamic adaptation in real-world environments [19,20].

3.2.1. Particle Swarm Optimization

The Particle Swarm Optimization (PSO) algorithm is a widely used, biologically inspired method for AMR path planning. It is a stochastic optimization technique inspired by the behavior and intelligence of creatures, such as swarming fish and bird flocks, in search of food [21,22]. Using PSO, locating the best solution in the configuration space (CS) is simple. Unlike other optimization techniques, PSO excels as a gradient-free optimization technique, requiring only a fitness function and remaining insensitive to the function’s differentiability or slope [23]. Its dynamics balance local and global search effectively, with particle velocities controlling exploration and convergence [24,25,26]. A graphical representation of PSO evolving is shown in Figure 5.
The velocity and position updates follow the standard PSO equations:
v i t + 1 = w · v i + c 1 r 1 ( x l b x i ) + c 2 r 2 ( x g b x i )
x i t + 1 = x i + v i t + 1
where v i is velocity; w, c 1 , c 2 are inertia and acceleration weights; x l b , x g b , and x i denote local best, global best, and current positions; and r 1 , r 2 are random scalars in [0, 1]. A flowchart of the PSO process is illustrated in Figure 6.
PSO has been applied to AMR path planning in static and dynamic settings, often serving as a global planner for generating short routes [27] and handling constrained navigation in harsh static terrain through diversity-enhancing update mechanisms, such as crowding-radius strategies, outperforming baselines like NSGA-II [28]. In environments with static convex obstacles, PSO-based planning can construct candidate waypoint sequences and evaluate them using fitness terms that trade off distance, clearance, and smoothness [29]; Figure 7 illustrates the PSO waypoint-based planning principle in a static map: feasible waypoint sequences guide the swarm from start to goal, while infeasible waypoints (e.g., w p 4 ) are penalized. For dynamic scenarios, PSO has also been combined with sensing and scheduling mechanisms, such as IR-based detection and time allocation, to reduce robot–robot and robot–obstacle collisions [30].
To address premature convergence, local minima, and path smoothness, many hybrid and adaptive PSO variants have been proposed. Examples include adaptive PSO combined with BAS/BSA-style modifications and parameter control (including chaotic/trigonometric tuning) to strengthen global search [31], PSO-based autonomous navigation systems integrating localization and laser scanning [32], and hybrids that combine PSO with GWO (often with chaos/adaptive inertia) to improve convergence behavior [33,34]. Other hybrids integrate PSO with BA and obstacle detection/avoidance modules for operation in static and dynamic scenes [35], or combine PSO with SA and mutation to improve exploration in cluttered environments and avoid premature convergence [36]. For wheeled robots, adaptive fitness formulations that incorporate distance and smoothing terms have also demonstrated effectiveness in simulation and experiments [37]. Meanwhile, improved PSO–GWO designs explicitly balance exploitation and exploration to enhance safety and solution quality [38]. Multi-objective hybrids further optimize length, smoothness, and safety in unknown dynamic environments with moving targets [39].
PSO is a simple and effective optimizer for AMR path planning; however, it can prematurely converge to local minima, motivating many hybrid and adaptive variants that improve exploration, smoothness, and real-time performance [38,39]. Recent advances broadly fall into three categories: (i) prior-informed PSO variants that inject domain knowledge to guide search, (ii) self-evolving or adaptive parameter strategies to enhance convergence and robustness, and (iii) explicit obstacle-kinematics modeling to improve safety in dynamic environments. Representative examples include PKPSO, which injects start–goal prior information and applies modified velocity updates with quintic smoothing to produce shorter, smoother paths [40]; SEPSO, which employs a self-evolving framework for automatic hyperparameter tuning to enhance efficiency in dynamic settings [41]; and “OkayPlan,” which explicitly models obstacle motion through an Obstacle Kinematics Augmented Optimization Problem (OKAOP) and achieves 125 Hz real-time planning with improved convergence and safety [42]. Beyond PSO-only refinements, comparative studies evaluate improved PSO alongside GWO and ABC on indoor static navigation using ROS 2 and onboard sensing [43], while spline-based formulations optimize B-spline control points with modified PSO (plus adaptive random fluctuations) to generate smooth, collision-free, kinematically feasible trajectories [44].
Hybridization with complementary methods has also enhanced PSO’s robustness. A Smooth PSOIPF (SPSO-IPF) was introduced in [45], which combines PSO’s global search with improved potential fields and kinematic constraints to mitigate jerky trajectories and local trapping, thereby yielding smoother, collision-free paths in both static and dynamic environments. The authors in [46] integrated PSO into the RRT* framework (PSO-RRT*), adding a minimum rotation angle constraint and B-spline smoothing. Their approach reduced planning time by up to 37%, shortened paths by 10%, and lowered turning angles by over 20% compared to RRT*, Informed RRT*, and Q-RRT*, with further validation on a Raspberry Pi-based mobile robot. Collectively, these works illustrate a clear trend toward domain-aware, self-adaptive, and hybrid PSO variants that enhance smoothness, efficiency, and real-time applicability in AMR navigation. The surveyed works in Table 1 highlight a progressive shift toward domain-aware, self-tuning, and hybrid PSO strategies that enhance the efficiency and reliability of AMR path planning in both static and dynamic environments.

3.2.2. Genetic Algorithm

Genetic Algorithms (GA) are evolutionary optimization techniques inspired by natural selection, operating through iterative phases of selection, crossover, and mutation [47]. They are inspired by Darwin’s theory of evolution and genetics, where a population competes for survival, and only the fittest individuals are selected for reproduction. In GA, the fittest individuals are selected based on their fitness function, which represents their ability to solve the problem at hand [47,48]. The five elemental phases of GA are as shown in the flowchart in Figure 8. GA starts by initializing a population of candidate paths/solutions and iteratively evolves them using selection, crossover, and mutation. Due to its simplicity and global perspective, GA has been successfully applied to a range of challenges. Their population-based search enables robust exploration of multimodal solution spaces, making them effective for AMR path planning. However, standard GA suffers from slow convergence, inefficient initial populations, and difficulty handling highly dynamic environments. These limitations have motivated a variety of improved and hybrid GA variants in recent years.
Genetic algorithms were mainly used for path planning in static environments. Still, in recent years, significant advancements have almost wholly eliminated such restrictions, allowing for the use of GAs in dynamic environments. The research in [49] proposes a dynamic path planner that uses a reward-based model to estimate the probability of dynamic obstacles along the path. The reward function is applied to the GA, accounting for factors such as reward value, path security, and length. This approach improves the effectiveness of the GA-based path planner in dynamic environments. Due to inadequate consideration, most crossover operators examined in the literature produce impractical routes. In [50], an enhanced crossover operator was proposed for path planning in a static environment that considers the varying lengths of chromosomes. A genetic adaptive navigation control strategy was integrated with the enhanced GA to improve obstacle avoidance, as described in [51]. The method successfully navigates around obstacles in the robot’s workspace, demonstrating its robustness and effectiveness.
The research presented in [52] proposed a hybrid approach to AMR path planning in a uniform-grid environment by combining an improved GA with a unique APF method. The proposed approach first utilizes the APF to generate initial paths, which are then further optimized by the GA to achieve better optimality in terms of length, quality, and reliability. A collision-mitigation operator is also introduced into the GA to prevent potential collisions in advance. The proposed approach demonstrates its effectiveness in producing high-quality paths. Another enhanced GA technique was proposed in [53], which addresses the local optima and sluggish convergence issues in AMR path planning in a static environment. In [54], an enhanced GA was paired with the A* algorithm to give mobile robots practical path-planning abilities on complex maps. The GA method uses an A* algorithm’s evaluation function to enhance its heuristic. The suggested approach ensures quick convergence, shorter trajectory lengths, and ultimate autonomous vehicle stability. Ref. [55] presented a real-time path planner, as traditional approaches to discovery and optimization are relatively slow in real-time applications. The research proposed a sensor-based adaptive GA to optimize an AMR’s path during navigation in an obstacle-filled grid environment. The revised approach results in shorter pathways and faster convergence.
The quality of the initial population strongly affects GA performance in AMR path planning [56]. Several works improve GA-based global planning by using heuristics to seed viable initial paths and multi-objective fitness functions [57], or by targeting fast planning in crowded industrial settings [58]. Hybrid approaches further enhance diversity and path quality, including RRT with an upgraded GA (e.g., elite selection plus angle/obstacle terms) [59] and ACO–GA variants that adapt ACO evaporation and GA crossover/mutation to yield shorter, smoother paths with fewer turns. However, dynamic handling remains limited [60]. In dynamic, obstacle-rich environments, modified selection and fitness designs can generate higher-quality chromosomes than classical GAs and improve practical applicability [61]. More recently, ref. [62] proposed an improved GA that uses CBPRM-based initialization to reduce the number of infeasible paths and tailored operators to accelerate convergence in static 2D global planning, but without real-robot validation. A comparative warehouse-routing study found ACO outperformed GA in minimizing travel distance for the tested cases [63].
Recent work further refines GA for AMR path planning through improved initialization and operator design. For example, directional seeding combined with novel crossover, mutation, and simplification operators improves success rates and shortens grid-based paths [64], while a random domain inversion mechanism strengthens local search, accelerating convergence and improving optimality over conventional GA [65]. Zhang et al. combine multi-step population seeding, adaptive operators, and Bezier smoothing to enhance path smoothness, length, and runtime [66]. In dynamic settings, MHRTSN integrates APF with population-based metaheuristics for safety-aware real-time navigation; simulations show MHRTSN-PSO converges faster than MHRTSN-GA, especially with small populations [67]. Table 2 presents a comprehensive overview of significant GA-based contributions to AMR path planning, highlighting key trends and outcomes across static and dynamic environments.

3.2.3. Ant Colony Optimization

Ant Colony Optimization (ACO) is a bio-inspired technique that models the pheromone-based foraging behavior of ants to solve complex optimization problems, initially introduced by Dorigo in the late 1990s [68]. It is based on the foraging behavior of ants, which use pheromone trails to communicate paths between their colony and food sources [69]. Ants collectively explore, avoid hazards, and reinforce viable routes by depositing pheromones. The intensity of pheromones guides subsequent search behavior. Figure 9 provides an intuitive illustration of ACO’s foraging-inspired mechanism: candidate paths emerge through repeated agent exploration, while pheromone reinforcement biases subsequent ants toward shorter and safer routes around obstacles. The synergy of population-based global search and positive feedback makes ACO suitable for path discovery, even with incomplete environmental information [70]. ACO has been widely applied in AMR path planning for selecting efficient, collision-free paths in both static and dynamic environments [71]. This survey focuses on recent works, whereas previous surveys covered a broader range of earlier studies.
Despite its popularity, ACO suffers from premature convergence, sluggish computation, and sensitivity to parameter settings. To address these issues, various modifications and hybrid approaches have been proposed. A modified ACO algorithm that employs new heuristic information based on the concept of limitless step length to expand the visual field and enhance visibility accuracy, as well as a stimulating probability to aid the ant in choosing the following grid, was introduced [72]. The proposed algorithm demonstrated fast convergence and rapid expansion of the search area. To achieve fast convergence, obstacle-free operation, and efficient real-time performance for solving path planning concerns on a grid map, an improved APF-ACO technique was presented [73]. The ant colony moved in a direction determined by the directional forces from the APF. The hybrid strategy accelerated convergence by increasing exposure to heuristic data through the unbounded step-length concept and by merging global and local upgrade strategies to create an ACO pheromone update mechanism. The enhanced APF was used in a dynamic situation with GA operators to avoid local optimum [74]. The algorithm exhibits reliability and quick convergence. A real-time collision-avoidance path planner for both static and dynamic settings was proposed [75]. The hybrid algorithm achieves a reduced path length, rapid convergence, and fewer optimal loops by combining the collective benefits of an enhanced GA-ACO. Research in [76] proposed an improved ACO with uneven pheromone initialization, adaptive transition rules, and deadlock punishment, achieving faster convergence and reducing the number of lost ants compared to the standard ACO.
The enhanced ACO has been extended to 3-dimensional (3D) space for dynamic path planning [77]. The suggested technique employs an effective gain-function-based pheromone-enhancing mechanism to minimize overall energy consumption throughout the path-planning process. For indoor MR path planning, an improved adaptive ACO method was proposed in [78]. To arrive at a meaningful global optimal path, several adaptive adjustment variables were applied, and the path planning problem was treated as a multi-objective optimization problem. The researchers in [79] proposed an enhanced ACO that incorporates a communication system. The communication technique was inspired by the interactions of an ant’s tentacles in reality, which aggregate past pathways to form a superior composite way. An enhanced ACO incorporating adaptive parameter adjustment and advanced dynamic collision avoidance techniques was proposed in [80]. Experimental results demonstrate that the approach is both reliable and efficient under challenging conditions. To enhance AMR path planning in demanding, dynamic environments, ref. [81] proposes an improved ACO for optimal global path planning, integrating it with the DWA strategy for local collision avoidance. The algorithm demonstrated outstanding navigation and obstacle-avoidance performance. The distributed ACO frameworks have been tested for multi-robot path planning in congested environments [82]. The authors in [83] proposed an improved ACO-based path planning method for robots in complex grid environments. The enhanced ACO effectively generates optimal routes, while the obstacle avoidance scheme prevents ants from getting stuck. ACO remains competitive in comparative studies: An ACOGA hybrid outperformed standalone ACO and GA, producing smoother and shorter trajectories with faster convergence [84].
Several very recent studies have further enhanced ACO’s efficiency and applicability. An Improved Trimming ACO (ITACO) that integrates dynamic weighting in state transitions was proposed, utilizing an artificial potential field as the heuristic, path-length-dependent pheromone updates, and triangular pruning to remove redundant nodes. This approach achieves path lengths shortened by over 60% with faster convergence than classical ACO [85]. Research in [86] introduced a Parallel ACO (PACO) framework with rank-based pheromone updates and a “continue-or-kill” strategy to resolve deadlocks, showing nearly half the average planning time across different grid sizes without degrading path quality. An Intelligently Enhanced ACO (IEACO) featured six innovation layers, non-uniform pheromone initialization, ε greedy state transitions, adaptive parameter tuning, and multi-objective heuristics, and achieved superior path quality in both simulations and physical robot experiments [87]. Likewise, AR-ACO (2025) combined repulsive potential fields with ACO and introduced six iterative strategies for faster convergence and improved robustness in complex maps [88]. Complementing these algorithmic advances, ACO was integrated with edge cloud computing, where edge-level processing reduced latency and optimized pheromone updates, enabling faster and safer planning in dynamic environments [89]. An enhanced island-based ACO (EACI) mitigates stagnation and deadlocks through auxiliary map pre-processing, irregular pheromone initialization, and adaptive pheromone evaporation, achieving over 90% fewer iterations and 95% fewer lost ants in Mini-ROS vehicle tests compared to baseline ACO variants [90]. Collectively, these developments highlight a shift toward more adaptive, distributed, and real-time ACO variants that can overcome the classical drawbacks of premature convergence, slow execution, and limited scalability. Table 3 presents a comprehensive overview of significant ACO-based contributions to AMR path planning, highlighting key trends and outcomes across static and dynamic environments.

3.2.4. Firefly Algorithm

The Firefly Algorithm (FA), introduced by Xin-She Yang in 2008 [91], is a metaheuristic optimization method inspired by the attraction behavior of tropical fireflies [92]. Like other swarm intelligence algorithms such as PSO and ACO, FA exploits collective search behavior, where the brightness of a firefly represents solution quality and guides attraction-based movements [93]. As summarized in Figure 10, the FA iteratively updates candidate solutions based on fitness-driven attractiveness: solutions are ranked, the current best solution guides movement, and positions are updated until a goal/termination condition is met. Owing to its simplicity and population-based global search, FA has been widely used as an optimization technique and is spreading across virtually every area of engineering, including AMR path planning. The approach has been used to address global and local optimization issues, and several variants of the firefly algorithm have been developed to address these effectively [94,95]. Some deep findings show that the firefly algorithm appears to be more effective in multi-objective optimization [95]. This study will concentrate on recent advancements.
Early applications of the FA in AMR path planning highlighted its capability to minimize path length while ensuring smooth trajectories [96]. Moreover, FA-based approaches have been extended to dynamic environments, demonstrating effective navigation in the presence of moving goals and obstacles [97]. These studies indicate that FA can generate competitive results compared to conventional techniques. However, despite these advantages, standalone FA often exhibits premature convergence and limited adaptability in highly complex or high-dimensional search spaces, motivating subsequent research on hybrid and improved variants.
Since standalone optimization techniques rarely guarantee optimal paths across all configurations, many researchers have explored hybrid strategies by integrating FA with complementary algorithms to improve the efficiency and robustness of AMR navigation. For instance, in [98], FA was combined with fuzzy logic, with the Sobol sequence and a dynamic displacement factor used to initialize the population and enhance search capabilities. Similarly, ref. [99] proposed a distinctive hybrid approach in which the D algorithm was first utilized to compute the shortest path, and FA, enhanced with a quadratic factor, was subsequently applied for systematic obstacle detection and avoidance. These hybrid methods demonstrated improved path optimality and safety compared to standalone approaches. In [100], a hybrid path-planning method combining GA and FA was proposed to mitigate the FA’s tendency to converge prematurely to local optima. The GAFA hybrid demonstrated improved responsiveness and computational efficiency in both 2D and 3D navigation tasks. Similarly, ref. [101] presented a GAFA approach validated in a 2D static environment using a Khepra-II robot. In this method, the robot actively perceives its surroundings and uses the collected information to efficiently navigate cluttered environments, ensuring optimality, safety, and reduced routing time. A comprehensive survey [93] provides a broader discussion of past and recent FA-based approaches for AMR path planning.
Recent developments have further refined the Firefly Algorithm (FA) for AMR applications. An Enhanced FA (EFA) was proposed in [102], where the randomness parameter ( α ) was linearly reduced, yielding paths up to 10% shorter with lower variance across multiple map scenarios. A Functional FA (FFA) introduced in [103] integrated a choice-based function and Cartesian constraints, enabling efficient, collision-free navigation in both 2D and 3D dynamic environments. In [104], FA was combined with sliding mode control (SMC) for wheeled robot navigation, achieving improvements in both path planning and real-time trajectory tracking and outperforming PSO and TLBO baselines. A hybrid approach, as described in [105], embeds FA within an RRT framework (ERRT-FA) to enhance exploration and produce shorter, more efficient paths in complex scenarios. Similarly, ref. [106] presented a hybrid FA Cuckoo Search algorithm that leveraged complementary exploration strategies, demonstrating improved obstacle avoidance and path optimality in simulation studies. Finally, a hybrid Whale Firefly Optimization Algorithm (FWOA) was proposed in [107] for complex static environments, integrating firefly-inspired attraction with whale-inspired encircling mechanisms and opposition-based learning. The method demonstrated stronger exploration, faster convergence, and greater solution diversity in multi-population scenarios. Collectively, these advances highlight a shift toward hybrid and adaptive FA variants, emphasizing multi-objective performance, dynamic adaptability, and scalability to 3D environments in AMR path planning. Table 4 provides a comprehensive overview of major FA-based contributions to AMR path planning, summarizing key trends and outcomes in both static and dynamic environments.

3.2.5. Grey Wolf Optimization

Grey wolves operate in highly structured packs, with α , β , δ , and ω wolves forming a rigid social hierarchy. Inspired by this cooperative hunting behavior and leadership structure, the Grey Wolf Optimizer (GWO) was introduced in [108] as a swarm intelligence metaheuristic. The conventional GWO process is shown in Figure 11. Owing to its simple structure and effective balance between exploration and exploitation, GWO has been widely applied to optimization tasks, including AMR path planning. Early improvements focused on enhancing convergence speed and solution quality, such as the use of evolution and elimination strategies [109] and hybridization with Symbiotic Organism Search (SOS) to achieve faster convergence and improve detection capability in cluttered environments [110]. An improved GWO was later proposed in [111] to address path-planning challenges in complex 3D UAV environments, introducing a variable weighting factor to mitigate waypoint dispersion. Real-time applicability was further supported through a parallelized implementation of GWO.
Adaptive variants of the GWO algorithm have also been proposed to enhance robustness in constrained environments. In [112], two types of adaptive GWO, expanded and incremental, were proposed, differing in how wolves update their positions relative to one another to balance exploration and exploitation. These approaches were applied to UAV path planning in farm environments. Similarly, ref. [113] utilized adaptive GWO in scenarios with multiple obstacles, successfully identifying accident-free, time-efficient, and cost-effective paths for autonomous robots. A hybrid approach was introduced in [114], combining an upgraded GWO with a Whale Optimization Algorithm (WOA) in a leader–follower configuration and integrating it with a dynamic window approach (DWA) to address dynamic barriers, making the technique applicable to both global and local path planning. Further, ref. [43] proposed an improved GWO for static, structured indoor environments, experimentally validating the method on a two-wheeled AMR equipped with a Raspberry Pi and LiDAR. Although limited to 2D environments with static obstacles, this study represents a significant step toward bridging GWO-based algorithms with real-world AMR platforms.
Recent research has further advanced GWO variants for AMR path planning. An Improved GWO with Weighting Functions (IGWO-WFs) was introduced in [115], where multi-modal adaptive, sigmoid, and auto-regressive functions were incorporated to stabilize convergence and enhance trajectory smoothness in UAV path planning. A Hybrid Improved GWO (HI-GWO) was proposed in [116], integrating a Gauss chaotic mapping, a nonlinear convergence factor, and Lévy flight with golden-sine strategies, and demonstrated superior accuracy and robustness in both benchmark tests and mobile robot navigation tasks. Building on these efforts, the PAGWO-IDWA method presented in [117] integrated piecewise adaptive GWO with the DWA for dynamic obstacle avoidance, reducing the number of turns by up to 33% and runtime by 30%. The proposed approach enables effective dynamic path planning. More recently, an improved GWO variant inspired by cooperative predation strategies and lens-based oppositional learning was proposed in [118], achieving improved exploration, reduced local trapping, and shorter UAV trajectories than PSO and WOA. Collectively, these developments indicate a clear trend toward adaptive, hybrid, and real-time GWO frameworks tailored to robust AMR navigation. Table 5 summarizes notable GWO-based studies on AMR path planning, outlining key patterns and results across both static and dynamic environments.

3.2.6. Other Metaheuristics Algorithms

Beyond the widely adopted swarm-based optimizers, several less-utilized metaheuristics have also been explored for AMR path planning. A time-varying augmented Bat Swarm Optimization (BSO) was proposed in [119] to address path-planning challenges in dynamic environments. The Bat Algorithm (BA) was enhanced through a modified frequency setting and combined with Local Search (LS) and Obstacle Detection and Avoidance (ODA), enabling effective navigation in the presence of dynamic obstacles. An example of AMR navigation with dynamic obstacle avoidance is shown in Figure 12. A novel metaheuristic approach to AMR path planning, inspired by the foraging and orienting behaviors of African vultures, was presented in [120]. The African Vulture Optimization Algorithm (AVOA) demonstrated strong performance in scenarios with moving goals and obstacles. In [121], a dynamic planning technique was introduced to generate optimal robot paths that avoid collisions, maximize clearance from obstacles, and minimize distance to the target. Furthermore, ref. [122] proposed an updated COOT algorithm to address challenges in environments with both static and dynamic obstacles, thereby mitigating the instability issues of the conventional approach. This method formulated path planning as a multi-objective optimization problem that balances path length and smoothness.
Recent advances have extended the application of relatively underexplored metaheuristics for AMR path planning. A hybrid HHO–AVOA was proposed in [123], integrating Harris Hawks Optimization with the African Vulture Optimization Algorithm (AVOA) to handle kinematic constraints in wheeled robots. The method demonstrated robust obstacle avoidance, faster convergence, and reduced path length in both static and dynamic environments. A Modified AVOA (MAVOA) was introduced in [124] for legged robot navigation, where the dynamic tuning of vulture-inspired strategies was validated through simulations and Webots-based experiments, showing a deviation of less than 5% between simulated and real-world performance. In the UAV domain, a Sparrow-Enhanced AVOA (SAVOA) was proposed in [125], incorporating Sobol sequence initialization and a starvation-based adjustment strategy, thereby improving trajectory smoothness and obstacle avoidance in both urban and mountainous 3D environments. Collectively, these works highlight a growing trend toward hybrid, adaptive, and domain-specific metaheuristics, expanding the toolkit for AMR navigation across wheeled, legged, and aerial robotic platforms.

3.3. Artificial Intelligence Techniques

Artificial intelligence (AI) has become an indispensable component of modern AMR research, with applications ranging from industrial logistics to aerial and underwater robotics. While metaheuristic algorithms have attracted significant attention for AMR path planning in the past decades, recent years have witnessed a pronounced shift toward AI-powered approaches, a trend expected to intensify with advances in machine learning and computational resources. Unlike classical optimization, AI techniques enable robots to learn from data, adapt to dynamic and uncertain environments, and make real-time decisions by leveraging reasoning, perception, and learning capabilities [126,127,128].
In the context of AMR navigation, AI-based methods span a broad spectrum of paradigms: rule-based reasoning systems such as fuzzy logic (FL), data-driven models such as neural networks (NNs) and deep learning, reinforcement learning (RL) and its deep variants (DRL) for policy optimization, as well as hybrid approaches that integrate reasoning and learning (e.g., neuro-fuzzy systems or AI–metaheuristic hybrids) [128]. These techniques have shown promise in handling complex, high-dimensional search spaces, dynamic obstacles, and multi-robot coordination. However, challenges remain, including computational demands, the need for large-scale training data, generalization to unseen scenarios, and interpretability of learned models. The following subsections provide a detailed review of these cognitive approaches, with an emphasis on both foundational methods and recent state-of-the-art contributions.

3.3.1. Fuzzy Logic

Fuzzy Logic (FL) provides a reasoning framework inspired by human cognition, allowing systems to handle uncertainty and imprecision through linguistic rules [129]. Due to its simplicity, interpretability, and robustness, FL has been widely applied to AMR navigation tasks, particularly for obstacle avoidance and decision-making under uncertainty. Early studies showed that FL controllers could effectively guide wheeled mobile robots (WMRs) in cluttered environments by incorporating safe-zone concepts within the VFH technique [130], employing Mamdani-type fuzzy inference systems for efficient path planning [131], and combining fuzzy controllers with artificial potential fields (APF) to address local minima problems and improve trajectory smoothness [132]. These works demonstrated that reliable navigation could be achieved with relatively few rules and modest computational requirements.
The effectiveness of fuzzy logic (FL) in WMR path planning was further illustrated in [133], where sensor-equipped robots were able to recognize obstacles and avoid collisions. Similarly, ref. [131] proposed an intelligent Mamdani-type inference system capable of identifying near-optimal paths while minimizing time and energy consumption. To efficiently plan paths and further enhance obstacle avoidance in complex settings, ref. [132] integrated fuzzy controllers with an improved APF, addressing limitations such as local minima by using heading and stepping FL controllers.
Recent developments have increasingly focused on hybridization and multi-objective planning. For instance, in [134], the FL-based AHP method for multi-objective decision-making in MR path planning was suggested. The objectives of shortening the target’s reach, ensuring quick rotation, and ensuring path safety were achieved by prioritizing goals based on their significance. The FL-based AHP technique was combined with the A* algorithm to efficiently plan the route of an omnidirectional WMR in a dynamic environment, and a bio-inspired BLS was utilized as the robot’s motion controller to achieve resilience and rapid control [135]. In [136], FL was employed to adapt the heuristic parameters of the improved ACO, specifically the pheromone volatility ratio, resulting in rapid convergence. A multi-layer trajectory planning technique was proposed, based on an improved APF for the global path and an enhanced DWA with fuzzy logic control for the local path [137]. The DWA avoidance rate against dynamic barriers was improved by using a fuzzy logic technique to assess the degree of hazard posed by moving obstacles, based on an accident risk index and relative position. A path-planning simulation was used to validate the efficiency of the adaptive characteristics. In this hybrid framework, PSO tunes the APF global-planning parameters to generate intermediate subgoals. At the same time, the fuzzy-DWA local planner adapts its evaluation weights (time-varying coefficients) online based on local obstacle-risk indicators and outputs real-time motion commands ( v , ω ) in the presence of moving obstacles. For 3D navigation, ref. [138] introduced a novel technique for a car-like mobile robot that combines FL systems with a multi-objective PSO to design a smooth, short, collision-free, and energy-efficient path, among other qualities.
Recent contributions since 2023 have significantly advanced the integration of fuzzy logic into local and hybrid path-planning frameworks. In [139], the Dynamic Window Approach (DWA) was improved with a fuzzy controller that dynamically adjusts cost functions, enabling higher heading accuracy and enhanced responsiveness to dynamic obstacles. Building on this idea, a Fuzzy DWA (FDWA) was proposed in [140], where adaptive tuning of evaluation coefficients and sub-targets from a global planner was employed to prevent local optima and improve robustness in dynamic scenarios. In the context of indoor robotics, Kumar et al. [141] applied Takagi–Sugeno fuzzy inference with Kinect-based perception on TurtleBot 2 platforms, combining a Tracking FLC and an Obstacle Avoidance FLC to achieve reliable real-time navigation with reduced computational overhead compared to model predictive control. In [142], a Deterministic Constructive Algorithm (DCA) was coupled with an Efficient FL Controller (EFLC), forming a two-layer navigation system that outperformed GA, RRT, ACO, and Dijkstra in terms of path length, runtime, and safety, validated in V-REP simulations with Pioneer robots under both static and dynamic conditions.
More recent hybridizations further demonstrate the versatility of fuzzy logic. In [143], Quad_D*–Fuzzy integrated quadtree decomposition, D* Lite planning, and fuzzy reasoning to guide robots in dense, room-like, and trap-like environments, achieving a 100% success rate and reducing planning time by up to 80% in dynamic settings. Similarly, ref. [144] presented a hybrid fuzzy A*-quantum multi-stage Q-learning APF framework, in which fuzzy A* provides global adaptability, APF supports local obstacle avoidance, and quantum-enhanced Q-learning accelerates convergence and resolves local minima. Experimental evaluations demonstrated over 80% faster learning convergence and consistently smoother, collision-free paths compared to standalone A*, APF, or Q-learning. Collectively, these advances underscore FL’s growing role as a reasoning layer, which, when combined with optimization or learning methods, yields robust, adaptive, and computationally efficient solutions for AMR path planning in complex static and dynamic environments. Overall, pure FL methods are efficient and robust local planners, while recent hybridizations significantly improve path optimality at a moderate additional computational cost. Table 6 summarizes notable FL-based studies on AMR path planning, outlining key patterns and results across both static and dynamic environments.

3.3.2. Artificial Neural Networks

Artificial Neural Networks (ANNs) are collections of interconnected processing elements (neurons) that approximate relationships in data through weighted connections, mimicking the way the human brain processes information [145]. Figure 13 illustrates a feedforward ANN architecture commonly used for WMR path planning, consisting of an input layer, two hidden layers, and an output layer. Owing to their adaptability and ability to approximate nonlinear mappings, ANNs are particularly well-suited for navigation tasks in dynamic and uncertain environments [146].
Early ANN-based navigation schemes focused on direct motion control and collision avoidance. For example, ref. [147] proposed an ANN control framework for non-holonomic WMRs in 3D environments, while [148] used an ANN to guide wall-following robots within the Mobotism simulation. In [149], a GRU recurrent neural network was introduced for AMR dynamic path planning in uncharted environments, improving smoothness and real-time responsiveness, though limited by training data availability and accuracy. Multi-layer feed-forward ANNs are ideal for controlling systems with uncertainty. A potential field control method for adjusting the proximity between the autonomous vehicle and obstacles was proposed [150]. A deep learning technique based on hybrid positions and virtual forces was proposed, along with multi-layer feed-forward neural networks [150]. When addressing trajectory tracking and obstacle (collision) avoidance control issues for AMR systems, this approach takes uncertainties into account. To achieve optimal real-time navigation and obstacle avoidance for an autonomous MR, a Kalman filter (KF) in conjunction with a recurrent fuzzy neural network approach was presented [151]. The extended KF is designed to mitigate the inaccuracy of the RFNN’s weighting function while accounting for the robot’s kinematic constraints and constraints related to goals and obstacles. The RFNN then achieved resilient control of the robot’s movements, enabling the discovery of an accident-free path from the initial point to the endpoint.
A behavior-based ANN was proposed for dynamic path planning in a complicated environment. The term “behavior-based control strategy” refers to dividing a difficult navigation task into several simple, easy-to-design behaviors [129]. An adequate goal search and goal achievement plan for the mobile robot has been implemented [152]. Here, the controller’s outputs are the angular velocity and steering angle, with obstacle location and goal position serving as input functions. The BP methodology was employed to minimize errors in the weighting function, enabling the network to adapt appropriately. Then, using the BNN algorithm, the robot reacts more quickly than with other developed methods. In [153], a unique recursive ANN predictor method for AMR path planning was developed. The proposed multi-objective strategy addresses speed and accuracy issues in autonomous navigation by optimizing motor PWM signals through appropriate training. The results demonstrate the controller’s usefulness and versatility. In the context of enhancing classical algorithms, ANNs have been effectively integrated to improve convergence speed and solution quality. A Neural RRT* (NRRT*) algorithm, which employs a convolutional NN (CNN) to learn nonuniform sampling distributions from prior optimal paths generated via the A* algorithm, thereby accelerating the exploration process and reducing computational overhead in high-dimensional environments [154]. The MPNet framework [155] employs deep NNs to predict connectable path segments from raw depth sensors and initial/goal states and supports hybridization with RRT* to ensure asymptotic optimality. In evaluations across motion planning domains ranging from 2D to 7D, MPNet achieved sub-second planning times while maintaining high path quality.
The emergence of deep learning expanded the applications of ANNs into more complex environments. A hybrid scheme combining deep neural networks (DNNs) with high-fidelity direct optimization was proposed in [156] for AGV navigation, where the motion planning problem was formulated as a nonlinear optimal control task. In this hybrid scheme, the DNN approximates a state–action mapping from the robot state to control commands. The network was trained offline to approximate this state–action relationship, enabling rapid, robust real-time execution. An ANN-based dynamic obstacle avoidance system was also introduced in [157], updating data online at every step, enabling rapid maintenance of collision-free paths. A long-term and short-term memory (LSTM) neural network achieves optimal trajectory planning and navigation for mobile robots, further extending trajectory prediction capabilities. Research in [158] demonstrates optimal end-to-end navigation under dynamic obstacle conditions using LiDAR-assisted sensing. Recent studies emphasize the importance of real-time deployment in unstructured environments. In [159], a neural decision-making strategy categorized motion behaviors based on environmental inputs, enabling robots to adapt in unknown, cluttered spaces. Adaptive control frameworks combining ANN with FL have also been explored. For instance, ref. [160] proposed an ANN–FL hybrid with a monitoring layer to enhance localization, trajectory tracking, and real-time safety, thereby improving the mobility, stability, and robustness of AMR navigation.
Recent ANN-based contributions have focused on enhancing efficiency and navigation driven by perception. A lightweight neural planner tailored for resource-constrained platforms was introduced in [161], where a dual-input CNN generator with a hybrid sampling strategy reduced model size by nearly an order of magnitude and computation time fivefold compared to classical networks, while maintaining competitive accuracy in TurtleBot experiments. Perception-integrated navigation was advanced in [162] by deploying ResNet 18 and YOLOv3 models within a Jetson Nano-based AMR framework, achieving 98.5% recognition accuracy in dynamic, obstacle-rich environments. This approach was validated in both simulations and real-world trials. ANN designs have also been extended into reinforcement learning [163]. This approach embeds convolutional layers into a deep Q-network (CNN-DQN) with exponential decay and B-spline smoothing, outperforming standard DQN in convergence, path smoothness, and computational efficiency across diverse maps. Collectively, these works demonstrate a shift toward ANN frameworks that are not only more efficient but also tightly integrated with perception and reinforcement learning, ensuring real-time adaptability in uncertain environments.
A second line of research has emphasized hybrid and bio-inspired ANN models for multi-robot coordination and challenging road conditions. A dual-layer architecture integrating an improved Glasius bio-inspired NN with a reward-augmented DWA and dynamic priority rules was developed in [164], reducing path length by nearly 20% and cutting the number of turns by over 90% compared to GA, ACO, and BINN. ANN has also been combined with Invasive Weed Optimization (Neuro-IWO), producing optimized routes with less than 5% deviation while yielding shorter, smoother paths than standalone NN-based planners [165]. A Hybrid Symmetric Bio-inspired Neural Network (HSBNN) was proposed in [166], linking neuron activity to road conditions such as flatness, adhesion, and slope, and integrating an improved GA to eliminate redundant segments. Corridor experiments conducted on heterogeneous terrain demonstrated reductions of 11.4% in path length and ten fewer turns compared to GA, AGA, and BINN-GA. These studies illustrate the growing role of ANN as both a perception-rich and hybrid reasoning tool, increasingly fused with optimization and bio-inspired methods to deliver scalable, adaptive, and multi-robot navigation solutions for AMRs in complex static and dynamic environments. Table 7 presents a comprehensive overview of significant ANN-based contributions to AMR path planning.

3.3.3. Neuro-Fuzzy

Neuro-fuzzy algorithms are knowledge-based systems that combine the learning capability of neural networks (NNs) with the interpretability of fuzzy logic (FL) controllers, thereby leveraging the strengths of both paradigms [167]. In this survey, the integration of fuzzy logic and neural networks is considered a distinct entity, as this hybridization mitigates the limitations of fuzzy systems in precisely defining membership functions and those of neural networks in generalization [168,169]. Neuro-fuzzy controllers have been extensively applied to AMR path planning, consistently demonstrating improved adaptability, robustness, and real-time decision-making compared to standalone approaches.
Early contributions highlighted the use of neuro-fuzzy systems for autonomous navigation in cluttered spaces. For example, ref. [170] proposed a Mamdani-type fuzzy controller combined with reinforcement learning (RL)–based NNs to navigate a car-like mobile robot without collisions. A multiple adaptive neuro-fuzzy inference system was developed in [171], using infrared sensors to regulate velocity and achieve real-time navigation in the presence of static obstacles, dynamic barriers, and moving targets, thereby reducing path length and improving robustness. Similarly, adaptive neuro-fuzzy inference systems have been combined with GPS for path control [172], and tested in static obstacle environments, where they provided smoother paths than standalone fuzzy controllers [173]. The approach was later extended with ultrasonic sensing to dynamically adjust steering when obstacles were detected [174].
More recent works emphasize hybridization and complex navigation scenarios. An adaptive neuro-fuzzy system integrated with GPS and heading sensors was proposed in [175] for crowded, unknown environments, where ANFIS provided local control, while external sensors provided global guidance. The benefits of balancing deliberative and reactive navigation strategies through neuro-fuzzy reasoning were further explored in [176], highlighting its potential for real-time corrective decision-making. Multi-robot applications have also been investigated. A hybrid neuro-fuzzy system optimized with PSO was applied in [177] to improve navigation accuracy in terms of path length and travel duration for multiple robots in crowded settings. More recently, ref. [178] proposed an adaptive Neuro-Fuzzy Inference System (ANFIS) for dynamic environments, demonstrating reduced computational complexity and improved path smoothness compared to conventional controllers.
Recent neuro-fuzzy (NF) contributions have aimed to enhance interpretability while improving adaptability for real-time AMR navigation. In [179], an FCM-MANFIS framework was introduced by integrating Fuzzy C-Means clustering with ANFIS to reduce rule complexity and dimensionality. The method optimized wheel velocities, resulting in shorter paths and faster navigation in both V-REP simulations and Khepera-IV experiments, with a deviation of less than 9% between simulated and real-world results. To further streamline NF control, ref. [180] applied subtractive clustering and fuzzy set merging before ANFIS training, reducing the rule base to only five rules and achieving an RMSE of 0.0442, outperforming GPS-ANFIS, CS-ANFIS, and IWO-ANFIS benchmarks. Similarly, ref. [181] advanced NF controller design by implementing type-1, interval type-2, and type-3 ANFIS models trained with Teaching–Learning-Based Optimization (TLBO). Experiments in Gazebo with a Turtlebot demonstrated that type-3 fuzzy ANFIS produced superior convergence, smoother trajectories, and safer navigation compared to type-1, type-2, and existing baselines. Collectively, these studies underscore NF’s growing role as a compact, learning-enhanced alternative to conventional fuzzy logic in uncertain, cluttered indoor settings.
Hybrid NF frameworks have also gained momentum for complex outdoor environments. A hybrid A*–DWA–ANFIS PID approach was proposed in [182], where an improved A* reduced grid size and computation time, DWA handled local dynamic obstacle avoidance, and ANFIS-tuned PID achieved faster convergence (0.038 s vs. 0.052–0.075 s) under disturbances. In [166], the Hybrid Symmetric Bio-inspired NN (HSBNN) combined BINN with an improved GA (IGA), linking neuron activity to road conditions such as surface flatness, adhesion, and slope. Experiments in heterogeneous indoor corridors showed reductions of 11.4% in path length and 10 fewer turns compared to GA, AGA, and BINN-GA. In parallel, ref. [183] designed a compact ANFIS controller that regulated wheel velocities using infrared and ultrasonic sensors, requiring only eight rules instead of the conventional 48–54. Matlab simulations across six cluttered environments yielded collision-free paths that were up to 20% shorter than baseline fuzzy approaches. These hybrid contributions mark NF’s evolution from stand-alone inference engines to optimization-augmented controllers that combine interpretability with adaptability, ensuring robust, efficient, and scalable AMR navigation in both structured and unstructured settings. Overall, neuro-fuzzy methods consistently offer high robustness and improved path quality over pure fuzzy controllers, while recent optimization-augmented designs further enhance optimality at the cost of moderate additional computation. A comprehensive overview of NF-based contributions to AMR path planning, summarizing key trends and outcomes in both static and dynamic environments, is presented in Table 8.

3.3.4. Reinforcement Learning

Reinforcement learning (RL) is a subset of machine learning in which an agent learns an optimal policy by interacting with its environment. The agent maximizes cumulative rewards, receiving feedback in the form of rewards for desired actions and penalties for undesired actions [184]. This process reinforces the decisions made at each step, enabling the agent to learn to make decisions that will lead it toward its goal. RL has gained popularity over traditional heuristic methods due to its advantages, such as adaptability, optimality, and robustness. In AMR navigation, RL methods include model-free, model-based, Q-value function-based, policy-based, and actor–critic-based approaches [185].
Formally, RL can be modeled as a Markov Decision Process (MDP), where the agent A interacts with the environment through a set of observed states S, takes actions that yield rewards R, and transitions probabilistically to new states. The objective is to learn a policy that maximizes the cumulative discounted reward [186]. As illustrated in Figure 14, RL-based navigation is formulated as an agent–environment interaction loop. The robot (agent) observes the environment (e.g., sensor/goal features), selects an action (e.g., v, ω ), and receives a reward that encodes navigation objectives such as collision avoidance, progress toward the goal, and motion smoothness. Learning aims to optimize the policy that maps observations to actions to maximize long-term cumulative reward.
Several modifications to classical RL have been proposed to enhance performance in AMR path planning. Improvements to Q-learning are widespread. For instance, ref. [187] introduced an enhanced exploration strategy that balances exploration and exploitation, improving execution time, path length, and cost. Similarly, ref. [188] initialized Q-values with prior knowledge and dynamically adjusted the greedy factor, accelerating convergence while improving success rates. In [189], a reinforcement learning path generation method combined a deep Markov model with a motion fine-tuning module, enabling predictive path generation and motion refinement. To mitigate slow convergence in classical Q-learning, artificial potential fields (APF) were integrated into the framework, improving both learning speed and trajectory quality [190]. Beyond Q-learning, other RL paradigms have also been extended. For an effective AMR path planning process, ref. [191] augmented the Deep Deterministic Policy Gradient (DDPG) algorithm by incorporating Long Short-Term Memory (LSTM) units, reward function optimization, normalization techniques, and mixed noise modeling. These enhancements improved convergence speed, generalization, and path efficiency. Hybrid RL methods have also emerged, such as [192], which coupled RL with a fuzzy inference system to enable AMRs to balance navigation with energy management, resulting in shorter training times and improved energy efficiency.
Research has further extended RL frameworks for dynamic and multi-robot scenarios. Globally guided RL was applied to navigation amidst other robots and dynamic obstacles [193], while extensions of Proximal Policy Optimization (PPO) demonstrated strong performance in multi-robot coordination by balancing stability and efficiency. For example, ref. [194] combined PPO with transfer learning, and ref. [195] hybridized PPO with deep Q-learning and CNN-based perception. In dynamic environments, ref. [196] applied a real-time MDP-based planning algorithm, while ref. [197] enhanced Dyna-Q with hippocampus-inspired forward prediction to improve exploration–exploitation trade-offs. Collectively, these contributions demonstrate RL’s capacity to deliver adaptive, real-time, and scalable solutions for AMR navigation in increasingly complex environments.
Recent contributions to reinforcement learning (RL) for AMR navigation have primarily focused on overcoming the inefficiency and instability of classical Q-learning. Research in [198] proposed an optimized Q-learning (O-QL) algorithm that initializes Q-tables based on Euclidean distance, combines ϵ -greedy and Boltzmann policies for balanced exploration, and employs Gaussian-based reward shaping with an RMSprop-driven learning rate adjustment. Their method achieved faster convergence and higher cumulative rewards across randomized obstacle maps compared with GA-QL and EnDQN. Extending this direction, ref. [144] introduced a hybrid fuzzy A*–quantum multi-stage Q-learning–APF framework, where fuzzy A* provides adaptive heuristics, APF handles local avoidance, and quantum multi-stage Q-learning accelerates convergence, reducing learning time by about 80% and avoiding trap-induced deadlocks while producing smoother, shorter paths relative to A*, APF, and standard Q-learning. Complementarily, ref. [199] integrated A*, DQN, and DWA in a three-layer navigation system with Bezier smoothing to enforce non-holonomic constraints, achieving a 99.2% success rate in ROS-based experiments and improved robustness to dynamic obstacles over standalone A*, DWA, or DQN. To enhance reliability under perceptual uncertainty, ref. [200] combined Lightweight Learned Image Denoising with Instance Adaptation (LIDIA) and Quantile Regression DQN (QR-DQN), reducing path length by 13% and yielding smoother trajectories in Gazebo compared with A*, Dijkstra, and Hybrid-A*. Finally, ref. [201] presented ARE-QL, which integrates ant-colony pheromone guidance, adaptive ϵ , and continuous distance-based rewards, thereby shortening path lengths by up to 64% and reducing convergence time by more than 80% compared to classical Q-learning, IQ-FPA, and DRQN. Collectively, these studies indicate a shift from tabular Q-learning toward hybridized, distributional, and perception-enhanced RL models that improve convergence, adaptability, and real-time performance in dynamic environments. A comprehensive overview of RL-based contributions to AMR path planning is illustrated in Table 9.

Generic Formulation and Scope (Planning vs. Avoidance/Exploration)

Most AI-based navigation methods can be written as a learned mapping from observations to decisions. In supervised or imitation learning, a model with parameters θ learns either an action-level policy u t = f θ ( o t , g ) or a waypoint/trajectory generator W = f θ ( o 0 : t , g ) , where o t denotes the robot’s observation (e.g., local map, lidar/image features, state), and g denotes the goal. In reinforcement learning, navigation is commonly modeled as an MDP ( S , A , P , r , γ ) , and the objective is to learn a policy π θ ( a | s ) that maximizes the expected discounted return J ( π ) = E t = 0 γ t r t . We emphasize that, in the AMR literature, several RL/DRL works operate primarily as local modules for collision avoidance or exploration in unknown environments and are often combined with classical/global planners in a hierarchical navigation stack, rather than serving as standalone global path planners [11,202]. Deep reinforcement learning extends this framework by using deep neural networks to approximate policies and value functions, enabling scalability to high-dimensional observations [203].

3.3.5. Deep Reinforcement Learning

Deep reinforcement learning (DRL) combines the representation power of deep learning (DL) with the decision-making ability of reinforcement learning (RL) and has emerged as a dominant approach to handle perception and control in autonomous navigation [204,205]. By integrating high-dimensional sensory inputs with sequential decision-making, DRL methods have shown strong potential for addressing trajectory planning in uncertain and dynamic environments [206]. Figure 15 illustrates the general DRL architecture, where perception, policy learning, and reward feedback are tightly integrated to achieve autonomous decision-making. Recent studies have highlighted how DRL methods enhance adaptability, trajectory efficiency, and safety compared to classical approaches. However, challenges remain in terms of training efficiency, reward design, and convergence stability [207].
Early DRL applications demonstrated feasibility in unstructured settings. A DRL-based path planning solution for urban search-and-rescue mobile robots in unfamiliar, obstacle-filled environments was proposed in [208]. The approach integrates frontier-based exploration with DRL, enabling the robot to autonomously navigate uncharted areas even in the presence of obstacles of varying sizes. Subsequent works extended DRL to congested and complex environments. For instance, ref. [209] proposed a novel incremental training strategy. The framework incorporates observation states, a reward function, environmental factors, network characteristics, and other elements. The approach achieved excellent results, overcoming limitations of traditional DRL methods such as sluggish convergence and low efficiency. In [210], an improved value-function learning for robust obstacle avoidance was proposed. The learning capability of the DRL was enhanced by examining the behavior of AMRs in relation to obstacles. A high level of adaptability was achieved. Moreover, a novel DRL-based algorithm was developed for dynamic obstacle avoidance in mobile robots. To enhance the mobile robot’s ability to learn from experience and behave more efficiently, a deep deterministic policy gradient network with a separate experience replay mechanism was developed [211]. On the other hand, ref. [212] presents an end-to-end map-based path-planning technique to enhance AMR decision-making performance in complex environments. Importantly, it employs discontinuous operations and a strategy established using the dueling DQN algorithm to map local probability grids to goal points and the mobile robot’s speed into steering commands for the agent. The DRL approach has also been extended to a safety-critical scenario. An end-to-end navigation technique based on DRL for lunar rovers with security constraints was introduced to enable secure and effective unmanned exploration processes [213]. It incorporates state and action spaces, a safety reward function, and other elements to enhance the safety and reliability of navigation.
DRL is a self-directed learning paradigm that enables mobile robots to adapt their strategies through reward-driven interactions with the environment [214]. Still, challenges remain in terms of training stability, convergence, and reward design. To address these issues, recent studies have emphasized the importance of improved reward shaping, hierarchical control, and experience replay mechanisms. For instance, an enhanced DQN approach was proposed to accelerate path search in unknown environments by mitigating sparse-reward issues and enhancing exploration via optimized state-action mapping [215]. Building on this, a hierarchical DRL controller was developed for WMRs navigating congested settings, where vision-based inputs were decomposed into sub-actions and trained via a DQN-based selector, enabling practical obstacle avoidance, though with reduced efficiency in dense-gap scenarios [216]. More advanced frameworks integrate multiple control layers, such as a two-layer DL–DRL controller combined with gated recurrent DNNs for motion planning and an online deadlock-control DRL module for path tracking, further enhanced by noise-prioritized experience replay to accelerate learning [217]. Similarly, a two-stage DRL–dynamic programming hybrid was introduced, employing deep neural networks to capture environmental dynamics and leveraging prioritized replay and target network updates to improve efficiency and data integrity during training [218]. Beyond single-agent navigation, DRL has also been extended to human–robot interaction scenarios. For example, a novel crowd-aware algorithm incorporates pedestrian danger estimation and a virtual robot reference path, along with a priority-based safety mechanism, to generate smoother, safer trajectories in dynamic pedestrian-rich environments [219].
Applications of DRL in multi-robot systems have also gained momentum. Decentralized strategies such as the Asynchronous Multi-Critic Twin Delayed Deep Deterministic Policy Gradient (AMC-TD3) algorithm [220] and DRL with improved clustering for cooperative coverage path planning [221] have shown promise in large-scale coordination tasks. Several DRL methods have been validated on physical AMRs, including improved DDQN frameworks [222] and hybrid A*–DRL systems for real-time deployment [223]. More recent contributions explore DRL for heterogeneous and dynamic environments, combining static and dynamic goals with immune-inspired and hybrid learning strategies [224,225,226,227]. Overall, DRL has evolved from proof-of-concept exploration systems toward hybrid, hierarchical, and multi-agent frameworks capable of handling uncertainty, scalability, and real-world deployment challenges.
The most recent contributions (2025) demonstrate how DRL has evolved into more perception-aware, hybridized, and robust frameworks for AMR navigation. A Gated Attention Prioritized Experience Replay Soft Actor-Critic (GAP-SAC) algorithm was proposed that expands the state space using perceptual metrics, integrates a dynamic heuristic reward function, employs prioritized experience replay to improve sample efficiency, and uses a gated attention mechanism to selectively emphasize critical features. The experimental outcome validated superior robustness, generalization, and faster convergence compared to SAC, TD3, and SAC variants [128]. Similarly, a dueling double deep Q-network (D3QN) approach was developed for dynamic, unknown environments, and three network variants trained on depth images and orientation cues were evaluated. The best-performing variant achieved 85–90% navigation success in both simulation and real-world crowded settings, underscoring its capacity to generalize beyond laboratory conditions [228]. A hybrid DRL–adaptive neuro-fuzzy inference system (ANFIS) framework was also introduced, combining DRL for sparse reward challenges, fuzzy inference for local decision-making, and a Tent-based Artificial Hummingbird Algorithm (TAHA) to optimize fuzzy rules. This system reduced path length by (15%), computation time (25%), and energy consumption relative to DRL, neuro-fuzzy, and SLAM baselines, demonstrating efficiency gains in unstructured environments [229].
Further advances emphasize resilience, multiobjective optimization, and real-world deployment. A dual-layer framework combining Multiobjective Sheep Flock Migration Optimization (MOSFMO) with a DRL-based local planner was proposed, where MOSFMO generates global candidate paths, and DRL refines local decisions under a composite reward structure. A time-oriented deadlock-detection mechanism enables seamless path switching in congested environments. Simulation outcomes reported success rates of 92–95% and on-time arrival rates of up to 99% in dense pedestrian scenarios, outperforming DRL-VO and ARENA [230]. Collectively, these studies highlight a new research trajectory in which DRL is combined with attention mechanisms, perception enhancement, fuzzy reasoning, and multi-objective metaheuristics. The shift points toward DRL systems that are data-efficient, adaptable, and validated in real-world conditions, thereby narrowing the gap between simulation-driven development and practical AMR deployment in dynamic, uncertain, and human-populated environments. Table 10 summarizes notable DRL-based studies on AMR path planning, outlining key patterns and results across both static and dynamic environments.

3.3.6. Foundation Models and LLM/VLM-Enabled High-Level Navigation

Unlike classical learning-based planners that directly map observations to actions, foundation models such as LLMs and VLMs are increasingly employed as high-level cognitive layers that provide semantic reasoning and grounding for embodied agents by generating goals, waypoints, or constraints, which classical or learning-based path planners then execute. A recent trend in robotics is the use of foundation models, such as LLMs and VLMs, to provide high-level reasoning and semantic grounding for embodied agents. In mobile navigation, these models are typically integrated in a hierarchical pipeline: (i) an LLM interprets a free-form instruction into symbolic subgoals (e.g., landmarks) or intermediate waypoints; (ii) a VLM grounds these concepts in sensory observations or semantic maps; and (iii) a low-level controller (classical planner or DRL policy) executes feasible motions under robot dynamics and safety constraints. LM-Nav is an example that combines a pre-trained LLM (landmark extraction), a VLM (grounding), and a visual navigation model for execution in instruction-following navigation [231].
Beyond waypoint generation, LLMs have been used for skill sequencing and policy synthesis. SayCan grounds language in robot affordances to select a sequence of executable skills, improving task-level success by coupling language priors with feasibility estimates [232]. Code-as-Policies shows that code-generating LLMs can output reactive policy code that orchestrates perception and control primitives from natural-language commands [233]. Multimodal embodied models such as PaLM-E extend LLMs with visual and state inputs for embodied reasoning [234], while vision–language–action models (e.g., RT-2) directly map vision–language inputs to action tokens, improving generalization through large-scale pretraining [235]. More recently, Gemini Robotics 1.5 has been introduced as an embodied reasoning framework that extends multimodal reasoning to physical interaction and motion planning [236].
While these approaches are still emerging for AMR path planning, they motivate new capabilities such as
1.
Natural-language goal specification;
2.
Semantic cost-map construction and constraint generation;
3.
Long-horizon task decomposition into navigation subgoals;
4.
Improved human–robot interaction.
Key open challenges include grounding reliability, safety verification, computational cost, and robust deployment under distribution shift. Accordingly, foundation-model-based approaches are best viewed as complementary high-level reasoning modules that augment, rather than replace, classical or learning-based path planners within hierarchical AMR navigation stacks.

4. Analysis and Discussion

Based on the research reviewed, AMR path planning strategies can be categorized into four main types: classical, heuristic, metaheuristic, and artificial intelligence (AI)–based approaches. Each group exhibits distinct advantages and limitations. Factors including problem complexity, computational resources, and operational context influence the suitability of each approach. This review synthesizes findings from approximately 230 articles published between 2018 and 2025 and extracts system-level trends, maturity-level insights, and evidence-based gaps that shape current AMR deployment.

4.1. Algorithm Discussion and Analysis

Conventional path-planning techniques, including cell decomposition (CD), roadmap (RD), and artificial potential fields (APF), have been widely implemented in robotics and are effective in structured environments. Their performance, however, frequently deteriorates in intricate or dynamic environments featuring several moving obstacles, and they are typically constrained in scalability for real-time applications [5]. To address these deficiencies, novel families of approaches have been developed. Sampling-based methods (such as PRM and RRT) have been widely used and are often effective in high-dimensional, rapidly changing environments [237], while optimization-based methods can incorporate additional smoothness, feasibility, and cost-aware constraints. Heuristic graph-search algorithms, such as Dijkstra’s and A*, provide faster and more effective results than classical methods. However, they may not always find the optimal solution (i.e., the best solution). Metaheuristic algorithms (e.g., GWO, GA, PSO, ACO) can escape local minima and often produce superior paths in complex search spaces, albeit at increased computational cost. The selection of a path-planning algorithm should strike a balance between computational efficiency, solution optimality, and the complexity of the operational environment.
Metaheuristic algorithms demonstrate greater adaptability and effectiveness than traditional path-planning techniques, particularly when addressing nonlinearities, high-dimensional spaces, and dynamic environments. However, they also present several limitations. A key drawback is that metaheuristics are designed to approximate reasonable solutions efficiently rather than guarantee global optimality, often converging to near-optimal rather than exact solutions. This trade-off between optimality and computational efficiency can be problematic in safety-critical applications that require globally optimal paths. Furthermore, their performance is susceptible to parameter tuning and initial conditions; inappropriate parameter settings can lead to premature convergence, suboptimal paths, or excessive computation time. Another challenge is their computational cost, which increases significantly in large-scale or cluttered environments, limiting their suitability for real-time deployment. Despite these challenges, metaheuristics remain widely adopted due to their adaptability and ability to escape local minima. Current research trends focus on hybridization with deterministic or learning-based methods, adaptive parameter control, and parallelization strategies to enhance efficiency and reliability in complex, dynamic, and real-world scenarios.
AI-based techniques, including Fuzzy Logic (FL), Artificial Neural Networks (ANN), Neuro-Fuzzy (NF) systems, Reinforcement Learning (RL), and Deep Reinforcement Learning (DRL), have revolutionized AMR navigation by enabling robots to learn, adapt, and generalize in dynamic and uncertain environments. These approaches leverage various AI techniques, including fuzzy logic control and machine learning, to overcome the limitations of traditional path-planning algorithms and generate optimal or near-optimal solutions in real time. Fuzzy systems excel in interpretability and handling sensor noise but scale poorly with complexity. Neural networks (ANNs) offer strong function approximation capabilities, but they require large, high-quality datasets. Neuro-fuzzy systems combine interpretability with learning capacity, offering a balance but often at the cost of high model complexity. RL and DRL approaches represent the most significant trend in recent years, demonstrating robust adaptability to dynamic obstacles, moving targets, and partially observable settings. While AI-based path-planning approaches offer significant advantages over conventional approaches, they also come with their challenges and limitations. The quality of the solution generated by these approaches can be highly dependent on several factors, including the quality and quantity of available data, the accuracy of the models used, the extensive training requirement, and the selection of appropriate algorithmic parameters. Additionally, these approaches can be computationally intensive, requiring substantial processing power and storage capacity. The transfer of solutions from simulation to real-world environments remains a significant obstacle. Nevertheless, AI-based path-planning methods have achieved notable success in applications such as autonomous driving and warehouse automation. Continued research is expected to further enhance the efficiency, accuracy, and robustness of these approaches, thereby expanding their applicability to a broader range of real-world scenarios.
This review provides a comprehensive assessment of metaheuristics for autonomous mobile robot (AMR) path planning and of artificial intelligence (AI) reasoning/learning. Table 11 and Table 12 provide a performance-oriented comparison of the surveyed approaches, emphasizing qualitative indicators of optimality, robustness, and efficiency, as well as deployment-relevant capabilities such as handling dynamic obstacles/targets, hybridization, multi-robot applicability, and real-world validation. The distribution of evaluation settings (simulation vs. experiments, static vs. dynamic obstacles, and 2D vs. 3D) is summarized in Figure 16, Figure 17, Figure 18 and Figure 19. Overall, the analysis indicates that metaheuristic and AI-based techniques are more applicable in dynamic, unpredictable contexts where flexibility and robustness are essential.
Figure 16. Ratio of surveyed methods validated via simulation versus physical experiments.
Figure 16. Ratio of surveyed methods validated via simulation versus physical experiments.
Robotics 15 00023 g016
Figure 17. Ratio of surveyed methods validated via static versus dynamic obstacles.
Figure 17. Ratio of surveyed methods validated via static versus dynamic obstacles.
Robotics 15 00023 g017
Table 11. Comparative analysis of metaheuristic-based AMR path-planning techniques across identified criteria, including optimality, robustness, and efficiency.
Table 11. Comparative analysis of metaheuristic-based AMR path-planning techniques across identified criteria, including optimality, robustness, and efficiency.
MethodArticleYearKinematic
Model
Dynamic
Obstacles
Dynamic
Targets
Hybrid
Approach
Real
Experiments
Multi-RobotSearch Space
3D
OptimalityRobustnessEfficiency
[27]2018 ••••
[28]2018 •••••••
[29]2020 •••••
[30]2018 ••
[31]2019 ••••••••
[32]2019 ••••••
[34]2021 •••••••••
[35]2020 ••••••••
[33]2021 ••••••••
[36]2021 ••••••••
PSO[37]2021 ••••••
[38]2022 •••••••••
[39]2022 ••
[40]2023 ••••••••
[41]2023 ••
[42]2024 ••
[45]2024 ••••••
[43]2024 ••••••••
[44]2025 •••••••
[46]2025 ••••••
[49]2017 ••
[50]2018 ••••••
[51]2018 ••••••••
[52]2019 ••••
[53]2020 ••••••
[54]2020 ••••••
[55]2021 ••••••••
[57]2021 ••••••
GA[59]2021 ••••••
[60]2022 ••••••••
[61]2022 ••
[62]2024 ••••••••
[63]2024 ••••••
[64]2024 •••••••
[65]2024 ••••••••
[66]2025 ••••••••
[67]2025 ••••
[72]2018 ••••••••
[73]2019 •••••••••
[74]2019 ••
[75]2021 ••
[71]2020 ••••
[76]2020 ••••
[77]2021 ••••
[78]2021 ••••
[79]2022 ••••••••
ACO[80]2022 ••••
[81]2022 ••••
[83]2022 •••••••
[84]2024 ••••••••
[86]2024 •••••••
[85]2025 •••••••••
[87]2025 •••••••••
[88]2025 •••••••
[89]2025 ••••••
[90]2025 •••••••••
[96]2018 ••••••
[97]2018 ••••
[98]2021 ••••
[99]2021 ••••
[100]2022 ••••••••
[101]2022 •••••••
FA[103]2023 ••••••
[102]2024 ••••••••
[104]2024  SMC ••
[105]2024  RRT ••••
[106]2024  CS
[107]2024 ••••••
[110]2020 •••••••
[111]2021 ••••••
[113]2022 ••••••
[114]2022 ••
GWO[115]2023 ••••••••
[116]2023 •••••••••
[43]2024 ••••••••
[117]2025  DWA ••••••
[118]2025 •••••••
[119]2021
[120]2022 ••
[121]2022 ••
Others[122]2022 ••••
[123]2024  (HHO+AVOA) ••••••
[124]2025 •••••••
[125]2025  (SSA+AVOA) •••••••••
Note: Optimality, robustness, and efficiency are qualitative indicators derived from reported performance and design characteristics in the cited studies (High: •••, Medium: ••, Low: ).
Figure 18. Ratio of surveyed methods validated via 2D versus 3D space.
Figure 18. Ratio of surveyed methods validated via 2D versus 3D space.
Robotics 15 00023 g018
Figure 19. Statistical overview of validation strategies adopted in path planning literature. Simulation-only and combined simulation-experiment studies dominate the literature, while experiment-only validation accounts for 0% of the surveyed works and is therefore not visually discernible in the chart.
Figure 19. Statistical overview of validation strategies adopted in path planning literature. Simulation-only and combined simulation-experiment studies dominate the literature, while experiment-only validation accounts for 0% of the surveyed works and is therefore not visually discernible in the chart.
Robotics 15 00023 g019
Table 12. Comparative assessment of AI-based AMR navigation methods across deployment scope and performance-oriented criteria (optimality, robustness, and efficiency).
Table 12. Comparative assessment of AI-based AMR navigation methods across deployment scope and performance-oriented criteria (optimality, robustness, and efficiency).
MethodPaperYearKinematic
Model
Dynamic
Obstacles
Dynamic
Targets
Hybrid
Approach
Real
Experiments
Multi-RobotSearch Space
3D
OptimalityRobustnessEfficiency
[130]2017 ••••••••
[133]2018 ••••••••
[131]2018 ••••••••
[132]2019 ••••••••
[134]2020 ••••••••
[135]2020 ••••••••
Fuzzy[136]2021 ••••••••
Logic[137]2022 ••••••••
[138]2022 ••••••••
[139]2023  DWA ••••••••
[140]2023  DWA ••••••••
[141]2024 ••••••••
[142]2024  DCA ••••••••
[143]2025  D*L, Quad3 ••••••••
[144]2025  A*, APF, QL ••••••••
[147]2018 ••••••••
[148]2018 ••••••••
[149]2019 ••••••••
[150]2019 ••••••••
[151]2019 ••••••••
[152]2020 ••••••••
[153]2020 ••••••••
[156]2021 ••••••••
Artificial[157]2021 ••••••••
Neural[158]2021 ••••••••
Network[159]2022 ••••••••
[160]2022 ••••••••
[161]2023 •••••••
[162]2024 ••••••••
[163]2025  RL ••••••••
[164]2025  DWA ••••••••
[165]2025  IWO •••••••
[166]2025  IGA ••••••••
[170]2019 •••••••
[171]2019 •••••••
[172]2020  GPS ••••••••
[173]2021  Fuzzy •••••••
[174]2022 •••••••
[175]2022 ••••••••
Neuro-[176]2022 ••••••••
Fuzzy[179]2022  FCM ••••••••
[177]2023 ••••••••
[178]2024 ••••••••
[180]2025 ••••••••
[181]2025  TLBO ••••••••
[182]2025  A*, DWA ••••••••
[166]2025  BINN, GA ••••••••
[183]2025 •••••••
[195]2019 •••••••
[193]2020 ••••••
[192]2021 ••••••
[194]2021 •••••••
[194]2021 •••••••
[190]2022 ••••••
[189]2023 ••••••••
RL[196]2023 •••••••
[197]2024 ••••••
[238]2024 •••••••
[198]2024 ••••••
[144]2025  Fuz A*, APF ••••••••
[199]2025  A*, DWA ••••••••
[200]2025 ••••••
[201]2025  ACO, Adap ϵ ••••••••
[208]2019 ••••••••
[209]2020 ••••••••
[210]2020 •••••••
[211]2021 ••••••••
[212]2021 ••••••••
[213]2021 ••••••••
[215]2022 ••••••••
[216]2022 •••••••
Deep[217]2022 ••••••••
RL[218]2022 •••••••
[224]2023 ••••••••
[219]2024 ••••••
[220]2024 •••••••
[225]2024 •••••••
[227]2024 ••••••••
[128]2025  SAC, PER ••••••••
[228]2025 •••••••
[229]2025  ANFIS, TAHA ••••••••
[230]2025  MOSFMO ••••••••
Note: Optimality, robustness, and efficiency are qualitative indicators derived from reported performance and design characteristics in the cited studies (High: •••, and Medium: ••).
Table 13 complements the algorithm-type taxonomy by emphasizing that planner design and reported performance are strongly scenario-dependent. For example, static global planning primarily rewards path quality (cost, smoothness), whereas dynamic environments are dominated by replanning latency and safety under uncertainty. Likewise, kinematically constrained platforms require trajectory-level feasibility metrics that path-only comparisons may overlook. This problem-dimension view clarifies why no single algorithm family dominates across all regimes: methods should be compared primarily within the same operational regime and requirement set rather than across fundamentally different problem settings.
A total of approximately 230 articles published within the last eight years were systematically analyzed in this study, with the distribution of algorithm categories for dynamic environments illustrated in Figure 20. The findings reveal that both metaheuristics and AI approaches are widely used, though AI-based methods are better suited to dynamic navigation tasks. Despite the validation of significant theoretical work through simulations and offline processing, a critical gap persists between theoretical proposals and practical deployment. As highlighted in Figure 19, the majority of algorithms have been validated solely in simulation environments, while only about 21 % of the surveyed works report real-world experimental validation (see Figure 16). This reliance on simulated benchmarks limits the transferability of results to real-world AMR applications, where uncertainties and dynamic interactions are far more complex. Furthermore, this gap between theory and practice establishes the context for subsequent evaluation challenges discussed below. Addressing this gap by increasing experimental validation could significantly advance the field, leading to more reliable and practical solutions for the deployment of real-world autonomous mobile robots.
Furthermore, the scope of evaluation across the surveyed works remains uneven. Only a limited subset of studies has addressed navigation toward mobile targets (Figure 21), and even fewer have extended their methods to multi-robot formation scenarios, as indicated in Table 11 and Table 12. This highlights a broader research challenge: while algorithmic sophistication continues to advance, the field still lacks sufficiently large-scale, experimentally validated solutions that address mobile targets, cooperative behaviors, and multi-agent coordination in realistic dynamic settings. To drive the field forward, the community must prioritize and develop rigorous, real-world evaluations addressing these gaps. Doing so will unlock novel applications, improve system robustness, and accelerate the translation of research innovations into impactful real-world technology.
To complement the detailed lookup tables (Table 11, Table 12 and Table 14) and reduce reliance on purely tabular presentation, we provide a compact qualitative synthesis in Figure 22 that summarizes the dominant trade-offs across major families. Metaheuristics (GA/PSO) remain attractive for global search and flexible objective design. Still, their online adaptability is limited unless used in receding-horizon or hybrid settings, and performance is sensitive to parameter tuning. Fuzzy logic is lightweight and interpretable with fast online decision-making, yet rule design can be labor-intensive, and generalization to diverse environments may be limited without adaptation mechanisms. Once trained, DRL demonstrates strong real-time adaptability in dynamic scenes; however, it requires additional safety constraints for deployment, depends on data/simulation fidelity, and has high offline training costs. By combining complementary strengths (local responsiveness and global exploration), hybrid approaches often increase robustness at the expense of greater system complexity and integration effort.

4.2. Key Synthesized Findings and Maturity-Level Insights

Based on the evidence summarized in Table 11, Table 12 and Table 14 and Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 and Figure 23 we extract the following synthesized findings that reflect both algorithmic trends and deployment maturity in AMR navigation.
  • F1 (Adoption trend in dynamic settings). AI-based methods appear more frequently in dynamic-environment studies than metaheuristics, reflecting a shift toward perception-driven and reactive navigation policies.
  • F2 (Deployment maturity gap). Despite substantial algorithmic progress, real-world validation remains limited (only about 21% of surveyed works report physical experiments), indicating that the field is still deployment-immature under real sensing noise, latency, and safety constraints.
  • F3 (Evaluation scope remains uneven). The literature strongly emphasizes static obstacles and 2D settings (see Figure 17 and Figure 18), whereas comparatively fewer studies address mobile-target tracking, multi-robot coordination, and 3D navigation. These scenarios expose additional requirements (prediction, communication, and dynamic feasibility) that are not consistently evaluated across benchmarks.
  • F4 (Dominance of hierarchical navigation stacks). Many contemporary systems are best viewed as stacks rather than single planners: a global planner provides route structure, while a local module handles reactive obstacle avoidance and short-horizon feasibility. This is particularly evident in RL/DRL studies, which frequently act as local avoidance/exploration modules rather than standalone global path planners.
  • F5 (Metaheuristics as flexible optimizers, not guaranteed solvers). Metaheuristics remain attractive for flexible objective design (e.g., smoothness, energy, risk) and global search; however, they are typically near-optimal, sensitive to tuning/initialization, and computationally heavy for strict real-time settings unless combined with receding-horizon updates, parallelization, or hybridization.
  • F6 (Learning methods trade training burden for online adaptability). Once trained, learning-based policies (especially DRL) can react quickly in dynamic scenes and under partial observability. Still, performance depends on data/simulation fidelity and usually requires additional safety constraints (e.g., shielding) for reliable deployment.
  • F7 (Hybridization as a maturity bridge). Hybrid pipelines (metaheuristic + deterministic or global planner + learned local policy) increasingly represent a practical compromise: they improve robustness and adaptability by combining complementary strengths, at the cost of greater integration complexity.
These findings suggest that reported algorithmic performance should be interpreted in light of evaluation realism and the system-integration role. We therefore provide a pragmatic selection guide (Table 15) that maps common deployment regimes to suitable planning families.

4.3. Practical Decision Guidance

To translate the above findings into actionable guidance, Table 15 provides a compact selection heuristic linking deployment constraints to suitable planning families. The intent is not to prescribe a single “best” algorithm but to map standard AMR operating regimes to the dominant tool choices reported across the surveyed literature.

4.4. Computational Considerations: Time and Space Complexity

This subsection discusses the time and space complexities of AMR path-planning approaches.

4.4.1. Metaheuristic Approaches

Let P be the population size, N the number of iterations, D the number of waypoints per path, and K the number of obstacles. Let C eval ( D , K ) denote the cost to evaluate one candidate path (objective + constraints). With per-waypoint collision checks against every obstacle,
C eval ( D , K ) = O ( D K ) .
With a distance transform, this often drops to O ( D ) .
Per iteration, each of the P candidates is (i) evaluated and (ii) updated:
Evaluation : O P C eval ( D , K ) , Update : O ( P D ) .
Over N iterations, the total time is
T = O N P C eval ( D , K ) + P D = O P N [ C eval ( D , K ) + D ]
Under collision checking, C eval ( D , K ) = O ( D K ) , so
T = O P N ( D K + D ) = O P N D K ( when K 1 )
The memory stores the population and small per-candidate state:
M = O ( P D ) .
For ACO, with P ants, V nodes and E edges, then per iteration:
Path construction : O ( P D ) ,
Pheromone update : O ( E ) , Candidate evaluation : O P C eval ( D , K ) .
Over N iterations, the time complexity is
T = O N [ P ( C eval ( D , K ) + D ) + E ] .
With collision checks, this becomes
T = O N [ P ( D K + D ) + E ] .
The dominant memory term is the pheromone structure:
M = O ( E ) , roadmap / grid , O ( V 2 ) , dense graph .

4.4.2. Heuristic-Guided RL and Deep RL

In this part, we report online time and space complexity under the following assumptions: (i) fixed observation dimensionality; and (ii) no on-robot learning or replay buffer.
Let A be the number of discrete actions, | S | the number of discrete states (tabular case), and P n is the total number of learned parameters in the deployed function approximation.
Heuristic-guided RL. Time O ( P n ) per control step (single forward pass), independent of map size for fixed-size observations; space O ( P n ) .
Deep RL. Time O ( P n ) per control step (constant with respect to environment size for fixed observation dimensionality); space O ( P n ) .
Tabular RL. Time O ( A ) per step (action selection via arg max a Q ( s , a ) ); space O ( | S | A ) for the Q-table (or O ( | S | ) if a deterministic policy table is stored). Table 16 summarizes the online time and deployment space complexity of the three methods: both heuristic-guided and deep RL scale linearly with the number of parameters P n , whereas tabular RL incurs O ( A ) per-step time and O ( | S | A ) space.

5. Opportunities in AMR Path Planning

The rapid evolution of metaheuristic and AI-based approaches opens several promising research directions for advancing AMR path planning:

5.1. Parameter Tuning of Existing Metaheuristic Approaches

Because of their flexibility and strong global search capability, metaheuristic algorithms such as PSO, genetic algorithms, and ACO are widely used for AMR path planning. However, their effectiveness is highly sensitive to parameter settings, which can significantly affect robustness, convergence speed, and solution quality.
A significant limitation of applying these methods in practice is the need for manual parameter tuning, which is often time-consuming and highly problem-dependent. Selecting parameter values, such as the pheromone evaporation rate in ACO, the mutation rate in GA, or the inertia weight in PSO, typically requires extensive trial-and-error. Moreover, these settings rarely transfer well across environments: parameters that perform well in one map or task may degrade performance in another. Poor parameter choices can also lead to undesirable behavior, such as excessive exploration or premature convergence to local optima, reducing both planning accuracy and efficiency. To mitigate these issues, several approaches can reduce or eliminate reliance on manual tuning, including:
  • Utilizing automated methods, including grid search, random search, Bayesian optimization, and AI-based approaches to find the best parameter combinations in a variety of settings.
  • Algorithm performance in a dynamic environment can be improved by creating self-adaptive metaheuristics that adjust parameters in real time, such as mutation rates in GAs or adaptive inertia weights in PSOs.

5.2. Hybrid Algorithm Development

Hybrid approaches, especially those combining metaheuristics (e.g., PSO, GA, ACO) with AI techniques (e.g., neural networks, fuzzy logic, reinforcement learning), offer a major opportunity to advance AMR path planning. Metaheuristics provide strong global exploration and can reduce the risk of poor local optima, while learning-based components can inject adaptability, context awareness, and improved generalization across environments. When designed systematically, hybrids offer a practical route to reconcile competing requirements such as path quality, real-time feasibility, robustness to uncertainty, and safety.

5.2.1. Principles for Designing Hybrid Planners

Hybridization should be guided by explicit principles rather than ad hoc algorithm mixing. Effective designs typically (i) decompose the planning problem into complementary submodules, (ii) enforce consistent objectives and constraints across modules, (iii) define clear information flow and replanning triggers, and (iv) explicitly manage added complexity and failure modes.

5.2.2. Common Hybridization Patterns

  • Hierarchical decomposition (global–local stack): A global planner produces a coarse route or waypoint sequence, while a local planner/controller performs reactive collision avoidance and enforces short-horizon feasibility. This pattern is particularly effective in dynamic or partially observed environments where frequent replanning is required.
  • Embedded local refinement: A global search stage (e.g., sampling-based or metaheuristic optimization) generates candidate paths, followed by deterministic refinement (e.g., smoothing, constraint projection, or local optimization) to satisfy kinematic/dynamic limits and improve clearance and smoothness.
  • Safety shielding/constraint enforcement: A learning-based or heuristic module proposes actions or waypoints, while an explicit safety layer (e.g., rule-based constraints, constrained optimization, control barrier functions, or a classical fallback planner) filters unsafe outputs and guarantees collision avoidance.

5.2.3. Methodology for Hybrid Algorithm Design

A principled hybrid design can be structured as follows:
  • Match the decomposition to the task regime: Assign each module a well-scoped role (global routing, reactive avoidance, exploration, or trajectory feasibility) based on the operating scenario (e.g., static vs. dynamic, known vs. unknown, kinematically constrained vs. unconstrained).
  • Define shared objectives and constraints: Ensure modules optimize compatible criteria (e.g., time/length, clearance, smoothness, energy) and encode hard constraints (collision-free, kinematic/dynamic feasibility) consistently to avoid conflicting behaviors.
  • Specify coupling and information flow: Define what is exchanged (e.g., waypoints, local costmaps, obstacle predictions, feasibility feedback) and how often replanning occurs (periodic vs. event-triggered), including how to resolve disagreements between modules.
  • Manage complexity and failure modes: Hybrids improve capability but increase system complexity; evaluation should include ablations (removing modules), sensitivity to hyperparameters, and systematic failure-case analysis under sensing noise and map errors.

5.2.4. Future Opportunities

Promising directions for hybrid AMR planners include:
  • Learning-guided metaheuristics: combining DRL with bio-inspired optimization to bias search toward high-value regions and improve planning in unknown or rapidly changing environments.
  • Layered hybrid architectures: using a metaheuristic (or graph/sampling planner) for global routing, coupled with a learned local policy (e.g., DRL or fuzzy logic) for reactive avoidance and short-horizon feasibility.
  • AutoML-style adaptation: using metaheuristics to tune or evolve hyperparameters of learning-based planners online (or across deployment sites), improving robustness across different layouts and traffic patterns.

5.3. Dynamic and Complex Environments

Many current solutions still struggle with highly dynamic, uncertain, or crowded environments, such as
  • Disaster response zones with unpredictable hazards.
  • Developing real-time adaptive planners that continuously learn from environmental changes using reinforcement learning systems.
  • Introducing predictive models to estimate the movement of dynamic obstacles (e.g., pedestrians, vehicles) and adjust navigation plans accordingly.

5.4. Multi-Robot Systems and Coordination

A multi-robot system enables robots to learn cooperative strategies without relying on task allocation, allowing them to achieve shared goals. Despite growing interest, few methods fully address this in real-world settings. Key opportunities include:
  • Decentralized learning-based approaches (e.g., multi-agent reinforcement learning) enable robots to learn cooperative strategies without relying on centralized control.
  • Communication-aware planning, where robots share partial knowledge of the environment or goals to improve collective decision-making.
  • Dynamic role assignment and task sharing, where robots adapt roles (leader, follower, scout) based on changing mission needs or failures.

5.5. 3D Path Planning and Aerial/Underwater Robots

Most surveyed methods focus primarily on 2D environments. However, real-world deployment increasingly demands 3D navigation, particularly for aerial drones, underwater vehicles, and climbing robots. Designing effective 3D collision-avoidance systems remains a significant challenge, as it requires accurate obstacle mapping, real-time environmental perception, and dynamic trajectory planning in complex, often unpredictable conditions, such as air or water. These environments introduce additional variables, like wind currents, fluid dynamics, and varying terrain elevation, that further complicate path planning. Consequently, there is a growing need for advanced algorithms that can make robust, adaptive decisions in three-dimensional, dynamic spaces.

5.6. Real-World Deployment and Validation

While simulation has proven vital, real-world testing is limited. Only 21% of algorithms are experimentally validated. Bridging this gap opens several opportunities, including.
  • Combining heuristic methods with DRL to create systems that are both interpretable (via heuristics) and adaptive (via learning), enabling safer real-world deployment.
  • Field testing in application-specific settings, such as agriculture, mining, or logistics, to assess algorithm robustness and adaptability under real-world constraints, including limited sensing, noise, and latency.

5.7. Agentic AI for Autonomous Mobile Robots

Agentic AI denotes systems that autonomously set sub-goals, invoke external planners and tools, act, and reflect using memory, closing a loop of plan → act → evaluate → refine under real-world constraints. Contemporary AMR stacks integrate perception, localization, and motion planning. Agentic AI adds a decision layer that (i) decomposes tasks into semantic sub-goals; (ii) selects and configures the appropriate planner/skill at the right time; (iii) critiques outcomes to inform subsequent steps; and (iv) adapts to user intent and evolving environments.
Recent advances in foundation models, particularly large language models (LLMs) and vision–language models (VLMs), provide a key enabling technology for agentic AI in AMR systems. When integrated as high-level cognitive layers, these models support semantic reasoning, natural-language goal interpretation, and long-horizon task decomposition, complementing classical and learning-based path planners. In this context, agentic AI frameworks often leverage foundation models for intent understanding and planning orchestration, while delegating geometric path planning and motion execution to established planners or learned controllers.

Key Research Directions and Opportunities

  • Agentic orchestration of planning tools. Develop task-level agents that decompose goals into sub-goals, select/configure planners (e.g., D*, RRT*), and iteratively refine plans through self-critique. Integrate formal verifiers (collision, kinematics, and aisle/traffic rules) to provide structured feedback for the next iteration in case of failures.
  • Uncertainty- and risk-aware routing. Couple agentic controllers with probabilistic forecasts of dynamic actors and sensing quality; optimize risk-sensitive objectives (e.g., chance constraints, Conditional Value at Risk) rather than focusing solely on time/distance. Explore anytime variants that trade computation for reduced tail risk.
  • Decentralized multi-robot coordination. Use agentic negotiation to allocate right-of-way, form platoons, and de-conflict routes with minimal communication under partial observability. Compare emergent conventions with centralized schedulers across density regimes.
  • Energy-, compute-, and latency-aware scheduling. Co-design agent policies with embedded constraints (CPU/GPU power, thermal limits), prioritizing planner/tool calls that meet real-time deadlines while minimizing energy for long-duration autonomy.
  • Human-in-the-loop intent and semantics. Map high-level human directives (semantic goals, no-go zones, and task priorities) into sub-goals and constraints, enabling interpretable re-planning and operator override when mission objectives change.

6. Open Questions in AMR Path Planning

While research in AMR path planning has made considerable progress, several key questions remain open. These questions highlight areas of uncertainty or debate where further investigation is needed to achieve breakthroughs:
How can we guarantee safety and reliability in unpredictable environments?
Ensuring that an AMR can operate safely in environments with stochastic elements (moving humans, dynamic obstacles, sensor noise, and map incompleteness) remains an open problem. Current approaches often incorporate safety margins or rely on worst-case assumptions, leading to overly conservative behavior. The question is: can we develop planning algorithms that provide formal safety guarantees (perhaps probabilistic) while remaining efficient enough for real-time deployment? This may involve new theories in stochastic trajectory optimization or chance-constrained planning, and it is not yet clear how to strike a balance between performance and absolute safety.
How can we close the simulation-to-reality gap?
As shown in Figure 19, only a few algorithms, at 21.0 % , have been validated experimentally, with the majority tested only in simulation. The question is how to reliably transfer learned or optimized policies to physical robots operating in uncertain, noisy real-world environments. Techniques such as domain randomization, transfer learning, or standardized benchmarking platforms may help, but no consensus framework has yet been established.
Can reinforcement learning-based planners be made explainable and guaranteed?
RL and DRL methods offer adaptability in dynamic settings, but their decision-making remains opaque and lacks hard safety guarantees. Open questions include: how can we verify that a learned policy will always avoid collisions? How can constraints such as safety zones or non-holonomic dynamics be enforced within an RL framework? Concepts like safe RL or shielded RL (where a classical planner acts as a safety net) are promising. Still, it is unresolved how to scale them to complex, cluttered environments while maintaining efficiency.
How can AMRs handle true 3D environments?
Although most surveyed techniques operate in 2D, real-world deployments increasingly demand navigation in 3D (e.g., drones in warehouses, climbing robots, underwater vehicles). Figure 18 highlights the limited number of algorithms evaluated in 3D search spaces. Open questions include: How can dynamic constraints, such as slopes, fluid dynamics, or variable friction, be accounted for while maintaining real-time performance? Can hybrid DRL metaheuristic planners naturally extend to these domains?
Despite tremendous advancements, these open questions highlight that AMR path planning remains an interesting topic with many layers of complexity that need further exploration. Addressing these issues necessitates interdisciplinary collaboration among robotics, optimization, machine learning, safety engineering, and ethics. Each inquiry presents a prospective pathway for substantial research contributions, and together they will influence the forthcoming decade of progress in autonomous mobile robotics.

7. Conclusions and Future Prospects

Autonomous mobile robot (AMR) path planning remains a central challenge in robotics, directly influencing navigation efficiency, safety, and adaptability in real-world applications. This review provides a structured and systematic overview of AMR planning approaches spanning classical, heuristic, metaheuristic, and artificial intelligence-based families, with a particular emphasis on recent developments in metaheuristics and cognitive/learning-based methods (fuzzy systems, neural approaches, and RL/DRL). By analyzing their core mechanisms, advantages, limitations, and application scenarios, we highlight the evolution from purely optimization-based planning toward hybrid, perception-driven, and learning-enabled systems; moreover, beyond cataloging the literature, we synthesized cross-cutting insights on deployment maturity, evaluation practices, and practical trade-offs and provided decision-oriented guidance linking standard operating regimes to suitable algorithmic choices.
Our synthesis indicates that metaheuristic algorithms (e.g., PSO, GA, ACO, and GWO) remain valuable for global search and flexible objective design; however, they are sensitive to parameter tuning, can be computationally heavy in cluttered spaces, and typically provide near-optimal solutions rather than guaranteed optimality. Conversely, AI-based methods, especially DRL, offer strong adaptability in dynamic, partially observable environments once trained, but face key barriers, including high offline training costs, nontrivial reward and safety design, limitations in sim-to-real transfer, and the need for reliable safety constraints for deployment. Importantly, many learning-based approaches are most effective when viewed as local modules for obstacle avoidance or exploration within hierarchical navigation stacks, rather than as standalone global planners. Accordingly, hybrid pipelines that combine global structure (graph/search/metaheuristic planning) with responsive local policies (DRL, fuzzy, or adaptive modules) are increasingly emerging as a pragmatic pathway toward robust real-world navigation, albeit with greater system integration complexity.
Several research challenges and open questions remain unresolved. These include (i) reliable sim-to-real transfer and reproducible benchmarking to close the deployment maturity gap; (ii) safety guarantees, interpretability, and verification for learning-enabled navigation; (iii) scalable multi-robot coordination under partial observability and communication limits; (iv) addressing energy efficiency and real-time computation on embedded platforms; and (v) robust 3D navigation for aerial, underwater, and heterogeneous robots. Moreover, the field must also address broader questions of benchmarking, reproducibility, lifelong adaptation, and human-aware navigation in crowded and uncertain spaces. Addressing these questions will require integrating insights from optimization, machine learning, control theory, and human–robot interaction. In addition, emerging trends in foundation models (LLMs/VLMs) suggest new opportunities for high-level semantic navigation, such as natural-language goal specification and waypoint/constraint generation, while classical or learning-based planners handle low-level feasibility and safety.
Overall, this review consolidates the current landscape of AMR path planning, highlights evidence-grounded research gaps, highlights emerging opportunities, and offers practical guidance for method selection under common deployment constraints. Continued progress will likely be driven by hybrid, verifiable, and resource-aware designs that integrate optimization, perception, and learning to deliver safe, efficient, and resilient autonomy in complex real-world environments.

Author Contributions

Conceptualization, M.B.A. and S.E.; methodology, M.B.A. and G.A.; validation, M.B.A. and G.A.; formal analysis, M.B.A. and G.A.; investigation, M.B.A.; resources, S.E. and A.-W.A.S.; data curation, M.B.A.; writing—original draft preparation, M.B.A.; writing—review and editing, M.B.A., G.A., S.E., and A.-W.A.S.; visualization, M.B.A. and G.A.; supervision, G.A., S.E., and A.-W.A.S.; project administration, S.E. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by King Fahd University of Petroleum and Minerals (KFUPM) through the Interdisciplinary Research Center for Smart Mobility and Logistics (IRC-SML), under the project number INML2505.

Data Availability Statement

All data supporting the findings of this study are included within the article or cited in the referenced materials.

Acknowledgments

The authors gratefully acknowledge the support of King Fahd University of Petroleum and Minerals (KFUPM) and the Interdisciplinary Research Center for Smart Mobility and Logistics (IRC-SML).

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

  • The following abbreviations are used in this manuscript: List of acronyms used throughout the paper for key planning strategies and algorithms.
AcronymMeaning
Robotic Platforms
AMRAutonomous Mobile Robot
WMRWheeled Mobile Robot
AGVAutomated Guided Vehicle
Heuristic and Bio-Inspired Path Planning Methods
PSOParticle Swarm Optimization
GAGenetic Algorithm
ACOAnt Colony Optimization
GWOGrey Wolf Optimizer
BSOBat Swarm Optimization
ABCArtificial Bee Colony
FAFirefly Algorithm
WOAWhale Optimization Algorithm
AVOAAfrican Vulture Optimization Algorithm
LSLocal Search
Classical Path Planning Methods
APFArtificial Potential Field
ODAObstacle Detection and Avoidance
DWADynamic Window Approach
Fuzzy, Analytic, and Neuro-Fuzzy Techniques
FLFuzzy Logic
AHPAnalytic Hierarchy Process
ANNArtificial Neural Network
BNNBehavioral Neural Network
GRUGated Recurrent Unit
ANFISAdaptive Neuro-Fuzzy Inference System
Reinforcement Learning and Deep RL Methods
RLReinforcement Learning
DRLDeep Reinforcement Learning
Q-LQ-Learning
DQNDeep Q-Network
DDQNDouble Deep Q-Network
DDPGDeep Deterministic Policy Gradient
PPOProximal Policy Optimization
PERPrioritized Experience Replay

References

  1. Teja, G.K.; Mohanty, P.K.; Das, S. Review on path planning methods for mobile robot. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2025, 239, 5547–5580. [Google Scholar] [CrossRef]
  2. Loganathan, A.; Ahmad, N.S. A systematic review on recent advances in autonomous mobile robot navigation. Eng. Sci. Technol. Int. J. 2023, 40, 101343. [Google Scholar] [CrossRef]
  3. Sheltami, T.; Ahmed, G.; Ghaleb, M.; Mahmoud, A. UAV Path Planning and Trajectory Optimization: A Comprehensive Survey. Arab. J. Sci. Eng. 2025, 1–41. [Google Scholar] [CrossRef]
  4. Sheltami, T.; Ahmed, G.; Yasar, A.U.H. An Optimization Approach of IoD Deployment for Optimal Coverage Based on Radio Frequency Model. Comput. Model. Eng. Sci. 2024, 139, 2627–2647. [Google Scholar] [CrossRef]
  5. Badamasi Aremu, M.; Kabir, I.K.; Ahmed, G.; El-Ferik, S. Autonomous Mobile Robot Path Planning Techniques—A Review: Classical and Heuristic Techniques. IEEE Access 2025, 13, 117999–118022. [Google Scholar] [CrossRef]
  6. Ugwoke, K.C.; Nnanna, N.A.; Abdullahi, S.E.Y. Simulation-based review of classical, heuristic, and metaheuristic path planning algorithms. Sci. Rep. 2025, 15, 12643. [Google Scholar] [CrossRef]
  7. Liu, L.; Wang, X.; Yang, X.; Liu, H.; Li, J.; Wang, P. Path planning techniques for mobile robots: Review and prospect. Expert Syst. Appl. 2023, 227, 120254. [Google Scholar] [CrossRef]
  8. Ibraheem, I.K.; Ajeil, F.H. Path Planning of an Autonomous Mobile Robot in a Dynamic Environment using Modified Bat Swarm Optimization. arXiv 2018, arXiv:1807.05352. [Google Scholar]
  9. Wahab, M.N.A.; Nefti-Meziani, S.; Atyabi, A. A comparative review on mobile robot path planning: Classical or meta-heuristic methods? Annu. Rev. Control 2020, 50, 233–252. [Google Scholar] [CrossRef]
  10. Shahid, S. Artificial Intelligence in Path Planning for Autonomous Robots: A Review. Metaheuristic Optim. Rev. 2024, 2, 37–47. [Google Scholar] [CrossRef]
  11. Thrun, S. Probabilistic algorithms in robotics. Ai Mag. 2000, 21, 93. [Google Scholar]
  12. LaValle, S.M. Planning Algorithms; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  13. Siegwart, R.; Nourbakhsh, I.R.; Scaramuzza, D. Introduction to Autonomous Mobile Robots; MIT Press: Cambridge, MA, USA, 2011. [Google Scholar]
  14. Dokeroglu, T.; Canturk, D.; Kucukyilmaz, T. A survey on pioneering metaheuristic algorithms between 2019 and 2024. arXiv 2024, arXiv:2501.14769. [Google Scholar]
  15. Lin, S.; Liu, A.; Wang, J.; Kong, X. An intelligence-based hybrid PSO-SA for mobile robot path planning in warehouse. J. Comput. Sci. 2023, 67, 101938. [Google Scholar] [CrossRef]
  16. Lou, T.S.; Yue, Z.P.; Jiao, Y.; He, Z. A hybrid strategy-based GJO algorithm for robot path planning. Expert Syst. Appl. 2023, 238, 121975. [Google Scholar] [CrossRef]
  17. Sehuveret Hernández, D.; García-Muñoz, J.A.; Barranco Gutiérrez, A.I. Evaluation of metaheuristic optimization algorithms applied to path planning. Int. J. Adv. Robot. Syst. 2024, 21, 17298806241285302. [Google Scholar] [CrossRef]
  18. Qin, H.; Shao, S.; Wang, T.; Yu, X.; Jiang, Y.; Cao, Z. Review of Autonomous Path Planning Algorithms for Mobile Robots. Drones 2023, 7, 211. [Google Scholar] [CrossRef]
  19. Xu, S.; Ho, E.S.; Shum, H.P. A hybrid metaheuristic navigation algorithm for robot path rolling planning in an unknown environment. Mechatron. Syst. Control 2019, 47, 216–224. [Google Scholar] [CrossRef]
  20. Garip, Z.; Karayel, D.; Erhan Çimen, M. A study on path planning optimization of mobile robots based on hybrid algorithm. Concurr. Comput. Pract. Exp. 2022, 34, e6721. [Google Scholar] [CrossRef]
  21. Sood, M.; Panchal, V.K. Meta-heuristic techniques for path planning: Recent trends and advancements. Int. J. Intell. Syst. Technol. Appl. 2020, 19, 36–77. [Google Scholar] [CrossRef]
  22. Ahmed, G.; Sheltami, T.; Mahmoud, A.; Yasar, A. IoD swarms collision avoidance via improved particle swarm optimization. Transp. Res. Part A Policy Pract. 2020, 142, 260–278. [Google Scholar] [CrossRef]
  23. Ahmed, G.; Eltayeb, A.; Alyazidi, N.M.; Imran, I.H.; Sheltami, T.; El-Ferik, S. Improved particle swarm optimization for fractional order PID control design in robotic Manipulator system: A performance analysis. Results Eng. 2024, 24, 103089. [Google Scholar] [CrossRef]
  24. Mobarez, E.; Sarhan, A.; Ashry, M. Obstacle avoidance for multi-UAV path planning based on particle swarm optimization. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2021; Volume 1172, p. 012039. [Google Scholar]
  25. Ahmed, G.; Sheltami, T.R. A safety system for maximizing operated uavs capacity under regulation constraints. IEEE Access 2023, 11, 139069–139081. [Google Scholar] [CrossRef]
  26. Aremu, M.B.; Abdel-Nasser, M.; Alyazidi, N.M.; El-Ferik, S. Disturbance Observer-Based Bio-Inspired LQR Optimization for DC Motor Speed Control. IEEE Access 2024, 12, 152418–152429. [Google Scholar] [CrossRef]
  27. Adamu, P.I.; Jegede, J.T.; Okagbue, H.I.; Oguntunde, P.E. Shortest path planning algorithm—A Particle Swarm Optimization (PSO) approach. In Proceedings of the World Congress on Engineering, London, UK, 4–6 July 2018; Volume 1, pp. 4–6. [Google Scholar]
  28. Wang, B.; Li, S.; Guo, J.; Chen, Q. Car-like mobile robot path planning in rough terrain using multi-objective particle swarm optimization algorithm. Neurocomputing 2018, 282, 42–51. [Google Scholar] [CrossRef]
  29. Alam, M.S.; Rafique, M.U.; Khan, M.U. Mobile robot path planning in static environments using particle swarm optimization. arXiv 2020, arXiv:2008.10000. [Google Scholar] [CrossRef]
  30. Meerza, S.I.A.; Islam, M.; Uzzal, M.M. Optimal Path Planning Algorithm for Swarm of Robots Using Particle Swarm Optimization Technique. In Proceedings of the 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), Yogyakarta, Indonesia, 13–14 November 2018; IEEE: New York, NY, USA, 2018; pp. 330–334. [Google Scholar]
  31. Lian, J.; Yu, W.; Liu, W. A Chaotic Adaptive Particle Swarm Optimization for Robot Path Planning. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 4751–4756. [Google Scholar] [CrossRef]
  32. Krell, E.; Sheta, A.; Balasubramanian, A.; King, S. Collision-Free Autonomous Robot Navigation in Unknown Environments Utilizing PSO for Path Planning. J. Artif. Intell. Soft Comput. Res. 2019, 9, 267–282. [Google Scholar] [CrossRef]
  33. Cheng, X.; Li, J.; Zheng, C.; Zhang, J.; Zhao, M. An Improved PSO-GWO Algorithm with Chaos and Adaptive Inertial Weight for Robot Path Planning. Front. Neurorobot. 2021, 15, 770361. [Google Scholar] [CrossRef] [PubMed]
  34. Gul, F.; Rahiman, W.; Alhady, S.; Ali, A.; Mir, I.; Jalil, A. Meta-heuristic approach for solving multi-objective path planning for autonomous guided robot using PSO–GWO optimization algorithm with evolutionary programming. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 7873–7890. [Google Scholar] [CrossRef]
  35. Ajeil, F.H.; Ibraheem, I.K.; Azar, A.T.; Humaidi, A.J. Autonomous navigation and obstacle avoidance of an omnidirectional mobile robot using swarm optimization and sensors deployment. Int. J. Adv. Robot. Syst. 2020, 17, 1729881420929498. [Google Scholar] [CrossRef]
  36. Lu, J.; Zhang, Z. An Improved Simulated Annealing Particle Swarm Optimization Algorithm for Path Planning of Mobile Robots Using Mutation Particles. Wirel. Commun. Mob. Comput. 2021, 2021, 2374712. [Google Scholar] [CrossRef]
  37. Mohanty, P.K.; Dewang, H.S. A smart path planner for wheeled mobile robots using adaptive particle swarm optimization. J. Braz. Soc. Mech. Sci. Eng. 2021, 43, 101. [Google Scholar] [CrossRef]
  38. Abaas, T.F.; Shabeeb, A.H. Obstacle Avoidance and Path Planning of a Wheeled Mobile Robot Using Hybrid Algorithm. Eng. Technol. J. 2022, 40, 1659–1670. [Google Scholar] [CrossRef]
  39. Abed, B.M.; Jasim, W.M. Multi-objective optimization path planning with moving target. IAES Int. J. Artif. Intell. 2022, 11, 1184–1196. [Google Scholar] [CrossRef]
  40. Jia, L.; Li, J.; Ni, H.; Zhang, D. Autonomous mobile robot global path planning: A prior information-based particle swarm optimization approach. Control Theory Technol. 2023, 21, 173–189. [Google Scholar] [CrossRef]
  41. Xin, J.; Li, Z.; Zhang, Y.; Li, N. Efficient real-time path planning with self-evolving particle swarm optimization in dynamic scenarios. Unmanned Syst. 2024, 12, 215–226. [Google Scholar] [CrossRef]
  42. Xin, J.; Kim, J.; Chu, S.; Li, N. OkayPlan: Obstacle Kinematics Augmented Dynamic real-time path Planning via particle swarm optimization. Ocean Eng. 2024, 303, 117841. [Google Scholar] [CrossRef]
  43. Promkaew, N.; Thammawiset, S.; Srisan, P.; Sanitchon, P.; Tummawai, T.; Sukpancharoen, S. Development of metaheuristic algorithms for efficient path planning of autonomous mobile robots in indoor environments. Results Eng. 2024, 22, 102280. [Google Scholar] [CrossRef]
  44. Menebröker, F.; Stadtler, J.; Mohamed, M. Mobile Robot Path Planning Under Kinematic Constraints by Metaheuristic B-Spline Optimization. In Proceedings of the 2025 11th International Conference on Automation, Robotics, and Applications (ICARA), Zagreb, Croatia, 12–14 February 2025; IEEE: New York, NY, USA, 2025; pp. 224–229. [Google Scholar]
  45. Mohaghegh, M.; Jafarpourdavatgar, H.; Saeedinia, S.A. New design of smooth PSO-IPF navigator with kinematic constraints. IEEE Access 2024, 12, 175108–175121. [Google Scholar] [CrossRef]
  46. Zhuang, H.; Jiang, C. RRT* path planning method for mobile robot based on particle swarm optimization and rotation angle constraint. In Proceedings of the International Conference Optoelectronic Information and Optical Engineering (OIOE2024), Kunming, China, 8–10 March 2024; SPIE: Philadelphia, PA, USA, 2025; Volume 13513, pp. 593–602. [Google Scholar]
  47. Patle, B.; Pandey, A.; Parhi, D.; Jagadeesh, A. A review: On path planning strategies for navigation of mobile robot. Def. Technol. 2019, 15, 582–606. [Google Scholar] [CrossRef]
  48. Ahmed, G.; Sheltami, T.; Ghaleb, M.; Hamdan, M.; Mahmoud, A.; Yasar, A. Energy-efficient internet of drones path-planning study using meta-heuristic algorithms. Appl. Sci. 2024, 14, 2418. [Google Scholar] [CrossRef]
  49. Wang, Y.; Zhou, H.; Wang, Y. Mobile robot dynamic path planning based on improved genetic algorithm. AIP Conf. Proc. 2017, 1864, 020046. [Google Scholar] [CrossRef]
  50. Lamini, C.; Benhlima, S.; Elbekri, A. Genetic algorithm based approach for autonomous mobile robot path planning. Procedia Comput. Sci. 2018, 127, 180–189. [Google Scholar] [CrossRef]
  51. Mao, L.; Ji, X.; Qin, F. A robot obstacle avoidance method based on improved genetic algorithm. In Proceedings of the 2018 11th International Conference on Intelligent Computation Technology and Automation (ICICTA), Changsha, China, 22–23 September 2018; IEEE: New York, NY, USA, 2018; pp. 327–331. [Google Scholar]
  52. Nazarahari, M.; Khanmirza, E.; Doostie, S. Multi-objective multi-robot path planning in continuous environment using an enhanced genetic algorithm. Expert Syst. Appl. 2019, 115, 106–120. [Google Scholar] [CrossRef]
  53. Li, Y.; Huang, Z.; Xie, Y. Path planning of mobile robot based on improved genetic algorithm. In Proceedings of the 2020 3rd International Conference on Electron Device and Mechanical Engineering (ICEDME), Suzhou, China, 1–3 May 2020; IEEE: New York, NY, USA, 2020; pp. 691–695. [Google Scholar]
  54. Li, Y.; Dong, D.; Guo, X. Mobile robot path planning based on improved genetic algorithm with A-star heuristic method. In Proceedings of the 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Online, 11–13 December 2020; IEEE: New York, NY, USA, 2020; Volume 9, pp. 1306–1311. [Google Scholar]
  55. Wang, M. Real-time path optimization of mobile robots based on improved genetic algorithm. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng. 2021, 235, 646–651. [Google Scholar] [CrossRef]
  56. Zhang, Z.; Lu, R.; Zhao, M.; Luan, S.; Bu, M. Robot path planning based on genetic algorithm with hybrid initialization method. J. Intell. Fuzzy Syst. 2021, 42, 2041–2056. [Google Scholar] [CrossRef]
  57. Li, K.; Hu, Q.; Liu, J. Path planning of mobile robot based on improved multiobjective genetic algorithm. Wirel. Commun. Mob. Comput. 2021, 2021, 8836615. [Google Scholar] [CrossRef]
  58. Suresh, K.; Venkatesan, R.; Venugopal, S. Mobile robot path planning using multi-objective genetic algorithm in industrial automation. Soft Comput. 2022, 26, 7387–7400. [Google Scholar] [CrossRef]
  59. Huang, F.; Fu, H.; Chen, J.; Wang, X. Mobile robot path planning based on improved genetic algorithm. In Proceedings of the 2021 4th World Conference on Mechanical Engineering and Intelligent Manufacturing (WCMEIM), Wuhan, China, 29 November–1 December 2021; pp. 378–383. [Google Scholar] [CrossRef]
  60. Shi, K.; Huang, L.; Jiang, D.; Sun, Y.; Tong, X.; Xie, Y.; Fang, Z. Path planning optimization of intelligent vehicle based on improved genetic and ant colony hybrid algorithm. Front. Bioeng. Biotechnol. 2022, 10, 905983. [Google Scholar] [CrossRef]
  61. Rahmaniar, W.; Rakhmania, A.E. Mobile Robot Path Planning in a Trajectory with Multiple Obstacles Using Genetic Algorithms. J. Robot. Control 2022, 3, 1–7. [Google Scholar] [CrossRef]
  62. Ab Wahab, M.N.; Nazir, A.; Khalil, A.; Ho, W.J.; Akbar, M.F.; Noor, M.H.M.; Mohamed, A.S.A. Improved genetic algorithm for mobile robot path planning in static environments. Expert Syst. Appl. 2024, 249, 123762. [Google Scholar] [CrossRef]
  63. Rajendran, P.A.; Othman, M. A Comparative Study on Ant-Colony Algorithm and Genetic Algorithm for Mobile Robot Planning. In Proceedings of the International Conference on Soft Computing and Data Mining, Putrajaya, Malaysia, 21–22 August 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 286–295. [Google Scholar]
  64. Zhu, J.; Pan, D. Improved Genetic Algorithm for Solving Robot Path Planning Based on Grid Maps. Mathematics 2024, 12, 4017. [Google Scholar] [CrossRef]
  65. Mankudiyil, R.; Dornberger, R.; Hanne, T. Improved Genetic Algorithm in a Static Environment for the Robotic Path Planning Problem. In Proceedings of the International Conference on Data Science and Applications (ICDSA 2023), Jaipur, India, 14–15 July 2023; Springer: Berlin/Heidelberg, Germany, 2024; Volume 819, pp. 217–230. [Google Scholar]
  66. Zhang, Z.; Yang, H.; Bai, X.; Zhang, S.; Xu, C. The Path Planning of Mobile Robots Based on an Improved Genetic Algorithm. Appl. Sci. 2025, 15, 3700. [Google Scholar] [CrossRef]
  67. Balza, M.; Goldbarg, M.A.; Silva, S.N.; Silva, L.M.; Fernandes, M.A. A Real-Time Safe Navigation Proposal for Mobile Robots in Unknown Environments Using Meta-Heuristics. IEEE Access 2025, 13, 23987–24013. [Google Scholar] [CrossRef]
  68. Dorigo, M.; Gambardella, L.M. Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1997, 1, 53–66. [Google Scholar] [CrossRef]
  69. Yang, Y.; Xiong, X.; Yan, Y. UAV Formation Trajectory Planning Algorithms: A Review. Drones 2023, 7, 62. [Google Scholar] [CrossRef]
  70. Zhang, H.Y.; Lin, W.M.; Chen, A.X. Path planning for the mobile robot: A review. Symmetry 2018, 10, 450. [Google Scholar] [CrossRef]
  71. Ajeil, F.H.; Ibraheem, I.K.; Azar, A.T.; Humaidi, A.J. Grid-based mobile robot path planning using aging-based ant colony optimization algorithm in static and dynamic environments. Sensors 2020, 20, 1880. [Google Scholar] [CrossRef]
  72. Akka, K.; Khaber, F. Mobile robot path planning using an improved ant colony optimization. Int. J. Adv. Robot. Syst. 2018, 15, 1729881418774673. [Google Scholar] [CrossRef]
  73. Chen, G.; Liu, J. Mobile robot path planning using ant colony algorithm and improved potential field method. Comput. Intell. Neurosci. 2019, 2019, 1932812. [Google Scholar] [CrossRef]
  74. Liu, Y.; Ma, J.; Zang, S.; Min, Y. Dynamic path planning of mobile robot based on improved ant colony optimization algorithm. In Proceedings of the 2019 8th International Conference on Networks, Communication and Computing, Luoyang, China, 13–15 December 2019; pp. 248–252. [Google Scholar]
  75. Song, Q.; Li, S.; Yang, J.; Bai, Q.; Hu, J.; Zhang, X.; Zhang, A. Intelligent Optimization Algorithm-Based Path Planning for a Mobile Robot. Comput. Intell. Neurosci. 2021, 2021, 8025730. [Google Scholar] [CrossRef]
  76. Luo, Q.; Wang, H.; Zheng, Y.; He, J. Research on path planning of mobile robot based on improved ant colony algorithm. Neural Comput. Appl. 2020, 32, 1555–1566. [Google Scholar] [CrossRef]
  77. Sangeetha, V.; Krishankumar, R.; Ravichandran, K.; Kar, S. Energy-efficient green ant colony optimization for path planning in dynamic 3D environments. Soft Comput. 2021, 25, 4749–4769. [Google Scholar] [CrossRef]
  78. Miao, C.; Chen, G.; Yan, C.; Wu, Y. Path planning optimization of indoor mobile robot based on adaptive ant colony algorithm. Comput. Ind. Eng. 2021, 156, 107230. [Google Scholar] [CrossRef]
  79. Hou, W.; Xiong, Z.; Wang, C.; Chen, H. Enhanced ant colony algorithm with communication mechanism for mobile robot path planning. Robot. Auton. Syst. 2022, 148, 103949. [Google Scholar] [CrossRef]
  80. Gong, C.; Yang, Y.; Yuan, L.; Wang, J. An improved ant colony algorithm for integrating global path planning and local obstacle avoidance for mobile robot in dynamic environment. Math. Biosci. Eng. 2022, 19, 12405–12426. [Google Scholar] [CrossRef] [PubMed]
  81. Yang, L.; Fu, L.; Li, P.; Mao, J.; Guo, N. An Effective Dynamic Path Planning Approach for Mobile Robots Based on Ant Colony Fusion Dynamic Windows. Machines 2022, 10, 50. [Google Scholar] [CrossRef]
  82. Wang, Q.; Li, J.; Yang, L.; Yang, Z.; Li, P.; Xia, G. Distributed Multi-Mobile Robot Path Planning and Obstacle Avoidance Based on ACO–DWA in Unknown Complex Terrain. Electronics 2022, 11, 2144. [Google Scholar] [CrossRef]
  83. Huang, H.; Tan, G.; Jiang, L. Robot Path Planning Using Improved Ant Colony Algorithm in the Environment of Internet of Things. J. Robot. 2022, 2022, 1739884. [Google Scholar] [CrossRef]
  84. Gajendra, K.; Thivagar, K.; Karna, V.V.R. Path Planning and Trajectory Control of Autonomous Robot Using Metaheuristic Algorithms. In Proceedings of the International Conference on Algorithms and Computational Theory for Engineering Applications, Online, 2–3 February 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 285–292. [Google Scholar]
  85. Ma, J.; Liu, Q.; Yang, Z.; Wang, B. Improved Trimming Ant Colony Optimization Algorithm for Mobile Robot Path Planning. Algorithms 2025, 18, 240. [Google Scholar] [CrossRef]
  86. Si, J.; Bao, X. A novel parallel ant colony optimization algorithm for mobile robot path planning. Math. Biosci. Eng. 2024, 21, 2568–2586. [Google Scholar] [CrossRef]
  87. Li, P.; Wei, L.; Wu, D. An Intelligently Enhanced Ant Colony Optimization Algorithm for Global Path Planning of Mobile Robots in Engineering Applications. Sensors 2025, 25, 1326. [Google Scholar] [CrossRef] [PubMed]
  88. Liu, J.; Qian, Y.; Zhang, W.; Ji, M.; Xv, Q.; Song, H. High-safety path optimization for mobile robots using an improved ant colony algorithm with integrated repulsive field rules. Robot. Auton. Syst. 2025, 190, 104998. [Google Scholar] [CrossRef]
  89. Nor Azmi, S.N.L.K.; Rafique, M.; Anwar Apandi, N.I.; Md Noar, N.A.Z. Investigation of Autonomous Mobile Robot Path Planning with Edge Cloud Based on Ant Colony Optimization. In Proceedings of the Smart and Sustainable Industrial Ecosystem Conference, Kuala Lumpur, Malaysia, 5–6 August 2025; Springer Nature: Singapore, 2025; pp. 51–56. [Google Scholar]
  90. Li, Q.; Li, Q.; Cui, B. Enhanced Ant Colony Algorithm Based on Islands for Mobile Robot Path Planning. Appl. Sci. 2025, 15, 7023. [Google Scholar] [CrossRef]
  91. Yang, X.S.; He, X. Firefly algorithm: Recent advances and applications. arXiv 2013, arXiv:1308.3898. [Google Scholar] [CrossRef]
  92. Banerjee, A.; Singh, D.; Sahana, S.; Nath, I. Chapter 3—Impacts of metaheuristic and swarm intelligence approach in optimization. In Cognitive Big Data Intelligence with a Metaheuristic Approach; Mishra, S., Tripathy, H.K., Mallick, P.K., Sangaiah, A.K., Chae, G.S., Eds.; Cognitive Data Science in Sustainable Computing; Academic Press: Cambridge, MA, USA, 2022; pp. 71–99. [Google Scholar] [CrossRef]
  93. Li, J.; Wei, X.; Li, B.; Zeng, Z. A survey on firefly algorithms. Neurocomputing 2022, 500, 662–678. [Google Scholar] [CrossRef]
  94. Nayak, J.; Naik, B.; Dinesh, P.; Vakula, K.; Dash, P.B. Firefly algorithm in biomedical and health care: Advances, issues and challenges. SN Comput. Sci. 2020, 1, 311. [Google Scholar] [CrossRef]
  95. Karur, K.; Sharma, N.; Dharmatti, C.; Siegel, J.E. A survey of path planning algorithms for mobile robots. Vehicles 2021, 3, 448–468. [Google Scholar] [CrossRef]
  96. Duan, P.; Li, J.; Sang, H.; Han, Y.; Sun, Q. A Developed Firefly Algorithm for Multi-Objective Path Planning Optimization Problem. In Proceedings of the 2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Tianjin, China, 19–23 July 2018; IEEE: New York, NY, USA, 2018; pp. 1393–1397. [Google Scholar]
  97. Patle, B.; Pandey, A.; Jagadeesh, A.; Parhi, D.R. Path planning in uncertain environment by using firefly algorithm. Def. Technol. 2018, 14, 691–701. [Google Scholar] [CrossRef]
  98. Fu, H.; Liu, X. A Path Planning Method for Mobile Robots Based on Fuzzy Firefly Algorithms. Recent Adv. Comput. Sci. Commun. 2021, 14, 3040–3045. [Google Scholar] [CrossRef]
  99. Abbas, N.A.F. Mobile Robot Path Planning Optimization Based on Integration of Firefly Algorithm and Quadratic Polynomial Equation. In Proceedings of the International Conference on Intelligent Systems & Networks, Hanoi, Vietnam, 9 March 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 538–547. [Google Scholar]
  100. Zhang, T.W.; Xu, G.H.; Zhan, X.S.; Han, T. A new hybrid algorithm for path planning of mobile robot. J. Supercomput. 2022, 78, 4158–4181. [Google Scholar] [CrossRef]
  101. Patle, B.; Pagar, N.; Parhi, D.; Sanap, S. Hybrid FA-GA Controller for Path Planning of Mobile Robot. In Proceedings of the 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), Hyderabad, India, 21–23 July 2022; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar]
  102. Ab Wahab, M.N.; Nazir, A.; Khalil, A.; Bhatt, B.; Noor, M.H.M.; Akbar, M.F.; Mohamed, A.S.A. Optimised path planning using Enhanced Firefly Algorithm for a mobile robot. PLoS ONE 2024, 19, e0308264. [Google Scholar] [CrossRef] [PubMed]
  103. Patle, B.; Patel, A.J.; Kashyap, S.K. Self-Directed Mobile Robot Navigation Based on Functional Firefly Algorithm and Choice Function. Eng 2023, 4, 2656–2681. [Google Scholar] [CrossRef]
  104. Achouri, M.; Zennir, Y. Path planning and tracking of wheeled mobile robot: Using firefly algorithm and kinematic controller combined with sliding mode control. J. Braz. Soc. Mech. Sci. Eng. 2024, 46, 228. [Google Scholar] [CrossRef]
  105. Muhsen, D.K.; Raheem, F.A.; Yusof, Y.; Sadiq, A.T.; Al Alawy, F. Improved Rapidly-Exploring Random Tree using Firefly Algorithm for Robot Path Planning. J. Soft Comput. Comput. Appl. 2024, 1, 1. [Google Scholar] [CrossRef]
  106. Sood, M.; Panchal, V.K. Optimal path planning with hybrid firefly algorithm and cuckoo search optimisation. Int. J. Adv. Intell. Paradig. 2024, 27, 223–248. [Google Scholar] [CrossRef]
  107. Tian, T.; Liang, Z.; Wei, Y.; Luo, Q.; Zhou, Y. Hybrid whale optimization with a firefly algorithm for function optimization and mobile robot path planning. Biomimetics 2024, 9, 39. [Google Scholar] [CrossRef]
  108. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  109. Wang, J.S.; Li, S.X. An improved grey wolf optimizer based on differential evolution and elimination mechanism. Sci. Rep. 2019, 9, 7181. [Google Scholar] [CrossRef]
  110. Qu, C.; Gai, W.; Zhang, J.; Zhong, M. A novel hybrid grey wolf optimizer algorithm for unmanned aerial vehicle (UAV) path planning. Knowl.-Based Syst. 2020, 194, 105530. [Google Scholar] [CrossRef]
  111. Jamshidi, V.; Nekoukar, V.; Refan, M.H. Real time UAV path planning by parallel grey wolf optimization with align coefficient on CAN bus. Clust. Comput. 2021, 24, 2495–2509. [Google Scholar] [CrossRef]
  112. Seyyedabbasi, A.; Kiani, F. I-GWO and Ex-GWO: Improved algorithms of the Grey Wolf Optimizer to solve global optimization problems. Eng. Comput. 2021, 37, 509–532. [Google Scholar] [CrossRef]
  113. Kiani, F.; Seyyedabbasi, A.; Nematzadeh, S.; Candan, F.; Çevik, T.; Anka, F.A.; Randazzo, G.; Lanza, S.; Muzirafuti, A. Adaptive metaheuristic-based methods for autonomous robot path planning: Sustainable agricultural applications. Appl. Sci. 2022, 12, 943. [Google Scholar] [CrossRef]
  114. Zhou, M.; Wang, Z.; Wang, J.; Dong, Z. A Hybrid Path Planning and Formation Control Strategy of Multi-Robots in a Dynamic Environment. J. Adv. Comput. Intell. Intell. Inform. 2022, 26, 342–354. [Google Scholar] [CrossRef]
  115. Li, H.; Lv, T.; Shui, Y.; Zhang, J.; Zhang, H.; Zhao, H.; Ma, S. An improved grey wolf optimizer with weighting functions and its application to unmanned aerial vehicles path planning. Comput. Electr. Eng. 2023, 111, 108893. [Google Scholar] [CrossRef]
  116. Liu, L.; Li, L.; Nian, H.; Lu, Y.; Zhao, H.; Chen, Y. Enhanced grey wolf optimization algorithm for mobile robot path planning. Electronics 2023, 12, 4026. [Google Scholar] [CrossRef]
  117. Chen, W.; Liu, L.; Zhang, L.; Lin, Z.; Chen, J.; He, D. Path Planning of Mobile Robots with an Improved Grey Wolf Optimizer and Dynamic Window Approach. Appl. Sci. 2025, 15, 3999. [Google Scholar] [CrossRef]
  118. Teng, Z.; Dong, Q.; Zhang, Z.; Huang, S.; Zhang, W.; Wang, J.; Li, J.; Chen, X. An Improved Grey Wolf Optimizer Inspired by Advanced Cooperative Predation for UAV Shortest Path Planning. arXiv 2025, arXiv:2506.03663. [Google Scholar] [CrossRef]
  119. Ajeil, F.H.; Ibraheem, I.K.; Humaidi, A.J.; Khan, Z.H. A novel path planning algorithm for mobile robot in dynamic environments using modified bat swarm optimization. J. Eng. 2021, 2021, 37–48. [Google Scholar] [CrossRef]
  120. Abed, M.S.; Lutfy, O.F.; Al-Doori, Q.F. Online Path Planning of Mobile Robots Based on African Vultures Optimization Algorithm in Unknown Environments. J. Eur. Syst. Autom. 2022, 55, 405–412. [Google Scholar] [CrossRef]
  121. Reguii, I.; Hassani, I.; Rekik, C. Mobile Robot Navigation Using Planning Algorithm and Sliding Mode Control in a Cluttered Environment. J. Robot. Control 2022, 3, 166–175. [Google Scholar] [CrossRef]
  122. Abdulsaheb, J.A.; Kadhim, D.J. Robot Path Planning in Unknown Environments with Multi-Objectives Using an Improved COOT Optimization Algorithm. Int. J. Intell. Eng. Syst. 2022, 15, 548–565. [Google Scholar] [CrossRef]
  123. Loganathan, A.; Ahmad, N.S. A hybrid HHO-AVOA for path planning of a differential wheeled mobile robot in static and dynamic environments. IEEE Access 2024, 12, 25967–25979. [Google Scholar] [CrossRef]
  124. Das, P.; Parhi, D.R.; Mahapatro, A.; Dash, H.S.; Prakash, V. Navigational Analysis of Legged Robot Using the Modified African Vulture Optimization Algorithm. Res. Sq. 2025, preprint. [Google Scholar] [CrossRef]
  125. Zhu, W.; Kuang, X.; Jiang, H. Unmanned Aerial Vehicle Path Planning Based on Sparrow-Enhanced African Vulture Optimization Algorithm. Appl. Sci. 2025, 15, 8461. [Google Scholar] [CrossRef]
  126. Deshpande, S.; Kashyap, A.K.; Patle, B.K. A review on path planning AI techniques for mobile robots. Robot. Syst. Appl. 2023, 3, 27–46. [Google Scholar] [CrossRef]
  127. Sharma, U.; Rani, S.; Anand, A. Path Planning and Navigation in Autonomous Mobile Robots (AMRs): A Review. In Proceedings of the International Conference on Artificial-Business Analytics, Quantum and Machine Learning, Faridabad, India, 14–15 July 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 517–529. [Google Scholar]
  128. Zhang, Z.; Fu, H.; Yang, J.; Lin, Y. Deep reinforcement learning for path planning of autonomous mobile robots in complicated environments. Complex Intell. Syst. 2025, 11, 277. [Google Scholar] [CrossRef]
  129. Wang, Y.; Li, X.; Zhang, J.; Li, S.; Xu, Z.; Zhou, X. Review of wheeled mobile robot collision avoidance under unknown environment. Sci. Prog. 2021, 104, 00368504211037771. [Google Scholar] [CrossRef] [PubMed]
  130. Boujelben, M.s.; Rekik, C.; Derbel, N. A reactive approach for mobile robot navigation in static and dynamic environment using fuzzy logic control. Int. J. Model. Identif. Control 2017, 27, 293–302. [Google Scholar] [CrossRef]
  131. Singh, N.H.; Thongam, K. Mobile robot navigation using fuzzy logic in static environments. Procedia Comput. Sci. 2018, 125, 11–17. [Google Scholar] [CrossRef]
  132. Wang, C.; Piao, Z.; Li, L.; Sun, W.; Wang, J.; Li, C. Path planning for mobile robot using fuzzy controllers with artificial potential field. In Proceedings of the 2019 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 17–19 October 2019; IEEE: New York, NY, USA, 2019; pp. 391–396. [Google Scholar]
  133. Dirik, M. Collision-free mobile robot navigation using fuzzy logic approach. Int. J. Comput. Appl. 2018, 179, 33–39. [Google Scholar] [CrossRef]
  134. Kim, C.; Kim, Y.; Yi, H. Fuzzy analytic hierarchy process-based mobile robot path planning. Electronics 2020, 9, 290. [Google Scholar] [CrossRef]
  135. Kim, C.; Suh, J.; Han, J.H. Development of a hybrid path planning algorithm and a bio-inspired control for an omni-wheel mobile robot. Sensors 2020, 20, 4258. [Google Scholar] [CrossRef]
  136. Tao, Y.; Gao, H.; Ren, F.; Chen, C.; Wang, T.; Xiong, H.; Jiang, S. A mobile service robot global path planning method based on ant colony optimization and fuzzy control. Appl. Sci. 2021, 11, 3605. [Google Scholar] [CrossRef]
  137. Lin, Z.; Yue, M.; Chen, G.; Sun, J. Path planning of mobile robot with PSO-based APF and fuzzy-based DWA subject to moving obstacles. Trans. Inst. Meas. Control 2022, 44, 121–132. [Google Scholar] [CrossRef]
  138. Sathiya, V.; Chinnadurai, M.; Ramabalan, S. Mobile robot path planning using fuzzy enhanced improved Multi-Objective particle swarm optimization (FIMOPSO). Expert Syst. Appl. 2022, 198, 116875. [Google Scholar] [CrossRef]
  139. Zhu, Y.; Lu, T. A Fuzzy-Based Improved Dynamic Window Approach for Path Planning of Mobile Robot. In Proceedings of the International Conference on Intelligent Robotics and Applications, Hangzhou, China, 5–7 July 2023; Springer: Berlin/Heidelberg, Germany, 2023; Volume 14270, pp. 586–597. [Google Scholar]
  140. Sun, Y.; Wang, W.; Xu, M.; Huang, L.; Shi, K.; Zou, C.; Chen, B. Local path planning for mobile robots based on fuzzy dynamic window algorithm. Sensors 2023, 23, 8260. [Google Scholar] [CrossRef] [PubMed]
  141. Kumar, A.; Sahasrabudhe, A.; Nirgude, S. Fuzzy Logic Control for Indoor Navigation of Mobile Robots. arXiv 2024, arXiv:2409.02437. [Google Scholar] [CrossRef]
  142. Hentout, A.; Maoudj, A.; Kouider, A. Shortest path planning and efficient fuzzy logic control of mobile robots in indoor static and dynamic environments. Sci. Technol. 2024, 27, 21–36. [Google Scholar] [CrossRef]
  143. Nguyen Minh, H.; Trinh An, H.; Tran Thi Cam, G.; Dinh Thi Ha, L.; Do Quoc, H. Fuzzy Logic and Quadtree-Based Control for Mobile Robots in Dynamic Environments. In Proceedings of the International Conference on Intelligent Robotics and Applications, Okayama, Japan, 6–9 August 2025; Springer: Berlin/Heidelberg, Germany, 2025; Volume 15205, pp. 429–445. [Google Scholar]
  144. Hu, L.; Wei, C.; Yin, L. Fuzzy A* quantum multi-stage Q-learning artificial potential field for path planning of mobile robots. Eng. Appl. Artif. Intell. 2025, 141, 109866. [Google Scholar] [CrossRef]
  145. Katona, K.; Neamah, H.A.; Korondi, P. Obstacle Avoidance and Path Planning Methods for Autonomous Navigation of Mobile Robot. Sensors 2024, 24, 3573. [Google Scholar] [CrossRef]
  146. Peng, G.; Yang, C.; He, W.; Chen, C.P. Force sensorless admittance control with neural learning for robots with actuator saturation. IEEE Trans. Ind. Electron. 2019, 67, 3138–3148. [Google Scholar] [CrossRef]
  147. Khnissi, K.; Seddik, C.; Seddik, H. Smart navigation of mobile robot using neural network controller. In Proceedings of the 2018 International Conference on Smart Communications in Network Technologies (SaCoNeT), El Oued, Algeria, 27–31 October 2018; IEEE: New York, NY, USA, 2018; pp. 205–210. [Google Scholar]
  148. Dewi, T.; Risma, P.; Oktarina, Y.; Nawawi, M. Neural network simulation for obstacle avoidance and wall follower robot as a helping tool for teaching-learning process in classroom. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2018; Volume 403, p. 012043. [Google Scholar]
  149. Yuan, J.; Wang, H.; Lin, C.; Liu, D.; Yu, D. A novel GRU-RNN network model for dynamic path planning of mobile robot. IEEE Access 2019, 7, 15140–15151. [Google Scholar] [CrossRef]
  150. Zheng, W.; Wang, H.B.; Zhang, Z.M.; Li, N.; Yin, P.H. Multi-layer feed-forward neural network deep learning control with hybrid position and virtual-force algorithm for mobile robot obstacle avoidance. Int. J. Control. Autom. Syst. 2019, 17, 1007–1018. [Google Scholar] [CrossRef]
  151. Zhu, Q.; Han, Y.; Liu, P.; Xiao, Y.; Lu, P.; Cai, C. Motion planning of autonomous mobile robot using recurrent fuzzy neural network trained by extended Kalman filter. Comput. Intell. Neurosci. 2019, 2019, 1934575. [Google Scholar] [CrossRef] [PubMed]
  152. Pandey, K.K.; Parhi, D.R. Trajectory planning and the target search by the mobile robot in an environment using a behavior-based neural network approach. Robotica 2020, 38, 1627–1641. [Google Scholar] [CrossRef]
  153. Khnissi, K.; Jabeur, C.B.; Seddik, H. A smart mobile robot commands predictor using recursive neural network. Robot. Auton. Syst. 2020, 131, 103593. [Google Scholar] [CrossRef]
  154. Wang, J.; Chi, W.; Li, C.; Wang, C.; Meng, M.Q.H. Neural RRT*: Learning-Based Optimal Path Planning. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1748–1758. [Google Scholar] [CrossRef]
  155. Qureshi, A.H.; Miao, Y.; Simeonov, A.; Yip, M.C. Motion Planning Networks: Bridging the Gap Between Learning-Based and Classical Motion Planners. IEEE Trans. Robot. 2021, 37, 48–66. [Google Scholar] [CrossRef]
  156. Ren, Z.; Lai, J.; Wu, Z.; Xie, S. Deep neural networks-based real-time optimal navigation for an automatic guided vehicle with static and dynamic obstacles. Neurocomputing 2021, 443, 329–344. [Google Scholar] [CrossRef]
  157. Farag, K.K.A.; Shehata, H.H.; El-Batsh, H.M. Mobile robot obstacle avoidance based on neural network with a standardization technique. J. Robot. 2021, 2021, 1129872. [Google Scholar] [CrossRef]
  158. Molina-Leal, A.; Gómez-Espinosa, A.; Escobedo Cabello, J.A.; Cuan-Urquizo, E.; Cruz-Ramírez, S.R. Trajectory Planning for a Mobile Robot in a Dynamic Environment Using an LSTM Neural Network. Appl. Sci. 2021, 11, 10689. [Google Scholar] [CrossRef]
  159. Chen, Y.; Cheng, C.; Zhang, Y.; Li, X.; Sun, L. A neural network-based navigation approach for autonomous mobile robot systems. Appl. Sci. 2022, 12, 7796. [Google Scholar] [CrossRef]
  160. Guan, L.; Lu, Y.; He, Z.; Chen, X. Intelligent Obstacle Avoidance Algorithm for Mobile Robots in Uncertain Environment. J. Robot. 2022, 2022, 8954060. [Google Scholar] [CrossRef]
  161. Li, J.; Wang, S.; Chen, Z.; Kan, Z.; Yu, J. Lightweight neural path planning. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; IEEE: New York, NY, USA, 2023; pp. 6713–6718. [Google Scholar]
  162. Galarza-Falfan, J.; García-Guerrero, E.E.; Aguirre-Castro, O.A.; López-Bonilla, O.R.; Tamayo-Pérez, U.J.; Cárdenas-Valdez, J.R.; Hernández-Mejía, C.; Borrego-Dominguez, S.; Inzunza-Gonzalez, E. Path Planning for Autonomous Mobile Robot Using Intelligent Algorithms. Technologies 2024, 12, 82. [Google Scholar] [CrossRef]
  163. Zhang, J.; Chen, H.; Sun, H.; Xu, H.; Yan, T. Convolutional neural network-based deep Q-network (CNN-DQN) path planning method for mobile robots. Intell. Serv. Robot. 2025, 18, 929–950. [Google Scholar] [CrossRef]
  164. Teng, Y.; Feng, T.; Li, J.; Chen, S.; Tang, X. A Dual-Layer Symmetric Multi-Robot Path Planning System Based on an Improved Neural Network-DWA Algorithm. Symmetry 2025, 17, 85. [Google Scholar] [CrossRef]
  165. Sahoo, B.; Das, D.; Pujhari, K.C.; Vikas. Optimization of route planning for the mobile robot using a hybrid Neuro-IWO technique. Int. J. Inf. Technol. 2025, 17, 1431–1439. [Google Scholar] [CrossRef]
  166. Chen, S.; Feng, T.; Li, J.; Yang, S.X. Research on Intelligent Path Planning of Mobile Robot Based on Hybrid Symmetric Bio-Inspired Neural Network Algorithm in Complex Road Environments. Symmetry 2025, 17, 836. [Google Scholar] [CrossRef]
  167. Huang, H.; Yang, C.; Chen, C.P. Optimal robot–environment interaction under broad fuzzy neural adaptive control. IEEE Trans. Cybern. 2020, 51, 3824–3835. [Google Scholar] [CrossRef]
  168. Le, M. Robust Deep Neural Networks Inspired by Fuzzy Logic. arXiv 2019, arXiv:1911.08635. [Google Scholar]
  169. Zhang, S.; Sakulyeva, T.; Pitukhin, E.; Doguchaeva, S. Neuro-fuzzy and soft computing-A computational approach to learning and artificial intelligence. Int. Rev. Autom. Control 2020, 13, 191–199. [Google Scholar] [CrossRef]
  170. Brahimi, S.; Azouaoui, O.; Loudini, M. Intelligent mobile robot navigation using a neuro-fuzzy approach. Int. J. Comput. Aided Eng. Technol. 2019, 11, 710–726. [Google Scholar] [CrossRef]
  171. Pandey, A.; Kashyap, A.K.; Parhi, D.R.; Patle, B. Autonomous mobile robot navigation between static and dynamic obstacles using multiple ANFIS architecture. World J. Eng. 2019, 16, 275–286. [Google Scholar] [CrossRef]
  172. Gharajeh, M.S.; Jond, H.B. Hybrid global positioning system-adaptive neuro-fuzzy inference system based autonomous mobile robot navigation. Robot. Auton. Syst. 2020, 134, 103669. [Google Scholar] [CrossRef]
  173. Batti, H.; Ben Jabeur, C.; Seddik, H. Autonomous smart robot for path predicting and finding in maze based on fuzzy and neuro-Fuzzy approaches. Asian J. Control 2021, 23, 3–12. [Google Scholar] [CrossRef]
  174. Gharajeh, M.S.; Jond, H.B. An intelligent approach for autonomous mobile robots path planning based on adaptive neuro-fuzzy inference system. Ain Shams Eng. J. 2022, 13, 101491. [Google Scholar] [CrossRef]
  175. Haider, M.H.; Wang, Z.; Khan, A.A.; Ali, H.; Zheng, H.; Usman, S.; Kumar, R.; Bhutta, M.U.M.; Zhi, P. Robust mobile robot navigation in cluttered environments based on hybrid adaptive neuro-fuzzy inference and sensor fusion. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 9060–9070. [Google Scholar] [CrossRef]
  176. Mishra, D.K.; Thomas, A.; Kuruvilla, J.; Kalyanasundaram, P.; Prasad, K.R.; Haldorai, A. Design of mobile robot navigation controller using neuro-fuzzy logic system. Comput. Electr. Eng. 2022, 101, 108044. [Google Scholar] [CrossRef]
  177. Ayub, S.; Singh, N.; Hussain, M.Z.; Ashraf, M.; Singh, D.K.; Haldorai, A. Hybrid approach to implement multi-robotic navigation system using neural network, fuzzy logic, and bio-inspired optimization methodologies. Comput. Intell. 2023, 39, 592–606. [Google Scholar] [CrossRef]
  178. Stavrinidis, S.; Zacharia, P. An ANFIS-Based Strategy for Autonomous Robot Collision-Free Navigation in Dynamic Environments. Robotics 2024, 13, 124. [Google Scholar] [CrossRef]
  179. Mohanty, P. Path Planning of Mobile Robots Under Uncertain Navigation Environments Using FCM Clustering ANFIS. Wirel. Pers. Commun. 2024, 137, 1251–1276. [Google Scholar] [CrossRef]
  180. Hilali, B.; Ramdani, M.; Naji, A. An efficient strategy for optimizing a neuro-fuzzy controller for mobile robot navigation. Int. J. Electr. Comput. Eng. 2025, 15, 1065–1078. [Google Scholar] [CrossRef]
  181. Mostafanasab, A.; Menhaj, M.B.; Shamshirsaz, M.; Fesharakifard, R. A novel mobile robot path planning method based on neuro-fuzzy controller. AUT J. Math. Comput. 2025, 6, 41–53. [Google Scholar]
  182. Machavaram, R. Intelligent path planning for autonomous ground vehicles in dynamic environments utilizing adaptive Neuro-Fuzzy control. Eng. Appl. Artif. Intell. 2025, 144, 110119. [Google Scholar] [CrossRef]
  183. Saleh, M.S.; Al Mashhadany, Y.I.; Alshaibi, M.; Ameen, F.M.; Algburi, S. Optimal Mobile Robot Navigation for Obstacle Avoidance Based on ANFIS Controller. J. Robot. Control 2025, 6, 484–492. [Google Scholar] [CrossRef]
  184. Albrecht, S.V.; Christianos, F.; Schäfer, L. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches; MIT Press: Cambridge, MA, USA, 2024. [Google Scholar]
  185. Singh, R.; Ren, J.; Lin, X. A review of deep reinforcement learning algorithms for mobile robot path planning. Vehicles 2023, 5, 1423–1451. [Google Scholar] [CrossRef]
  186. Mitchell, T. Reinforcement learning. In Machine Learning; Springer: Berlin/Heidelberg, Germany, 1997; pp. 367–390. [Google Scholar]
  187. Khlif, N.; Nahla, K.; Safya, B. Reinforcement learning with modified exploration strategy for mobile robot path planning. Robotica 2023, 41, 2688–2702. [Google Scholar] [CrossRef]
  188. Shi, Z.; Wang, K.; Zhang, J. Improved reinforcement learning path planning algorithm integrating prior knowledge. PLoS ONE 2023, 18, e0284942. [Google Scholar] [CrossRef] [PubMed]
  189. Zhang, L.; Hou, Z.; Wang, J.; Liu, Z.; Li, W. Robot navigation with reinforcement learned path generation and fine-tuned motion control. IEEE Robot. Autom. Lett. 2023, 8, 4489–4496. [Google Scholar] [CrossRef]
  190. Orozco-Rosas, U.; Picos, K.; Pantrigo, J.J.; Montemayor, A.S.; Cuesta-Infante, A. Mobile robot path planning using a QAPF learning algorithm for known and unknown environments. IEEE Access 2022, 10, 84648–84663. [Google Scholar] [CrossRef]
  191. Gong, H.; Wang, P.; Ni, C.; Cheng, N. Efficient path planning for mobile robot based on deep deterministic policy gradient. Sensors 2022, 22, 3579. [Google Scholar] [CrossRef] [PubMed]
  192. López-Lozada, E.; Rubio-Espino, E.; Sossa-Azuela, J.H.; Ponce-Ponce, V.H. Reactive navigation under a fuzzy rules-based scheme and reinforcement learning for mobile robots. PeerJ Comput. Sci. 2021, 7, e556. [Google Scholar] [CrossRef] [PubMed]
  193. Wang, B.; Liu, Z.; Li, Q.; Prorok, A. Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robot. Autom. Lett. 2020, 5, 6932–6939. [Google Scholar] [CrossRef]
  194. Wen, S.; Wen, Z.; Zhang, D.; Zhang, H.; Wang, T. A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning. Appl. Soft Comput. 2021, 110, 107605. [Google Scholar] [CrossRef]
  195. Bae, H.; Kim, G.; Kim, J.; Qian, D.; Lee, S. Multi-robot path planning method using reinforcement learning. Appl. Sci. 2019, 9, 3057. [Google Scholar] [CrossRef]
  196. Chen, Y.J.; Jhong, B.G.; Chen, M.Y. A Real-Time Path Planning Algorithm Based on the Markov Decision Process in a Dynamic Environment for Wheeled Mobile Robots. Actuators 2023, 12, 166. [Google Scholar] [CrossRef]
  197. Huang, J.; Zhang, Z.; Ruan, X. An Improved Dyna-Q Algorithm Inspired by the Forward Prediction Mechanism in the Rat Brain for Mobile Robot Path Planning. Biomimetics 2024, 9, 315. [Google Scholar] [CrossRef]
  198. Zhou, Q.; Lian, Y.; Wu, J.; Zhu, M.; Wang, H.; Cao, J. An optimized Q-Learning algorithm for mobile robot local path planning. Knowl.-Based Syst. 2024, 286, 111400. [Google Scholar] [CrossRef]
  199. Zhang, Y.; Cui, C.; Zhao, Q. Path Planning of Mobile Robot Based on A Star Algorithm Combining DQN and DWA in Complex Environment. Appl. Sci. 2025, 15, 4367. [Google Scholar] [CrossRef]
  200. Nguyen, A.T.; Pham, D.D.; Le, V.N.; Luu, V.H. Design a path–planning strategy for mobile robot in multi-structured environment based on distributional reinforcement learning. MethodsX 2025, 15, 103554. [Google Scholar] [CrossRef]
  201. Zhang, Y.; Liu, Y.; Chen, Y.; Yang, Z. ARE-QL: An enhanced Q-learning algorithm with optimized search for mobile robot path planning. Phys. Scr. 2025, 100, 036015. [Google Scholar] [CrossRef]
  202. Kober, J.; Bagnell, J.A.; Peters, J. Reinforcement learning in robotics: A survey. Int. J. Robot. Res. 2013, 32, 1238–1274. [Google Scholar] [CrossRef]
  203. Tang, Y.; Zhao, C.; Wang, J.; Zhang, C.; Sun, Q.; Zheng, W.X.; Du, W.; Qian, F.; Kurths, J. Perception and navigation in autonomous systems in the era of learning: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 9604–9624. [Google Scholar] [CrossRef] [PubMed]
  204. Zhao, Y.; Zhang, Y.; Wang, S. A Review of Mobile Robot Path Planning Based on Deep Reinforcement Learning Algorithm. J. Phys. Conf. Ser. 2021, 2138, 012011. [Google Scholar] [CrossRef]
  205. Hoseinnezhad, R. A comprehensive review of deep learning techniques in mobile robot path planning: Categorization and analysis. Appl. Sci. 2025, 15, 2179. [Google Scholar] [CrossRef]
  206. Sun, H.; Zhang, W.; Yu, R.; Zhang, Y. Motion planning for mobile robots—Focusing on deep reinforcement learning: A systematic review. IEEE Access 2021, 9, 69061–69081. [Google Scholar] [CrossRef]
  207. Nguyen, M.; Dubay, R. Application of Deep Learning in Autonomous Mobile Robot Control: An Overview. In Proceedings of the 2025 IEEE International Systems Conference (SysCon), Montreal, QC, Canada, 7–10 April 2025; IEEE: New York, NY, USA, 2025; pp. 1–8. [Google Scholar]
  208. Niroui, F.; Zhang, K.; Kashino, Z.; Nejat, G. Deep reinforcement learning robot for search and rescue applications: Exploration in unknown cluttered environments. IEEE Robot. Autom. Lett. 2019, 4, 610–617. [Google Scholar] [CrossRef]
  209. Gao, J.; Ye, W.; Guo, J.; Li, Z. Deep reinforcement learning for indoor mobile robot path planning. Sensors 2020, 20, 5493. [Google Scholar] [CrossRef]
  210. Xiaoxian, S.; Chenpeng, Y.; Haoran, Z.; Chengju, L.; Qijun, C. Obstacle Avoidance Algorithm for Mobile Robot Based on Deep Reinforcement Learning in Dynamic Environments. In Proceedings of the 2020 IEEE 16th International Conference on Control & Automation (ICCA), Singapore, 9–11 October 2020; IEEE: New York, NY, USA, 2020; pp. 366–372. [Google Scholar]
  211. Gao, X.; Yan, L.; Wang, G.; Wang, T.; Du, N.; Gerada, C. Toward Obstacle Avoidance for Mobile Robots Using Deep Reinforcement Learning Algorithm. In Proceedings of the 2021 IEEE 16th Conference on Industrial Electronics and Applications (ICIEA), Chengdu, China, 1–4 August 2021; IEEE: New York, NY, USA, 2021; pp. 2136–2139. [Google Scholar]
  212. Chen, G.; Pan, L.; Chen, Y.; Xu, P.; Wang, Z.; Wu, P.; Ji, J.; Chen, X. Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation. SN Comput. Sci. 2021, 2, 417. [Google Scholar] [CrossRef]
  213. Yu, X.; Wang, P.; Zhang, Z. Learning-based end-to-end path planning for lunar rovers with safety constraints. Sensors 2021, 21, 796. [Google Scholar] [CrossRef]
  214. Samma, H.; Abubaker, A.; Aremu, M.B.; Abdel-Nasser, M.; El-Ferik, S. Fusion of Visual Attention and Scene Descriptions with Deep Reinforcement Learning for UAV Indoor Autonomous Navigation. IEEE Access 2025, 13, 81298–81311. [Google Scholar] [CrossRef]
  215. Wang, W.; Wu, Z.; Luo, H.; Zhang, B. Path planning method of mobile robot using improved deep reinforcement learning. J. Electr. Comput. Eng. 2022, 2022, 5433988. [Google Scholar] [CrossRef]
  216. Tang, Y.; Chen, Q.; Wei, Y. Robot Obstacle Avoidance Controller Based on Deep Reinforcement Learning. J. Sens. 2022, 2022, 4194747. [Google Scholar] [CrossRef]
  217. Chai, R.; Niu, H.; Carrasco, J.; Arvin, F.; Yin, H.; Lennox, B. Design and experimental validation of deep reinforcement learning-based fast trajectory planning and control for mobile robot in unknown environment. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 5778–5792. [Google Scholar] [CrossRef]
  218. Ren, J.; Huang, X.; Huang, R.N. Efficient Deep Reinforcement Learning for Optimal Path Planning. Electronics 2022, 11, 3628. [Google Scholar] [CrossRef]
  219. Fu, H.; Wang, Q.; He, H. Path-Following Navigation in Crowds with Deep Reinforcement Learning. IEEE Internet Things J. 2024, 11, 20236–20245. [Google Scholar] [CrossRef]
  220. Yin, H.; Wang, C.; Yan, C.; Xiang, X.; Cai, B.; Wei, C. Deep Reinforcement Learning with Multi-Critic TD3 for Decentralized Multi-Robot Path Planning. IEEE Trans. Cogn. Dev. Syst. 2024, 16, 1233–1247. [Google Scholar] [CrossRef]
  221. Ni, J.; Gu, Y.; Tang, G.; Ke, C.; Gu, Y. Cooperative Coverage Path Planning for Multi-Mobile Robots Based on Improved K-Means Clustering and Deep Reinforcement Learning. Electronics 2024, 13, 944. [Google Scholar] [CrossRef]
  222. Han, H.; Wang, J.; Kuang, L.; Han, X.; Xue, H. Improved Robot Path Planning Method Based on Deep Reinforcement Learning. Sensors 2023, 23, 5622. [Google Scholar] [CrossRef] [PubMed]
  223. Liu, H.; Shen, Y.; Yu, S.; Gao, Z.; Wu, T. Deep reinforcement learning for mobile robot path planning. arXiv 2024, arXiv:2404.06974. [Google Scholar] [CrossRef]
  224. Yan, C.; Chen, G.; Li, Y.; Sun, F.; Wu, Y. Immune deep reinforcement learning-based path planning for mobile robot in unknown environment. Appl. Soft Comput. 2023, 145, 110601. [Google Scholar] [CrossRef]
  225. Zhang, J.; Zhao, H. Mobile Robot Path Planning Based on Improved Deep Reinforcement Learning Algorithm. In Proceedings of the 2024 4th International Conference on Neural Networks, Information and Communication (NNICE), Guangzhou, China, 19–21 January 2024; IEEE: New York, NY, USA, 2024; pp. 1758–1761. [Google Scholar]
  226. Tabakis, I.M.; Dasygenis, M. Deep Reinforcement Learning-Based Path Planning for Dynamic and Heterogeneous Environments. In Proceedings of the 2024 Panhellenic Conference on Electronics & Telecommunications (PACET), Thessaloniki, Greece, 28–29 March 2024; IEEE: New York, NY, USA, 2024; pp. 1–4. [Google Scholar]
  227. Vashisth, A.; Ruckin, J.; Magistri, F.; Stachniss, C.; Popovic, M. Deep Reinforcement Learning with Dynamic Graphs for Adaptive Informative Path Planning. IEEE Robot. Autom. Lett. 2024, 9, 7747–7754. [Google Scholar] [CrossRef]
  228. Ozdemir, K.; Tuncer, A. Navigation of autonomous mobile robots in dynamic unknown environments based on dueling double deep Q networks. Eng. Appl. Artif. Intell. 2025, 139, 109498. [Google Scholar] [CrossRef]
  229. Chakraborty, S.; Raghuvanshi, A.S. Adaptive Deep Reinforcement Learning Hybrid Neuro-Fuzzy Inference System Based Path Planning Algorithm for Mobile Robot. J. Field Robot. 2025, 42, 3425–3439. [Google Scholar] [CrossRef]
  230. Hai, X.; Zhu, Z.; Liu, Y.; Khong, A.W.; Wen, C. Resilient real-time decision-making for autonomous mobile robot path planning in complex dynamic environments. IEEE Trans. Ind. Electron. 2025, 72, 11551–11562. [Google Scholar] [CrossRef]
  231. Shah, D.; Osiński, B.; Ichter, B.; Levine, S. LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action. In Proceedings of the 6th Conference on Robot Learning (CoRL 2023), Atlanta, GA, USA, 6–9 November 2023; Proceedings of Machine Learning Research (PMLR): Cambridge, MA, USA, 2023; Volume 205, pp. 492–504. [Google Scholar] [CrossRef]
  232. Ahn, M.; Brohan, A.; Brown, N.; Chebotar, Y.; Cortes, O.; David, B.; Zeng, A. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances. In Proceedings of the Conference on Robot Learning (CoRL), Auckland, New Zealand, 14–18 December 2022. [Google Scholar] [CrossRef]
  233. Liang, J.; Huang, W.; Xia, F.; Xu, P.; Hausman, K.; Ichter, B.; Florence, P.; Zeng, A. Code as Policies: Language Model Programs for Embodied Control. arXiv 2022, arXiv:2209.07753. [Google Scholar] [CrossRef]
  234. Driess, D.; Xia, F.; Sajjadi, M.S.M.; Lynch, C.; Chowdhery, A.; Ichter, B.; Wahid, A.; Tompson, J.; Vuong, Q.; Yu, T.; et al. PaLM-E: An Embodied Multimodal Language Model. In Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Honolulu, HI, USA, 23–29 July 2023; PMLR: Cambridge, MA, USA, 2023; Volume 202, pp. 8469–8488. [Google Scholar]
  235. Zitkovich, B.; Yu, T.; Xu, S.; Xu, P.; Xiao, T.; Xia, F.; Wu, J.; Wohlhart, P.; Welker, S.; Wahid, A.; et al. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. In Proceedings of the Conference on Robot Learning (CoRL 2023), Atlanta, GA, USA, 6–9 November 2023; PMLR: Cambridge, MA, USA, 2023; pp. 2165–2183. [Google Scholar]
  236. Gemini Robotics Team. Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer. arXiv 2025, arXiv:2510.03342. [Google Scholar] [CrossRef]
  237. Zhang, L.; Cai, K.; Sun, Z.; Bing, Z.; Wang, C.; Figueredo, L.; Haddadin, S.; Knoll, A. Motion planning for robotics: A review for sampling-based planners. Biomim. Intell. Robot. 2025, 5, 100207. [Google Scholar] [CrossRef]
  238. Xiao, H.; Chen, C.; Zhang, G.; Chen, C.P. Reinforcement learning-driven dynamic obstacle avoidance for mobile robot trajectory tracking. Knowl.-Based Syst. 2024, 297, 111974. [Google Scholar] [CrossRef]
Figure 1. Study workflow for the systematic review of AMR path planning across metaheuristic and AI-based (learning/reasoning) approaches.
Figure 1. Study workflow for the systematic review of AMR path planning across metaheuristic and AI-based (learning/reasoning) approaches.
Robotics 15 00023 g001
Figure 2. Selection pipeline (2018–2025) for AMR path-planning studies.
Figure 2. Selection pipeline (2018–2025) for AMR path-planning studies.
Robotics 15 00023 g002
Figure 3. Classification of autonomous mobile robot path planning techniques.
Figure 3. Classification of autonomous mobile robot path planning techniques.
Robotics 15 00023 g003
Figure 4. General model of metaheuristic process to generate AMR route.
Figure 4. General model of metaheuristic process to generate AMR route.
Robotics 15 00023 g004
Figure 5. Graphical evolution of particles in a Particle Swarm Optimization (PSO) environment, showing exploration and convergence behavior.
Figure 5. Graphical evolution of particles in a Particle Swarm Optimization (PSO) environment, showing exploration and convergence behavior.
Robotics 15 00023 g005
Figure 6. Simplified flowchart of the PSO algorithm demonstrating position updates based on global and local best values.
Figure 6. Simplified flowchart of the PSO algorithm demonstrating position updates based on global and local best values.
Robotics 15 00023 g006
Figure 7. Illustration of PSO-based path planning in a static environment. The black shaded zones denote static obstacles (non-traversable/forbidden areas). The figure illustrates a feasible waypoint progression ( w p 1 w p 3 ) and an infeasible waypoint ( w p 4 ) that yields a discontinuous segment.
Figure 7. Illustration of PSO-based path planning in a static environment. The black shaded zones denote static obstacles (non-traversable/forbidden areas). The figure illustrates a feasible waypoint progression ( w p 1 w p 3 ) and an infeasible waypoint ( w p 4 ) that yields a discontinuous segment.
Robotics 15 00023 g007
Figure 8. Flowchart of fundamental GA operations for AMR path planning. The initialization step corresponds to generating an initial population of candidate solutions (chromosomes), followed by iterative fitness evaluation and genetic operators until the termination condition is met.
Figure 8. Flowchart of fundamental GA operations for AMR path planning. The initialization step corresponds to generating an initial population of candidate solutions (chromosomes), followed by iterative fitness evaluation and genetic operators until the termination condition is met.
Robotics 15 00023 g008
Figure 9. Illustration of the natural foraging behavior in Ant Colony Optimization (ACO) for solution discovery. The “?” symbol indicates the uncertainty/exploration phase in ACO, where ants have not yet converged to a preferred route around the obstacle.
Figure 9. Illustration of the natural foraging behavior in Ant Colony Optimization (ACO) for solution discovery. The “?” symbol indicates the uncertainty/exploration phase in ACO, where ants have not yet converged to a preferred route around the obstacle.
Robotics 15 00023 g009
Figure 10. Flowchart of the Firefly Algorithm (FA) depicting attractiveness-based movement behavior. Positions are updated iteratively until the goal/termination condition is met.
Figure 10. Flowchart of the Firefly Algorithm (FA) depicting attractiveness-based movement behavior. Positions are updated iteratively until the goal/termination condition is met.
Robotics 15 00023 g010
Figure 11. Schematic of the Grey Wolf Optimizer (GWO) process based on social hierarchy and adaptive hunting. Here, α , β , and δ represent the best three solutions guiding the search; A and C are coefficient vectors used in the position-update step to balance exploration and exploitation.
Figure 11. Schematic of the Grey Wolf Optimizer (GWO) process based on social hierarchy and adaptive hunting. Here, α , β , and δ represent the best three solutions guiding the search; A and C are coefficient vectors used in the position-update step to balance exploration and exploitation.
Robotics 15 00023 g011
Figure 12. Example of AMR navigation amidst moving obstacles, demonstrating dynamic path adaptation.
Figure 12. Example of AMR navigation amidst moving obstacles, demonstrating dynamic path adaptation.
Robotics 15 00023 g012
Figure 13. Feedforward ANN architecture for WMR path planning. A state/feature vector (e.g., goal position and obstacle/location features, or robot state features) is mapped through hidden layers to produce control outputs (e.g., steering/angular velocity and/or speed commands) used for motion control.
Figure 13. Feedforward ANN architecture for WMR path planning. A state/feature vector (e.g., goal position and obstacle/location features, or robot state features) is mapped through hidden layers to produce control outputs (e.g., steering/angular velocity and/or speed commands) used for motion control.
Robotics 15 00023 g013
Figure 14. Basic RL interaction loop for mobile navigation: the robot controller (agent) receives observations from the environment, outputs an action (e.g., motion commands), and the environment returns a reward signal that guides learning toward desired navigation behavior.
Figure 14. Basic RL interaction loop for mobile navigation: the robot controller (agent) receives observations from the environment, outputs an action (e.g., motion commands), and the environment returns a reward signal that guides learning toward desired navigation behavior.
Robotics 15 00023 g014
Figure 15. DRL architecture integrates perception, decision-making, and reward-based learning for AMR trajectory planning.
Figure 15. DRL architecture integrates perception, decision-making, and reward-based learning for AMR trajectory planning.
Robotics 15 00023 g015
Figure 20. Percentage distribution of path planning algorithms (Metaheuristic vs. AI-based) used in dynamic environments based on a 7-year literature review.
Figure 20. Percentage distribution of path planning algorithms (Metaheuristic vs. AI-based) used in dynamic environments based on a 7-year literature review.
Robotics 15 00023 g020
Figure 21. Proportion of reviewed algorithms capable of tracking a dynamic or moving target.
Figure 21. Proportion of reviewed algorithms capable of tracking a dynamic or moving target.
Robotics 15 00023 g021
Figure 22. High-level qualitative trade-off summary of major planning families in AMR navigation (High: •••, Medium: ••, Low: ). Ratings are qualitative and reflect typical characteristics reported in the surveyed literature under common AMR deployment constraints.
Figure 22. High-level qualitative trade-off summary of major planning families in AMR navigation (High: •••, Medium: ••, Low: ). Ratings are qualitative and reflect typical characteristics reported in the surveyed literature under common AMR deployment constraints.
Robotics 15 00023 g022
Figure 23. Percentage distribution of path planning algorithms (Metaheuristic vs. AI-based), based on a 7-year literature survey.
Figure 23. Percentage distribution of path planning algorithms (Metaheuristic vs. AI-based), based on a 7-year literature survey.
Robotics 15 00023 g023
Table 1. Survey of PSO-based approaches for AMR path planning.
Table 1. Survey of PSO-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[27]2018PSO for global path optimizationMobile robot (simulation)Achieved shortest obstacle-free route
[30]2018PSO with IR-based dynamic obstacle avoidanceMultiple mobile robotsCollision-free navigation for moving obstacles
[28]2018Crowding-radius adaptive PSOConstrained mobile robotEnhanced diversity and global optimum convergence
[31]2019Adaptive PSO + BSA hybrid with chaotic controlMobile robot (simulation)Improved global search and efficiency
[32]2019PSO-based autonomous navigation using sensors and LIDARTurtleBot (3D Gazebo)Robust guidance and obstacle avoidance in complex terrain
[29]2020PSO on grid maps with convex obstaclesMobile robot (simulation)Generated shortest collision-free paths
[35]2020Hybrid frequency BA–PSO with obstacle detectionOmnidirectional mobile robotEffective static and dynamic obstacle avoidance
[33]2021Hybrid chaotic PSO–GWO with adaptive inertiaAutonomous mobile robotEnhanced convergence and avoidance of local minima
[34]2021PSO–GWO–LS hybrid plannerAutonomous robot (simulation)Smoothed paths and improved feasibility
[37]2021Adaptive PSO with route-smoothing fitnessWheeled mobile robotShortest smooth path, validated in simulation and experiment
[36]2021Enhanced PSO–SA with mutationAMR in polygonal obstacle environmentOptimal paths with premature convergence avoidance
[38]2022Improved PSO–GWO hybridWheeled mobile robotBalanced global and local search, secure navigation
[39]2022Multi-objective PSO–BA–LS–ODAAMR in dynamic environmentOptimized length, smoothness, and safety
[40]2023Prior-knowledge PSO (PKPSO) with quintic smoothingMobile robotShorter, smoother paths with adaptive velocity update
[41]2023Self-Evolving PSO (SEPSO) with auto hyperparameter tuningAMR (dynamic settings)Real-time performance, improved path efficiency
[42]2024Obstacle Kinematics Augmented Optimization (OkayPlan)TurtleBot in VRX simulatorReal-time 125 Hz planning, safer trajectories
[43]2024Improved PSO, GWO, and ABC comparison studyTwo-wheeled AMR (ROS 2 + LiDAR)Experimental validation; PSO achieved superior static path efficiency
[45]2024Smooth PSO-IPF (SPSO-IPF) with kinematic constraintsMobile robot (static/dynamic)Reduced jerks, smoother paths, improved convergence
[44]2025Adaptive Random Fluctuation PSO + B-spline optimizationAMR (simulation)Smooth curvature-constrained trajectories with local minima avoidance
[46]2025PSO-RRT* with B-spline smoothingRaspberry Pi mobile robot37% faster planning, 20% fewer turns, shorter paths
Table 2. Survey of GA-based approaches for AMR path planning.
Table 2. Survey of GA-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[49]2017Reward-based dynamic GA path planner using probabilistic obstacle modelingDynamic mobile robotEnhanced efficiency and obstacle avoidance in dynamic settings
[50]2018GA with enhanced crossover considering chromosome length variationMobile robot (simulation)Reduced infeasible paths and improved convergence
[51]2018Genetic adaptive navigation control strategy integrated with GAWheeled mobile robotRobust obstacle avoidance and reliable motion generation
[52]2019Hybrid Artificial Potential Field (APF)–GA with collision mitigation operatorMobile robot (grid environment)High-quality, collision-free paths and better optimality
[53]2020Improved GA with adaptive operators for local optima avoidanceMobile robot (static)Faster convergence and shorter, smoother paths
[54]2020Hybrid A*–GA for complex map navigationAutonomous mobile robotEfficient and stable trajectory generation
[55]2021Sensor-based adaptive GA for real-time path planningAMR with onboard sensorsReduced path length and faster convergence
[57]2021Multi-objective GA with heuristic median initializationMobile robot (simulation)Balanced global paths with multi-objective optimization
[58]2022Industrial GA-based planner for crowded environmentsIndustrial AMROptimal routes with reduced computation time
[59]2022Hybrid 2-way RRT with enhanced GA operatorsMobile robot (grid maps)Smoother trajectories, improved population diversity
[60]2022Hybrid ACO–GA with adaptive parametersWheeled mobile robotShorter, smoother, and fewer-turn paths
[61]2022GA with path-level fitness and dynamic mutationDynamic mobile robotReal-time feasible path planning under moving obstacles
[62]2024Improved GA with Clearance-Based Probabilistic Roadmap (CBPRM)Mobile robot (simulation)Safer paths and fewer infeasible routes in static 2D maps
[63]2024Comparative GA–ACO routing optimization studyAMR in warehouse environmentACO achieved shorter distances and better scheduling
[64]2024Direction-guided GA with adaptive crossover/mutationGrid-based AMRHigher success rate and faster convergence
[65]2024Double-Domain Inversion (DDI)-based GA for improved local searchMobile robot (simulation)Faster convergence and superior local exploration
[66]2025Dichotomy-based GA with adaptive operators and Bézier smoothingMobile robot (simulation)Improved smoothness, reduced length, and faster computation
[67]2025MHRTSN (GA + APF metaheuristic) hybrid for real-time navigationDynamic AMR (3D environment)Safe, adaptive navigation; PSO variant achieved faster processing
Table 3. Survey of ACO-based approaches for AMR path planning.
Table 3. Survey of ACO-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[72]2018Modified ACO with limitless step length and stimulus probability for decision-makingMobile robot (grid map)Faster convergence and wider search area coverage
[73]2019Hybrid APF–ACO with directional forces from potential fieldsMobile robot (static/dynamic)Accelerated convergence and obstacle-free motion
[74]2019APF–ACO with GA operators and unbounded step length heuristicsMobile robot (dynamic)Reliable navigation and fast convergence
[75]2021Hybrid GA–ACO for real-time collision avoidanceStatic/dynamic robot environmentShorter path length, reduced loops, faster convergence
[76]2020Improved ACO with uneven pheromone initialization and deadlock penaltyAutonomous mobile robotReduced stagnation and faster convergence
[77]2021Energy-efficient 3D ACO with gain-based pheromone enhancement3D AMR platformMinimized energy use and smoother global trajectories
[78]2021Adaptive multi-objective ACO for indoor navigationIndoor mobile robotBalanced optimization of distance, safety, and smoothness
[79]2022Communication-inspired ACO using tentacle interaction modelingMulti-robot (simulation)Improved cooperation and composite route formation
[80]2022ACO with adaptive parameter adjustment and dynamic avoidanceAMR (simulation)Robust performance in cluttered and changing environments
[81]2022Improved ACO integrated with DWA for global–local path planningMobile robot (dynamic)Efficient obstacle avoidance and real-time adaptation
[82]2022Distributed ACO for multi-robot path coordinationMulti-AMR systemReduced collision rate and computation load
[83]2022Enhanced ACO for complex grid environmentsWheeled robotAvoided stagnation, generated optimal routes
[84]2024Hybrid ACO–GA framework for smoother and faster pathsMobile robot (simulation)Outperformed standalone ACO and GA in trajectory quality
[86]2024Parallel ACO (PACO) with rank pheromone updates and “continue-or-kill” strategyMobile robot (multi-grid tests)Halved planning time and avoided deadlocks
[85]2025Improved Trimming ACO (ITACO) with triangular pruning and APF heuristicsMobile robot (simulation)60% shorter paths and faster convergence than classical ACO
[87]2025Intelligently Enhanced ACO (IEACO) with multi-layer innovation (adaptive tuning, ε -greedy)Physical and simulated AMRsSuperior path quality and faster convergence
[88]2025AR-ACO combining repulsive potential field and ACOMobile robot (complex map)Faster convergence and improved robustness
[89]2025Edge-assisted ACO with cloud pheromone update optimizationDynamic AMR systemLow-latency, safe, and real-time path planning
[90]2025Enhanced island-based ACO (EACI) with irregular pheromone initialization and auxiliary map pre-processingMini-ROS vehicle testbed95% fewer lost ants and 90% fewer iterations
Table 4. Survey of Firefly Algorithm (FA)-based approaches for AMR path planning.
Table 4. Survey of Firefly Algorithm (FA)-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[96]2018Firefly Algorithm for shortest and smooth trajectory optimizationMobile robot (simulation)Reduced path length with smooth and feasible trajectories
[97]2018Dynamic FA-based planner for moving obstacles and goalsMobile robot (dynamic environment)Effective path generation and obstacle avoidance
[98]2021Fuzzy-FA hybrid using Sobol initialization and dynamic displacementMobile robot (simulation)Improved convergence speed and adaptability
[99]2021D–FA hybrid with quadratic factor for obstacle avoidanceWheeled robot (simulation)Safer paths and faster obstacle detection
[100]2022Hybrid GA–FA planner for 2D/3D navigationMobile robot (multi-domain)Mitigated local optima and enhanced computational efficiency
[101]2022GAFA hybrid validated on Khepra-II platformKhepra-II robotReliable navigation and reduced routing time in cluttered settings
[93]2022Comprehensive survey of FA variants and hybridsVarious AMR systemsOverview of FA’s evolution and hybridization trends
[102]2024Enhanced FA (EFA) with linear α reductionMobile robot (multiple maps)10% shorter paths and reduced variance in path length
[103]2023Functional FA (FFA) with choice-based functions and constraints2D/3D dynamic environmentsEfficient and collision-free navigation
[104]2024FA-SMC hybrid for real-time control and path optimizationWheeled mobile robotImproved trajectory tracking and reduced error vs. PSO/TLBO
[105]2024ERRT–FA hybrid combining FA with Rapidly Exploring Random TreeAMR (complex static/dynamic)Faster exploration and shorter paths
[106]2024Hybrid FA–Cuckoo Search (HAC) plannerMobile robot (simulation)Enhanced obstacle avoidance and global path efficiency
[107]2024Whale–Firefly Optimization Algorithm (FWOA) with opposition-based learningAMR (static complex environment)Improved exploration, convergence, and multi-population diversity
Table 5. Survey of Grey Wolf Optimization (GWO)-based approaches for AMR path planning.
Table 5. Survey of Grey Wolf Optimization (GWO)-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[109]2019Improved GWO with evolution and elimination strategiesMobile robot (simulation)Faster convergence and improved global search performance
[110]2020Hybrid GWO–Symbiotic Organism Search (SOS)AMR in cluttered environmentsEnhanced detection capability and convergence speed
[111]2021Improved GWO with variable weighting factor for 3D UAV path planningUAV (3D environment)Mitigated waypoint dispersion and improved trajectory stability
[112]2021Expanded and Incremental Adaptive GWO variantsUAV (farm environment)Balanced exploration and exploitation; improved path feasibility
[113]2022Adaptive GWO for multi-obstacle navigationMobile robot (simulation)Generated accident-free, time-efficient, and cost-effective paths
[114]2022Hybrid GWO–WOA with Dynamic Window Approach (DWA)AMR (dynamic environment)Combined global and local planning; improved obstacle avoidance
[43]2024Improved GWO for static indoor navigationTwo-wheeled AMR (LiDAR, Raspberry Pi)Real-world validation; efficient path generation in structured maps
[115]2023Improved GWO with adaptive weighting functions (IGWO-WFs)UAV (simulation)Smoother trajectories and stable convergence
[116]2023Hybrid Improved GWO (HI-GWO) with chaotic mapping, Lévy flight, and golden sineAMR (simulation)Enhanced accuracy, robustness, and convergence rate
[117]2025Piecewise Adaptive GWO with DWA (PAGWO-IDWA)Mobile robot (dynamic environment)Reduced turns by 33% and runtime by 30% vs. standard GWO
[118]2025Improved GWO with cooperative predation and Lens Opposition-Based LearningUAV (dynamic navigation)Improved exploration, reduced local trapping, and shorter trajectories
Table 6. Survey of Fuzzy Logic (FL)-based approaches for AMR path planning.
Table 6. Survey of Fuzzy Logic (FL)-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[130]2017Fuzzy controller integrated with VFHWheeled mobile robot (WMR)Reactive obstacle avoidance using safe-zone concepts
[133]2018FL-based obstacle recognition and avoidanceSensor-equipped WMRReliable collision avoidance in cluttered environments
[131]2018Mamdani-type fuzzy inference systemMobile robotNear-optimal path with reduced time and energy
[132]2019Improved APF + dual fuzzy controllers (heading/stepping)WMR in complex environmentSmooth trajectories and reduced local minima effects
[134]2020FL-based AHP for multi-objective planningOmnidirectional WMRBalanced trade-offs between path length, rotation, and safety
[135]2020A* + FL-AHP + bio-inspired BLS controllerDynamic environment robotResilient, rapid control with adaptive decision-making
[136]2021Fuzzy-tuned heuristic parameters in improved ACOMobile robot (simulation)Faster convergence and stable path generation
[137]2022Hybrid global–local planner (APF + fuzzy DWA)AMR in dynamic environmentsImproved local obstacle avoidance via fuzzy hazard evaluation
[138]2022FL + multi-objective PSO framework for 3D navigationCar-like mobile robotEnergy-efficient, smooth, and collision-free 3D path
[139]2023Fuzzy-enhanced Dynamic Window Approach (FDWA)WMR in dynamic obstaclesImproved heading accuracy and responsiveness
[140]2023Adaptive Fuzzy DWA with sub-target generationAMR (simulation)Robust navigation avoiding local optima
[141]2024Takagi–Sugeno fuzzy inference + Kinect-based perceptionTurtleBot 2 (indoor)Reliable real-time navigation with reduced computational cost
[142]2024DCA + Efficient FL Controller (EFLC)Pioneer robot in V-REPShorter path and runtime than GA, RRT, and ACO
[143]2025Quad_D*–Fuzzy hybrid (quadtree + D* Lite + FL)Mobile robot in dense/trap-like environments100% success rate and 80% reduction in planning time
[144]2025Fuzzy A*–quantum multi-stage Q-learning–APF hybridAMR in static/dynamic environments80% faster learning convergence and smoother collision-free paths
Table 7. Survey of ANN-based approaches for AMR path planning.
Table 7. Survey of ANN-based approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[147]2018ANN control scheme for 3D WMRNon-holonomic WMREffective obstacle avoidance in 3D
[148]2018Teacher–student ANN for wall-following robotSimulated Mobotism platformSteering and path optimization
[149]2019GRU RNN for path planning in uncharted spacesAutonomous robot in uncharted environmentSmooth, real-time path planning
[150]2019Feedforward ANN + potential field hybridAutonomous vehicleTrajectory tracking with uncertainty handling
[151]2019Recurrent fuzzy NN + Extended Kalman FilterAutonomous MRResilient collision-free navigation
[152]2020Behavior-based ANN with BP trainingMobile robotFast goal-reaching with adaptive control
[153]2020Recursive ANN predictor for PWM controlAutonomous MRImproved accuracy and speed
[154]2020Neural RRT*: CNN-assisted RRT*Mobile robot (simulation)Improved exploration efficiency and path optimality
[155]2019MPNet: Deep neural motion planning network with continual learning + RRT*Various Robotic platformsAchieved near-optimal, collision-free paths in <1 s with generalization to unseen environments
[156]2021DNN + high-fidelity direct optimizationAGV (industrial)Robust real-time planning via DNN
[157]2021ANN with online data update for obstacle avoidanceMobile robotDynamic obstacle avoidance in real time
[158]2021LSTM for LiDAR-based trajectory predictionMobile robot (LiDAR-equipped)Optimal navigation under dynamic obstacles
[159]2022NN decision-making for unknown environmentsMobile robotRobust motion behavior adaptation
[160]2022Hybrid ANN + FL adaptive control with monitoring layerAMR (general framework)Improved localization, stability, and safety
[161]2023Lightweight dual-input CNN planner with hybrid samplingTurtleBot, simulated and real10× smaller model, 5 times faster computation, and competitive accuracy
[162]2024CNN-based ResNet18 + YOLOv3 for perception-driven navigationJetson Nano-based AMR98.5% recognition accuracy in dynamic obstacle-rich scenarios
[163]2025CNN-DQN with exponential decay + B-spline smoothingMobile robot (simulation)Faster convergence, smoother and more efficient paths than DQN
[164]2025Dual-layer BINN + reward-augmented DWAMulti-robot (static/dynamic/mixed)20% shorter paths, 90% fewer turns than GA, ACO, BINN
[165]2025Neuro-IWO (NN + Invasive Weed Optimization)Mobile robot (simulation)<5% deviation, smoother and shorter paths vs. NN planners
[166]2025Hybrid Symmetric BINN + Improved GAAMR on heterogeneous terrain11.4% shorter paths, 10 fewer turns in corridor experiments
Table 8. Representative Neuro-Fuzzy (NF) approaches for AMR navigation (2019–2025).
Table 8. Representative Neuro-Fuzzy (NF) approaches for AMR navigation (2019–2025).
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[170]2019Mamdani-type fuzzy controller + RL-based NNCar-like mobile robotCollision-free navigation in cluttered environments
[171]2019Multiple adaptive neuro-fuzzy inference system using IR sensorsMobile robot (simulation)Real-time navigation, reduced path length, robust against collisions
[172]2020Adaptive neuro-fuzzy inference system + GPS guidanceAMR with GPSCollision-free path following toward fixed or moving targets
[173]2021Adaptive neuro-fuzzy controllerMobile robot in static obstacle environmentSmoother and shorter paths compared to standalone FL
[174]2022ANFIS with ultrasonic sensors for obstacle avoidanceAMR with ultrasonic sensorsDynamic steering control when obstacles detected
[175]2022ANFIS + GPS + heading sensor hybrid motion plannerWheeled mobile robotRobust navigation in crowded/unknown environments, improved global–local control
[176]2022Neuro-fuzzy system for deliberative and reactive strategiesMobile robotImproved corrective decision-making in real time
[179]2022FCM-MANFIS (clustering + ANFIS)Khepera-IV, V-REP simulationReduced rule complexity; <9% deviation between simulation and real-world; shorter/faster paths
[177]2023Neuro-fuzzy optimized with PSO for multi-robot navigationMultiple mobile robots (crowded setting)Increased navigation accuracy, reduced path length and duration
[178]2024Adaptive Neuro-Fuzzy Inference System (ANFIS)AMR in dynamic environmentsReduced computational complexity, smoother trajectories
[180]2025Subtractive clustering + fuzzy set merging + ANFISAMR (simulation)RMSE 0.0442 with only 5 rules; outperformed GPS-ANFIS, CS-ANFIS, IWO-ANFIS
[181]2025Type-1, Type-2, Type-3 ANFIS with TLBO trainingTurtlebot (Gazebo)Type-3 ANFIS achieved smoother trajectories, faster convergence, safer navigation
[182]2025Hybrid A*–DWA–ANFIS PID controllerGround vehicleFaster convergence (0.038s vs. 0.052–0.075 s); efficient dynamic obstacle handling
[166]2025Hybrid Symmetric BINN + IGA + NF reasoningCorridor experimentsReduced path length by 11.4% and turns by 10; robust navigation in heterogeneous terrain
[183]2025Compact ANFIS (8 rules) with IR + ultrasonic sensingMatlab simulationsCollision-free paths up to 20% shorter vs. baseline fuzzy controllers
Table 9. Survey of Reinforcement Learning approaches for AMR path planning.
Table 9. Survey of Reinforcement Learning approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[195]2019PPO + DQN + CNNMulti-robot AMR navigationImproved perception, robust trajectory generation, faster convergence
[193]2020Globally guided RL for navigation with dynamic obstacles and robotsMulti-robot settingEffective navigation and collision avoidance in dynamic multi-agent scenarios
[192]2021Hybrid RL + fuzzy inference systemMobile robot (navigation + charging decision)Reduced training time, improved energy efficiency
[194]2021PPO + transfer learning for multi-robot path planningMulti-robot simulationStable learning, efficient coordination, reduced training overhead
[190]2022Q-learning + Artificial Potential Field (APF)AMR in cluttered environmentFaster learning, improved performance, reduced training time
[191]2022Augmented DDPG with LSTM, reward shaping, normalization, and mixed noiseAMR (simulation)Accelerated convergence, better generalization, more efficient paths
[187]2023Q-learning with improved exploration–exploitation balanceAMR in dynamic environmentReduced execution time, shorter paths, and improved cost performance
[188]2023Q-learning with prior knowledge initialization and adaptive greedy factorMobile robot (simulation)Faster convergence and higher success rate toward targets
[189]2023RL Path Generation with deep Markov model + motion fine-tuningMobile robotPredictive path generation and refined trajectory control
[196]2023Real-time MDP-based RL planningAMR in dynamic environmentRobust real-time navigation under uncertainty
[197]2024Improved Dyna-Q with hippocampus-inspired forward predictionAMR (simulation)Better exploration–exploitation balance, enhanced navigation efficiency
[198]2024Optimized Q-learning (O-QL) with distance-based initialization, hybrid exploration, Gaussian reward shapingAMR (randomized maps)Faster convergence, higher cumulative rewards than GA-QL and EnDQN
[144]2025Hybrid fuzzy A* + quantum multi-stage Q-learning + APFAMR (simulations with static/dynamic obstacles) 80% faster learning, smoother and shorter paths, avoided trap deadlocks
[199]2025A* + DQN + DWA with Bezier smoothingROS-based wheeled robots99.2% success rate, robust against dynamic obstacles compared to standalone A*, DWA, or DQN
[200]2025Distributional RL with LIDIA + QR-DQNAMR in GazeboReduced path length by 13%, smoother trajectories, improved reliability in cluttered maps
[201]2025ARE-QL (Q-learning + ant colony pheromone guidance + adaptive ϵ + continuous rewards)AMR in grid environmentsPath length reduced up to 64%, convergence time reduced by >80% compared to Q-learning, IQ-FPA, DRQN
Table 10. Survey of Deep Reinforcement Learning (DRL) approaches for AMR path planning.
Table 10. Survey of Deep Reinforcement Learning (DRL) approaches for AMR path planning.
Ref.YearMethodRobot/Platform/EnvironmentKey Outcome
[208]2019Frontier-based exploration + DRLUrban search and rescue robotAutonomous navigation in uncharted, cluttered terrains
[209]2020Incremental training strategy for DRLAMR (simulation)Improved convergence speed and learning efficiency
[210]2020Value-function enhanced DRL path plannerAMR in dynamic environmentsAdaptive obstacle avoidance with high success rate
[211]2021DDPG with separate experience replayMobile robotImproved obstacle avoidance and learning from experience
[212]2021Dueling DQN with end-to-end grid map inputAMR in complex mapsEnhanced decision-making, robust steering commands
[213]2021DRL-based secure navigation frameworkLunar roverSafety-constrained planning for space exploration
[215]2022Improved DQN with enhanced reward and explorationAMR in unknown environmentsFaster learning, improved exploration, optimized state–action mapping
[216]2022Hierarchical DRL controller (DQN-based)Wheeled robot in congested environmentsEffective obstacle avoidance via sub-action decomposition
[217]2022Two-layer controller (motion planning + DRL path-tracking)AMR in unknown terrainsReal-time navigation, deadlock resolution, faster training
[218]2022DRL + dynamic programming with prioritized replayAutonomous agent (simulation)Efficient learning, improved robustness in complex environments
[222]2023Improved DDQN frameworkPhysical AMRReal-world validation, enhanced convergence
[219]2024DRL with pedestrian danger modeling + virtual robotAMR in crowd navigationSmoother and safer trajectories in dense pedestrian environments
[220]2024AMC-TD3 decentralized DRLMulti-robot systemsCoordinated multi-robot navigation with robust performance
[221]2024DRL + improved K-means clusteringMulti-robot coverage tasksEfficient cooperative coverage path planning
[223]2024Hybrid A* + DRL frameworkReal AMR platformReal-time deployment, robust navigation in dynamic maps
[128]2025GAP_SAC with gated attention, prioritized replay, dynamic rewardTurtleBot 3 in ROS–GazeboImproved robustness, 90%+ success rate, smoother/shorter paths
[228]2025Dueling Double DQN (D3QN) variants with depth images and orientation cuesAMR in dynamic unknown environments85–90% success in real and simulated crowded scenarios
[229]2025Hybrid DRL–ANFIS with Tent-based Artificial Hummingbird Algorithm (TAHA)AMR in unstructured environmentsReduced path length (15%), computation time (25%), energy use
[230]2025Dual-layer MOSFMO + DRL local planner with composite rewardsAgileX Ranger-Mini 2.0, Gazebo92–95% success, 99% on-time arrivals in dense pedestrian traffic
Table 13. Cross-analysis of AMR path planning from a problem-dimension perspective.
Table 13. Cross-analysis of AMR path planning from a problem-dimension perspective.
Problem DimensionKey RequirementsCommonly Effective Design PatternHow to Evaluate
Static, known environment (global routing)Near-optimal cost; smoothness; clearance; moderate computeGraph/sampling/metaheuristic: global planning + smoothingPath cost/length; clearance; smoothness; runtime; success rate
Dynamic obstacles (time-varying scene)Fast replanning; safety under uncertainty; stabilityHierarchical stack: global route + reactive local avoidance; prediction-aware local moduleReplanning latency; replanning rate; collision rate; time-to-goal, and robustness to sensing delay/noise
Unknown/partially known (online navigation/exploration)Safety with partial observability; exploration efficiency, and robustnessOnline mapping + local policy; frontier/information gain; safety shieldingCoverage/time-to-map; safety violations; compute; generalization across layouts
Kinematically constrained (nonholonomic/dynamic feasibility)Dynamic feasibility; tracking stability; smoothnessTrajectory optimization or constrained refinement (e.g., spline-based) on top of a route/pathFeasibility violation rate; curvature/jerk; tracking error; runtime; safety margin
Narrow passages/cluttered (high constraint density)Escape local minima; maintain clearance; avoid dead-endsDiversity-preserving global search + feasibility projection; hybrid refinementSuccess rate; clearance distribution; failure cases; sensitivity to initialization
Multi-objective/task-driven (time–energy–risk tradeoffs)Pareto tradeoffs; constraint satisfaction; interpretabilityMulti-objective optimization (e.g., weighted or Pareto) + scenario-specific objectivesPareto front quality; constraint violations; stability under weight changes; task success rate
Table 14. Overview of strengths and limitations of metaheuristic and AI-based approaches for AMR path planning.
Table 14. Overview of strengths and limitations of metaheuristic and AI-based approaches for AMR path planning.
MethodStrengths and AdvantagesLimitations and Challenges
Metaheuristic Approaches
PSOSimple implementation; applicable to both continuous and discrete optimization problems; efficient global search behaviorConvergence speed is sensitive to parameter settings; may suffer from premature convergence in complex environments
GAHandles complex, multimodal optimization; maintains population diversity for broad explorationSlow convergence in large search spaces; computationally demanding for real-time applications
ACOSuitable for graph-based path planning; effective in dynamic and partially known environmentsSlow convergence; high memory usage; parameter tuning is nontrivial
FABalances exploration and exploitation; useful in multimodal search problemsPerformance varies significantly with parameter tuning; effectiveness may degrade in noisy environments
GWOFast convergence; effective for continuous path optimization; easy to implementLimited exploration in complex search spaces; sensitive to population size and parameter scaling
AI-Based Approaches
FLRobust to sensor noise and uncertainty; interpretable decision-making using linguistic rulesRule design is heuristic and problem-specific; scalability and performance degrade with complex scenarios
NNLearns nonlinear mappings for obstacle avoidance and path prediction; generalizes across environmentsrequires large training datasets; lacks transparency and interpretability; high computational demand
Neuro-FuzzyCombines the adaptability of NNs with the transparency of FL; handles both crisp and fuzzy inputHigh model complexity; training is sensitive and may lead to overfitting or instability
RLLearns policies through trial-and-error interaction with the environment; adaptable to dynamic tasks without explicit modelsConvergence can be slow; performance is sensitive to reward design; limited generalization beyond trained scenarios
DRLLearns optimal navigation strategies in dynamic, uncertain environments; enables autonomous adaptationTraining requires extensive data and compute; interpretability of policies remains a significant challenge
Table 15. Practical decision guidance for selecting AMR planning families under common deployment regimes. Recommendations are qualitative and summarize typical choices reported across the surveyed literature.
Table 15. Practical decision guidance for selecting AMR planning families under common deployment regimes. Recommendations are qualitative and summarize typical choices reported across the surveyed literature.
Operating Regime/ConstraintTypical Suitable FamiliesWhy This Choice Is CommonKey Caution/Failure Mode
Known map; mostly static obstacles; need global routeGraph search/roadmap/sampling (e.g., A*, D*, PRM/RRT*)High reliability and interpretability; mature toolchain; repeatable performanceMay degrade in dense dynamics; requires good map; may ignore nonholonomic/dynamic limits unless kinodynamic variants are used
Known map but frequent local changes (temporary obstacles, congestion)Global planner + local reactive layer (e.g., A*/D* + DWA/APF/FL)Global optimality/structure + fast local reaction; practical for warehouses and indoor navigationLocal minima/oscillation risks; tuning needed; may violate dynamic feasibility without constraints
Unknown/partially known environment; exploration/coverage requiredFrontier/exploration methods + local avoidance; RL/DRL for exploration policies (often hybridized)Handles incomplete maps; learns exploration behaviors; supports information-gain objectivesSparse rewards and brittleness; safety constraints needed; sim-to-real transfer can fail
Highly dynamic scenes (crowds, moving obstacles); strict real-time responseDRL local avoidance (policy inference) + global planner; or adaptive FL local layerFast online inference and responsiveness; handles partial observability when trained appropriatelyTraining cost and data/simulation fidelity; safety verification/shielding often required
Multi-objective planning (short, smooth, safe, energy-aware) with complex cost termsMetaheuristics (GA/PSO/ACO/GWO) and hybrid metaheuristic + deterministicFlexible objective design; can escape local minima; effective in cluttered search spacesParameter sensitivity; higher compute; real-time use often requires receding-horizon or parallelization
Tight nonholonomic constraints (Ackermann, differential drive) and/or dynamic feasibility requiredKinodynamic planners/optimization-based methods; constrained sampling; hybrid with feasibility checksProduces feasible trajectories respecting kinematics/dynamics; better for high-speed or constrained platformsComputational burden; requires accurate models; may need replanning under disturbances
Safety-critical deployment; need guarantees (hard constraints, certification)Deterministic planners + constrained optimization; safe RL with shields/CBFs as augmentationEasier to reason about constraints; can integrate formal checks (collision, speed/accel limits)Conservatism; may underperform in complex dynamic scenes without learned prediction
Multi-robot coordination (formation, deconfliction, shared goals)Centralized/distributed planners + MARL (hybrid) depending on commsSupports coordination objectives and negotiation; handles partial observability in learned settingsCommunication limits; non-stationarity in learning; scalability and safety remain open issues
Table 16. Per-decision-step (deployment/inference) online time and space complexity under fixed-size observations with no on-robot learning.
Table 16. Per-decision-step (deployment/inference) online time and space complexity under fixed-size observations with no on-robot learning.
MethodTime ComplexitySpace Complexity
Heuristic-guided RL O ( P n ) O ( P n )
Deep RL O ( P n ) O ( P n )
Tabular RL O ( A ) O ( | S | A )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aremu, M.B.; Ahmed, G.; Elferik, S.; Saif, A.-W.A. Autonomous Mobile Robot Path Planning Techniques—A Review: Metaheuristic and Cognitive Techniques. Robotics 2026, 15, 23. https://doi.org/10.3390/robotics15010023

AMA Style

Aremu MB, Ahmed G, Elferik S, Saif A-WA. Autonomous Mobile Robot Path Planning Techniques—A Review: Metaheuristic and Cognitive Techniques. Robotics. 2026; 15(1):23. https://doi.org/10.3390/robotics15010023

Chicago/Turabian Style

Aremu, Mubarak Badamasi, Gamil Ahmed, Sami Elferik, and Abdul-Wahid A. Saif. 2026. "Autonomous Mobile Robot Path Planning Techniques—A Review: Metaheuristic and Cognitive Techniques" Robotics 15, no. 1: 23. https://doi.org/10.3390/robotics15010023

APA Style

Aremu, M. B., Ahmed, G., Elferik, S., & Saif, A.-W. A. (2026). Autonomous Mobile Robot Path Planning Techniques—A Review: Metaheuristic and Cognitive Techniques. Robotics, 15(1), 23. https://doi.org/10.3390/robotics15010023

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop