Next Article in Journal
Advances in Sport Physiology, Nutrition, and Metabolism
Previous Article in Journal
Surface Morphology Effects on Turbulent Structure and Diffusion Across Multiple Underlying Surfaces in a Wind Tunnel
Previous Article in Special Issue
Unseen Hazard Recognition in Autonomous Driving Using Vision–Language and Sensor-Based Temporal Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Energy-Aware Spatio-Temporal Multi-Agent Route Planning for AGVs

by
Olena Pavliuk
1,2,* and
Myroslav Mishchuk
1,2
1
Department of Distributed Systems and Informatic Devices, Silesian University of Technology, 44-100 Gliwice, Poland
2
Department of Automated Control Systems, Lviv Polytechnic National University, 79000 Lviv, Ukraine
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(6), 3060; https://doi.org/10.3390/app16063060
Submission received: 18 February 2026 / Revised: 16 March 2026 / Accepted: 20 March 2026 / Published: 22 March 2026
(This article belongs to the Special Issue Autonomous Vehicles and Robotics—2nd Edition)

Abstract

This article addresses the problem of finding the shortest route for Automated Guided Vehicles (AGVs) in a production environment with constrained battery state-of-charge (SoC) and time-dependent operating conditions. The route map is divided into a uniform grid containing stationary obstacles and two types of dynamic obstacles: human, for which AGV transportation is prohibited, and inanimate (moving objects), which impose a penalty function. A key contribution of the proposed methodology is the introduction of a battery residual charge matrix, which embeds cell-level energy feasibility directly into the grid-based environment representation by determining minimum admissible SoC constraints and accounting for transition-dependent energy costs. This matrix restricts the set of traversable cells under low-energy conditions, enabling energy-aware route feasibility evaluation during both initial planning and adaptive replanning. The proposed approach is based on the A* and D* Lite algorithms, providing shortest-path construction that explicitly integrates battery SoC into the spatio-temporal cost function. To avoid collisions in a multi-agent environment during routing, a simplified hybrid scheme with M* elements performs local coordination and adaptive trajectory replanning. The effectiveness of the proposed methodology was assessed using travel time, temporal complexity, and spatial complexity metrics. Simulation results on a  10 × 10  grid showed that agents with sufficient battery completed routes of 8 and 11 cells with travel times of 7.2 to 10.7 conventional units. A critically low-energy agent was initially unable to move, but after adjusting the minimum SoC constraint, all agents completed their routes with travel times up to 11.4 conventional units, demonstrating the direct impact of energy constraints on system performance. Additional experiments with varying agent counts and SoC thresholds confirmed reliable balancing of route feasibility and energy constraints across configurations.

1. Introduction

Automated Guided Vehicles (AGVs) are essential components of modern industrial logistics, enabling flexible and efficient material transport in manufacturing and warehouse environments. Effective AGV route planning must address multiple challenges simultaneously: managing energy constraints, avoiding static infrastructure, responding to dynamic obstacles (including personnel), and coordinating multiple agents. While numerous path-planning methods exist, few approaches account for all these factors, particularly the integration of real-time battery state into routing decisions. This gap motivates the development of a comprehensive methodology that balances route optimality, safety, and energy efficiency.
This research proposes a route-planning methodology for a multi-agent AGV system subject to energy and time constraints, accounting for static and dynamic obstacles of various types. The methodology aims to ensure that each agent reliably reaches the target while minimizing route length and travel time and enhancing system robustness by incorporating residual battery charge as a planning constraint through adaptive, cell-level energy thresholds (i.e., a battery charge matrix as an additional component of the environment state). It combines A*, D* Lite, and M* planning components, enabling local optimization and adaptive real-time trajectory replanning. The scientific novelty lies in introducing a dynamic spatio-temporal travel-cost function in which the current battery state-of-charge (SoC) explicitly constrains feasible routes and drives adaptive replanning in multi-agent settings. The effectiveness of the methodology is evaluated based on route execution time, collision count, and total energy expenditure.
The remainder of the paper is structured as follows. Section 2 reviews related work on AGV path planning. Section 3 introduces the mathematical foundation of the proposed hybrid A*D* LiteM* methodology for multi-agent AGV route optimization under battery-charge constraints, static and dynamic obstacles, and time limitations. Section 4 describes the experimental setup, including grid configuration, obstacle matrices, and simulation parameters. Section 5 presents numerical results based on simulated AGV trajectories in a  10 × 10  grid environment with personnel and movable-object interference. Section 6 compares the effectiveness of the proposed approach with baseline A*, D*, and M*-based approaches in terms of route optimality, computation time, and adaptability to dynamic environmental changes. Our conclusions are presented in Section 7, where the obtained results are summarized, and proposals for future research are outlined.

2. Related Works

A large body of research has been devoted to AGV route planning and navigation in dynamic environments. Existing studies can be broadly divided into several groups depending on the applied planning paradigm, including classical graph-search methods, hybrid planning strategies, AI-based approaches, and solutions that explicitly consider battery constraints. Publications [1,2,3,4] are review articles. In [1], a review on route planning approaches in multi-agent systems is presented, including the classification of A*-based methods, heuristic approaches, biologically inspired techniques, and artificial intelligence (AI)-based algorithms. It is shown that many existing approaches do not account for battery constraints or time-varying moving obstacles. Article [2] focuses on the route planning and control of mobile robots, including AGVs and Autonomous Mobile Robots (AMRs), in logistics environments (warehouses and intralogistics) based on a control model combined with layered planning (task scheduling and motion planning) while considering constraints in complex intralogistics settings. In [3], a review of modern methods of path planning in dynamic environments for various robotic systems is conducted. Sensor modalities, obstacle types, planning strategies (avoid, plan, and replan), and key challenges (adaptation time, inter-agent communication) are analyzed. A classification of planning approaches is provided, and weaknesses are identified, particularly in multi-step planning for dynamic agents. In [4], the authors present a comprehensive survey on path planning methods for dynamic environments. They categorize existing approaches into constraint-based, dynamic, and multi-agent and highlight open research gaps. Particular attention is drawn to the necessity of integrating battery limitations, dynamic obstacles, and inter-agent coordination.
A large group of studies focuses on improving classical heuristic search algorithms for AGV navigation. For example, several works enhance the A* algorithm or combine it with local obstacle-avoidance techniques. In [5], an improved A* algorithm is integrated with the Dynamic Window Approach (DWA) to enable adaptive navigation in environments with static and dynamic obstacles. Similar hybrid approaches combining A* with local motion planning are presented in [6,7,8,9]. An algorithm for planning a route for an AGV operating in a dynamic environment, based on A* and DWA, is proposed in [6]. A feature of this study is the inclusion of route adaptation to dynamic obstacles, accounting for speed and safety constraints. Experimental results show that the proposed hybrid approach exhibits better adaptability and shorter paths than the standalone A* and DWA algorithms. The authors of [7] developed a hybrid algorithm that utilizes kinematic-constraint A* with DWA for an AGV in a dynamic environment. The proposed algorithm reduces the path length by 3.6% and the local planning time by 50% compared with traditional methods. In [8], A*, under AGV kinematic constraints, was combined with local DWA planning in dynamic environments. As a result, the path generated by the global plan became smoother in terms of the number of turns and time compared to that of classical A*. In [9], the authors improved A* with adaptive heuristics and a combination with DWA analysis in environments with different types of obstacles. That is, adaptive heuristics were applied with a search approach to increase efficiency and reduce redundant nodes. The results demonstrate an increase in planning efficiency, achieved by reducing the number of nodes and search time compared to basic A*. Other optimization-based approaches include the use of Ant Colony Optimization (ACO) combined with local planning methods [10]. In multi-agent contexts, improved coordination strategies based on A* are presented in [11], while ref. [12] demonstrates dynamic path planning for AGVs tracking moving targets using integrated sensory information. Beyond classical A*-based planning, several extensions have been proposed. Theta* [13] enables any-angle path planning on grid maps, producing shorter and smoother trajectories than grid-constrained planners. Lifelong Planning A* (LPA*) [14] supports incremental replanning when the environment changes, making it well-suited for dynamic settings. Sampling-based methods such as RRT and RRT* [15] efficiently explore large configuration spaces and are widely used in robotics for feasible trajectory generation. Another important research direction involves dynamic replanning algorithms based on the D* family. In particular, D* Lite has been widely applied in environments with changing conditions. In [16], D* Lite is combined with conflict-based search for multi-task routing. The algorithm is further extended in [17], where an improved version of D* Lite is applied to autonomous navigation in unknown dynamic environments. Hybrid frameworks integrating D* Lite with learning-based approaches have also been proposed. For example, ref. [18] combines D* Lite with multi-agent reinforcement learning. Additional studies demonstrate the applicability of D* Lite for multi-agent coordination and trajectory replanning under environmental changes [19,20]. Alternative approaches include improved potential field methods for multi-agent navigation [21]. In addition to classical planning algorithms, recent research has focused on multi-agent pathfinding (MAPF) and coordination strategies for fleets of AGVs operating in shared environments. Modern MAPF algorithms and lifelong multi-agent planning strategies enable efficient conflict resolution and coordinated movement of multiple robots in dense environments [22]. Recent studies also apply artificial intelligence and machine-learning techniques to AGV path planning. Publications [23,24,25,26,27,28] explore machine learning, evolutionary optimization, and reinforcement learning approaches. In [23], an optimized shortest-path model for AGVs based on a support vector machine (SVM) was developed, resulting in increased planning efficiency under complex production conditions. In [29], a review of path-planning approaches for multiple mobile robots was conducted, covering classical, heuristic, and AI-based methods. The authors emphasized the trend toward decentralized planning in multi-robot systems and identified key challenges, including real-time operation, resource constraints, and inter-robot coordination. In [24], a dynamic multi-criteria path-planning model based on multi-agent deep reinforcement learning (MADRL) was proposed. It has been shown that agents can reach their targets while simultaneously optimizing multiple criteria, such as time, distance, and the number of turns, in complex environments. In [25], a hybrid AGV path-optimization algorithm was introduced, integrating the evolutionary Grey Wolf Optimizer (GWO) algorithm with local partially matched crossover (PMX) mutation to improve route smoothness and length. Experiments confirmed that the evolved routes are smoother and shorter than those of classical heuristic approaches. The novelty of the approach in [26] lies in combining a learning-based method (a K-L network enhanced with a genetic algorithm) with global planning to achieve smoother navigation paths. The results demonstrate effective avoidance of dynamic obstacles and improved coordination among multiple AGVs. In [27], deep reinforcement learning methods—specifically Deep Deterministic Policy Gradient (DDPG) and Multi-Agent Deep Deterministic Policy Gradient (MADDPG)—were applied for cooperative AGV navigation in dynamic environments with obstacle avoidance. The novelty of the method lies in centralized training with decentralized execution and optimized experience buffers. The study shows that agents successfully coordinate and avoid collisions, thereby reducing travel time and conflicts. Paper [28] proposes an intelligent motion-control system for a wheeled mobile robotic platform that integrates navigation, fuzzy logic, and advanced sensor-data processing to ensure adaptive behavior in dynamic environments. The novelty lies in the development of a rule-based fuzzy controller and a remote software implementation, with simulation results demonstrating effective platform motion under varying sensor conditions. Machine learning techniques are also increasingly used for predictive maintenance and energy modeling in AGV systems. For example, neural-network-based models for predicting battery-cell voltage drops are presented in [30,31]. Time-series forecasting models for sensor data analysis in robotic systems are investigated in [32], while transfer learning approaches for intelligent data processing in robotic platforms are studied in [33]. These studies demonstrate the growing role of AI methods in improving operational efficiency and decision-making in autonomous robotic systems. Another important line of research considers the influence of the battery state-of-charge (SoC) on AGV operation. Publications [34,35,36,37] address scheduling and routing problems under battery constraints. These studies demonstrate that incorporating energy constraints into routing decisions can improve operational efficiency and reduce system downtime. In [34], a priority-based scheduling approach for AGVs using the ACO algorithm for collision avoidance is proposed, with the AGVs’ battery level determining the collision avoidance priority. Experimental results confirm that battery-based prioritization reduces conflicts and improves AGV traffic efficiency in a factory. The authors of [35] consider the problem of real-time dispatching of an AGV fleet under battery constraints. The proposed solution based on deep reinforcement learning demonstrated improvements in both battery-utilization efficiency and overall system performance. The authors of [36] developed a task-scheduling method for AGVs under battery constraints that combines task redistribution, charging management via a Gurobi/LP module, and optimization using a battery-threshold policy. The results demonstrate industry-applicable solutions that minimize delays and traffic-related costs. In [37], optimization of AGV routing and loading schedules in a container terminal, accounting for load-flow dynamics, is considered. The novelty of the approach lies in a dual-threshold charging strategy adapted to load-flow conditions, integrated with routing and scheduling. The results are verified through simulations, and it is validated that the flexible charging strategy increases AGV availability and reduces downtime. Despite the extensive research in AGV route planning, most existing studies consider only a subset of operational constraints. In particular, the simultaneous integration of energy management, dynamic obstacle avoidance, and stable route execution in multi-agent systems remains insufficiently explored. In many studies, battery considerations are limited to simple threshold policies or charging-point routing, while the spatial distribution of battery resources and their influence on route feasibility are rarely incorporated directly into the planning process.
Based on the conducted literature review, most approaches for AGV route planning can be grouped into the following categories:
1.
Classical heuristic and graph-search methods, including A*, Dijkstra, D*, D* Lite, Theta*, Lifelong Planning A*, etc. They guarantee the existence of an optimal or approximate route. However, these methods suffer from high computational complexity when scaled to large or dynamic environments and generally do not account for the remaining AGV battery charge.
2.
Local optimization methods, including the DWA, Artificial Potential Field (APF), Velocity Obstacle (VO), and their improved variants. They may implicitly account for energy consumption through constraints on maximum speed or acceleration.
3.
Evolutionary and swarm-intelligence methods, such as Genetic Algorithms (GA), ACO, Particle Swarm Optimization (PSO), and GWO, etc. These approaches perform well for multi-criteria tasks but often require many iterations and are unstable in dynamic environments. Therefore, they are effective for real-time tasks but are prone to local minima and do not guarantee a global optimum. They can incorporate energy as an optimization criterion, but in a simplified manner, without dynamic discharge modeling or constraints on returning to the charging station.
4.
Machine Learning and Reinforcement Learning methods, such as Deep Q-Learning, DDPG, Proximal Policy Optimization (PPO), MADDPG, and Graph Neural Networks (GNNs), for multi-agent planning. These methods enable AGVs to adapt their behavior to environmental changes but require large amounts of training data and substantial computational resources. They can incorporate SoC into the state space. However, literature analysis shows that these methods typically model only a minimum SoC constraint after which the AGV must return to charging and do not address battery degradation or inter-cell charge variation.
5.
Hybrid methods integrate global planners (e.g., A* and D*) with local obstacle-avoidance techniques (e.g., DWA and APF) or combine heuristic methods with reinforcement learning or evolutionary algorithms. They strive to balance speed, adaptability, and accuracy, yet most studies do not jointly consider energy constraints, time efficiency, and the presence of live dynamic obstacles. In such methods, battery considerations are typically limited to pre-planned charging-point routing or are completely ignored in agent-coordination scenarios.
Thus, key research gaps include the inadequate integration of energy management, adaptive dynamic obstacle avoidance, and stable route execution in multi-agent environments.

3. Materials and Methods

In this paper, we propose a hybrid methodology for determining the shortest routes between destinations for AGVs that accounts for limited resources, including:
  • the number of grid cells into which the map of the premises, along with the route that the AGV must travel, is divided;
  • stationary obstacles where AGV transportation is prohibited;
  • dynamic obstacles, which are categorized into human (personnel) and inanimate, such as boxes and pallets, etc., that accidentally appear on the AGV’s path and can be rearranged;
  • the remaining charge of the AGV’s battery;
  • the time spent traveling through the cells.
In the proposed methodology, the generalized dynamic travel-cost function is defined as (1):
C dyn ( u , v , t ) = α E u , v + β T u , v + γ O v static + δ O t dyn _ inh ( v ) + M · O t dyn _ human ( v ) ,
where
  • C dyn ( u , v , t ) —dynamic transit cost of moving from node u to v at time step t;
  • E u , v —estimated energy consumption for the transition from u to v;
  • T u , v —traversal time for the transition from u to v, including base time, turns, and delays due to dynamic obstacles;
  • O v static —static obstacle indicator at target cell v;
  • O t dyn _ inh ( v ) —inanimate dynamic obstacle indicator at cell v at time t;
  • O t dyn _ human ( v ) —human dynamic obstacle indicator at cell v at time t;
  • α , β , γ , δ —weighting factors;
  • M —sufficiently large penalty constant (e.g.,  10 6 ).
The coefficient  γ  determines the penalty associated with static obstacles. At very large values, the cell becomes inadmissible (analogous to a hard constraint, as used in this work), and at significant values, cells near obstacles can be used, accounting for risk or additional costs. The relative value of  γ  compared to  α  and  β  determines the system priorities. For example, if  γ α , β , then safety with respect to static objects dominates, and if  γ α β , then obstacles are accounted for as an additional cost factor alongside energy consumption and time.
Within this framework, A* constructs an initial minimum-cost route, D* Lite updates it online as the environment/battery SoC changes, and M* resolves local multi-agent conflicts when paths interact. Therefore, we introduce these routing algorithms next and then show how they are combined in the hybrid scheme.
Although the core route-planning algorithms are deterministic, the battery residual charge matrix is generated using the deep neural network methodology developed and validated in [30,31]. This predictive module provides cell-level SoC estimates integrated into the environment representation (Section 3.2.3), enabling the planner to evaluate energy feasibility before route execution and during adaptive replanning.

3.1. Algorithms for Finding the Shortest AGV Route

3.1.1. A* Algorithm

A* is a starting algorithm for local search for the shortest AGV route on a graph with edge weights. Its main advantage lies in combining a breadth-first search with a greedy search. The basic principle of A* is to compute the cost function (2) for each vertex u:
f ( u ) = g ( u ) + h ( u ) ,
where
  • f ( u ) —estimated total path cost through node u;
  • g ( u ) —accumulated path cost from the start node to node u;
  • h ( u ) —heuristic estimate of the cost from node u to the target node.
The following steps summarize the standard A* search loop used to expand candidate nodes and update path costs:
1.
Place the starting vertex in the open list; while the open list is not empty, select the vertex u with the smallest  f ( u ) . If u is the target vertex, terminate the search and move it to the closed list.
2.
For each neighbor v of node u, we calculate the accumulated cost to reach the neighbor node v via node u as (3):
g ( v ) = g ( u ) + C dyn ( u , v ) .
When the time index is clear from context or the environment state is fixed (e.g., during initial A* planning on a static snapshot), we write  C dyn ( u , v )  for brevity.
3.
Update the priority of affected nodes in the open list and continue by expanding the node with the smallest evaluation function (2).
The A* algorithm guarantees finding the optimal path if its heuristic function is admissible. The time complexity depends on the accuracy of the heuristic  h ( u ) .

3.1.2. D* and D* Lite Algorithms

The D* algorithm enables dynamic updates to edge weights or node accessibility during the AGV motion. The accumulated cost for each node follows the same relaxation rule as in (3). If the graph changes, only affected nodes are updated. D* Lite works backward from the target to the start. It uses the g-value and rhs-value (one-step lookahead value of node u to determine node consistency): if  g ( u ) = rhs ( u ) , the node is consistent; otherwise, it is inconsistent and requires processing. For D* Lite, the  rhs -value is defined as (4):
rhs ( u ) = min v Adj ( u ) g ( v ) + C dyn ( u , v ) .
If a node v has not yet been considered, i.e., is not on the closed list, then a better path has been found; add v to the open list. The key-function used for the node u in the D* Lite priority queue is defined in detail in Section 3.3, Equations (16)–(18). The main steps are:
1.
Initialization:  g ( u ) = M rhs ( goal ) = 0 . Add goal to the priority queue.
2.
While start is inconsistent or start-key < queue-key, remove nodes with a minimum key; update  g ( u )  and neighbors, maintaining consistency.
3.
During agent movement: if new obstacles appear or weights change, update  rhs  only for affected nodes; recalculate path only for inconsistent nodes.
D* Lite performs fewer computations than D* because it does not need to recalculate the entire graph. It enables optimal, adaptive replanning in dynamic environments for AGVs moving across a map with unpredictable obstacles.

3.1.3. M* Algorithm

The M* algorithm is an extension of A* for problems with multiple agents moving simultaneously in a shared environment by searching the joint state space only when conflicts occur. For K agents, a joint state is  s = ( u 1 , u 2 , , u K ) , where  u i  is the current cell of agent i. Conflicts include vertex collisions (same cell and same time) and edge collisions (agents swapping cells).
The main steps are:
1.
Compute individual shortest paths using  C dyn .
2.
Detect conflicts along these paths.
3.
If a conflict is found, temporarily couple only the conflicting agents and replan locally; otherwise, execute the next step.
4.
Repeat until all agents reach their goals.
M* performs joint replanning only where needed, reducing the cost of full K-dimensional search.

3.2. Setting Constraints to Find the Shortest AGV Route

3.2.1. Grid Discretization

Figure 1 shows the route map window during the movement of the AGV in the “Navitrol Monitor” environment. The brown color indicates stationary obstacles included in the AGV route map (walls, cabinets, and fixed workstations). The green color indicates what the AGV detects during motion using 2D LiDAR. Since this AGV is equipped with only a single front-facing LiDAR, areas behind it remain temporarily invisible. The green and blue circles denote the points between which the AGV moves while performing its tasks; their combination forms a trajectory that corresponds to the AGV’s predefined goals. The purple line represents the possible trajectory between waypoints. The yellow square indicates the AGV’s current position on the map with the green arrows indicating the front of AGV.
The map of the room where the AGV and personnel operate is divided into a grid of uniformly sized squares, with dimensions  n × m  in the horizontal and vertical directions, respectively. The grid is represented as a graph  G = ( V , E ) , where the vertices V correspond to the cells of the grid, and the edges  E = ( u , v ) u , v V , v Adj ( u )  represent the connections between the four adjacent neighboring cells.
Each node  u V  corresponds to a unique grid cell at position  ( i , j ) , where i and j denote the row and column indices, respectively. When defining environment matrices (e.g.,  M [ i , j ] ), the indices  ( i , j )  refer to the grid coordinates of the corresponding node u.

3.2.2. Stationary Obstacles

If in the matrix of static obstacles  M [ i , j ] = 1 , this indicates a stationary obstacle where AGV movement is prohibited (marked on the map with brown lines or dots occupying the corresponding cell). Such obstacles include walls, fixed tables, workstations, fenced areas around various robotic setups, and similar structures. Conversely,  M [ i , j ] = 0  represents a free cell where AGV movement is allowed (no brown lines or dots are present). This matrix is static, uniform, and constant across all AGVs. Changes occur only if the room layout is modified or new stationary obstacles are introduced. Next, all constraints must be defined, including spatial limits, masking areas, and proximity restrictions.

3.2.3. The Remaining Charge of the AGV Battery

In the proposed approach, the AGV battery SoC directly affects the route planning process through two mechanisms: the availability of cells in the workspace and the energy component of the travel cost function.
The first mechanism is implemented via a cell-availability mask,  B [ i , j ] , which characterizes the minimum fraction of charge required to pass through a particular cell. The constraint is the battery charge mask  B [ 0 , 1 ] n × m { None } , given as an  n × m  matrix that determines whether the agent (AGV) can be in this cell according to energy constraints.  B [ i , j ] = None  if the cell is inaccessible to the agent (mainly due to stationary obstacles). If  B [ i , j ] = 1  (for the maximum battery charge) or a specific value:  B [ i , j ] ( 0 , 1 )  for the remaining SoC, then the cell is accessible. At the same time, energy costs can be taken into account (5):
A B = { ( i , j ) B i , j None B i , j B min } ,
where
  • A B —set of indices  ( i , j )  for which the matrix B has defined values greater than or equal to  B min ;
  • B min —minimum threshold value to include the entry in  A B  ( B min = 0.2  is the minimum technologically permissible discharge threshold for AGV Formica 1, below which the cell is inaccessible).
The second mechanism is based on the dynamic function  C dyn ( u , v , t ) , in which the component  E u , v  represents the energy cost of transitioning between cells. This value accounts for key AGV motion characteristics, including acceleration, braking, required stops, turns, and potential local obstacles that induce additional energy losses. The weight of this component is modulated by the coefficient  α , which determines the relative importance of energy efficiency compared to other routing factors. A higher value of  α  encourages the algorithm to favor transitions with lower energy consumption. Consequently, even without an explicit dependence on the battery’s SoC, the energy cost indirectly reflects the AGV’s limited energy resources, leading the AGV to naturally select more energy-efficient trajectories at lower SoC levels.
The procedure for generating the matrix B from predicted battery cell voltages, including the neural network architecture and data preprocessing pipeline, is detailed in Section 4 and illustrated in Figure 2.

3.2.4. Dynamic Obstacles

Two types of dynamic obstacles are considered: human obstacles (personnel), which prohibit AGV passage through the occupied cell, and inanimate obstacles (e.g., pallets, boxes, and moving objects), which do not block the cell permanently but increase the local traversal cost. We introduce the time-dependent occupancy matrices  O t dyn _ human { 0 , 1 } n × m  and  O t dyn _ inh { 0 , 1 } n × m , where  O t dyn _ human ( u ) = 1  if a person occupies node u at time t, and  O t dyn _ inh ( u ) = 1  if an inanimate obstacle occupies node u at time t. The overall dynamic occupancy is (6):
O t ( u ) = O t dyn _ human ( u ) O t dyn _ inh ( u ) .
A node u is considered traversable at time t if it is not a static obstacle, not occupied by a person, and satisfies the battery-threshold constraint. The corresponding accessibility mask is defined as (7):
X t ( u ) = 1 { M [ i , j ] = 0 } 1 { O t dyn _ human ( u ) = 0 } 1 { B [ i , j ] None } 1 { B [ i , j ] B min } ,
where  1 { · }  is the indicator function.
In general, movement is allowed only between four horizontally/vertically adjacent cells (no diagonal moves). The 4-neighborhood of a cell  ( i , j ) , constrained to the grid, is defined as (8):
N 4 ( i , j ) = { ( i ± 1 , j ) , ( i , j ± 1 ) } ( [ 0 , n ) × [ 0 , m ) ) .
For example, in cell  ( 2 , 2 ) , the 4-neighborhood includes the cells  ( 1 , 2 ) ( 3 , 2 ) ( 2 , 1 ) , and  ( 2 , 3 ) . For modeling purposes, we assume that a dynamic obstacle moves by one cell per time step in one of the four directions:
x ( t + 1 ) = x ( t ) + Δ , Δ { ( 1 , 0 ) , ( 1 , 0 ) , ( 0 , 1 ) , ( 0 , 1 ) } ,
where  x ( t ) —obstacle position (cell index) at time step t Δ —move to a neighboring cell in one of the four directions.
For a move into a free cell, we set the base traversal cost to 1. Inanimate obstacles impose an additive penalty, whereas human obstacles make the cell inaccessible. The inanimate-obstacle penalty component is represented by (10):
c ( u , v , t ) = 1 + κ inh O t dyn _ inh ( v ) ,
where
  • c ( u , v , t ) —traversal cost of moving from cell u to cell v at time t;
  • κ inh —penalty coefficient for inanimate dynamic obstacles;
  • O t dyn _ inh ( v ) —occupancy indicator of inanimate obstacles at cell v at time t.
Equation (10) is a special case of the generalized dynamic cost function (1) under simplified scheduling conditions. It describes a local model of the cost of passing a cell, accounting only for the basic step and the penalty associated with a non-living dynamic obstacle. It is applicable to basic A*-search or to the analysis of dynamic coordination between agents in isolation.

3.2.5. The Time Spent Traveling Through Cells

Because the generalized cost function  C dyn  in (1) explicitly includes the time component  T u , v , it is important to define the traversal-time model used in simulations. The basic time to pass through a single free cell, with no obstacles and no penalties, defined as (11), is computed from the cell geometry and the average AGV speed.
T base = L v AGV ,
where L—length of the cell side, and  v AGV —average linear speed of the AGV while moving in the free zone.
However, in real traffic conditions, there are additional time delays with which the total traversal time from u to v can be estimated by the expression (12):
T u , v = T base + Δ T rot + Δ T obs ,
where  Δ T rot —additional time for turning (depends on the angle of change of direction of movement);  Δ T obs —additional delay due to the presence of dynamic obstacles.
The delay associated with a turn is described as (13):
Δ T rot = 0 , if   the   direction   of   movement   does   not   change k rot · Δ Θ , if   a   turn   occurs ,
where  k rot —coefficient that takes into account the kinematic features of the AGV;  Δ Θ —angle of change of direction (in radians).
In the presence of dynamic obstacles in neighboring cells, the AGV’s speed decreases, resulting in an increased transition duration (14):
Δ T obs = k obs · 1 d min ,
where  k obs —coefficient representing the influence of obstacles on AGV movement;  d min —minimum distance to the nearest dynamic obstacle.

3.3. Proposed Methodology for Finding the Shortest AGV Route

Start the search for the shortest path from the starting cell to the goal, accounting for movement costs and the A* heuristic. Let:
  • s—the starting node;
  • q—the goal node;
  • g ( u ) —accumulated cost to reach node u from the start node;
  • h ( u ) —heuristic estimate from s to the goal u (e.g., Manhattan distance,  | u x q x | + | u y q y | );
  • f ( u ) —the evaluation function for the priority queue;
  • c ( u , v ) —cost of moving from node to neighbor from u to v.
Recursive update when moving from a node to a neighbor is carried out according to the expression if  g ( v ) > g ( u ) + c ( u , v )  then  g ( v ) g ( u ) + c ( u , v ) , as in (15):
g ( v ) = min g ( v ) , g ( u ) + c ( u , v ) , ( u , v ) E , v Adj ( u ) ,
where E—set of graph edges;  ( u , v ) E —edge from node u to node v in the graph.
D* Lite is used for local adaptivity, i.e., it updates the current path when dynamic obstacles change the map (e.g., when people move). In the simplest implementation, a naïve strategy would rerun A* from scratch after each change; however, this is computationally expensive. In its classical form, D* Lite relies on the two functions:  rhs ( u ) , defined in (4), and  key ( u ) . To calculate the first component of the priority key for node u in D* Lite, combining its current best-known cost, one-step preview score, heuristic distance from the origin, and map change modifier, we use the Formula (16):
k 1 ( u ) = min g ( u ) , rhs ( u ) + h ( s start , u ) + k m ,
where  k m —incremental modifier used to update keys when the map changes;  h ( s start , u ) —heuristic distance from the start position to node u (note that D* Lite searches backward from the goal).
To calculate the second component of the priority key for node u in D* Lite, taking the minimum of its current best-known value and the one-step preview estimate, we use the Formula (17):
k 2 ( u ) = min g ( u ) , rhs ( u ) .
This formula defines the priority key of node u in D* Lite as a tuple  ( k 1 ( u ) , k 2 ( u ) ) , which is used to order nodes in the priority queue based on their estimated cost and heuristic (18):
key ( u ) = k 1 ( u ) , k 2 ( u ) .
The basic recalculation cycle is given by (19):
U . topKey ( ) < key ( s start ) or rhs ( s start ) g ( s start ) ,
where  U . topKey ( ) —the smallest key in the priority queue.
We take the node u with the minimum  key ( u ) . If  g ( u ) > rhs ( u ) , where  g ( u )  is the current best-known cost to reach node u and  rhs ( u )  is the one-step lookahead value estimating the cost to reach u, then we make the node consistent:  g ( u ) rhs ( u ) ; otherwise, if  g ( u ) < rhs ( u ) , we set  g ( u ) +  and discard the previous u.

3.4. Multi-Agent Path Planning with Dynamic Obstacles

To ensure collision-free motion, we prohibit (i) vertex conflicts (two agents occupying the same cell at the same time) and (ii) edge conflicts (two agents swapping cells between consecutive time steps). In addition, movements must respect the static-obstacle mask, the human-obstacle constraint, and the battery-threshold constraint.
For a given agent at cell u and time step t, we restrict candidate moves to feasible neighbors  N t ( u ) = { v N 4 ( u ) X t ( v ) = 1 }  (In algorithmic expressions, we write  N 4 ( u )  to denote the neighborhood of node u located at position  ( i , j ) , i.e.,  N 4 ( u ) N 4 ( i , j ) .). The local one-step update can then be written in the standard shortest-path form (20):
g ( v ) = min g ( v ) , g ( u ) + c ( u , v , t ) , v N t ( u ) .
The transition cost  c ( u , v , t )  is based on the same interpretation as in the dynamic-cost model (base cost plus an inanimate-obstacle penalty, and  +  for infeasible cells). A compact definition used in simulations is (21):
c ( u , v , t ) = 1 , X t ( v ) = 1 O t dyn _ inh ( v ) = 0 1 + κ inh , X t ( v ) = 1 O t dyn _ inh ( v ) = 1 + , otherwise ,
where
  • c ( u , v , t ) —transition cost from u to v at time t;
  • X t ( v ) —feasibility indicator of cell v, see (7);
  • O t dyn _ inh ( v ) —inanimate-obstacle indicator at cell v at time t;
  • κ inh —penalty coefficient for inanimate obstacles (in this research, we use  κ inh = 5 ).
Equation (21) generalizes (10) by retaining the same base value of 1, adding a validity check via the accessibility indicator (7), and introducing an infinite cost for infeasible cells. That is, if the cell is admissible ( X t ( v ) = 1 ), Equation (21) reduces to (10), and if  X t ( v ) = 0 , then  c ( u , v , t ) = +  corresponds to a hard restriction on passing through the cell.
The computational complexity of full M* considering K agents is ∼ O ( branches K ) , where K—number of agents; branches—average number of possible moves per agent. The simplified version reduces complexity by using a function similar to A*. The cost accumulation and node evaluation follow the same relaxation and priority rules defined in Equations (3) and (2), respectively. The next step selection function is (22):
next _ step ( u ) = arg min v N 4 ( u ) f ( v ) ,
where  next _ step ( u ) —best neighbor according to f-value. The total availability of the cell for the agent (23):
occupied _ next ( t ) = i { next _ step ( i ) } ,
where  occupied _ next ( t ) —set of cells temporarily occupied by other agents at time t. Remaining path length after partial path execution (24):
remaining _ path _ len = len ( path _ partial ) 1 ,
where  remaining _ path _ len —remaining steps in partially constructed path. SoC constraint update (25):
battery _ threshold = MIN _ BATTERY + 0.01 · remaining _ path _ len ,
where  battery _ threshold —minimal battery required for the remaining path;  MIN _ BATTERY —allowed minimal battery;  0.01 —coefficient per remaining step. The coefficient 0.01 corresponds to an estimated 1% battery consumption per cell transition, derived empirically from the AGV Formica 1 specifications. Conflict detection with neighboring agents (26):
conflict _ agents = { a a i , | p x q x | + | p y q y | 1 } ,
where
  • conflict _ agents —set of agents in vertex or edge conflict;
  • p x , p y —current agent position;
  • q x , q y —other agent position;
  • | p x q x |   +   | p y q y | 1 —adjacency condition.
Appending selected next step (27):
execute _ path i = execute _ path i { next _ step } ,
where  execute _ path i —executed path sequence for agent i.
After all agents execute their moves, the position vector is updated to the new positions, and the simulation clock advances by one discrete step. Boolean goal reached indicator (28):
reached i = True , if next _ step = goal i False , otherwise ,
Execution results summary (29):
results i = agent : i path : execute _ path i visited : vis _ set i vis _ order : vis _ order i reached : reached i stopped _ reason : stopped _ reason i
where  results i —final record of agent i, including the path, visited cells, order, reached flag, and stop reason. The total number of synchronous simulation steps is defined as  max _ steps = max i ( len ( execute _ path i ) ) , with discrete time  t = 0 , 1 , 2 , , max _ steps 1 .
The formulas described above follow the classical structure and operational rules of the A*, D* Lite, and M* algorithms; however, the key novelty lies in the author’s modifications, which include the integration of two types of dynamic obstacles (people and moving objects), as well as taking into account the remaining SoC of each AGV and the restrictions on the minimum allowable energy level, along with adaptive route replanning that considers these factors. Thus, the proposed changes will ensure that all agents reach the target point while optimally utilizing energy resources in a multifactor dynamic environment.

3.5. The Submodule Architecture

The main route planner receives the following: the AGV’s initial and target coordinates, matrices of static and dynamic obstacles, and a matrix of residual battery charges. The A* submodule is used to quickly build a route in small areas of the map with a relatively small number of variable obstacles. The D* Lite submodule is used to adapt the route in dynamic environments when moving objects appear or battery data are updated. The M* submodule is used to coordinate multiple AGVs when routes intersect or conflicts must be avoided.
Each submodule returns a route, predicted time, and charge parameters. The main planner compares alternatives and selects a path that minimizes the total cost function, ensures the person’s safety (with a complete stop of the AGV) and safety from other dynamic obstacles, and guarantees sufficient battery balance. At each step of the AGV’s motion, the main planner checks the battery state and dynamic obstacles. In case of a conflict, the route is reconstructed by calling D* Lite and M* again.

4. Experimental Setup

Let us set the constraints for the traffic map shown in Figure 1, namely, the part of the 10 by 10 grid that covers points 23, 25, 26, and 30 (blue circles), over which the AGV traveled. Then the stationary constraint matrix will have the form (30):
M = 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1
In the same way, matrices are created for other parts of the grid. The map is partitioned so that, if all AGVs are in a single part, the entire map need not be recalculated. In this case, it is advisable to recalculate the shortest route only within this part of the map.
A key contribution is the introduction of the battery residual charge matrix of each AGV, which encodes cell-level energy feasibility and defines minimum admissible SoC constraints for route planning. In this matrix, Nones are at the locations of stationary obstacles (1 in the stationary-obstacle matrix), since it makes no sense to compute the battery-reserve charge, as AGV passage is prohibited there. In other cases, values are assigned using various methods, with neural networks and hybrid methods being the most effective. The authors in previous publications [30,31,32] proposed such methods of short- and medium-term forecasting. The residual charge calculated for each cell is normalized to the absolute maximum value in mV.
Residual charge values are normalized to  [ 0 , 1 ]  using the maximum capacity of each AGV battery, reflecting both battery characteristics and degradation. Then it will describe the discharge more accurately, accounting for both the battery’s characteristics and condition and the operating modes of a particular AGV. Figure 2 presents the block diagram of the battery residual matrix generation pipeline, adapted from the neural network methodology proposed and validated in [30,31]. The DNN and RNN architectures shown in the diagram, including their hyperparameters and evaluation metrics, were developed in the authors’ prior work. The present study applies these trained models to generate cell-level SoC predictions for route planning; for details on model training and validation, the reader is referred to [30,31].
For each agent, a normalized matrix of the remaining battery charge is calculated, representing the predicted SoC at each grid cell. These predictions are obtained using the trained neural network models from [30,31] and are updated when checking movement feasibility and during route replanning in each simulation round, as illustrated in Figure 2. Additionally, the matrix is recalculated for agents waiting in a conflict situation until another agent vacates the cell. If the remaining charge is not enough to cover the route or reach the goal, the agent stops without starting a new route. This allows the algorithm to take into account the actual SOC to make a decision on movement or stopping.
The total energy consumption of an AGV system is primarily determined by higher-level mission strategies such as task distribution, scheduling, and charging policies. The proposed method complements these mechanisms by implementing energy-aware decision-making at the path-planning level: by including the SoC matrix in the environment representation, the planner avoids routes that would discharge the battery below a minimum threshold, thereby improving mission reliability.
An important feature of the proposed method is that the battery state is not treated only as a global constraint but is incorporated directly into the spatial representation of the environment through a battery charge matrix. This allows the planner to evaluate route feasibility at the cell level before execution and prevents agents from initiating paths that cannot be completed with the available energy reserve. As a result, the planning process simultaneously considers geometric feasibility and energy reachability, extending conventional multi-agent path planning approaches.
For example, three AGVs are running within the allowed cells. The agents start from cells  ( 0 , 4 ) , ( 0 , 6 ) ,   a n d   ( 0 , 3 ) , first, second, and third, respectively, and go to cells  ( 4 , 1 ) , ( 9 , 7 ) ,   a n d   ( 9 , 1 ) . For example, for the first AGV, the discharge matrix of the accumulator battery, assuming a full charge, will have the form (31):
B 1 = 0.5 0.46 0.47 0.48 0.5 None 0.46 0.49 None 0.49 0.49 0.43 0.46 0.48 0.46 None 0.39 0.46 None None 0.47 0.43 0.44 0.47 0.46 None 0.39 0.43 None 0.41 0.46 0.42 0.41 0.41 0.45 None 0.32 0.41 None 0.39 0.46 0.40 0.35 0.35 0.42 None 0.31 0.38 None 0.32 0.45 0.36 None None None None 0.31 0.36 None 0.29 0.39 0.35 None 0.29 0.40 0.46 0.29 0.34 0.48 None 0.38 0.32 None 0.28 0.35 0.31 0.27 0.22 0.37 None 0.34 0.32 None 0.27 0.34 0.23 0.22 0.21 0.33 None 0.30 0.24 None 0.25 0.25 0.21 0.20 0.21 0.26 None
For the second AGV, the discharge matrix of the accumulator battery, assuming a full charge, will have the form (32):
B 2 = 0.60 0.59 0.58 0.57 0.56 None 0.55 0.54 None 0.53 0.59 0.58 0.57 0.56 0.55 None 0.54 0.53 None None 0.58 0.57 0.56 0.55 0.54 None 0.53 0.52 None 0.51 0.57 0.56 0.55 0.54 0.53 None 0.52 0.51 None 0.50 0.56 0.55 0.54 0.53 0.52 None 0.51 0.50 None 0.50 0.55 0.54 None None None None 0.51 0.50 None 0.50 0.54 0.53 None 0.52 0.51 0.50 0.50 0.50 0.50 None 0.53 0.52 None 0.51 0.50 0.50 0.50 0.50 0.50 None 0.52 0.51 None 0.50 0.50 0.50 0.50 0.50 0.50 None 0.51 0.50 None 0.50 0.50 0.50 0.50 0.50 0.50 None
For the third AGV, the battery discharge matrix, taking into account the full charge, will have the form (33):
B 3 = 0.90 0.89 0.88 0.87 0.86 None 0.85 0.84 None 0.83 0.82 0.81 0.80 0.79 0.78 None 0.77 0.76 None None 0.75 0.74 0.73 0.72 0.71 None 0.70 0.69 None 0.68 0.67 0.66 0.65 0.64 0.63 None 0.62 0.61 None 0.60 0.59 0.58 0.57 0.56 0.55 None 0.54 0.53 None 0.52 0.51 0.50 None None None None 0.49 0.48 None 0.47 0.46 0.45 None 0.44 0.43 0.42 0.41 0.40 0.39 None 0.38 0.37 None 0.36 0.35 0.34 0.33 0.32 0.31 None 0.30 0.29 None 0.28 0.27 0.26 0.25 0.24 0.23 None 0.28 0.29 None 0.20 0.19 0.18 0.17 0.16 0.15 None
Dynamic obstacles are categorized as human (personnel) and inanimate (moving objects), with human obstacles prohibiting AGV passage and inanimate obstacles imposing a penalty function. Let two employees move cyclically along the route. The first starts at cell  ( 3 , 3 ) , where cell numbering begins at zero and proceeds along the following sequence of cells:  ( 3 , 3 ) , ( 3 , 4 ) , ( 4 , 4 ) , ( 4 , 3 ) , ( 4 , 2 ) , ( 4 , 1 ) , ( 3 , 1 ) ,  and  ( 3 , 2 ) . The second employee starts in cell  ( 6 , 6 )  and moves cyclically through the cells  ( 6 , 6 ) , ( 7 , 6 ) , ( 8 , 6 ) , ( 9 , 6 ) , ( 8 , 6 ) ,  and  ( 7 , 6 ) . Inanimate dynamic obstacles move along routes through the cells  ( 7 , 5 ) , ( 7 , 4 ) , ( 1 , 7 ) , and  ( 1 , 6 ) , respectively. To determine a person’s position in space within the map grid, the methods developed by the authors [33] can be used.
  • Empirically determined parameters:
  • the coefficient of influence of the turn, which takes into account the delay of the AGV when changing direction;
  • the coefficient of influence of dynamic obstacles, which estimates the additional time due to the proximity of moving objects;
  • the basic time of passage of a free cell, determined based on the average speed of the AGV.
  • Conditionally accepted parameters:
  • cell side length, chosen for the unification of calculations;
  • the minimum allowable battery charge level, taken as a safety threshold;
  • penalties for passing through cells with moving objects, determined as conditional values for modeling the difficulty of passage.
Each AGV locally evaluates the dynamic cell cost, including residual SoC, static obstacles, and penalties from dynamic obstacles, enabling adaptive local route planning. The deep neural network-based battery remaining prediction module is executed locally on each agent, and the resulting neural network weights can be transmitted to a central server for aggregation. Each AGV can then update its transit–time matrix, assess the risks posed by dynamic obstacles, and construct optimal routes based on updated energy consumption and environmental forecasts. This approach reduces traffic between the AGV and the server and enhances data confidentiality. It also enables dynamic, real-time route adaptation that accounts for energy constraints and environmental changes. This architecture demonstrates how AI-enhanced predictive modules can complement classical path-planning algorithms. By forecasting battery behavior and environmental dynamics, such models provide additional information that improves the robustness and adaptability of route planning without replacing the deterministic search framework.

5. Results

The simulation was executed using the parameters defined in the experimental setup: cell side length of 1.0 (conventional length unit), average AGV speed of 1.0 (conventional length unit per time unit), turning-angle influence coefficient of 0.2, and dynamic-obstacle influence coefficient of 0.5. The methodology’s effectiveness is evaluated using three metrics: travel time, spatial complexity, and temporal complexity. The resulting paths are depicted in Figure A1.
In Figure A1, black indicates stationary obstacles where AGV passage is prohibited. These cells are assigned a value of 1 in the static-obstacle matrix and are marked as None in the predicted residual battery-charge matrices. Purple, blue, and turquoise represent three AGVs, while green denotes their target points. Yellow and red indicate dynamic obstacles: red denotes people; if an AGV and a person occupy the same cell, passage is prohibited, forcing the AGV to wait until the cell is released. Yellow represents non-visible dynamic obstacles that impose a penalty for passing through the cell. If the penalty value is high (close to 5), the path is planned to circumvent the obstacle; if it is low, the AGV may traverse the cell and wait if it is occupied upon approach.
The methodology prevents AGVs from occupying the same cell simultaneously, applying local conflict resolution as in M* to replan trajectories when conflicts occur. If an AGV encounters another AGV or a dynamic obstacle, it waits one step before releasing the cell. If the cell remains occupied, a new “detour route” is constructed. As observed, Agent 3 did not move because, after attempting to build a route to the target cell, it became evident that the remaining SoC would be insufficient (Table 1).
Additional simulations with varying numbers of AGVs confirmed the effectiveness of the proposed approach in maintaining the shortest paths while respecting energy and obstacle constraints. A comparison of the results obtained with the proposed methodology against other methods for finding the shortest route under constraints is presented in Table 2.
Analyzing the results presented in Table 2, it is evident that different route-planning methods differ significantly in efficiency and reliability in achieving their goals. Given that the primary requirement is for each agent to follow the shortest route while ensuring sufficient remaining SoC and accounting for obstacles, the following conclusions can be drawn.
The A* and Cooperative A* methods guarantee reliable goal achievement for all agents, albeit at the cost of evaluating a significantly larger number of cells during route planning. The D* Lite method demonstrates a balance between efficiency and computational cost, enabling agents to reach their goals while considering a moderate number of cells and adapting routes in dynamic environments. However, the resulting routes are not highly optimized. The M* method proved to be the most economical in terms of the number of evaluated cells, but it exhibits instability; in particular, Agent 2 was unable to reach its goal, making this method risky for guaranteed route planning. The proposed methodology ensures goal achievement along the shortest routes, although the number of evaluated cells is slightly higher than that of the most economical methods. Nevertheless, it is the only approach that guarantees the shortest route for all agents, accounts for both dynamic and static obstacles, and respects the limit on the remaining SoC. Furthermore, the maximum number of evaluated cells is moderate compared with that of all methods. Therefore, the proposed methodology represents a compromise solution between efficiency, reliability, and computational cost.
The metric “Number of Cells Examined” reported in Table 2 does not directly correspond to computational complexity or planning time. It reflects the number of grid cells evaluated during the search process, characterizing the size of the explored search space. This metric serves as an indicator of algorithmic workload rather than an exact measure of runtime, since actual planning time also depends on implementation factors such as data structures and hardware. The proposed method achieves a moderate number of examined cells, balancing shortest-path guarantees with reasonable computational effort.
To assess the effect of the number of agents on the operation of the AGV system, a series of experiments were conducted. The grid and static obstacles were fixed. Dynamic obstacles cyclically changed their positions along the same route in each experiment. To analyze the scalability of the system, experiments were performed with 1, 2, …, N agents ( N = 4 ). For each number of agents, the corresponding starting and target positions were set. Each experiment was repeated 5 times for different starting and target points and different numbers of agents. This allowed us to reduce the random influence of the order of moves and conflicts between agents. In addition, to systematically assess the effect of energy constraints, simulations were conducted for different values of the minimum state of charge (SoC) threshold. For each number of agents, the metrics of all runs were aggregated by averaging.
Figure 3 presents the dependence of the number of examined cells on the minimum SoC threshold. This metric reflects the size of the explored search space during the path planning process and provides an indirect indication of the computational workload of the planner.
Figure 4 illustrates the relationship between the minimum SoC threshold and three planning performance metrics: average path length, number of cells examined during planning, and planning time. The results show that increasing the minimum SoC threshold generally reduces the possibility of reaching the goal, since agents with insufficient remaining battery charge may refuse to start a route if the predicted energy consumption does not allow them to safely reach the destination, since this residual charge must be sufficient for them to reach a charging station.
At moderate threshold values, the number of examined cells may increase because the planner must search for routes that satisfy additional energy constraints. However, when the threshold becomes too high, fewer agents are able to complete their routes. As a result, paths become shorter or incomplete, which may reduce the total number of examined cells and the corresponding planning effort.
As shown in Figure 4, a distinct threshold effect occurs around a minimum SoC of 0.21. Below this threshold, all agent configurations achieve high success rates (100% for 2 agents, 75–100% for 3–4 agents), with stable average path lengths and expected total energy consumption. Above this critical value, the success ratio declines sharply, indicating that a fraction of agents fail to reach their goals (e.g., a success ratio of 0.75 corresponds to 3 out of 4 agents completing their paths). Accordingly, the average path length decreases as some agents will not start the journey due to low battery charge, which should be enough only to reach the charging station. The total energy also decreases, reflecting incomplete paths rather than increased efficiency. The analysis also shows that larger teams are more sensitive to higher Minimum Battery State of Charge (Minimal SOC) thresholds, as some agents wait for others to release cells, draining their batteries in anticipation. This leads to a greater decrease in success and path completion. Overall, these results highlight the trade-off between maintaining sufficient battery reserves and ensuring reliable multi-agent task completion, suggesting that overly high minimum SoC thresholds can substantially compromise performance.

6. Discussion

The results demonstrate the effectiveness of the proposed multi-agent AGV planning methodology, which simultaneously considers residual SoC, static and dynamic obstacles, and time costs to traverse the route. For Agent 1, a stable route was built; it completed the task without stopping, and the remaining SoC after route completion was  0.400  (or  0.35  due to expectations). A key strength of the proposed approach is its ability to integrate predictive models into the planning framework. AI-based prediction of battery state and environmental changes provides additional information that improves route feasibility assessment and supports adaptive replanning in dynamic industrial environments. This indicates the efficient use of energy resources for a short route and the absence of critical situations.
For Agent 2, the longest route (11 cells) was found with a passage time of  10.70  conventional units and a remaining charge of  0.500  (or  0.45  due to expectations). The agent completed the task in full, without reaching a critical energy level, confirming that even with a longer trajectory, the algorithm supports stable operation of the energy management system. For Agent 3, the initial simulation recorded a stop due to reaching the minimum allowable battery charge (battery_low, remaining  0.17 < 0.20  for the constructed route). As a result, the route consisted of a single cell, indicating that the energy limitation criterion operated correctly. After adapting the residual charge matrix (setting the minimum threshold to  0.20 ), the agent passed 12 cells in  11.40  conventional units, completing the task without stopping. The remaining SoC was  0.24 , i.e., above the minimum allowable level. This confirms that adjusting the energy parameters in the residual charge matrix directly affects the reachability of the target coordinates and the total task execution time.
A comparison of the two simulation series showed that even a slight change in the minimum allowable charge level (from  0.17  to  0.20 ) significantly improves the functional reliability of the AGV system, allowing Agent 3 to complete the route without an emergency stop.
Limitations of the current experimental scale: The reported results were obtained under simplifying conditions that constrain the scope of the conclusions. The simulations assumed a fixed fleet size (i.e., a predefined number of AGVs on the map) and a prescribed minimum battery-charge threshold that determines whether motion is permitted. The AGV base speed was treated as constant, so effects such as load-dependent dynamics, acceleration limits, and speed-control policies were not explicitly modeled. The behavior of dynamic obstacles was represented through selected motion characteristics (e.g., speed and cyclic trajectories), which may not fully capture irregular human behavior or non-periodic disturbances typical of real production floors. Finally, the motion model was restricted to 4-connected grid transitions (up/down/left/right), excluding diagonal moves, which may overestimate path length relative to continuous-space navigation. While the experimental scale is limited, the proposed methodology is designed to scale to more complex routes, accommodating a greater number of agents as well as both stationary and dynamic obstacles on higher-dimensional grids. The use of relatively coarse grids in the experiments may lead to sharp turns and somewhat higher energy consumption compared to smooth continuous-space trajectories. However, the proposed approach scales to finer grids and is fully compatible with continuous-space motion, where local trajectory smoothing or motion primitives can reduce sharp maneuvers. The methodology therefore remains applicable to realistic AGV operations and more complex environments.
Future work will focus on extending the framework toward more realistic industrial conditions by: (i) incorporating variable-speed and load-dependent energy models;, (ii) supporting 8-connected and continuous-space motion primitives; (iii) improving prediction and uncertainty handling for human motion and non-cyclic dynamic obstacles; (iv) validating the approach on larger maps and real AGV deployments with online sensing and communication constraints; and (v) integrating a local path smoothing module that reduces sharp turns, taking into account dynamic obstacles and battery limitations. Therefore, further studies involving larger AGV fleets, more complex workspaces, and battery degradation data in real-world conditions are needed to fully validate the methodology for industrial applications.

7. Conclusions

The results demonstrate the effectiveness of the proposed methodology for route planning in a multi-agent AGV system under energy and time constraints, accounting for static and two types of dynamic obstacles. In light of prior studies, the results confirm that incorporating dynamic battery charge estimation and adaptive thresholds significantly enhances route stability and system robustness. Earlier research on AGV coordination often focused on geometric optimization or obstacle avoidance. However, fewer studies have examined the energy-reliability trade-off in decentralized multi-agent control frameworks. The systematic analysis of the minimum SoC threshold revealed a critical value of approximately 0.21, below which high success rates are maintained across all agent configurations. Above this threshold, success declines sharply, particularly for larger teams where inter-agent waiting increases battery drain. From the perspective of the initial working hypothesis, the results validate that incorporating a dynamic energy constraint into the routing algorithm directly improves both the robustness and functional reliability of AGV fleets. The agents successfully balanced travel time and residual energy, indicating that the proposed model can be effectively applied to real-world industrial scenarios where energy efficiency and operational continuity are critical. Future research should focus on extending the proposed approach to: real-time adaptive control using predictive battery models (e.g., LSTM-based voltage forecasting); federated coordination among heterogeneous AGVs with varying energy consumption profiles; and integration of anomaly detection mechanisms to preemptively identify energy inefficiencies or sensor faults.

Author Contributions

Conceptualization, O.P. and M.M.; methodology, O.P.; software, O.P. and M.M.; validation, O.P. and M.M.; formal analysis, O.P.; investigation, O.P. and M.M.; resources, O.P. and M.M.; data curation, O.P. and M.M.; writing—original draft preparation, O.P. and M.M.; writing—review and editing, O.P. and M.M.; visualization, O.P. and M.M.; supervision, O.P.; project administration, O.P.; funding acquisition, O.P. All authors have read and agreed to the published version of the manuscript.

Funding

Co-funded by the European Union HORIZON TMA MSCA Doctoral Networks/HORIZON-MSCA-2023-DN-01/project TUAI/grant agreement N° 101168344. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or European Research Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. Co-funded by the Statutory Research funds of the Department of Distributed Systems and Informatic Devices, Silesian University of Technology, Gliwice, Poland (Grant BK-244/RAu8/2025 02/110/BK_25/1036).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All of the data presented in this study are available in publicly accessible repositories. Neural network prediction of battery discharge https://github.com/opavliuk1980/AGV_WP3, accessed on 3 November 2025. Human activity recognition https://github.com/opavliuk1980/HAR, accessed on 19 March 2026. Methodology for finding the shortest AGV route under limited resources https://github.com/opavliuk1980/methodology_MDPI, accessed on 19 March 2026.

Acknowledgments

During the preparation of this manuscript, the authors used Claude Opus 4.6 (https://claude.ai/) for the purposes of proofreading and Google NotebookLM (https://notebooklm.google.com/) for generating some visual elements for the graphical abstract. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. The result of finding the shortest AGV path according to the proposed methodology.
Figure A1. The result of finding the shortest AGV path according to the proposed methodology.
Applsci 16 03060 g0a1

References

  1. Banik, S.; Banik, S.C.; Mahmud, S.S. Path Planning Approaches in Multi-robot System: A Review. Eng. Rep. 2025, 7, e13035. [Google Scholar] [CrossRef]
  2. Fragapane, G.; de Koster, R.; Sgarbossa, F.; Strandhagen, J.O. Planning and control of autonomous mobile robots for intralogistics: Literature review and research agenda. Eur. J. Oper. Res. 2021, 294, 405–426. [Google Scholar] [CrossRef]
  3. AbuJabal, N.; Baziyad, M.; Fareh, R.; Brahmi, B.; Rabie, T.; Bettayeb, M. A Comprehensive Study of Recent Path-Planning Techniques in Dynamic Environments for Autonomous Robots. Sensors 2024, 24, 8089. [Google Scholar] [CrossRef] [PubMed]
  4. Jayalakshmi, K.P.; Nair, V.G.; Sathish, D. A Comprehensive Survey on Coverage Path Planning for Mobile Robots in Dynamic Environments. IEEE Access 2025, 13, 60158–60185. [Google Scholar] [CrossRef]
  5. Sang, W.; Yue, Y.; Zhai, K.; Lin, M. Research on AGV Path Planning Integrating an Improved A* Algorithm and DWA Algorithm. Appl. Sci. 2024, 14, 7551. [Google Scholar] [CrossRef]
  6. Wu, B.; Chi, X.; Zhao, C.; Zhang, W.; Lu, Y.; Jiang, D. Dynamic Path Planning for Forklift AGV Based on Smoothing A* and Improved DWA Hybrid Algorithm. Sensors 2022, 22, 7079. [Google Scholar] [CrossRef]
  7. Yin, X.; Cai, P.; Zhao, K.; Zhang, Y.; Zhou, Q.; Yao, D. Dynamic Path Planning of AGV Based on Kinematical Constraint A* Algorithm and Following DWA Fusion Algorithms. Sensors 2023, 23, 4102. [Google Scholar] [CrossRef] [PubMed]
  8. Gong, X.; Gao, Y.; Wang, F.; Zhu, D.; Zhao, W.; Wang, F.; Liu, Y. A Local Path Planning Algorithm for Robots Based on Improved DWA. Electronics 2024, 13, 2965. [Google Scholar] [CrossRef]
  9. Sun, Y.; Yuan, Q.; Gao, Q.; Xu, L. A Multiple Environment Available Path Planning Based on an Improved A* Algorithm. Int. J. Comput. Intell. Syst. 2024, 17, 172. [Google Scholar] [CrossRef]
  10. Xiao, J.; Yu, X.; Sun, K.; Zhou, Z.; Zhou, G. Multiobjective path optimization of an indoor AGV based on an improved ACO-DWA. Math. Biosci. Eng. 2022, 19, 12532–12557. [Google Scholar] [CrossRef]
  11. Pratama, A.Y.; Ariyadi, M.R.; Tamara, M.N.; Purnomo, D.S.; Ramadhan, N.A.; Pramujati, B. Design of Path Planning System for Multi-Agent AGV Using A* Algorithm. In Proceedings of the 2023 International Electronics Symposium (IES), Denpasar, Indonesia, 8–10 August 2023; pp. 335–341. [Google Scholar] [CrossRef]
  12. Sun, M.; Lu, L.; Ni, H.; Wang, Y.; Gao, J. Research on dynamic path planning method of moving single target based on visual AGV. SN Appl. Sci. 2022, 4, 86. [Google Scholar] [CrossRef]
  13. Daniel, K.; Nash, A.; Koenig, S.; Felner, A. Theta*: Any-Angle Path Planning on Grids. Artif. Intell. 2010, 175, 87–117. [Google Scholar] [CrossRef]
  14. Koenig, S.; Likhachev, M. Lifelong Planning A*. Artif. Intell. 2004, 155, 93–146. [Google Scholar] [CrossRef]
  15. Karaman, S.; Frazzoli, E. Sampling-Based Algorithms for Optimal Motion Planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
  16. Jin, J.; Zhang, Y.; Zhou, Z.; Jin, M.; Yang, X.; Hu, F. Conflict-based search with D* lite algorithm for robot path planning in unknown dynamic environments. Comput. Electr. Eng. 2023, 105, 108473. [Google Scholar] [CrossRef]
  17. Zhu, X.; Yan, B.; Yue, Y. Path Planning and Collision Avoidance in Unknown Environments for USVs Based on an Improved D* Lite. Appl. Sci. 2021, 11, 7863. [Google Scholar] [CrossRef]
  18. Liu, N.; Shen, S.; Kong, X.; Zhang, H.; Bräunl, T. Cooperative Hybrid Multi-Agent Pathfinding Based on Shared Exploration Maps. J. Intell. Robot. Syst. 2025, 111, 97. [Google Scholar] [CrossRef]
  19. Feng, J.; Yang, Y.; Zhang, H.; Sun, S.; Xu, B. Path Planning and Trajectory Tracking for Autonomous Obstacle Avoidance in Automated Guided Vehicles at Automated Terminals. Axioms 2024, 13, 27. [Google Scholar] [CrossRef]
  20. Ren, Z.; Rathinam, S.; Likhachev, M.; Choset, H. Multi-Objective Path-Based D* Lite. IEEE Robot. Autom. Lett. 2022, 7, 3318–3325. [Google Scholar] [CrossRef]
  21. Chen, X.; Chen, C.; Wu, H.; Postolache, O.; Wu, Y. An improved artificial potential field method for multi-AGV path planning in ports. Intell. Robot. 2025, 5, 19–33. [Google Scholar] [CrossRef]
  22. Kasaura, K.; Yonetani, R.; Nishimura, S. Elevating Priorities for an Efficient and Complete Lifelong Multi-AGV Pathfinding on Roadmaps. Robot. Auton. Syst. 2026, 197, 105295. [Google Scholar] [CrossRef]
  23. Liu, R. Research on Optimization of the AGV Shortest-Path Model and Obstacle Avoidance Planning in Dynamic Environments. Math. Probl. Eng. 2022, 2022, 2239342. [Google Scholar] [CrossRef]
  24. Tao, M.; Li, Q.; Yu, J. Multi-Objective Dynamic Path Planning with Multi-Agent Deep Reinforcement Learning. J. Mar. Sci. Eng. 2025, 13, 20. [Google Scholar] [CrossRef]
  25. Guo, Z.; Xia, Y.; Li, J.; Liu, J.; Xu, K. Hybrid Optimization Path Planning Method for AGV Based on KGWO. Sensors 2024, 24, 5898. [Google Scholar] [CrossRef] [PubMed]
  26. Bai, Y.; Ding, X.; Hu, D.; Jiang, Y. Research on Dynamic Path Planning of Multi-AGVs Based on Reinforcement Learning. Appl. Sci. 2022, 12, 8166. [Google Scholar] [CrossRef]
  27. Chen, P.; Pei, J.; Lu, W.; Li, M. A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance. Neurocomputing 2022, 497, 64–75. [Google Scholar] [CrossRef]
  28. Tsmots, I.; Teslyuk, V.; Opotiak, Y.; Parcei, R.; Zinko, R. The basic architecture of mobile robotic platform with intelligent motion control system and data transmission protection. Ukr. J. Inf. Technol. 2021, 3, 74–80. [Google Scholar] [CrossRef]
  29. Lin, S.; Liu, A.; Wang, J.; Kong, X. A Review of Path-Planning Approaches for Multiple Mobile Robots. Machines 2022, 10, 773. [Google Scholar] [CrossRef]
  30. Pavliuk, O.; Cupek, R.; Steclik, T.; Medykovskyy, M.; Drewniak, M. A Novel Methodology Based on a Deep Neural Network and Data Mining for Predicting the Segmental Voltage Drop in Automated Guided Vehicle Battery Cells. Electronics 2023, 12, 4636. [Google Scholar] [CrossRef]
  31. Pavliuk, O.; Medykovskyy, M.; Steclik, T. Predicting AGV Battery Cell Voltage Using a Neural Network Approach with Preliminary Data Analysis and Processing. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 5087–5096. [Google Scholar] [CrossRef]
  32. Yemets, K.; Izonin, I.; Dronyuk, I. Time Series Forecasting Model Based on the Adapted Transformer Neural Network and FFT-Based Features Extraction. Sensors 2025, 25, 652. [Google Scholar] [CrossRef]
  33. Pavliuk, O.; Mishchuk, M.; Strauss, C. Transfer Learning Approach for Human Activity Recognition Based on Continuous Wavelet Transform. Algorithms 2023, 16, 77. [Google Scholar] [CrossRef]
  34. Zhang, Y.; Wang, F.; Fu, F.; Su, Z. Multi-AGV Path Planning for Indoor Factory by Using Prioritized Planning and Improved Ant Algorithm. J. Eng. Technol. Sci. 2018, 50, 534–547. [Google Scholar] [CrossRef]
  35. Singh, N.; Akcay, A.; Dang, Q.V.; Martagan, T.; Adan, I. Dispatching AGVs with battery constraints using deep reinforcement learning. Comput. Ind. Eng. 2024, 187, 109678. [Google Scholar] [CrossRef]
  36. Singh, N.; Dang, Q.V.; Akcay, A.; Adan, I.; Martagan, T. A matheuristic for AGV scheduling with battery constraints. Eur. J. Oper. Res. 2022, 298, 855–873. [Google Scholar] [CrossRef]
  37. Guo, W.; Hu, H.; Sha, M.; Lian, J.; Yang, X. Battery-Powered AGV Scheduling and Routing Optimization with Flexible Dual-Threshold Charging Strategy in Automated Container Terminals. J. Mar. Sci. Eng. 2025, 13, 1526. [Google Scholar] [CrossRef]
Figure 1. Route map view in the “Navitrol Monitor” environment during AGV operation (static map and currently detected area).
Figure 1. Route map view in the “Navitrol Monitor” environment during AGV operation (static map and currently detected area).
Applsci 16 03060 g001
Figure 2. Block diagram of the battery residual matrix generation pipeline.
Figure 2. Block diagram of the battery residual matrix generation pipeline.
Applsci 16 03060 g002
Figure 3. Dependence of the number of examined cells on the minimum SoC threshold.
Figure 3. Dependence of the number of examined cells on the minimum SoC threshold.
Applsci 16 03060 g003
Figure 4. Impact of the minimum SoC threshold on path planning performance metrics: average path length, number of examined cells, and planning time.
Figure 4. Impact of the minimum SoC threshold on path planning performance metrics: average path length, number of examined cells, and planning time.
Applsci 16 03060 g004
Table 1. The results of constructing the shortest route.
Table 1. The results of constructing the shortest route.
AgentFound PathStop ReasonNumber of Visited CellsRemaining SoCTravel time (Conventional Units)
1(0, 4), (1, 4), (2, 4), (3, 4), (4, 4), (4, 3), (4, 2), (4, 1)None80.4007.20
2(0, 6), (0, 7), (1, 7), (2, 7), (3, 7), (4, 7), (5, 7), (6, 7), (7, 7), (8, 7), (9, 7)None110.50010.70
3(0, 3)battery_low10.8700
Table 2. Comparison of different methods for finding shortest routes with constraints.
Table 2. Comparison of different methods for finding shortest routes with constraints.
MethodAgentNumber of Cells PassedNumber of Cells Considered in Order
Proposed1861
Proposed21184
Proposed31297
A*18118
A*21174
A*314194
Cooperative A*18126
Cooperative A*21174
Cooperative A*314194
M*1825
M*22None
M*31234
D* Lite11067
D* Lite21119
D* Lite31433
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pavliuk, O.; Mishchuk, M. Energy-Aware Spatio-Temporal Multi-Agent Route Planning for AGVs. Appl. Sci. 2026, 16, 3060. https://doi.org/10.3390/app16063060

AMA Style

Pavliuk O, Mishchuk M. Energy-Aware Spatio-Temporal Multi-Agent Route Planning for AGVs. Applied Sciences. 2026; 16(6):3060. https://doi.org/10.3390/app16063060

Chicago/Turabian Style

Pavliuk, Olena, and Myroslav Mishchuk. 2026. "Energy-Aware Spatio-Temporal Multi-Agent Route Planning for AGVs" Applied Sciences 16, no. 6: 3060. https://doi.org/10.3390/app16063060

APA Style

Pavliuk, O., & Mishchuk, M. (2026). Energy-Aware Spatio-Temporal Multi-Agent Route Planning for AGVs. Applied Sciences, 16(6), 3060. https://doi.org/10.3390/app16063060

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop