Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex

Sadeghi Esfahlani, Shabnam; Simanjuntak, Sarinova; Sanaei, Alireza; Fraess-Ehrfeld, Alex

doi:10.3390/drones9090664

Open AccessArticle

Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex

by

Shabnam Sadeghi Esfahlani

^1,*

,

Sarinova Simanjuntak

¹

,

Alireza Sanaei

¹

and

Alex Fraess-Ehrfeld

²

¹

Faculty of Science and Engineering, School of Engineering and the Built Environment, Anglia Ruskin University, Bishop Hall Lane, Chelmsford CM1 1SQ, UK

²

Airborne Robotics Worting House, Worting Road, Basingstoke RG23 8PY, UK

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(9), 664; https://doi.org/10.3390/drones9090664

Submission received: 7 August 2025 / Revised: 18 September 2025 / Accepted: 18 September 2025 / Published: 22 September 2025

(This article belongs to the Special Issue Advances in Cartography, Mission Planning, Path Search, and Path Following for Drones)

Download

Browse Figures

Versions Notes

Abstract

Highlights

What are the main findings?

RRT*-based hybrids (RRT*-SA and RRT*-ALNS) achieve the shortest mean paths; RRT*-SA attains a co-lowest runtime.
The TWA-MILP reaches proven optimality in 0.11 s; seven UAVs satisfy all 20–30 min windows in a single wave, while a rolling 15 min demand is sustained with three UAVs.

What are the implications of the main findings?

Sub-40-min BVLOS missions are feasible for nearby sites with fewer than ten drones; the farthest destinations are reliably served in 1–2 h, including the return.
Digital-twin validation and CAP 722-compliant trajectories lower certification risk and support scalable rollout.

Abstract

Rapid access to defibrillators, blood products, and time-critical medicines can improve survival, yet urban congestion and fragmented infrastructure delay deliveries. We present and evaluate an end-to-end framework for beyond-visual-line-of-sight (BVLOS) UAV logistics in Essex (UK), integrating (I) strategic depot placement, (II) a hybrid obstacle-aware route planner, and (III) a time-window-aware (TWA) Mixed-Integer Linear Programming (MILP) scheduler coupled to a battery/temperature feasibility model. Four global planners—Ant Colony Optimisation (ACO), Genetic Algorithm (GA), Particle Swarm Optimisation (PSO), and Rapidly Exploring Random Tree* (RRT*)—are paired with lightweight local refiners, Simulated Annealing (SA) and Adaptive Large-Neighbourhood Search (ALNS). Benchmarks over 12 destinations used real Civil Aviation Authority no-fly zones and energy constraints. RRT*-based hybrids delivered the shortest mean paths: RRT* + SA and RRT* + ALNS tied for the best average length, while RRT* + SA also achieved the co-lowest runtime at

v = 60 km h^{- 1}

. The TWA-MILP reached proven optimality in 0.11 s, showing that a minimum of seven UAVs are required to satisfy all 20–30 min delivery windows in a single wave; a rolling demand of one request every 15 min can be sustained with three UAVs if each sortie (including service/recharge) completes within 45 min. To validate against a state-of-the-art operations-research baseline, we also implemented a Vehicle Routing Problem with Time Windows (VRPTW) in Google OR-Tools, confirming that our hybrid planners generate competitive or shorter NFZ-aware routes in complex corridors. Digital-twin validation in AirborneSIM confirmed CAP 722-compliant, flyable trajectories under wind and sensor noise. By hybridising a fast, probabilistically complete sampler (RRT*) with a sub-second refiner (SA/ALNS) and embedding energy-aware scheduling, the framework offers an actionable blueprint for emergency medical UAV networks.

Keywords:

drone; Unmanned Aerial Vehicles (UAVs); supply chain; routing and scheduling; Beyond Visual Line of Sight (BVLOS)

1. Introduction

Unmanned Aerial Vehicles (UAVs) or drones have rapidly evolved from military platforms into versatile tools for civil and commercial applications [1,2,3,4,5,6]. Their exceptional reliability, speed, and agility make them ideally suited to navigate congested urban environments where ground-based infrastructure often proves limiting [7,8,9,10,11]. Fuelled by the explosive growth of e-commerce, UAVs are transforming logistics, challenging traditional delivery paradigms, and opening new frontiers in last-mile distribution [12].

UAVs with efficient logistics management not only accelerate delivery times but also deliver critical benefits, such as reduced traffic congestion, lower carbon emissions, and, most significantly, enhanced access to life-saving aids [13]. UAVs for medical logistics applications have attracted increasing attention in recent years. Market projections estimate the global medical UAV market will grow from USD 255 million in 2021 to USD 1.4 billion by 2028 [14]. Empirical deployments, e.g., UAV delivery over 3.2 km in under 16 min, have demonstrated clear advantages over traditional ground transport in time-critical scenarios such as out-of-hospital cardiac arrest, organ transplant, and emergency blood resupply [15,16,17,18].

In time-critical healthcare situations, every minute can mean the difference between life and death. Yet urban healthcare systems often lack the infrastructure to ensure the rapid dispatch and precise routing of these essential items. Automating the process can make a significant difference, while human supervision and intervention ensure safety. UAVs must be capable of determining optimal paths from their current location to the mission target, while maintaining stability under uncertain disturbances (e.g., wind). They must also adhere to geofencing and Civil Aviation Authority (CAA) regulations, including congested airspace restrictions and designated no-fly zones (NFZs), and avoid physical obstacles such as buildings and infrastructure. Further challenges include limited battery life and payload capacity, the need to efficiently cover certain delivery locations, ensuring smooth trajectories, coordinating with other UAVs or objects in shared airspace, and optimising energy usage given payload constraints [19].

This paper addresses this gap by presenting a novel UAV-based routing and scheduling framework, developed under the UKRI Small Business Research Initiative (UKRI-SBRI-432304) as part of the Wireless Mesh Network for Multi-drone Operations project. It is specifically designed for the rapid delivery of time-sensitive medical emergency supplies (automated external defibrillators, medications, blood products, etc.). By integrating strategic hub placement with advanced hybrid route-planning algorithms and explicitly incorporating delivery time-window constraints, our approach ensures both operational feasibility and maximal clinical impact. To further validate the pipeline, we benchmark fleet-level schedules using Vehicle Routing Problem with Time Windows (VRPTW), built on an NFZ-aware visibility graph, as an independent operations-research baseline that confirms temporal feasibility and cross-checks geometric costs derived from our hybrids. The result is an autonomous UAV network that complements existing emergency response infrastructures to deliver efficient, reliable, and cost-effective dispatches, thereby substantially improving patient outcomes in urban settings.

This study contributes an end-to-end UAV logistics framework that integrates hybrid obstacle-aware route planning with energy and time-window-aware scheduling, validated through digital-twin simulation and benchmarked against state-of-the-art VRPTW solvers under real UK airspace constraints. As such, we (i) introduce a hybrid planner that couples fast global samplers (RRT*, ACO, GA, PSO) with lightweight local refiners (SA, ALNS), enabling both probabilistic completeness and sub-second path smoothing; (ii) embed a Time-Window-Aware Mixed-Integer Linear Programming (TWA-MILP) scheduler to ensure heterogeneous UAV fleets meet strict medical delivery deadlines while respecting payload, battery, and regulatory limits; and (iii) validate the full framework through digital-twin simulations in AirborneSIM and cross-validation with VRPTW, demonstrating county-wide, CAP 722-compliant missions with sub-40-min ETAs and consistent gains in NFZ-constrained cases.

The remainder of this manuscript is organised as follows. Related work is reviewed in Section 2. The problem formulation, mathematical model, and customised algorithms for routing and scheduling optimisation are presented in Section 3. Experimental results are reported and analysed in Section 4, followed by further discussion in Section 5. Finally, conclusions are drawn in Section 6.

2. Related Work

The logistics and supply chain industry has witnessed unprecedented transformations, driven by the promise of ultra-fast, low-cost delivery catalysed by technological advancements, evolving consumer demands, and urbanisation trends [20]. Several studies have proposed a warehouse-to-consumer model of drone delivery [21]. However, due to limitations such as finite battery capacity, restricted communication range, and the need to minimise the number of warehouses for economies of scale, many researchers have instead considered hybrid truck-and-drone models. These approaches mitigate the restricted range of drones by coupling them with ground vehicles. For example, in [22], drones are used to resupply trucks during delivery routes.

Collaborative routing with trucks and drones has therefore emerged as a promising solution to improve delivery efficiency. This problem is typically formulated as an extension of the Travelling Salesman Problem (TSP), with two well-studied variants: the Flying Sidekick TSP (FSTSP) and the Parallel Drone Scheduling TSP (PDSTSP) [23,24]. In both formulations, a set of customers must be served from a depot where a truck and a set of drones are based. In the FSTSP, the truck and a single drone coordinate: the drone is launched from the truck at one customer, delivers to another, and then returns to the truck at a different location. During the drone’s flight, the truck continues its own deliveries, provided the drone has sufficient battery to hover or wait for the truck. This mode requires tight synchronisation between the two vehicles. By contrast, in the PDSTSP, multiple drones operate independently: each drone can fly directly between the depot and its assigned customers, completing several back-and-forth trips within the time horizon. Both variants seek to minimise the total completion time required to serve all customers and return both trucks and drones to the depot [25].

Beyond these formulations, a wide range of algorithms have been proposed for UAV path planning. These methods can be broadly categorised into classical graph-search techniques, heuristic or bio-inspired metaheuristics, and learning-based approaches [26,27,28]. The success of UAV missions in dynamic and complex environments depends critically on the development of robust, scalable path-planning algorithms, particularly for multi-UAV platforms.

2.1. Classical and Sampling-Based Path Planning

Graph-search methods (A*, Dijkstra) guarantee optimality on known maps but scale poorly with environment size [29]. Sampling-based planners such as Probabilistic Roadmaps (PRM) and Rapidly Exploring Random Trees (RRT*) effectively handle high-dimensional and partially known spaces, trading strict optimality for computational tractability [30]. Reactive techniques, Dynamic Window Approach (DWA) and Artificial Potential Fields (APF), offer low-latency obstacle avoidance but can become trapped in local minima or produce oscillatory trajectories [31].

2.2. Heuristic and Bio–Inspired Metaheuristics

To overcome the limitations of classical planners, the literature suggests numerous bio-inspired metaheuristics approaches, e.g., Genetic Algorithms (GAs) [32], Particle Swarm Optimisation (PSO) [33], Ant Colony Optimisation (ACO) [34], Grey Wolf Optimiser (GWO) [35] and its hybrids [36]. These approaches are more flexible, with multi-objective search capabilities, but they require careful parameter tuning and can incur significant computational overhead [37].

2.3. Learning-Based and Hybrid Approaches

Deep reinforcement learning (DRL) methods, including Deep Q-Networks (DQN), Deep Deterministic Policy Gradient (DDPG), Soft Actor–Critic (SAC), Proximal Policy Optimisation (PPO), and multi-agent reinforcement learning (MARL), enable real-time, adaptive decision-making in dynamic environments, but typically require extensive training and are highly sensitive to reward design [38,39,40,41,42]. Zheng et al. [43] extend the classical Lin–Kernighan–Helsgaun (LKH) algorithm for the Travelling Salesman Problem (TSP) by proposing a Variable Strategy Reinforced LKH (VSR-LKH) method, which integrates Q-learning, Sarsa, and Monte Carlo strategies. Their work illustrates how reinforcement learning can augment established heuristics to enhance large-scale combinatorial optimisation. Similarly, Johansson [44] demonstrated that LKH produces better routes than nearest neighbour (NN) heuristics within limited time budgets, provided the waypoint count is moderate, and that incorporating recharging trips yields more energy-efficient UAV routes than plain LKH.

In parallel, hybrid approaches seek to fuse classical, heuristic, and learning-based components to achieve a balance of optimality, safety, and computational efficiency. For example, Zhou et al. [32] combined A* initialisation with RRT* refinement, Chen et al. [37] integrated Grey Wolf Optimisation (GWO) with Artificial Potential Fields (APF), and Ghaffar et al. [45] proposed Artificial Bee Colony (ABC) with Simulated Annealing (SA) for clustering and route refinement. Such hybrids exploit the global search capacity of metaheuristics while leveraging the deterministic guarantees and smooth trajectories provided by classical planners, thereby addressing UAV path-planning challenges in dynamic and resource-constrained environments.

3. Methodology

3.1. Drone Specification

Airborne Robotics’ AIR8 Medium Lifter Octocopter (https://www.air6systems.com/, accessed on 1 April 2024) Airborne Robotics, Basingstoke, UK (Figure 1) was selected for this study. The AIR8 is a robust UAV platform configured with eight coaxial motors and rotors, offering reliable lift and control for medium-payload operations. The drone has a maximum take-off weight (MTOW) of 25 kg and an operational weight of 16 kg (including batteries), enabling it to carry payloads of up to 10 kg. It features a rotor span of approximately 200 cm when fully unfolded and can reach speeds of up to 90 km/h. Four 15 Ah 6S Li-ion batteries power the UAV, arranged to provide an overall capacity of 30 Ah at 12S, ensuring sustained flight duration. For guidance and control, the system is equipped with a Pixhawk Cube Orange autopilot (ArduPilot, New York, NY, USA). It supports multiple redundant communication links, including satellite communication (satcom) and 4G/5G connectivity, crucial for beyond-visual-line-of-sight (BVLOS) operations. Furthermore, the drone meets IP54 weather resistance standards, making it suitable for dust and light rain conditions commonly encountered in field deployments.

3.2. Problem Formulation

3.3. Case Study

In this study, Essex County in the UK was selected as the case study area. The depot is located at Broomfield Hospital, with coordinates (

51.77452 °

,

0.46619 °

, 55 m). Geospatial data for the region, including no-fly zones (NFZs) and delivery targets, was collected from Google Earth in KML format. The KML file defines NFZs, including prohibited, restricted, and hazardous areas. Twelve delivery destinations were selected across Essex to ensure regional coverage (see Table 1).

Each location is defined by its geographic coordinates (latitude and longitude), along with its straight-line (Euclidean) distance and estimated car travel distance from the depot (Broomfield Hospital). The car distances reflect real-world travel times and were recorded on a Sunday at 5:30 PM to capture moderate traffic conditions. The straight-line distances were obtained from the Drone Safety Map, providing a useful baseline for evaluating aerial delivery efficiency.

The delivery locations were selected to ensure broad geographical coverage of the region, supporting a comprehensive assessment of routing performance, coverage limitations, and navigation constraints in proximity to designated NFZs. Notably, Southend University Hospital was excluded due to its location within an NFZ.

To accommodate high-stakes use cases, such as emergency medical deliveries, our UAV routing and scheduling strategy incorporates time-window constraints into the problem formulation. We put restrictions on the flight altitude based on the terrain elevation. UAVs are set to fly within legal bounds that are hereby fixed to

m i n_{a l t} = 90

m (m) for the minimum altitude, and

m a x_{a l t} = 120

m for the maximum altitude, relative to the elevation from mean sea level of a point of given latitude and longitude coordinates. These constraints are essential for ensuring the feasibility, reliability, and clinical impact of UAV-based delivery services. In such scenarios, UAVs demonstrate a high degree of autonomy, enabling the efficient and cost-effective transportation of critical medical supplies.

3.4. Routing and Scheduling Optimisation Methods

Our study problem is based on the concept of PDSTSP (Parallel Drone Scheduling TSP) while excluding the truck from the system. Instead, a depot or charging station (CS) is the location from which the drones fly and return. Parallel machine scheduling (PMS) was integrated using hybrid path findings. We formulated our problem based on several constraints, including strict time windows, regulatory, urban, and NFZ obstacles, as well as payload and battery constraints. Our study fills this gap by integrating strategic hub placement and time-window-aware hybrid route planners to deliver critical medical supplies with guaranteed timeliness and robustness. Figure 2 illustrates the system architecture of the drone medical logistics framework with inputs from the data layer (geospatial KML, UAV specifications, demands) fed into the optimisation and scheduling layers.

3.4.1. Path Length and Cost Function

For any collision-free polyline path P, the total great-circle length is defined as:

L (P) = \sum_{(u, v) \in P} d (u, v),

where

d (u, v)

denotes the haversine distance between consecutive nodes.

Across all metaheuristics, the **fitness or cost function** is consistently:

f (P) = \sum_{i = 1}^{| P | - 1} d (P_{i}, P_{i + 1}),

(1)

measuring the total path length. This formulation (Equation (1)) is used in GA, PSO, ACO, and refinement algorithms.

Estimated Time of Arrival (ETA)

The mission ETA for a path of length

L (P)

at cruise speed v with service time

t_{svc}

is:

ETA (P) = \frac{60}{v} L (P) + t_{svc} .

(2)

This expression is later extended in the battery model (Equations (32)–(34)) to explicitly account for payload mass m and temperature-dependent battery performance.

3.4.2. Rapidly Exploring Random Tree Star (RRT*)

The RRT* path-planning algorithm [46] iteratively grows a tree of collision-free paths by random sampling in the configuration space. RRT* adds two operations, Parent and Rewire, to the RRT, which complement each other. Parent reduces the path cost of newly generated nodes, and Rewire reduces redundant paths for the expanded random tree. The algorithm exhibits asymptotic optimality, meaning that as the number of iteration cycles tends to infinity, the optimal path will be found [47]. A random sample

q_{rand} \in R^{2}

is drawn from the configuration space with goal-bias probability

p_{goal}

:

q_{rand} = \{\begin{matrix} q_{goal}, & with probability p_{goal} \\ Uniform (x_{min}, x_{max}) \times Uniform (y_{min}, y_{max}), & otherwise \end{matrix}

(3)

The closest existing tree node

q_{near}

is selected by minimising the Euclidean (haversine-based) distance:

q_{near} = arg min_{q \in T} ∥ q - q_{rand} ∥

(4)

A new node

q_{new}

is generated by moving from

q_{near}

toward

q_{rand}

with a maximum step size

δ

:

q_{new} = q_{near} + δ \cdot \frac{q_{rand} - q_{near}}{∥ q_{rand} - q_{near} ∥}

(5)

The length of the edge

d (\cdot, \cdot)

is the distance between the haversines and the cumulative cost

c (q)

is the length of the path from the root (depot) to the node q. The cumulative path cost at node

q_{new}

is computed as

c (q_{new}) = c (q_{near}) + d (q_{near}, q_{new})

(6)

This distance is later mapped to time via v (cruise speed) and to energy via the battery model.

For each nearby node

q_{j}

within a radius neighbourhood r, the algorithm checks if the path through

q_{new}

provides a shorter cost. If the route to

q_{j}

via

q_{new}

is shorter than its current best route, we rewire

q_{j}

to use

q_{new}

as its parent.

If c (q_{new}) + d (q_{new}, q_{j}) < c (q_{j}), then parent (q_{j}) \leftarrow q_{new}

(7)

where

c (\cdot)

is the cost accumulated from the depot to the node (km);

d (\cdot, \cdot)

is the edge length (km);

q_{j}

ranges over the near set. This produces a shorter seed path P so ETA is reduced and the energy test

E_{km} (m) L \leq E_{use} (T)

is more likely to be passed.

If

q_{new}

is sufficiently close to the goal (within

δ

) and the straight segment to the goal is collision-free, we connect it.

d (q_{new}, q_{goal}) < δ and collision ‐ free (q_{new}, q_{goal}) .

where

δ

is the steering step, and collision-free means that the great-circle segment does not intersect any KML no-fly polygon and remains within the altitude corridor (90–120 m AGL). As soon as a legal connection exists, the current best root-to-target path can be extracted, the ETA can be checked against the battery model, and it can be passed for smoothing, even while the tree continues to improve. We use an angular steering step

s t e p = δ = 0.05 °

in latitude/longitude. At Essex latitudes (

ϕ \approx 51.8 °

), this corresponds to

\approx 5.55

km in latitude (

1 ° \approx 111

km) and

\approx 3.44

km in longitude (

1 ° \approx 111 cos ϕ \approx 68.7

km).

3.4.3. Ant Colony Optimisation (ACO)

Ant Colony Optimisation (ACO) is a probabilistic metaheuristic algorithm inspired by the foraging behaviour of real ants. It probabilistically constructs and iteratively improves a population of candidate solutions using a shared memory structure known as pheromone trails [48]. During each iteration, individual artificial ants generate a solution by selecting components (basic elements of a complete solution) based on the accumulated pheromone information, which reflects the historical quality of the components in previous iterations [34,48].

At iteration zero, the pheromone values on all edges are initialised to a small constant:

τ_{i j}^{(0)} = τ_{0} \forall (i, j) \in E

(8)

For each edge

(i, j)

, a static heuristic

η_{i j}

is defined as the inverse of the edge cost:

η_{i j} = \frac{1}{d_{i j}}

(9)

At each decision point, an ant at node i chooses the next node

j \in N_{i}

using a probabilistic rule.

P_{i j}^{(k)} = \frac{{[τ_{i j}]}^{α} \cdot {[η_{i j}]}^{β}}{\sum_{l \in N_{i} ∖ visited} {[τ_{i l}]}^{α} \cdot {[η_{i l}]}^{β}}

(10)

where

$α$ controls the influence of the pheromones;
$β$ controls the influence of the heuristic;
$N_{i}$ is the set of feasible neighbours of node i.

After all ants have built a path, pheromone values are reduced globally to simulate evaporation.

τ_{i j} \leftarrow (1 - ρ) \cdot τ_{i j}

(11)

where

ρ \in (0, 1)

is the evaporation rate.

For each ant k that completes a valid path

P_{k}

of length

J (P_{k})

, the pheromone is deposited on each edge

(i, j)

in its path:

τ_{i j} \leftarrow τ_{i j} + \sum_{k = 1}^{m} \frac{Q}{J (P_{k})} \cdot 1_{(i, j) \in P_{k}}

(12)

where

Q is a constant deposit factor;
$J (P_{k})$ is the total path cost for ant k;
$1_{(i, j) \in P_{k}}$ is 1 if edge $(i, j)$ is part of path $P_{k}$ , and otherwise 0.

At the end of each iteration, the shortest path among all constructed paths is retained as the best-so-far solution.

P^{*} = arg min_{P_{k}} J (P_{k})

(13)

The algorithm repeats for a fixed number of iterations T until convergence.

3.4.4. Genetic Algorithm (GA)

The GA [32] evolves a population of candidate paths through a graph from a start node s to a goal node g. Each path is encoded as a chromosome representing a sequence of node indices. The fitness function

f (P)

computes the total path cost as the sum of distances between consecutive nodes:

f (P) = \sum_{k = 1}^{| P | - 1} d (P_{k}, P_{k + 1})

(14)

Given two parent chromosomes

P_{1}

and

P_{2}

, a child is created by randomly selecting a splice point:

\begin{matrix} Child & = P_{1} [: a] \cup P_{2} [b :] \end{matrix}

(15)

\begin{matrix} where & a \in [1, | P_{1} | - 1], b \in [1, | P_{2} | - 1] \end{matrix}

(16)

Duplicate nodes are removed from the resulting sequence to maintain validity.

With probability

p_{mut}

, a random sub-path between two indices i and j is replaced with a new valid sub-path:

Mutate (P) = P [: i] \cup RandomWalk (P_{i}, P_{j}) \cup P [j :]

(17)

Let

elite_frac \in (0, 1)

denote the fraction of top-performing chromosomes preserved each generation:

{Elite}_{k} = arg min_{P \in Population} f (P), k = ⌊ elite_frac \cdot pop_size ⌋

(18)

At each generation t, the population is updated as

{Population}^{(t + 1)} = {Elite}_{k} \cup Children

(19)

where children are generated via crossover and mutation from elite parents.

After T generations, the chromosome with the lowest fitness is returned:

P^{*} = arg min_{P \in {Population}^{(T)}} f (P)

(20)

3.4.5. Particle Swarm Optimisation (PSO)

Let the swarm consist of N particles, where each particle represents a candidate path through the graph from start node s to goal node g [33]. The PSO algorithm evolves the swarm over T iterations using the following components:

$P_{k}^{(t)}$ : Path of particle k at iteration t.
$f (P)$ : Fitness function measuring total path length.
$P_{k}^{*}$ : Best path found by particle k so far (personal best).
$G^{*}$ : Best path found by any particle (global best).
w: Inertia weight.
$c_{1}, c_{2}$ : Cognitive and social influence coefficients.

Each particle updates its personal best if the current path improves upon it:

If f (P_{k}^{(t)}) < f (P_{k}^{*}), then P_{k}^{*} \leftarrow P_{k}^{(t)}

(21)

The best-performing particle in the swarm updates the global best:

G^{*} = arg min_{k} f (P_{k}^{*})

(22)

Each particle probabilistically modifies its path according to the following logic:

With probability w, retain current path $P_{k}^{(t)}$ .
With probability $c_{1}$ , splice with personal best $P_{k}^{*}$ .
With probability $c_{2}$ , splice with global best $G^{*}$ .

The cost of each path is computed as the total length (e.g., haversine distance) of the ordered node sequence:

f (P) = \sum_{i = 1}^{| P | - 1} d (P_{i}, P_{i + 1})

(23)

3.4.6. Simulated Annealing (SA)

SA produces a locally optimal refinement of the seed route by randomly perturbing it and probabilistically accepting uphill moves. This metaheuristic is inspired by the physical annealing process, where materials are slowly cooled to reach a low-energy crystalline state. In the context of route optimisation, SA seeks to minimise the objective function (e.g., total path length or a combination of distance and smoothness) by probabilistically accepting not only improvements but also occasional degradations, allowing the algorithm to escape local minima [49].

At each iteration, a candidate solution

x_{new}

is generated from the current solution

x_{cur}

by applying a local mutation (e.g., a segment reversal). The change in objective is computed as

Δ E = f (x_{new}) - f (x_{cur}),

(24)

where

f (\cdot)

is the objective function (e.g., weighted path length and turning cost). The acceptance probability

P (Δ E, T)

is defined as

P (Δ E, T) = \{\begin{matrix} 1, & if Δ E \leq 0, \\ exp (- \frac{Δ E}{T}), & if Δ E > 0, \end{matrix}

(25)

where T is the current temperature. This allows worse solutions to be accepted with decreasing probability as the system cools.

The temperature T is updated at each step using an exponential decay schedule:

T_{k + 1} = max (α \cdot T_{k}, T_{min})

(26)

where

α \in (0, 1)

is the cooling rate (e.g.,

α = 0.90

), and

T_{min}

is a predefined minimum temperature threshold below which the system is considered “frozen.”

This process is repeated for a fixed number of steps or until convergence, after which the best solution encountered is returned. In this work, we used SA with the parameters

T_{0} = 4.0

,

α = 0.90

,

T_{min} = 0.05

, and a maximum of 1200 iterations.

3.4.7. Adaptive Large Neighbourhood Search (ALNS)

ALNS iteratively improves a solution by applying a set of neighbourhood operators. At each iteration, a neighbourhood operator

o \in N

is selected probabilistically based on an adaptive score, and a candidate solution is generated and evaluated [50].

Let

s_{o}^{(t)}

be the score for operator o at iteration t. The probability of selecting operator o is given by

P (o) = \frac{s_{o}^{(t)}}{\sum_{o^{'} \in N} s_{o^{'}}^{(t)}}

(27)

Let

P^{(t)}

be the current best path; applying operator o yields a candidate path

P^{cand}

. If the candidate is collision-free and improves the objective (total path length), then it is accepted.

P^{(t + 1)} = \{\begin{matrix} P^{cand} & if Cost (P^{cand}) < Cost (P^{(t)}) and collision ‐ free \\ P^{(t)} & otherwise \end{matrix}

(28)

After each iteration, the operator’s score is updated based on its performance:

s_{o}^{(t + 1)} = \{\begin{matrix} s_{o}^{(t)} \cdot γ_{improve} & if P^{cand} improves the best solution \\ s_{o}^{(t)} \cdot γ_{worse} & otherwise \end{matrix}

(29)

where typically

γ_{improve} > 1

, and

γ_{worse} < 1

(e.g., 1.1 and 0.9, respectively). A larger penalty (0.8) is applied for infeasible solutions.

The process runs for a fixed number of iterations T, and returns the best solution:

P^{*} = arg min_{P^{(t)}} Cost (P^{(t)}), t = 1, \dots, T

(30)

3.4.8. Battery Model and Integration

Pack Energy and Temperature Derating

For the AIR8 pack with

N_{s} = 12

series Li-ion cells, nominal cell voltage

V_{nom} = 3.7

V, and capacity

C = 30

Ah, the nominal energy is calculated as

E_{nom} = N_{s} V_{nom} C = 12 \times 3.7 V \times 30 Ah = 1332 Wh .

(31)

We reserve a depth of discharge fraction

ρ_{DoD}

(here

0.8

usable), and apply a temperature derating factor

f_{T}

that accounts for reduced capacity and higher internal resistance away from 25 °C:

\begin{matrix} E_{use} (T) & = ρ_{DoD} E_{nom} f_{T} (T), \end{matrix}

(32)

\begin{matrix} f_{T} (T) & = clip (1 - k_{T} (25 - T), f_{min}, f_{max}) \end{matrix}

(33)

where

k_{T} = 0.004 / ° C, f_{min} = 0.70, f_{max} = 1.02

. The linear slope

k_{T}

is a conservative aggregate of low-temp capacity loss and IR heating. The clipping prevents unrealistic extrapolation at extreme temperatures.

For medium multirotors flying at fixed low altitude and moderate, near-constant speed, the per-distance energy is approximated as affine in payload mass m:

E_{km} (m) = E_{0} + k_{m} m, E_{0} = 20.0 Wh / km, k_{m} = 2.0 Wh / (km \cdot kg) .

(34)

E_{0}

captures propulsion, avionics and frame losses at zero payload;

k_{m}

reflects the (near) linear increase in induced power with weight at these disk loadings.

At cruise speed v (60 km h⁻¹), the one-way range and cruise endurance follow the following steps.

\begin{matrix} R (T, m) & = \frac{E_{use} (T)}{E_{km} (m)} [km], \end{matrix}

(35)

\begin{matrix} t_{end} (T, m) & = \frac{60 R (T, m)}{v} = \frac{60 E_{use} (T)}{v E_{km} (m)} [\min] . \end{matrix}

(36)

Given a planned round-trip air path of length L (km) from the hybrid planner, the mission ETA is calculated.

ETA (L, T, m) = \frac{60 L}{v} + t_{svc},

(37)

where

t_{svc}

is the on-site service time (5 min in our scenarios).

A route is energy-feasible if the total energy (cruise plus service) fits within the usable pack energy:

E_{km} (m) L + E_{svc} \leq E_{use} (T) .

(38)

With

v = 60

km h⁻¹ and no explicit service term, (38) reduces to

L \leq R (T, m)

.

3.4.9. Time-Window-Aware Mixed-Integer Linear Programming Model (TWA-MILP)

The scheduling problem, alongside routing, is conducted using a Time-Window-Aware Mixed-Integer Linear Programming (TWA-MILP) model [23]. The TWA-MILP model is an optimisation framework designed to handle scheduling and routing decisions under strict time constraints [23,51]. It is defined as a graph of nodes and arcs. Let

G = (N, E)

be a graph with links or flight legs between two nodes, where

N = {0} \cup C \cup {0^{'}}

represents depot node 0, customer nodes C, return-depot copy

0^{'}

and

E \subseteq N \times N

is a feasible aerial link. The TWA-MILP model is defined in this study using the following parameters:

n = Number of UAVs.
Q = Payload capacity of each UAV.
R = Maximum range of each UAV per sortie.
v = Cruise speed.
$τ$ = Fixed recharge (or service-swap) time at each stop.
$d_{i j}$ = Euclidean (air-path) distance from i to j.
$p_{j}$ = Payload demand at customer $j \in C$ .

[a_{j}, b_{j}]

Time window for customer j, chosen by urgency class:

[a_{j}, b_{j}] = \{\begin{matrix} [a_{j}^{urg}, b_{j}^{urg}], & j \in urgent, \\ [a_{j}^{rut}, b_{j}^{rut}], & j \in routine, \\ [a_{j}^{sch}, b_{j}^{sch}], & j \in scheduled . \end{matrix}

(39)

M is the time-continuity constraint of link t where x:

M \geq {max}_{i, j} (d_{i j} / v + τ)

.

TWA-MILP model ensures that each delivery task is completed within a specific time window assigned to each demand point, while accounting for resource constraints:

Binary decision variables to model the assignment of drones to delivery routes.
Continuous variables to represent time, distance, and resource consumption (e.g., energy, payload).
Time windows $[e_{i}, l_{i}]$ specifying the earliest ( $e_{i}$ ) and latest ( $l_{i}$ ) allowable arrival time at node i.
Service constraints to enforce timely and uninterrupted emergency deliveries.

Decision Variables:

\begin{matrix} x_{i j k}, & Binary variable indicating if drone k travels from node i to node j . \\ w_{i k} \geq 0, & Payload on UAV k in visiting i . \\ r_{i k} \geq 0, & Remaining range on UAV k after visiting i . \\ u_{i}, & Miller - Tucker - Zemlin (MTZ) variable for node i . \\ t_{i k} \geq 0, & Departure (service) time of UAV k at node i . \\ T_{max} \geq 0, & Overall mission makespan . \end{matrix}

Objective Choice

Our TWA-MILP enforces all customer time windows as hard constraints and minimises total flown distance:

min \sum_{i \in N} \sum_{j \in N} \sum_{k = 1}^{n} c_{i j} x_{i j k} .

Thus, the solved model is a distance-minimising Vehicle Routing Problem (VRP) with Time Windows variant (TWA-VRPTW [52]).

where

N is the set of all locations (including depot and delivery points);
n is the total number of drones;
$c_{i j}$ is the cost or distance from node i to node j;
$x_{i j k} \in {0, 1}$ is a binary decision variable equal to 1 if drone k travels from i to j, and 0 otherwise.

Minimising total distance is equivalent to minimising total cruise time at fixed v; ETA adds a constant

t_{s v c}

per stop.

MILP Embedding (Range/Energy Constraints)

Let

x_{i j k} \in {0, 1}

indicate UAV k’s flight arc

(i, j)

and

d_{i j}

its length (km). Using the affine energy model, a conservative linear budget for a sortie assigned to UAV k is

\begin{matrix} \sum_{(i, j) \in E} (E_{0} + k_{m} m_{k}) d_{i j} x_{i j k} + E_{svc, k} \leq E_{use, k} (T), E_{use, k} (T) = ρ_{DoD} E_{nom} f_{T} (T), \end{matrix}

(40)

where

m_{k}

is the carried payload for that sortie, and

E_{svc, k}

aggregates hover/handling energy at visited nodes. These constraints are added to the TWA-MILP, ensuring every scheduled route is energy-feasible under ambient temperature and payload.

4. Results

A fleet of heterogeneous UAVs launched from a central depot (Broomfield Hospital). Coordinates captured in DMS/aviation notation were converted to decimal degrees. Distances are computed with the haversine formula; time uses a constant cruise of

v = 60 km h^{- 1}

(at Essex latitudes,

1 °

longitude

\approx 111 cos 52 ° \approx 68.4

km.) Table 2 summarises the parameters used for configuring the hybrid path-finding algorithms. These parameter values were selected based on extensive empirical tuning and iterative testing. The tuning process involved adjusting each algorithm’s internal parameters to achieve a balance between convergence speed and path optimality across diverse environments. Multiple runs (28,800) were conducted to evaluate consistency, and the final configurations represent the best trade-off observed between computational efficiency and solution quality.

Figure 3a visualises the designated NFZ as defined by the UK Civil Aviation Authority (CAA). These zones include critical infrastructure and high-risk urban regions where UAV operations are restricted, indicated in red. The 3D red polygons also depict vertical airspace restrictions near coastal or sensitive air traffic corridors. Figure 3b shows the output from the hybrid path-planning algorithms applied to the environment. The UAV routes, shown as coloured lines, originate from the central depot at Broomfield Hospital and extend to twelve different destinations with multiple iterations. Blue-shaded regions indicate projected avoidance zones, derived from KML-defined boundaries, which were dynamically avoided during trajectory generation. The map serves to validate the routing performance of the algorithms, demonstrating their ability to maintain safe distances from restricted areas while optimising route length and efficiency under regulatory constraints.

Each delivery was scheduled within a 20–30 min window, expressed in seconds for temporal precision. In the first scenario, all UAV missions begin simultaneously at

t = 0

min, with a 5-min stop at each customer location and 12 s spacing between successive delivery requests. We minimise the total round-trip path length (km) from the depot to each hospital/town and back, subject to operational constraints (no-fly zones, time windows, payload limits, and energy).

The raw path length outputs of individual metaheuristic algorithms—ACO, GA, PSO, and RRT*—were compared against their hybridised versions integrated with ALNS or PSO refinement layers (see Table 3). As standalone methods, PSO methods consistently produced the longest path lengths. For instance, PSO (raw) yielded extreme values such as

106.43

km for South Benfleet and

87.73

km for Harwich. RRT* (raw) was more stable, with shorter path lengths across most locations, although it suffered from outliers such as Chigwell (

36.23

km) and Walton-on-the-Naze (

56.99

km). Comparing raw planners to their ALNS/PSO-refined hybrids (Table 3) reveals three consistent effects:

ACO-ALNS consistently matched or improved upon raw ACO values, reducing all long routes to as low as 33.90 km.
GA-ALNS closely mirrored ACO-ALNS’s efficiency, achieving significant improvements over GA (raw), especially for Halstead and Colchester (reducing both from $40.43$ km to $33.90$ km).
PSO-ALNS drastically reduced PSO’s highly inflated path lengths, e.g., South Benfleet (from $106.43$ km to $34.84$ km) and Harwich (from $87.73$ km to $57.92$ km).
RRT*-ALNS outperformed all other methods for consistency and stability. For nearly all destinations, it produced the shortest or near-shortest paths. For example, Walton-on-the-Naze dropped from $56.99$ km (RRT*) to $53.53$ km (RRT*-ALNS).

The RRT*-PSO hybrid yielded erratic results. While it offered small gains in specific locations (e.g., Chelmsford improved slightly to

5.43

km), it also introduced significant inflation in others, such as Walton-on-the-Naze (

472.93

km), Chigwell (

741.45

km), and Linton (

297.86

km). This inconsistency suggests that PSO may amplify outliers if not well-tuned within the hybrid framework.

Table 3. Route lengths (in km) for each destination using different hybrid planning algorithms.

Hospital/Town	ACO (Raw)	ACO-ALNS	GA (Raw)	GA-ALNS	PSO (Raw)	PSO-ALNS	RRT* (Raw)	RRT*-ALNS	RRT*-PSO
Halstead	23.49	23.49	24.48	23.49	23.49	23.49	25.29	23.49	50.21
Colchester	34.65	33.90	40.43	33.90	52.40	33.90	36.62	33.90	38.40
Chelmsford & Essex Hospital	4.79	4.79	10.54	4.79	10.54	4.79	4.93	4.79	5.43
Oaks Hospital	32.95	32.95	35.04	32.95	32.95	32.95	35.22	32.95	37.35
Basildon	23.86	23.86	23.89	23.86	36.69	23.86	24.75	23.86	26.92
Princess Alexandra	26.30	26.30	27.03	26.30	26.30	26.30	28.37	26.30	29.98
St Margaret Hospital	24.29	24.29	24.44	24.29	24.29	24.29	26.01	24.29	24.29
South Benfleet	26.97	26.97	38.25	38.25	106.43	34.84	27.55	26.97	61.57
Walton-on-the-Naze	62.91	54.57	61.92	61.92	57.69	55.88	56.99	53.53	472.93
Chigwell	35.27	35.21	32.61	32.61	77.81	77.81	36.23	33.07	741.45
Linton	39.23	39.23	38.56	38.56	38.56	38.56	39.52	37.29	297.86
Harwich	70.34	57.92	63.18	57.92	87.73	57.92	60.77	57.92	65.14

The TWA-MILP model was implemented using PuLP modelling with CBC 2.10.3 solver optimisation library within a custom Python 3.12.4-based framework. We integrate real-time components for dynamic UAV simulation and performance analysis.

The problem comprised 623 constraints, 56 decision variables, and 2429 nonzero coefficients. The solver reached optimality in just 0.11 s of CPU time, and the objective value confirmed that every customer time-window constraint was satisfied. To serve all customers within their time windows requires a minimum fleet of seven UAVs. The optimal solution for this seven-UAV case yielded an objective value of 918 after 134 solver iterations. In another scenario, emergency response requests arrive every 15 min. In this case, three UAVs could sustain the demand, provided each completes a full cycle of “customer → delivery → return” in under 45 min. In this scenario (one request every 15 min), each UAV handles four missions per hour. The three-UAV configuration is formulated in Equation (41), where the vehicle’s payload capacity is 7 kg. Each customer’s request is served exactly once by a single UAV. After completing a mission, each UAV returns to the depot before embarking on its next delivery. UAVs may be reused for multiple missions, but not simultaneously.

\sum_{k = 1}^{3} \sum_{i \in N} x_{i j k} = 1, \forall j \in C .

(41)

where

x_{i j k}

is a binary variable indicating whether UAV k travels from node i to customer j, N is the set of all nodes, and

C \in N

is the set of customers. To prevent subtours and ensure route continuity, the Miller–Tucker–Zemlin (MTZ) subtour elimination constraints are applied as follows:

\begin{matrix} u_{i} - u_{j} + | N | x_{i j k} \leq | N | - 1, \forall i \neq j, k = 1, 2, 3 . \end{matrix}

(42)

u_{i}

and

u_{j}

are continuous auxiliary variables used to enforce valid visiting sequences in the UAV routes. UAVs have a limited flight range R, and their remaining range

r_{i k}

is depleted by the distance

d_{i j}

travelled, ensuring that UAVs do not exceed their operational travel range.

\begin{matrix} r_{j k} = r_{i k} - d_{i j}, 0 \leq r_{i k} \leq R, \forall i, j \in N, k . \end{matrix}

(43)

The arrival time at node j accounts for the time at node i, travel time

d_{i j} / v

, and service duration

τ

, adjusted using a large constant M to deactivate the constraint when the route

x_{i j}

is not used:

\begin{matrix} t_{j k} \geq t_{i k} + \frac{d_{i j}}{v} + τ - M (1 - x_{i j k}), \forall i, j \in N, k = 1, 2, 3 . \end{matrix}

(44)

The total mission completion time for UAV k is tied to the return to the depot (node

0^{'}

), modelled as

\begin{matrix} t_{0^{'} k} = \sum_{i \in N} (t_{i k} + \frac{d_{i 0}}{v} + τ) x_{i 0 k}, \forall k = 1, 2, 3 . \end{matrix}

(45)

Each customer j must be served within a predefined time window

[a_{j}, b_{j}]

as well:

\begin{matrix} a_{j} \leq t_{j k} \leq b_{j}, \forall j \in C, k = 1, 2, 3 . \end{matrix}

(46)

All results aggregate

R = 100

independent replications per destination with the outer-loop seed initialised as

1000 + rep

for reproducibility. Table 4 presents a comparative benchmark of hybrid path-finding algorithms that combine global planners (ACO, GA, PSO, RRT*) with post-optimisation techniques (ALNS and SA). This benchmark evaluates the effectiveness and efficiency of these hybrid approaches in solving obstacle-rich path-planning problems. Each algorithm was tested under identical conditions, using the same input map, hospital/town destinations, and start location. To ensure strict repeatability across all runs, the benchmark wrapper invoked each planner with a fixed outer-loop seed of 1000 (incremented per replication). This ensures that all stochastic operations, such as node selection, random sampling, or mutation choices, follow the same sequence across runs, isolating algorithmic performance from randomness. The benchmark reports three key metrics: the mean path length (in km), the ETA mean (assuming a uniform UAV cruising speed of 60 km/h), and the average computation time (

t_{calc}

in seconds). Each value is averaged over 12 destinations, with the standard deviation (Std) included to quantify variability across the routes. For completeness, RRT*-PSO is excluded from the aggregate benchmark in Table 4 due to pathological outliers; we retain its per-destination values in Table 3 to illustrate failure modes. Simulated Annealing (SA) generally produces competitive path solutions with reduced computational overhead compared to ALNS. Conversely, ACO-based hybrids yield longer paths with higher variability. The ACO-ALNS variant is the most computationally expensive (12.3 s), reflecting the stochastic nature of ACO exploration and the comparatively limited refinement achieved by ALNS in this setting. The results indicate that hybrids involving RRT* paired with either ALNS or SA consistently achieve the shortest path lengths. These outcomes are expected due to RRT*’s inherent goal bias and ability to sample feasible connections in cluttered environments. When further refined by SA or ALNS, RRT* paths gain additional smoothness and detour efficiency.

Figure 4a shows the distribution of route lengths across the 12 destinations. RRT*-based hybrids consistently produce the shortest paths with minimal spread, highlighting their reliability and robustness. In contrast, ACO-based hybrids display wider boxes and longer whiskers, reflecting higher variability and a tendency toward longer or less efficient paths. The outliers further confirm that ACO occasionally yields extreme cases. Among all methods, RRT*-SA stands out with the lowest median path length and narrowest interquartile range, making it the most stable performer.

Figure 4b explores the trade-off between ETA (minutes) and computation time (seconds). SA-enhanced variants, particularly RRT*-SA and PSO-SA, combine short ETAs with very low runtimes, making them attractive for time-critical operations. GA-SA and PSO-SA also offer favourable trade-offs, maintaining efficient path lengths with runtimes consistently under one second. By contrast, ACO-based hybrids (especially ACO-ALNS) incur significantly higher runtimes (≈12 s on average) without proportional improvements in ETA, underscoring the higher computational burden of ant colony search and its sensitivity to problem scale.

In all cases, the planner computes obstacle-avoiding outbound and return legs, measures their great-circle distances, and converts distance to time using the commanded cruise speed v = 60 km/h. The reported mission ETA includes a fixed on-site service time of 5 min. Table 5 summarises the round-trip distances, mission ETAs, and per-leg distances across six more challenging destinations (with long or cluttered corridors). Unlike Table 4, the mission ETAs include a fixed 5 min service time.

Chelmsford is the closest destination (10.2 km round-trip), yielding the smallest mission time (15.2 min). Harwich and Walton-on-the-Naze are the longest missions at 132.4 km and 121.8 km, respectively, resulting in total ETAs of approximately 132.4 min and 121.8 min (at 60 km h⁻¹ plus 5 min service). Computation times for the hybrid planners are reported separately in Table 4 and remain sub-second for GA/PSO/SA and ≲12 s for ACO-based hybrids.

Table 6 summarises the technical assumptions used in the benchmark simulations. The first section lists the UAV battery and energy model parameters based on the AIR8 pack specification. The vehicle operates with 12 series lithium-ion cells, each with a nominal voltage of 3.7 V, giving a total pack capacity of 30 Ah and a nominal energy of 1332 Wh. Only 80% of this energy is considered usable due to depth-of-discharge limits. To account for temperature effects, a linear derating factor is applied with slope

k_{T} = 0.004

per degree Celsius, bounded between 0.70 and 1.02. The per-distance energy cost is modelled as an affine function of payload, with a base consumption of 20 Wh/km and an additional 2.0 Wh/km per kilogram of payload. Each mission also assumes a fixed 5-min on-site service time.

The second section of the table specifies the flight and mission conditions assumed for benchmarking. UAV cruise speed is fixed at 60 km/h, the ambient temperature is set to 20 °C, and the payload mass is taken as 2.0 kg to represent a typical medical delivery.

Figure 5 illustrates the battery-aware mission range for AIR8 with four 15 Ah 6S Li-ion packs (effective 12S, 30 Ah), based on the energy model in Equations (32)–(37).

To benchmark our hybrid metaheuristics against a state-of-the-art operations-research solver, we implemented a VRPTW using Google OR-Tools [53]. The depot was fixed at Broomfield Hospital, with 12 demand sites defined in Table 1 and the constraints in Table 6. The visibility graph constructed for 13 sites (including the depot) yielded node degrees ranging from 2 to 10, confirming a well-connected but obstacle-constrained network. Time windows were set with flexible bounds (e.g., Halstead:

[18, 48]

min; Colchester:

[28, 58]

min; Chelmsford and Essex:

[0, 30]

min), ensuring realistic scheduling constraints. Each UAV was assigned to a single customer and required to depart from and return to the depot (Broomfield Hospital). ETAs were derived assuming a constant cruise speed of 60 km/h (1 km/min), Table 6.

These results are illustrated in Table 7. The table indicates that Chelmsford requires approximately 15 min for a UAV delivery, whereas longer coastal routes such as Walton-on-the-Naze or Harwich take close to 1 h. These findings necessitate the incorporation of energy-aware planning and NFZ-compliant routing to ensure the feasibility of BVLOS operations.

Comparing the results in Table 3 with the OR-Tools VRPTW solutions illustrates that, for destinations without NFZs such as Chelmsford and Essex, the paths are identical (4.8 km in OR-Tools vs. 4.8–10.5 km in hybrid planners). However, destinations such as Walton-on-the-Naze, Chigwell, Linton, and Harwich, which are heavily constrained by NFZs, show the greatest divergence; in these cases, the RRT-ALNS hybrid planners yield geometrically more efficient trajectories. This demonstrates the ability of metaheuristic hybrids to produce more optimised solutions under complex airspace constraints.

We tested the efficiency of the RRT-SA and RRT*-ALNS hybrid algorithms in Airborne Robotics’ AirborneSIM environment to validate their performance under physical conditions, including metrics such as velocity, range, and delivery dynamics, using ArduPilot’s MAVProxy (https://ardupilot.org/mavproxy/, accessed on 1 February 2025). AirborneSIM, a physics-based UAV simulation interface, enables real-time monitoring of flight performance and system behaviour. The results demonstrated that the hybrid algorithms effectively identified optimised flight paths in real time, ensuring both feasibility and operational efficiency.

Figure 6 illustrates the integrated digital-twin environment used to verify our blue-light UAV courier concept during simulated missions over Chelmsford. In the central 3D pane, an AIR8 multirotor proceeds along the optimised trajectory at

50 m

AGL, allowing visual confirmation that the route clears rooftops and other urban obstacles. The telemetry panel at left streams live flight data, distance to depot (

729 m

), distance to target (

1.29 km

), mission time (

1 \min 15 s

), ground speed (≈

15 m s^{- 1}

), altitude, battery voltage (

47.8 V

,

83 %

SOC (state of charge)) and GNSS fix, demonstrating compliance with CAA low-altitude (<120 m) and battery-reserve requirements. In the upper right corner, a smartphone interface displays the emergency consignment (AED) and the destination address (75 West Avenue, CM1 2DD), providing first responders with real-time ETA and payload tracking. The lower-right Q-Ground-Control/ArduPilot heads-up display shows waypoint geometry and live heading, allowing operators to monitor this display throughout the flight continuously.

5. Discussion

This study examined whether combining well-established global planners with lightweight local refiners can yield consistent, regulation-compliant flight plans for time-critical medical logistics in congested UK airspace. Across twelve destinations and three metrics (path length, ETA, and compute time), RRT*-based hybrids performed best overall. In particular, RRT*-SA and RRT*-ALNS tied on mean path length (31.36 km) while RRT*-SA also achieved a co-lowest runtime (∼0.09 s), delivering the most favourable trade-off. Intuitively, RRT* aggressively explores the configuration space and rewires subtrees; a 60% goal bias keeps the search focused even in tight NFZ corridors.

SA’s 2-opt-style segment swaps remove residual zig-zags in sub-second time and, by probabilistically accepting small uphill moves, escape shallow local minima left after RRT* rewiring. Unlike PSO and GA, which need careful tuning of inertia or mutation rates, RRT*-SA remains robust across a wide range of cooling schedules, supporting rapid deployment by non-expert operators.

These findings extend recent reports that RRT* hybrids offer strong anytime behaviour in urban navigation [54,55]. To our knowledge, this is the first county-scale evaluation under UK CAA CAP 722 altitude and NFZ constraints (https://www.caa.co.uk/our-work/publications/documents/content/cap-722/, accessed on 1 March 2025). Coupled with our TWA-MILP results (a minimum of seven UAVs to satisfy all windows in a single wave; three UAVs to sustain one request every 15 min), the stack enables sub-40-min missions for nearby sites and predictable ∼1 h missions for the farthest destinations, closing gaps to AED access, mitigating peak-hour road delays, and offering a lower-carbon alternative to ad hoc blue-light couriers. Operationally, the hybrid planner can sit within a UTM/BVLOS stack to generate pre-tactical corridors and update trajectories tactically when ADS-B/C2 (Command and Control) indicates deteriorating connectivity.

Most truck-and-drone studies optimise makespan in rural settings and soften airspace rules; few address pure-drone, BVLOS, urban missions with strict windows. Our results diverge from [23], where GA variants dominated, by showing that GA-centric hybrids exhibit higher variance once real NFZs are introduced. The greater stability of RRT* aligns with [32]; adding SA further reduces mean path length without runtime penalties.

We also solved a distance-minimising VRPTW using Google OR-Tools 9.5.2237 on the same NFZ-aware visibility graph. Two patterns emerged:

NFZ-light corridors:Near-identical distances/ETAs (e.g., Chelmsford and Essex ≈ 4.8 km one-way; 7–15 min missions), indicating that both discrete OR models and continuous-space hybrids are near-optimal when obstacles are mild.
NFZ-heavy or long coastal corridors: The hybrids, especially RRT*-ALNS, often produced shorter paths than the OR-Tools solution built from a precomputed site-to-site matrix. This reflects the hybrids’ capacity to refine geometry in continuous space (shortcutting/rewiring and post hoc smoothing), occasionally revealing slightly shorter NFZ-compliant polylines.

In practice, a complementary workflow is natural: use a geometric hybrid (RRT*-SA/ALNS) to generate high-fidelity, NFZ-aware inter-site costs and pass those to OR-Tools for fleet-level, time-window scheduling, preserving OR-Tools’ temporal feasibility strengths while exploiting the hybrids for geometric optimality.

Four caveats remain:

Wind, NFZ dynamics, and cooperative traffic were emulated in the digital twin; no online re-planning was triggered mid-flight.
The battery model assumed linear discharge; richer electro-thermal models should capture nonlinear chemistries and temperature effects.
Essex has potential rooftop hubs; scaling will require multi-depot MILP variants and hub-and-spoke designs.
Connectivity metrics were not fed back to the optimiser in real time, important for regulatory evidence in nationwide BVLOS corridors.

Wind materially affects stability, energy, and ETA. Headwinds increase energy per kilometre; tailwinds can reduce ETA but add uncertainty; crosswinds raise control effort, especially for multirotors near buildings. Prior work suggests average winds above 8–

10 m s^{- 1}

can reduce endurance by up to 20% and increase position error [56,57]. Our framework mitigates risk by (i) allowing the RRT*-SA planner to avoid wind-exposed corridors and (ii) letting the TWA-MILP scheduler reassign tasks under adverse conditions. A practical extension is to ingest online weather and telemetry, enabling adaptive re-planning when wind profiles exceed thresholds.

We assume cruise within the legal altitude corridor (90–120 m AGL) and model package drop-off at 90 m with a fixed 5 min service delay; explicit descent/ascent profiles and site constraints (e.g., rooftop vs. ground) would slightly increase time/energy and merit future inclusion. We also plan to extend benchmarking to DRL approaches (DQN, PPO, SAC) [38,39] and advanced operations-research (OR) variants (e.g., branch-and-price, column generation) for drone routing [23,25].

By coupling a goal-biased, sampling-based explorer with a sub-second local optimiser and embedding both in a rigorous scheduler, medical drones can reliably achieve sub-40-min urban missions and 1 h coastal missions with fewer than ten aircraft under CAP 722 constraints. The approach is immediately actionable for health authorities seeking to bridge last-mile logistics gaps where every minute matters.

6. Conclusions

We presented an end-to-end framework for time-critical medical logistics in Essex that combines (i) strategic depot placement, (ii) hybrid, obstacle-aware route planning, and (iii) a time-window-aware MILP scheduler under real NFZ, altitude, and battery constraints. Across twelve destinations, RRT*-based hybrids were consistently strongest: RRT*-SA and RRT*-ALNS tied for the shortest average path (31.36 km), and RRT*-SA achieved a co-lowest runtime (∼0.09 s). The TWA-MILP reached proven optimality in 0.11 s and indicated that meeting all time windows in a single wave requires at least seven UAVs; sustaining one request every 15 min is feasible with three UAVs when each sortie (including service/recharge) completes within 45 min.

A head-to-head comparison with a Google OR-Tools VRPTW baseline (same NFZ-aware site matrix, hard windows, one customer per UAV) showed close agreement on NFZ-light/nearby routes and clear hybrid advantages on long or NFZ-constrained corridors (Walton-on-the-Naze, Chigwell, Linton, Harwich), where continuous-space refinement produced shorter feasible polylines than pairwise site costs alone. This supports a hybrid OR co-design: use RRT*-SA/ALNS to generate high-fidelity inter-site costs/geometry, then schedule with OR-Tools to guarantee temporal feasibility and fleet utilisation.

Digital-twin trials in AirborneSIM with the AIR8 confirmed that planned trajectories remain flyable under BVLOS constraints, wind disturbances, and positioning noise, indicating readiness for operational transfer. Together, the components enable sub-40-min missions for nearby sites and reliable 1–2 h missions for the farthest destinations, compliant with CAP 722.

The framework is directly applicable to NHS supply chains for rapid delivery of blood products, AEDs, and time-critical medicines during peak traffic, adverse weather, or disruptions; it also generalises to disaster relief for isolated communities. Embedding the planner within UTM systems provides a pathway to scalable, certifiable UAV corridors for emergency response.

We evaluated four metaheuristics with lightweight refiners; DRL and advanced OR approaches may further improve adaptive replanning or optimality. The battery model assumed linear discharge and constant cruise speed; richer electro-thermal models and stochastic energy use should be incorporated. Scaling beyond a single depot will require multi-depot and hub-and-spoke variants. Finally, live meteorological feeds and in-flight replanning are key for operational deployment.

Next steps.(i) Fuse online wind/telemetry into cost updates to enable tactical rescheduling in OR-Tools; (ii) extend to multi-depot county-wide networks; and (iii) benchmark against DRL and column-generation/branch-and-price OR variants. With these additions, the stack offers a practical blueprint for resilient, low-carbon emergency medical UAV networks where every minute counts.

Author Contributions

Conceptualisation, S.S.E., S.S., A.S. and A.F.-E.; methodology, S.S.E., S.S. and A.S.; software, S.S.E.; validation, S.S.E. and A.F.-E.; formal analysis, S.S.E.; investigation, S.S.E. and A.S.; resources, S.S.E. and S.S.; data curation, S.S.E. and A.F.-E.; writing—original draft preparation, S.S.E.; writing—review and editing, A.F.-E., A.S. and S.S.; visualisation, S.S.E.; supervision, S.S.E. and S.S.; project administration, S.S. and A.F.-E.; funding acquisition, S.S.E., S.S., A.S. and A.F.-E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by UK Research and Innovation (UKRI) Small Business Research Initiative (SBRI), Enhancing Medical Supply Chain Resilience with Drones (Project No. 432304).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Acknowledgments

The authors would like to acknowledge the use of OpenAI’s ChatGPT GPT-4 model for assisting in improving the syntax and grammar of several paragraphs in this manuscript.

Conflicts of Interest

Author Alex Fraess-Ehrfeld was employed by the company Airborne Robotics Worting House. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

ACO	Ant Colony Optimisation
AED	Automated External Defibrillator
ALNS	Adaptive Large Neighbourhood Search
ETA	Estimated Time of Arrival
GA	Genetic Algorithm
MILP	Mixed-Integer Linear Programming
NFZ	No-Fly Zone
RRT*	Rapidly Exploring Random Tree Star
SA	Simulated Annealing
TWA	Time-Window-Aware
UAV	Unmanned Aerial Vehicle

References

Geng, N.; Meng, Q.; Gong, D.; Chung, P.W. How good are distributed allocation algorithms for solving urban search and rescue problems? A comparative study with centralized algorithms. IEEE Trans. Autom. Sci. Eng. 2018, 16, 478–485. [Google Scholar] [CrossRef]
Murugan, D.; Garg, A.; Singh, D. Development of an adaptive approach for precision agriculture monitoring with drone and satellite data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 5322–5328. [Google Scholar] [CrossRef]
Lin, Z.; Liu, H.H.; Wotton, M. Kalman filter-based large-scale wildfire monitoring with a system of UAVs. IEEE Trans. Ind. Electron. 2018, 66, 606–615. [Google Scholar] [CrossRef]
Kuru, K.; Ansell, D.; Khan, W.; Yetgin, H. Analysis and optimization of unmanned aerial vehicle swarms in logistics: An intelligent delivery platform. IEEE Access 2019, 7, 15804–15831. [Google Scholar] [CrossRef]
Dorling, K.; Heinrichs, J.; Messier, G.G.; Magierowski, S. Vehicle routing problems for drone delivery. IEEE Trans. Syst. Man Cybern. Syst. 2016, 47, 70–85. [Google Scholar] [CrossRef]
Manyam, S.G.; Sundar, K.; Casbeer, D.W. Cooperative routing for an air–ground vehicle team—Exact algorithm, transformation method, and heuristics. IEEE Trans. Autom. Sci. Eng. 2019, 17, 537–547. [Google Scholar] [CrossRef]
Jiang, H.; Liang, Y. Online path planning of autonomous UAVs for bearing-only standoff multi-target following in threat environment. IEEE Access 2018, 6, 22531–22544. [Google Scholar] [CrossRef]
Macias, J.E.; Angeloudis, P.; Ochieng, W. Optimal hub selection for rapid medical deliveries using unmanned aerial vehicles. Transp. Res. Part C Emerg. Technol. 2020, 110, 56–80. [Google Scholar] [CrossRef]
Zhen, L.; Yang, Z.; Laporte, G.; Yi, W.; Fan, T. Unmanned aerial vehicle inspection routing and scheduling for engineering management. Engineering 2024, 36, 223–239. [Google Scholar] [CrossRef]
Liu, W.; Zhang, T.; Huang, S.; Li, K. A hybrid optimization framework for UAV reconnaissance mission planning. Comput. Ind. Eng. 2022, 173, 108653. [Google Scholar] [CrossRef]
Munishkin, A.A.; Milutinović, D.; Casbeer, D.W. Min–max time efficient inspection of ground vehicles by a UAV team. Robot. Auton. Syst. 2020, 125, 103370. [Google Scholar] [CrossRef]
Raivi, A.M.; Huda, S.A.; Alam, M.M.; Moh, S. Drone routing for drone-based delivery systems: A review of trajectory planning, charging, and security. Sensors 2023, 23, 1463. [Google Scholar] [CrossRef]
Sajid, M.; Mittal, H.; Pare, S.; Prasad, M. Routing and scheduling optimization for UAV assisted delivery system: A hybrid approach. Appl. Soft Comput. 2022, 126, 109225. [Google Scholar] [CrossRef]
Fortune Business Insights. Medical Drone Market Size, Share and Industry Analysis, 2024–2032. 2024. Available online: https://www.fortunebusinessinsights.com/medical-drone-market-105805 (accessed on 29 October 2024).
Claesson, A.; Bäckman, A.; Ringh, M.; Svensson, L.; Nordberg, P.; Djärv, T.; Hollenberg, J. Time to delivery of an automated external defibrillator using a drone for simulated out-of-hospital cardiac arrests vs emergency medical services. JAMA 2017, 317, 2332–2334. [Google Scholar] [CrossRef] [PubMed]
Subbarao, I.; Cooper, G.P., Jr. Drone-based telemedicine: A brave but necessary new world. J. Osteopath. Med. 2015, 115, 700–701. [Google Scholar] [CrossRef] [PubMed][Green Version]
Nyaaba, A.A.; Ayamga, M. Intricacies of medical drones in healthcare delivery: Implications for Africa. Technol. Soc. 2021, 66, 101624. [Google Scholar] [CrossRef]
Sun, K.; Gu, Y.; Wan Fei Ma, K.; Zheng, C.; Wu, F. Medical supplies delivery route optimization under public health emergencies incorporating metro-based logistics system. Transp. Res. Rec. 2024, 2678, 111–131. [Google Scholar] [CrossRef]
Civil Aviation Authority. CAP 722: Unmanned Aircraft System Operations in UK Airspace—Guidance. 2023. Available online: https://www.caa.co.uk/our-work/publications/documents/content/cap-722/ (accessed on 11 December 2024).
Pourmohammadreza, N.; Jokar, M.R.A.; Van Woensel, T. Last-Mile Logistics with Alternative Delivery Locations: A Systematic Literature Review. Results Eng. 2025, 25, 104085. [Google Scholar] [CrossRef]
Ulmer, M.W.; Thomas, B.W. Same-day delivery with heterogeneous fleets of drones and vehicles. Networks 2018, 72, 475–505. [Google Scholar] [CrossRef]
Dayarian, I.; Savelsbergh, M.; Clarke, J.P. Same-day delivery with drone resupply. Transp. Sci. 2020, 54, 229–249. [Google Scholar] [CrossRef]
Murray, C.C.; Chu, A.G. The flying sidekick traveling salesman problem: Optimization of drone-assisted parcel delivery. Transp. Res. Part C Emerg. Technol. 2015, 54, 86–109. [Google Scholar] [CrossRef]
Agatz, N.; Bouman, P.; Schmidt, M. Optimization approaches for the traveling salesman problem with drone. Transp. Sci. 2018, 52, 965–981. [Google Scholar] [CrossRef]
Nguyen, M.A.; Dang, G.T.H.; Hà, M.H.; Pham, M.T. The min-cost parallel drone scheduling vehicle routing problem. Eur. J. Oper. Res. 2022, 299, 910–930. [Google Scholar] [CrossRef]
Sanchez-Aguero, V.; Valera, F.; Vidal, I.; Tipantuña, C.; Hesselbach, X. Energy-aware management in multi-UAV deployments: Modelling and strategies. Sensors 2020, 20, 2791. [Google Scholar] [CrossRef] [PubMed]
Guo, J.; Gan, M.; Hu, K. Cooperative Path Planning for Multi-UAVs with Time-Varying Communication and Energy Consumption Constraints. Drones 2024, 8, 654. [Google Scholar] [CrossRef]
Xu, X.; Xie, C.; Luo, Z.; Zhang, C.; Zhang, T. A multi-objective evolutionary algorithm based on dimension exploration and discrepancy evolution for UAV path planning problem. Inf. Sci. 2024, 657, 119977. [Google Scholar] [CrossRef]
Hart, P.E.; Nilsson, N.J.; Raphael, B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 100–107. [Google Scholar] [CrossRef]
LaValle, S. Rapidly-exploring random trees: A new tool for path planning. Res. Rep. 1998, 9811. [Google Scholar]
Wu, T.; Zhang, Z.; Jing, F.; Gao, M. A dynamic path planning method for uavs based on improved informed-rrt* fused dynamic windows. Drones 2024, 8, 539. [Google Scholar] [CrossRef]
Zhou, Q.; Liu, G. UAV Path Planning Based on the Combination of A-star Algorithm and RRT-star Algorithm. In Proceedings of the 2022 IEEE International Conference on Unmanned Systems (ICUS), Guangzhou, China, 28–30 October 2022. [Google Scholar] [CrossRef]
Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]
Xu, T.; Chen, M.; Yang, S. Research on UAV Route Planning Based on Ant Colony Algorithm. In Proceedings of the 2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA), Shenyang, China, 28–30 June 2024. [Google Scholar] [CrossRef]
Dewangan, R.K.; Shukla, A.; Godfrey, W.W. Three dimensional path planning using Grey wolf optimizer for UAVs. Appl. Intell. 2019, 49, 2201–2217. [Google Scholar] [CrossRef]
Yu, X.; Jiang, N.; Wang, X.; Li, M. A hybrid algorithm based on grey wolf optimizer and differential evolution for UAV path planning. Expert Syst. Appl. 2023, 215, 119327. [Google Scholar] [CrossRef]
Chen, Y.; Yu, Q.; Han, D.; Jiang, H. UAV path planning: Integration of grey wolf algorithm and artificial potential field. Concurr. Comput. Pract. Exp. 2024, 36, e8120. [Google Scholar] [CrossRef]
Dong, R.; Pan, X.; Wang, T.; Chen, G. UAV Path Planning Based on Deep Reinforcement Learning. In Artificial Intelligence for Robotics and Autonomous Systems Applications; Springer: Berlin/Heidelberg, Germany, 2023; pp. 27–65. [Google Scholar]
Bayerlein, H.; Theile, M.; Caccamo, M.; Gesbert, D. Multi-UAV Path Planning for Wireless Data Harvesting with Deep Reinforcement Learning. IEEE Open J. Commun. Soc. 2021, 2, 1171–1187. [Google Scholar] [CrossRef]
Hu, J.; Zhang, H.; Song, L.; Schober, R.; Poor, H.V. Cooperative Internet of UAVs: Distributed trajectory design by multi-agent deep reinforcement learning. IEEE Trans. Commun. 2020, 68, 6807–6821. [Google Scholar] [CrossRef]
Wang, Y.; Hong, X.; Wang, Y.; Zhao, J.; Sun, G.; Qin, B. Token-based deep reinforcement learning for Heterogeneous VRP with Service Time Constraints. Knowl.-Based Syst. 2024, 300, 112173. [Google Scholar] [CrossRef]
Rubí, B.; Morcego, B.; Pérez, R. Deep reinforcement learning for quadrotor path following and obstacle avoidance. In Deep Learning for Unmanned Systems; Springer: Berlin/Heidelberg, Germany, 2021; pp. 563–633. [Google Scholar]
Zheng, J.; He, K.; Zhou, J.; Jin, Y.; Li, C.M. Combining reinforcement learning with Lin-Kernighan-Helsgaun algorithm for the traveling salesman problem. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 12445–12452. [Google Scholar]
Johansson, O. Path and Route Planning for Indoor Monitoring with UAV: An Evaluation of Algorithms for Time-constrained Path and Route Planning in an Indoor Environment with Several Waypoints and Limited Battery Time. 2022. Available online: https://www.diva-portal.org/smash/get/diva2:1711131/FULLTEXT01.pdf (accessed on 1 May 2025).
Ghaffar, M.A.; Peng, L.; Aslam, M.U.; Adeel, M.; Dassari, S. Vehicle-UAV Integrated Routing Optimization Problem for Emergency Delivery of Medical Supplies. Electronics 2024, 13, 3650. [Google Scholar] [CrossRef]
Karaman, S.; Walter, M.R.; Perez, A.; Frazzoli, E.; Teller, S. Anytime motion planning using the RRT. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 1478–1483. [Google Scholar]
Cui, X.; Wang, C.; Xiong, Y.; Mei, L.; Wu, S. More Quickly-RRT*: Improved Quick Rapidly-exploring Random Tree Star algorithm based on optimized sampling point with better initial solution and convergence rate. Eng. Appl. Artif. Intell. 2024, 133, 108246. [Google Scholar] [CrossRef]
Morin, M.; Abi-Zeid, I.; Quimper, C.G. Ant colony optimization for path planning in search and rescue operations. Eur. J. Oper. Res. 2023, 305, 53–63. [Google Scholar] [CrossRef]
Golden, B.L.; Skiscim, C.C. Using simulated annealing to solve routing and location problems. Nav. Res. Logist. Q. 1986, 33, 261–279. [Google Scholar] [CrossRef]
Hendel, G. Adaptive large neighborhood search for mixed integer programming. Math. Program. Comput. 2022, 14, 185–221. [Google Scholar] [CrossRef]
Schumacher, C.; Chandler, P.; Pachter, M.; Pachter, L. UAV task assignment with timing constraints via mixed-integer linear programming. In Proceedings of the AIAA 3rd “Unmanned Unlimited” Technical Conference, Workshop and Exhibit, Chicago, IL, USA, 20–23 September 2004; p. 6410. [Google Scholar]
Kallehauge, B.; Larsen, J.; Madsen, O.B.; Solomon, M.M. Vehicle routing problem with time windows. In Column Generation; Springer: Berlin/Heidelberg, Germany, 2005; pp. 67–98. [Google Scholar]
Google LLC. OR-Tools: Google Optimization Tools. 2025. Available online: https://developers.google.com/optimization (accessed on 6 September 2025).
Shirabayashi, J.V.; Ruiz, L.B. Toward uav path planning problem optimization considering the internet of Drones. IEEE Access 2023, 11, 136825–136854. [Google Scholar] [CrossRef]
Meng, W.; Zhang, X.; Zhou, L.; Guo, H.; Hu, X. Advances in UAV Path Planning: A Comprehensive Review of Methods, Challenges, and Future Directions. Drones 2025, 9, 376. [Google Scholar] [CrossRef]
Hentzen, D.; Stastny, T.; Siegwart, R.; Brockers, R. Disturbance estimation and rejection for high-precision multirotor position control. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 2797–2804. [Google Scholar]
Phadke, A.; Medrano, F.A.; Chu, T.; Sekharan, C.N.; Starek, M.J. Modeling wind and obstacle disturbances for effective performance observations and analysis of resilience in UAV swarms. Aerospace 2024, 11, 237. [Google Scholar] [CrossRef]

Figure 1. AIR 8 Medium Lifter Octocopter (coaxial) with eight motors and rotors UAV.

Figure 2. System architecture of the UAV medical logistics framework.

Figure 3. The map of the Essex area and corresponding planned UAV paths.

Figure 4. Performance evaluation of the hybrid UAV path-planning algorithms. (a) Distribution of path lengths showing central tendency and variability. RRT*-based hybrids achieve the most consistent and shortest routes, while ACO-based hybrids exhibit greater variability and occasional inefficiency. (b) SA-enhanced hybrids, particularly RRT*-SA and PSO-SA, combine short ETAs with low runtimes, making them well-suited for time-critical operations. In contrast, ACO-ALNS incurs the highest runtimes without proportional ETA improvements, reflecting the computational overhead of ACO search.

Figure 5. Usable range vs. payload for representative ambient temperatures.

Figure 6. AirborneSIM digital-twin mission view with a multirotor AIR8 UAV (centre) heading north-east above suburban Chelmsford on an automated medical delivery run. At the same time, ground stations and mobile interfaces stream live telemetry and logistics data.

Table 1. Identified locations in the Essex area used for the feasibility study.

Customer ID	Locations in Essex	Coordinates (Lat, Long)	Straight Line Distance * (km)	Car Distance (Sunday, 5:30 pm)
1	Halstead Hospital	(51.9570576, 0.6384148)	22.64	27.50
2	Colchester Hospital	(51.9250868, 0.8952196)	33.23	41.84
3	Chelmsford & Essex Hospital	(51.7315422, 0.4711854)	5.02	6.43
4	Oaks Hospital	(51.9075676, 0.8947500)	32.82	41.52
5	Basildon University Hospital	(51.5601640, 0.4522810)	24.19	38.46
6	Princess Alexandra Hospital	(51.7745343, 0.0838406)	26.19	36.37
7	St Margaret Hospital	(51.7207444, 0.1241908)	24.74	31.22
8	South Benfleet	(51.5408692, 0.5655935)	25.04	30.42
9	Walton-on-the-Naze	(51.8439234, 1.2330130)	55.71	71.45
10	Chigwell	(51.6279068, 0.0725821)	31.50	38.62
11	Linton	(52.0859200, 0.2730825)	38.76	49.89
12	Harwich	(51.9371536, 1.2673452)	57.36	70.81

* Distances are taken from https://dronesafetymap.com/?hsCtaTracking=b6fcfc7d-7407-4dc1-a704-87336e60a656%7C553a0e6e-33e7-4a4b-9c3f-7949e384667d#loc=51.7330297,0.4254688,14.385602016438682, accessed on 20 September 2024.

Table 2. Default parameter settings used in the hybrid benchmark.

Algorithm	Parameter (Value) with Description
Ant Colony
(AC)	n_ants = 300: ants per iteration;
	n_iter = 500: outer iterations;
	cand_k = 20: size of candidate-list;
	$α = 1.0$ : pheromone exponent;
	$β = 2.0$ : heuristic exponent $(1 / distance)$ ;
	$ρ = 0.050$ : global evaporation rate;
	elitist = 5: extra deposits of best-so-far tour.
Genetic
Algorithm
(GA)	pop_size = 120: initial population;
	patience = 400: stop if no improvement for this many outer iterations.
	elite_frac = 0.10: the fraction of top-performing individuals carried over to the next generation without alteration.
	mut_prob = 0.25: the probability of applying a mutation (random variation) to an individual.
Particle Swarm
(PSO)	patience = 60: early-stop stall counter.
	swarm_size = 80: the number of particles (candidate paths) in the swarm.
	n_iter = 300: the total number of iterations (generations) for the swarm to evolve.
	w = 0.4: controls how much of the previous velocity (path) is retained in the new iteration, balancing exploration and exploitation.
	c1 = 0.6: weights the influence of the particle’s own best-known position (personal experience).
	c2 = 0.9: weights the influence of the global best-known position found by the swarm (collective knowledge).
Rapidly Exploring
Random Tree*
RRT*	max_iter = 800: tree-expansion iterations (after direct-edge test);
	step = 0.05°: steering increment in lon/lat degrees;
	$p_{g o a l} = 0.60$ : goal-biased sampling probability;
	radius = $0.15 °$
	smoothing: 100 random shortcuts + 150 repair shortcuts.
Adaptive Large Neighbourhood
Search
(ALNS)	iters = 600: destroy/repair cycles;
	destroy% = (0.2, 0.4, 0.6): random-percentage removal levels;
	$λ = 0.85$ : distance weight in objective (1– $λ$ for heading variance);
	w_init = 5.0: initial operator weight;
	sa_steps = 400: SA-repair steps when selected.
Simulated
Annealing
(SA)	$T_{0} = 4.0$ : initial temperature;
	$α = 0.90$ : multiplicative cooling factor;
	$T_{min} = 0.05$ : freeze-out temperature;
	max_steps = 1200: optimisation steps per call.

Table 4. Benchmark results of hybrid planners (mean and standard deviation). Path length L in km, ETA in minutes at

v = 60 km h^{- 1}

and computation time

t_{calc}

in seconds.

Table 4. Benchmark results of hybrid planners (mean and standard deviation). Path length L in km, ETA in minutes at

v = 60 km h^{- 1}

and computation time

t_{calc}

in seconds.

Algorithm	L (km)		ETA (min)		$t_{calc}$ (s)
	Mean	Std	Mean	Std	Mean	Std
ACO-ALNS	33.27	15.70	33.27	15.70	12.30	4.11
ACO-SA	33.27	15.70	33.27	15.70	12.19	4.15
GA-ALNS	33.40	15.92	33.40	15.92	0.14	0.03
GA-SA	33.40	15.92	33.40	15.92	0.09	0.03
PSO-ALNS	33.40	15.92	33.40	15.92	0.21	0.06
PSO-SA	33.40	15.92	33.40	15.92	0.15	0.04
RRT*-ALNS	31.36	13.97	31.36	13.97	0.13	0.05
RRT*-SA	31.36	13.97	31.36	13.97	0.09	0.05

Table 5. Round-trip distance and mission ETA; Out/Back are forward/back leg distances.

Destination	Dist. (km)	ETA (min)	Out (km)	Back (km)
Chelmsford & Essex Hospital	10.2	15.2	4.93	5.32
South Benfleet	56.7	61.7	27.94	28.75
Walton-on-the-Naze	116.8	121.8	56.99	59.79
Chigwell	73.9	78.9	36.23	37.67
Linton	78.8	83.8	40.54	38.27
Harwich	127.4	132.4	63.50	63.94

Table 6. Battery, energy model, and mission parameters.

Battery and Energy Model	Value/Description
Number of series cells ( $N_{s}$ )	12
Nominal cell voltage ( $V_{nom}$ )	3.7 V
Capacity ( $C_{Ah}$ )	30.0 Ah
Nominal energy ( $E_{nom}$ )	1332 Wh
Usable depth of discharge ( $ρ_{DoD}$ )	0.8 (80%)
Temperature slope ( $k_{T}$ )	0.004/°C
Temperature factor bounds	$f_{min} = 0.70, f_{max} = 1.02$
Base energy cost ( $E_{0}$ )	20.0 Wh/km
Payload coefficient ( $k_{m}$ )	2.0 Wh/km/kg
Service time ( $t_{svc}$ )	5 min
Flight/Mission	Value/Description
Cruise speed (v)	60 km/h
Ambient temperature (T)	20 °C
Payload mass (m)	2.0 kg

Table 7. NFZ-aware UAV assignments for VRPTW using Google OR-Tools and ETAs.

UAV ID	Destination	Round-Trip Distance (km)	ETA (min)
0	Halstead	23.5	26
1	Colchester	33.9	36
2	Chelmsford & Essex	4.8	7
3	Oaks Hospital	33.0	36
4	Basildon	23.9	27
5	Princess Alexandra	26.3	29
6	St Margaret	24.3	27
7	South Benfleet	27.0	30
8	Walton-on-the-Naze	57.3	60
9	Chigwell	35.2	38
10	Linton	52.3	55
11	Harwich	57.9	61

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sadeghi Esfahlani, S.; Simanjuntak, S.; Sanaei, A.; Fraess-Ehrfeld, A. Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex. Drones 2025, 9, 664. https://doi.org/10.3390/drones9090664

AMA Style

Sadeghi Esfahlani S, Simanjuntak S, Sanaei A, Fraess-Ehrfeld A. Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex. Drones. 2025; 9(9):664. https://doi.org/10.3390/drones9090664

Chicago/Turabian Style

Sadeghi Esfahlani, Shabnam, Sarinova Simanjuntak, Alireza Sanaei, and Alex Fraess-Ehrfeld. 2025. "Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex" Drones 9, no. 9: 664. https://doi.org/10.3390/drones9090664

APA Style

Sadeghi Esfahlani, S., Simanjuntak, S., Sanaei, A., & Fraess-Ehrfeld, A. (2025). Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex. Drones, 9(9), 664. https://doi.org/10.3390/drones9090664

Article Menu

Advanced Drone Routing and Scheduling for Emergency Medical Supply Chains in Essex

Abstract

Highlights

Abstract

1. Introduction

2. Related Work

2.1. Classical and Sampling-Based Path Planning

2.2. Heuristic and Bio–Inspired Metaheuristics

2.3. Learning-Based and Hybrid Approaches

3. Methodology

3.1. Drone Specification

3.2. Problem Formulation

3.3. Case Study

3.4. Routing and Scheduling Optimisation Methods

3.4.1. Path Length and Cost Function

Estimated Time of Arrival (ETA)

3.4.2. Rapidly Exploring Random Tree Star (RRT*)

3.4.3. Ant Colony Optimisation (ACO)

3.4.4. Genetic Algorithm (GA)

3.4.5. Particle Swarm Optimisation (PSO)

3.4.6. Simulated Annealing (SA)

3.4.7. Adaptive Large Neighbourhood Search (ALNS)

3.4.8. Battery Model and Integration

Pack Energy and Temperature Derating

3.4.9. Time-Window-Aware Mixed-Integer Linear Programming Model (TWA-MILP)

Objective Choice

MILP Embedding (Range/Energy Constraints)

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI