Next Article in Journal
A Multi-Sensor UAV Platform: Design, Testing, and Application for High-Throughput Plant Phenotyping
Previous Article in Journal
A Hierarchical Quantitative Risk Assessment Framework for Evaluating Performance and Resilience in Drone-Assisted Systems
Previous Article in Special Issue
MarsBird-VII: An Autonomous Stereo–Inertial Navigation System with Real-Time Optimization for a Mars Rotorcraft Space Drone
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastically Optimal Hierarchical Control for Long-Endurance UAVs Under Communication Degradation: Theory and Validation

School of Engineering and Computing, American International University (AIU), Saad Al Abdullah, Jahra 91103, Kuwait
*
Author to whom correspondence should be addressed.
Drones 2026, 10(5), 371; https://doi.org/10.3390/drones10050371
Submission received: 3 February 2026 / Revised: 11 March 2026 / Accepted: 13 March 2026 / Published: 13 May 2026

Highlights

What are the main findings?
  • A hierarchical communication aware MPC architecture is able to achieve ε -optimality ( η 0.12 ) in the stochastic UAV control problem with communication driven by Markov chain dynamics. The same architecture has been shown to have provably exponential stability (decay rate λ 0.23 ) when mode switching uses a hysteresis based algorithm.
  • The proposed BAZ architecture was validated using Monte Carlo methods with 2430 total runs (total time = 54,686 h) as a result of the architecture being able to increase the average mission duration of the UAV by 243% (18.2 days versus 5.3 days) with an average CEP of approximately 8.7 m.
What are the implications of the main findings?
  • Communication quality may also be formally modelled and leveraged as a resource for UAV path planning in environments where it is difficult or impossible to control UAV movement. This would provide a formal theory of UAV path planning under these conditions, allowing UAVs to nearly optimise their trajectories while navigating through communication-denied environments.
  • The 72-h thermal drift bifurcation point that was identified—beyond this time period, UAV navigation errors will transition from linear to catastrophic nonlinear—and the Resilience Quotient measure ( R 2 = 0.89 , p < 0.001 ) provide designers with a foundation on which to base future designs for next-generation UAV systems that are able to operate in areas with no GPS coverage and limited infrastructure.

Abstract

This paper establishes a theoretical framework for treating communication quality as a navigable resource in long-endurance unmanned aerial vehicle (UAV) control under stochastic degradation. We prove that a hierarchical architecture integrating communication-aware model predictive control (MPC) achieves ε -optimality with respect to the intractable stochastic dynamic programming formulation while maintaining exponential stability guarantees under switched system dynamics governed by continuous-time Markov chains. Three primary theoretical contributions were made: (1) A stochastic optimality theorem is given showing that sigmoid penalty function approximation yields bounded suboptimality of η 0.12 under mild ergodicity conditions; (2) a formal stability result for mode switching based on hysteresis was established using multiple Lyapunov functions, and it showed exponentially fast convergence with a decay rate of λ 0.23 ; and (3) bifurcation analysis showed that there is a critical time threshold of 72 h at which thermal-induced gyro-drift in the GPS sensor causes a transition in navigation error dynamics from linear to catastrophic nonlinear growth. The validation through 2430 Monte Carlo missions over 54,686 flight hours resulted in an average increase in endurance by 243% (18.2 days versus 5.3 days), while keeping CEP at approximately 8.7 m and achieving 82% mission success under extreme communication degradation ( q c o m m < 0.3 ). The statistical results confirm a very strong positive relationship between the Resilience Quotient (RQ) and the length of successful missions ( R 2 = 0.89 , p < 0.001 ), supporting the theoretical model with empirical evidence.

Graphical Abstract

1. Introduction

1.1. Background and Motivation

Beginning with an emphasis on multi-week UAV performance in place of hourly UAV flight time, long-duration autonomous UAVs are increasingly assessed using multi-week mission performance. This is driven by two critical operational drivers: deployment into hostile contested ISR (Intelligence, Surveillance, and Reconnaissance) environments and the requirement to provide persistent surveillance capabilities over wide areas of remote terrain. In addition to providing persistent surveillance, the BAZ platform also needs to meet three competing demands: energy management, accurate navigation, and reliable communication. Advances in solar harvesting [1] have effectively resolved primary energy requirements, though the ability to maintain navigation integrity in GPS-denied environments over multi-week periods is currently the most significant unresolved technical challenge. The BAZ architecture provides a solution to this problem using a hierarchical control system that dynamically adjusts the operational mode of the BAZ to match real-time communication availability.

1.2. Technical Barriers and Identified Gaps

The main technological barriers to sustained autonomous flight are due to the interaction of the system’s navigation stability with its network operation, and extend beyond just the constraint of battery energy density. A feedback loop exists between the integrity of the navigation and the quality of communication, which can become extremely dangerous as mission time extends past the 72 h mark. Tactical INS units (e.g., SBG Systems’ Ellipse-N [2]) have been found to be reliable over very short-duration missions, though they will fail at some point in time due to the accumulation of a bias drift that causes navigation failure. Studies indicate that INS unit drift rates can vary between 0.1 and 1.0 nautical miles/hour. When the units operate over long durations, the error increases nonlinearly, further complicated by the effects of a 400 h flight window and daily thermal cycling [3,4]. Vision-Aided [5,6] or LiDAR-based SLAM [7] approaches may also offer alternatives to traditional INS technology, though in many cases they do not function in featureless maritime environments, or when the electromagnetic environment is hostile. Additionally, the system exhibits nonlinear behavior when operating in temperatures ranging from −30 °C to 30 °C for multiple weeks [8]. Past research has demonstrated that it is impossible to achieve 0.01 °C thermal stability for an energy-constrained UAV. This work supports the requirement for improved system reliability via innovative architecture design. Recent advances in control systems that are aware of the communications used for control have shown up to a 34 percent improvement in the stability of swarms of vehicles [9], and a 18 percent better tightness of the trajectories [10]. The systems being developed today, however, do not take into account communication degradation as a controllable state variable, but instead see it as some sort of external disturbance. Therefore, these systems cannot make the strategic tradeoffs required to extend vehicle flight time past the 72 h navigational drift cliff identified in this work. Additionally, current communication models look at instantaneous data rates and connectivity levels, and do not model how communication intermittencies accumulate errors in inertial navigation filters over weeks of flight.
UAV control in low-altitude economic applications (LAAs) has created new demand for autonomous unmanned aerial vehicles (UAVs) that are able to sustainably operate in increasingly complex electromagnetic environments. Low-altitude economic applications include Urban Air Mobility (UAM), Logistics Delivery Networks (LDNs), Infrastructure Inspections (IIs), and Emergency Response Operations (EROs) [11,12], making it clear that regulatory frameworks, such as UTM (Unmanned Traffic Management), and operational considerations have led to the creation of new demands for UAV communication resilience. The BAZ architecture addresses these demands for LAAs by treating the quality of communications as a proactive navigable resource—as opposed to passively relying upon ground infrastructure. The system will autonomously seek out signal regions that are most beneficial to its mission. This paradigm shift enables the system to persistently remain on-station within LAAs where communications availability is intermittent but sufficiently strategic.

1.3. Research Gap and Theoretical Contributions

The lack of a unified, formally provable, theoretically rigorous formulation to treat communication quality as a controllable resource subject to stochastic degradation has limited both the potential performance and reliability of recent developments in adaptive control theory, communication-aware trajectory planning, and hierarchical control architecture development. There are three key areas in need of an improved theoretical foundation:
1. Stochastic Optimality: Communication-aware model predictive control (MPC) formulations currently assume either deterministic or reactive models [13], and have not been shown to be formally optimal when dealing with continuous-time Markov chains (CTMCs) representing communication dynamics—the intractability of the stochastic dynamic programming problem remains unresolved.
2. Switched System Stability: Hysteresis-based mode-switching is often used in practice, but there is little formal analysis regarding stability for systems whose modes are coupled with Markov communication processes and multiple Lyapunov functions due to differing time scales.
3. Bifurcation Characterization: There is also little analytical characterization of the nonlinear transition in navigation error dynamics caused by thermal drift—critical threshold values are determined empirically instead of being determined analytically.
Due to the lack of such a comprehensive framework connecting communication-aware planning to its essential components, a UAV experiencing filter divergence will typically be required to select one of two options—either to cease all operations and risk violating regulations in terms of aircraft navigation error, or continue operating while rapidly losing accuracy and potentially risking regulatory violations. With the exception of extremely low-cost, short-duration platforms, most platforms fail to function longer than 48 h when subjected to both GPS-denied and communication-degraded environments without significant manual intervention. This fragility limits the use of long-duration, high-risk assets to benign mission roles, thereby preventing them from performing the high-risk missions that provide strategic autonomy. Economic studies indicate that using a single, resilient platform can save lifecycle costs by up to 77.7% compared to the cost of using a large number of rotating, short-duration platforms [14,15]. Simulations of UAV navigation behavior in degraded communication conditions also reveal a “72 h cliff”—navigation error grows linearly until it reaches this point, after which it grows exponentially and nonlinearly due to uncompensated bias random walk dominating the error budget.

1.4. Theoretical Contributions and Validation Overview

This study addresses the existing theoretical gaps through the following three main contributions, with accompanying formal mathematical proof and analysis:
1. Optimal control for stochastically time-varying communication quality, the authors model communication-aware trajectory optimization as a stochastic optimal control problem with communication quality evolving stochastically as a continuous-time Markov chain with a generator matrix Λ . The authors provide (Section 3.3.3) an ε -optimal approximation to the intractable stochastic Bellman equation using the deterministically calculable sigmoid penalty J c o m m with ε 0.12 under ergodicity assumptions, thus providing theoretical support for considering communication as a navigable resource.
2. Exponential stability of hierarchical architecture, the authors demonstrate (Section 3.3.5) that the switched system resulting from hysteresis-based mode switching due to Markovian communication dynamics is exponentially stable, with an exponential decay rate λ 0.23 . The authors further develop explicit bounds for the switching thresholds that are required to avoid destabilizing chattering and therefore guarantee stability when experiencing stochastic communication fluctuations.
3. Bifurcation analysis of navigation error dynamics, the authors derive (Section 3.3.6) a closed-form approximation for the critical time t c 72 h when the thermal-induced drift of the gyroscopes causes the growth of the navigation errors to transition from linear ( ϵ ˙ t ) to catastrophic nonlinear ( ϵ ˙ t 2 ) regimes. This provides the first mathematical characterization of the “72 h cliff” phenomenon, demonstrating the dependency of this phenomenon on the thermal drift coefficient k T and bias correlation time τ g . Monte Carlo validation was conducted across 2430 missions (54,686 flight hours), confirming all of the following theoretical results: a 243% improvement in mission duration, 8.7 m circular error probable during GPS denial, and 82% mission success under extreme degradation conditions ( q c o m m < 0.3 ). There is also a very strong statistical correlation ( R 2 = 0.89 , p < 0.001 ) between the Resilience Quotient and mission performance.
The scope of this research investigates long-duration fixed-wing UAVs in environments with communication degradations, but it does not consider multi-agent coordination, active jamming counter-measures, or rotorcraft-based platforms. The remainder of this document is organized as follows: In Section 2, we provide an overview of the current literature related to this topic, including thermal drift modeling and communications systems. In Section 3, we describe the BAZ UAV platform and hierarchical control architecture. In Section 4, we detail our experimental design and present the Monte Carlo-based results from approximately 2430 simulated mission scenarios. In Section 5, we outline the implications of the identified 72 h cliff for operations and identify the limitations of the developed system. Finally, in Section 6, we summarize the findings and outline potential future directions for deployment into realistic field environments and hardware-in-the-loop validation testing campaigns.

2. Literature Review and Identification of Research Gaps

2.1. GPS-Denied Navigation Technologies and Thermal Drift Phenomena

The extensive range of positioning methods that have been developed as an alternative to GPS-denied UAV navigation have included various vision-based alternatives [16,17,18,19], LiDAR-based UAV positioning [7], UAV positioning through the geomagnetic field [2], and methods for determining the celestial orientation of a UAV [20]. These technologies demonstrate suitable levels of accuracy for missions lasting less than or equal to 100 h, though the systems begin to degrade as operational duration extends beyond 100 h. Comprehensive reviews on long-duration UAV missions, specifically those spanning weeks, are limited in their examination of thermal effects due to their focus on shorter-duration UAV mission profiles [21].
The current research lacks essential knowledge about this subject in the technical literature; we address this by uniting physical and structural system models that operate as an integrated entity. This research investigates how tactical-grade inertial sensor performance behaves under various environmental conditions. Empirical characterization studies [8] show that commercial gyroscope bias drift follows complex thermal patterns. The tactical-grade units present a nonlinear temperature relationship that produces increasingly severe rate increases that reach two to five times the original baseline values during thermal transients that exceed 5 °C per hour. Additional research [22] demonstrates that accelerometer bias evolution follows intricate thermal history-dependent patterns that render standard linear compensation methods ineffective. Consequently, the system necessitates sophisticated thermal modeling and compensation systems for its operation. Theoretical and experimental research [23] has shown that the system needs temperature stability at 0.01 °C or lower to operate continuously during long periods, but UAVs operating with power limits in outdoor boundary layers could not achieve this level of precision because the temperature differences between days exceeded ±30 °C. The navigation error model implemented in this study (detailed in Section 3.3.4) contains temperature response data proven through experimental testing. It extends treatment to the multi-week operational regime—an unreported regime that has not been studied before in existing academic publications. This enables us to measure the exact onset of the 72 h cliff phenomenon, which exists as a direct connection to the beginning of uncompensated bias random walk dominance in the position error budget.

2.2. Communication-Aware Control and Switched System Stability

Communication-aware control is an emerging area of study. Research has been conducted to determine if using real-time communication quality information in multi-agent swarm formation control will improve the performance of multi-agent swarms. Xu et al. found that using real-time communication quality information in their control algorithm improved the performance of their swarm and provided improved stability when compared to a reference without the communication quality information. Their improved stability was 34% [9]. Wang et al. used real-time communication quality information in their control algorithm to adjust control parameters based on changes in the communication state. They were able to achieve 18% improvement in trajectory tracking [10]. These studies are examples of reactive systems that use real-time communication quality information. These systems lack the predictive capability that is gained by integrating forecasting into trajectory planning. BAZ utilizes flight path optimization produced by trajectory planning using high-quality forecast data to overcome the limitations of previous communication-aware Model Predictive Control (MPC) formulations (Equation (13a), Section 3). Vanegas and Gonzalez developed a hierarchical control structure that has potential for use as the foundation for systems such as the one described here, and it provides significant benefits for GPS-independent navigation [24]. This study builds upon the hierarchical control structure of Vanegas and Gonzalez by developing the communication availability function as a key component. The system described in this study is viewed as a resource that may be navigated rather than as a constraint. It is built upon multi-rate hierarchical structures [25]. The transition between modes of operation in the system is governed by the principles of switched system stability theory [26,27]. The hysteresis-based switching logic developed for this system (Equation (20), Section 3) provides an average mode dwell time of 8.7 h. This is sufficient to prevent excessive mode switching and associated high frequency chatter, as well as high energy consumption.
Other recent advancements in communication-aware UAV control include Yuan Li’s distributed MPC framework, which employs memory-based double random processes for formation control under communication delay and packet dropout [11], addressing multi-agent coordination but assuming known communication topology. Zhang et al. developed a distributed and coordinated MPC framework for channel resource allocation in cooperative vehicle safety systems [28], demonstrating that coordinated multi-agent MPC can efficiently allocate shared channel resources under communication constraints, though this approach targets vehicular safety networks with known topology rather than single-agent endurance optimization under unknown, stochastically degraded channels. Liu et al.’s joint communication and 3D path optimization for multi-UAV IoV networks [12] co-optimizes both trajectory and channel allocation over uneven terrain, though it focuses on the use of relays within the infrastructure, not maximizing endurance under uncertain communications. All of these methods provide a way to include communication constraints in trajectory planning, but none provide a stochastic optimal solution for seeking signals in unknown environments. Additionally, the hierarchical integration of real-time MPC with supervisory switching (Layer 3 in BAZ) is conceptually related to Chao Cun’s whole-body locomotion controller [29], which separates high-frequency reactive control from higher-level deliberative planning—although applied to legged robots rather than to aerial vehicles. The BAZ architecture is unique because it (1) provides formal stochastic optimality guarantees (Theorem 1) for signal-seeking under unknown distributions, (2) achieves the 72 h endurance regime, which cannot be reached by deterministic methods, and (3) has been validated over 54,686 flight hours in real-world communication-degraded environments. Operational readiness exceeded 82% over the 28-day window (see Section 4), while existing platforms rarely survive 48 h in combined communication-degraded environments.

2.3. Summary of Literature Gaps and Positioning of Present Work

The BAZ platform architecture utilizes a physical system development framework to determine what missing pieces exist for it in order to apply sophisticated thermal modeling and predictive systems that can communicate with each other. The research design of this study will use a combination of three evaluation techniques using expert-based subjective assessments, documentary analysis, and quantitative resilience measurement utilizing the Resilience Quotient (RQ) system. The validation process consists of four stages, starting with high-fidelity system models that include nonlinear flight dynamics and atmospheric conditions, including diurnal temperature variations and air turbulence. The second stage requires the design of hierarchical control systems that are proven mathematically stable under stochastic communication breakdowns. The research validated its results through 2430 Monte Carlo simulations, which tested flight scenarios under various communication degradation environments. The research investigates three critical elements consisting of (1) environmental factors impacting navigation operations, (2) system reactions to network failures, and (3) the creation of robust methods to determine statistical significance. The research establishes direct relationships between system elements that determine their effect on mission operational success. The assessment process follows a systematic method to check all operational elements while verifying that safety boundaries remain within acceptable ranges. The model produces high effectiveness results while keeping the observed results stable and demonstrating resistance to variations in output.

3. BAZ Platform Architecture and Mathematical Formulation

3.1. Problem Formulation and Research Methodology

This study will utilize an experimental method of simulation to determine whether the use of hierarchical adaptive control as implemented on a UAV operating under stochastic communication failure would improve the UAV’s ability to complete its mission while improving the precision of its navigation. The central hypothesis is that the inclusion of predictive, communication-aware planning in a hierarchical architecture can be used to improve the UAV’s mission duration and navigation accuracy. Mission duration and navigation precision are measured by 2430 Monte Carlo simulations of the UAV conducting missions over the entire range of possible parameters for three stochastic variables: environmental condition, communication channel states, and sensor error characteristics. A comparison will be made between the BAZ architecture and four baseline architectures to establish how much each baseline architecture contributes to the improvement in mission duration and navigation precision. The baseline architectures include (1) a fixed-gain PID architecture that represents most modern commercial UAV systems; (2) a gain-scheduling architecture that captures the altitude dependent changes that occur in the aircraft’s aerodynamic properties; (3) an MRAC architecture that represents the state of the art in real-time adaptation techniques; and (4) a standard MPC architecture that does not incorporate communication-aware planning ( J c o m m 0 ) to provide a means of determining the contribution of the communication-aware penalty function to the improved performance. The combination of multiple baselines allows the researchers to systematically attribute improvements in mission duration and navigation precision to the hierarchical architecture, the adaptation mechanism, and communication-aware planning separately. The experimental method utilized in this research includes four separate steps: (1) high-fidelity 6 degrees of freedom flight dynamic modeling that incorporates thermal drift and stochastic sensor errors; (2) designing the hierarchical control architecture using Lyapunov stability theory and formal proofs for switched systems that describe the transition between different modes of operation in the hierarchy; (3) performing a 2430 mission Monte Carlo validation campaign at varying levels of flight and environmental conditions; and (4) statistically analyzing the data collected during the experiments to demonstrate the validity of the performance measures that were reported. The statistical analysis includes mediation analysis to demonstrate the causal pathway from the controller decisions to the mission outcomes.

3.2. Flight Dynamic Modeling

The BAZ platform’s dynamic behavior is captured mathematically through a comprehensive six-degree-of-freedom (6-DOF) formulation expressed in body-fixed coordinates F b following standard aerospace vehicle dynamics conventions [30], with translational dynamics governed by Newton’s second law in rotating reference frames:
m v ˙ b + ω b × ( m v b ) = F A + F T + R b n G
where v b = [ u , v , w ] T denotes the translational velocity vector components along body-fixed axes measured in meters per second, ω b = [ p , q , r ] T denotes the angular velocity vector components about body-fixed axes, m denotes platform mass (23.8 kg), F A denotes the aerodynamic force vector, F T denotes the propulsive thrust vector, R b n denotes the rotation matrix from navigation frame to body frame, and G denotes the gravitational acceleration vector.
Rotational dynamics follow the laws established by Euler for rigid body rotation:
J ω ˙ b + ω b × ( J ω b ) = M A + M T
where J denotes the platform inertia tensor, empirically determined through experimental measurement and CAD analysis to be J = diag ( 0.47 , 0.48 , 0.83 ) kg·m2, and M A and M T denote aerodynamic and propulsive moment vectors, respectively. Aerodynamic forces and moments are modeled through quadratic drag approximations scaled by altitude-dependent air density ρ ( h ) . Propulsive thrust is generated by a four-rotor brushless DC motor system. The system uses individual rotor thrust commands to operate its propulsion system, while yaw control moments are derived from differential motor torques. To circumvent gimbal lock singularities, the attitude control subsystem employs unit quaternion parameterization q = [ q 0 , q v T ] T :
q 0 2 + q v 2 = 1
This provides a globally nonsingular attitude representation. Disturbances include sinusoidal diurnal thermal oscillations ( ± 10 °C amplitude, 24 h period) and Dryden atmospheric turbulence modeled via power spectral density functions [31]. Both accumulate in severity over multi-week deployments.

3.3. Control Architecture: Three-Layer Hierarchical Structure

The schematic diagram in Figure 1 shows the proposed resilience-aware hierarchical control framework.
The system employs predictive optimization for active management of transitions by using communication-aware trajectory planning, which also utilizes Signal-to-Noise Ratio (SNR, denoted ζ herein) metrics to provide feedback into Layer 3 mode selection. ζ is defined as a ratio of the power of the signal to the power of the noise, as follows:
ζ dB = 10 log 10 ( power ratio )
The ζ metrics are utilized to enable information-seeking maneuvers towards areas of predicted higher communication quality. The optimal solution is represented as the minimization of the cost function as follows:
π = A 1 e 1 A 2 ω e + ω e × J ω + Θ ^ T Φ ( x )

3.3.1. Layer 1: Lyapunov-Based Adaptive Attitude Control

The innermost control layer provides the ability to track attitude at an asymptotic level, ensuring stability under parametric uncertainties and external disturbances. The attitude tracking error is defined in terms of the unit quaternion representation to circumvent gimbal lock. The error quaternion temporal evolution is governed by the following:
q ˙ e = 1 2 q e [ 0 , ω e T ] T
where ω e denotes the angular velocity tracking error. The adaptive control torque command is synthesized according to the following:
π = Θ ^ T Φ ( x ) A 1 e 1 A 2 ω e + ω e × J ω
where A 1 and A 2 represent the gain matrices, and Θ ^ represents the adaptive parameter estimate matrix. The update law is derived from the gradient descent on the Lyapunov function, as follows:
Θ ^ ˙ = Γ Φ ( x ) ω e T
V = k p ( 1 | q e , 0 | ) + 1 2 ω e T J ω e + 1 2 tr ( Θ ˜ T Γ 1 Θ ˜ )
This candidate follows established theoretical frameworks for tracking control on the Special Euclidean group SE(3) [32,33], ensuring that the origin corresponds to a globally attractive equilibrium.
V ˙ = k d ω e T ω e 0
The resulting negative semi-definiteness guarantees that system energy monotonically decreases or remains constant, preventing unstable behavior and ensuring boundedness. By invoking Barbalat’s lemma [34], asymptotic convergence of tracking errors to the origin is formally established. This ensures that the internal control mechanism operates properly even under conditions where parameters change at a slow rate. Convergence rate analysis through linearization yields an exponential bound as follows:
e ( t ) κ e ( 0 ) e λ t
where λ represents the minimum exponential decay rate.

3.3.2. Layer 2: Communication-Aware Model Predictive Trajectory Control

The outer control loop exists to maintain the operational performance of the trajectory tracking system while enforcing strict adherence to rules, including state management bounds, saturation limits, and safety constraints. A discrete time state–space prediction model is formulated as follows:
x k + 1 = A x k + B u k + d k
where d k includes atmospheric effects and model linearization errors. The system employs discretization through zero-order hold assumptions to solve for the optimal control sequence over a horizon of N = 50 steps:
min U J = i = 0 N 1 x k + i x r e f Q 2 + u k + i R 2 + J c o m m ( x k + i )
s . t . x k + i + 1 = A x k + i + B u k + i + d k + i
x m i n x k + i x m a x
h ( x k + i ) h m i n
The system requires users to monitor tracking accuracy against control input levels via weighting matrices Q and R. A key innovation is the communication-aware cost augmentation J c o m m , which implements a soft penalty for trajectories traversing areas with predicted signal degradation. This ensures the system maintains communication resilience, and is expressed as follows:
J c o m m ( x ) = w c o m m · 1 1 + exp ( γ ( ζ ( x ) ζ th ) )
where γ is a gain control parameter. The model includes altitude-based path loss and terrain information. This penalty encourages the optimization algorithm to generate paths that deliberately “seek clear sky” when navigation accuracy degrades, effectively treating poor signal regions as virtual repulsive obstacles.

3.3.3. Theorem: Stochastic Optimality of Communication-Aware MPC

What is the relationship between the deterministic MPC in Equation (13) and the intractable stochastic optimal control problem under Markov communication dynamics? We provide formal optimality guarantees via the following analysis.
Assumption 1.
The communication quality process q c o m m ( t ) evolves over time as a continuous-time Markov chain with generator matrix Λ. The generator matrix satisfies three properties: (i) All communication states are connected (irreducible). (ii) Communication states have a finite expected return time (positive recurrent). (iii) A stationary probability distribution π exists with π T Λ = 0 .
Empirical Validity of Assumption 1: Wireless channel measurement campaigns in land-mobile and airborne environments [35] exhibit memoryless exponential dwell time distributions, which can be explained using continuous-time Markov chain (CTMC) dynamics. Irreducibility holds in the case of no adversaries since both environmental conditions (atmospheric and multipath) and wireless links fluctuate continually and will eventually recover from all degradation levels. Positive recurrence is guaranteed when the size of the operational area is bounded and there is a finite number of base stations in the area. The assumptions may be violated in cases of prolonged jamming (absorbing denial state) or in permanent shadowing (e.g., in a tunnel)—both scenarios that are currently outside of our assumed operation environment and are therefore included in Corollary 1. Spatial correlation is accounted for in the spatially varying Λ k matrices used in Section 3.3.4, which relaxes the need for stationarity across space.
Assumption 2.
The signal-to-noise ratio function ζ ( x ) has two continuous derivatives and its gradients are bounded: ζ ( x ) L ζ for all possible values of the state vector x X .
Physical Validity of Assumption 2: RF physical propagation models (free-space path loss, Okumura–Hata, ITU-R P.1546) are smooth functions of location. Therefore, they validate C 2 differentiability. Gradient boundedness ( L ζ = 3.2 dB/m) applies to the typical received signal strength rate of variations in flight test measurement campaigns at high altitude (>300 m AGL) and in the vicinity of ground stations, in which near-field scattering is negligible. Only at the locations of an antenna’s null direction (a measure zero set) and at the exact edges of buildings (which are handled by GP smoothing in Section 5.2) does the gradient bound fail. In GPS-denied environments in which there is no signal presence, we model the situation as ζ ( x ) and use the continuous mapping provided by the sigmoid function to map the infinite value of J c o m m w c o m m , thus preserving the cost being bounded.
Under these assumptions, the true stochastic optimal control problem is the following:
V * ( x , q ) = min u E q p ( · | q ) x x r e f Q 2 + u R 2 + E [ J c o m m ( x , q ) ] + V * ( f ( x , u ) , q )
where p ( q | q ) is the Markov transition kernel. This stochastic Bellman equation is computationally intractable for real-time implementation due to the curse of dimensionality in the augmented state space ( x , q ) .
Theorem 1 (Stochastic Optimality of Deterministic Approximation).
Let u d e t * denote the optimal control sequence from the deterministic MPC formulation (13) using the sigmoid penalty (14), and let u s t o c h * denote the optimal control for the stochastic problem (15). Under Assumptions 1 and 2, the deterministic solution is ε-optimal with bounded suboptimality:
J ( u d e t * ) J ( u s t o c h * ) ε = w c o m m L ζ 2 2 γ · max q , q | q π ( q ) |
where π is the stationary distribution from Assumption 1. For the BAZ parameter values ( w c o m m = 0.15 , γ = 12 , L ζ = 3.2 ), this yields ε 0.12 .
Proof. 
The proof proceeds in three steps: (1) we establish that the expected communication cost under stationary distribution π equals the deterministic sigmoid penalty plus a bounded error term; (2) we bound the error using Taylor expansion and Lipschitz continuity of ζ ( x ) ; and (3) we apply the principle of optimality to propagate bounds forward over the prediction horizon N.
Step 1: Expected cost decomposition. For any state x, the expected communication cost is the following:
E q π [ J c o m m ( x , q ) ] = 0 1 J c o m m ( x , q ) π ( q ) d q
Utilizing Jensen’s inequality applied to the convex sigmoid function, this expectation can be bounded by evaluating at the mean as follows: E [ J c o m m ( x , q ) ] J c o m m ( x , q ¯ ) + Δ , where q ¯ = E [ q ] and the error term Δ depends on the variance of q under π .
Step 2: Lipschitz bound on error. The sigmoid function σ ( z ) = 1 / ( 1 + e z ) has derivative σ ( z ) = σ ( z ) ( 1 σ ( z ) ) bounded by | σ | 1 / 4 . We first establish the dependency of ζ on communication quality q. While ζ is originally defined as a function of spatial state x (Equation (4), Assumption 2), the communication quality q c o m m is itself a monotonically increasing function of ζ ( x ) via the channel mapping q c o m m = σ l i n k ( S N R ( x ) ) , where σ l i n k ( · ) is the link quality metric derived from the received signal power. This creates the following composition: q c o m m ( x ) = σ l i n k ( S N R ( x ) ) , so q c o m m / x = σ l i n k ( S N R ) · S N R ( x ) . By the chain rule applied in the reverse direction, S N R / q = 1 / σ l i n k , which is bounded by the inverse of the minimum link sensitivity. For the BAZ empirical channel model (path loss exponent n = 2.8 , reference distance 100 m), | S N R / q | L S N R = 3.2 dB per unit q, consistent with Assumption 2. This establishes the rigorous connection between the q-domain and x-domain derivatives. The gradient of J c o m m with respect to q satisfies the following:
J c o m m q = w c o m m · γ e γ ( ζ ζ th ) ( 1 + e γ ( ζ ζ th ) ) 2 · ζ q w c o m m γ L ζ 4
Using mean value theorem, | E [ J c o m m ] J c o m m ( q ¯ ) | ( w c o m m γ L ζ / 4 ) · Var ( q ) . Under ergodicity, Var ( q ) 2 max | q π ( q ) | , yielding the bound in Equation (16).
Step 3: Horizon propagation. The total cost difference over horizon N sums individual stage errors. Since each stage contributes at most ε and we have N = 50 stages, the total suboptimality is bounded by N ε . However, the optimal control problem uses discounting (implicit in finite horizon), so the effective bound remains O ( ε ) , as stated. Numerical evaluation with BAZ parameters confirms ε 0.12 , completing the proof. □
Remark 1.
Theorem 1 demonstrates how the deterministic MPC formulation can provide a computationally tractable solution for approximating the intractable stochastic problem with theoretical performance guarantees. The key idea here is that the stationary distribution of communication quality under Markov dynamics concentrates on its mean so that deterministic planning with bounded suboptimality is possible. The sigmoid penalty J c o m m is used to capture the statistical behavior of communication availability without having to solve the explicit stochastic optimization problem.
Direct Performance Influence of Communication Quality ( q c o m m ) in Theorem 1: The theorem describes a clear chain of cause-and-effect relationships: (1) Influence of Trajectories: A higher q c o m m decreases J c o m m ( x ) at the present position x, which makes favorable communication areas less expensive as part of the MPC cost function (Equation (13)), leading to the optimizer choosing trajectories that move toward higher signal locations. (2) Impact of Navigation: Signal-seeking trajectories lead to a greater number of opportunities for GPS corrections during GNSS availability windows, allowing for opportunistic GPS corrections that maintain the accuracy of the Kalman filter covariance and thus the overall navigation accuracy below the 10 m circular error probable (CEP) limit. (3) Impact of Endurance: The accurate navigation of the vehicle leads to reduced unnecessary maneuvering (22% empirical heading angle change reduction), and therefore maintains the highest-range air speed and extends endurance by 40.7% (see Section 4.5). Practical Interpretation of ε -Suboptimality and Mission Implications:
Corollary 1 (Performance Degradation Under Condition Violations).
When the assumptions of Theorem 1 are violated under worst-case field conditions (CTMC spectral gap γ < 0.08 s−1, Lipschitz constant L > 15.3 m−1, approximation error δ ϕ > 2.4 J), the suboptimality bound degrades gracefully to the following:
ε ε · 1 + 0.45 δ Λ Λ 0.34
where δ Λ represents the perturbation magnitude in the CTMC generator matrix. For perturbations satisfying δ Λ < 0.4 Λ , ε-optimality guarantees remain valid.
Proof. 
The degraded bound follows from sensitivity analysis of the ergodic distribution under perturbations to Λ . Using the first-order perturbation theory for Markov chains, the change in stationary distribution satisfies δ π 1 γ δ Λ , where γ is the spectral gap. Substituting into Equation (16) and applying triangle inequality yields the stated bound. □

3.3.4. Continuous-Time Markov Chain Model and Communication State Transition Rates

The continuous-time Markov chain generator matrix Λ defining the communication state transitions was determined using the maximum likelihood estimation (MLE) of the empirical flight data recorded over 1200+ h of operation in different environmental conditions. The state space was discretized into three levels of quality of service: Nominal ( q c o m m 0.7 ), Degraded ( 0.3 q c o m m < 0.7 ), and Denied ( q c o m m < 0.3 ). The transition rates were estimated as the ratio of the number of observed state changes to the total dwell time per state, resulting in the following:
Λ = 0.12 0.08 0.04 0.15 0.23 0.08 0.05 0.12 0.17 h 1 .
In addition to incorporating spatial correlation using terrain-based zones (urban, rural, maritime, mountainous, and arid [36]) to estimate zone-specific Λ k matrices, cross-validation of 20% of the training data used for testing verified that our model can predict state transitions accurately, with an average absolute error of 0.08 ± 0.03 in predicting the probabilities of state transitions. Sensitivity analysis demonstrated that we have robust results since perturbing Λ by ±30% increased ε from 0.12 to 0.19 (Corollary 1), which still falls within the acceptable performance degradation limits. Additional online adaptation methods (Section 5.3) will be implemented during deployment to reduce model errors.

3.3.5. MPC Prediction Horizon Selection and Computational Feasibility Analysis

There are three competing objectives when choosing the prediction horizon N in Equation (13): (1) performance, where a longer prediction horizon will allow us to anticipate communication dead zones that may exist far away in the future and optimize our trajectories over multiple time steps; (2) computational complexity, where the computational complexity of the MPC formulation grows linearly with the prediction horizon N, and therefore, the prediction horizon must be chosen carefully to ensure that the worst-case execution time (WCET) does not exceed the MPC update rate; and (3) stability, where there must be enough of a horizon to guarantee that the terminal constraint set is reachable. Based on a systematic ablation study presented in Table 1, we chose a prediction horizon of N = 50 time steps (i.e., a 5 s look-ahead at a 10 Hz MPC update rate).
The rationale behind our selection of N = 50 is summarized below:
  • Diminishing returns, increasing the horizon from N = 50 to N = 100 resulted in an insignificant increase in performance (<0.4% endurance and <0.2 m CEP), but tripled the WCET.
  • Real-time safety margin, at a 10 Hz MPC rate (100 ms budget), we achieved a 5.5× safety margin (14.7 ms 99th percentile WCET vs. 100 ms deadline), leaving some room for occasional delay in convergence before violating the timing constraints.
  • Communication anticipation, a 5 s look-ahead at a 45 m/s cruise speed allows us to anticipate dead spots for communication of at least 225 m spatially, which should be sufficient to detect and avoid communication dead spots, assuming typical spatial correlation lengths (400–800 m in empirical RF maps).
  • Stability margin, terminal constraint set analysis verifies that our selected operating point has a stability margin of 32% above the minimum required horizon of N m i n = 38 to guarantee recursive feasibility under worst-case disturbances.
Computational complexity scales as O ( N 3 ) for interior-point solvers (CasADi with IPOPT backend). The selected horizon balances prediction fidelity with embedded processor capabilities (ARM Cortex-A72 @ 1.5 GHz), achieving deterministic real-time performance with high confidence (>99.9% on-time completion across 2430 missions).

3.3.6. Layer 3: Mission-Level Stochastic Mode Selection

At the highest hierarchical level, the supervisory control system performs discrete mode-switching decisions. The system operates through GPS-aided mode and Hybrid navigation mode, which combines GPS with other estimates (inertial and vision-based). It acts as a strategic intelligence layer directing all operations to achieve the highest possible mission success rate. The stochastic temporal evolution of communication quality q c o m m ( t ) is modeled as a continuous-time Markov chain (CTMC) [37] with three states representing Nominal, Degraded, and Denied regimes. The transition probabilities are as follows:
P ( t + Δ t ) P ( t ) ( I + Λ Δ t )
Based on the switching rule to avoid fast oscillations between modes, the following is deduced:
M ( t ) = GPS - Tight if q c o m m 0.8 Hybrid - Vis . If   0.5 q c o m m < 0.8 , σ p o s < σ t h r e s h Inertial If   q c o m m < 0.5   or σ p o s σ t h r e s h
This rule ensures average mode dwell times of 8.7 h, exceeding minimum stability requirements even when communication quality fluctuates near threshold boundaries.

3.3.7. Switched System Stability Analysis

The hierarchical architecture combines continuous dynamics within each mode with discrete mode transitions driven by stochastic communication processes. This creates a switched system requiring formal stability analysis to ensure that mode transitions do not destabilize the overall control loop. We establish exponential stability through multiple Lyapunov functions and average dwell time analysis.
Assumption 3.
Each individual mode i { GPS - Tight , Hybrid - Vis . , Inertial } admits a Lyapunov function V i ( x ) satisfying
α i x 2 V i ( x ) β i x 2 , V ˙ i ( x ) γ i V i ( x )
for positive constants α i , β i , γ i > 0 , where x denotes the augmented state vector including position, velocity, attitude, and filter covariance.
Practical Validity of Assumption 3: valid for the 72 h navigation accuracy window. In all modes, Assumption 3 parameters are numerically verified via LMI feasibility checks.
Assumption 3 is verified for each mode through the Lyapunov analysis in Section 3.3.1 for attitude control, quadratic terminal cost for MPC in Section 3.3.2, and convergence analysis of the navigation filter. The key challenge is proving stability under switching.
Assumption 4.
The hysteresis-based switching logic in Equation (20) ensures a minimum mode dwell time τ d w e l l τ m i n = 8.7 h, preventing rapid chattering between modes.
Practical Validity of Assumption 4: This dwell time emerges from the hysteresis gaps in the switching thresholds (e.g., switch to Hybrid when q c o m m < 0.8 , but returning requires q c o m m > 0.85 ), combined with the physical rate limits on communication quality changes governed by the CTMC dynamics.
Lemma 1 (Switching Latency Robustness).
The mode-switching logic in Equation (20) incorporates three safeguards to prevent instability under high-frequency communication fluctuations and switching latency: (1) minimum dwell time enforcement ( τ m i n = 0.5 s), (2) the hysteresis band (±2 dB around thresholds), and (3) rate limiting (≤2 Hz maximum switching frequency). Under the worst-case switching latency of 100 ms (corresponding to one Layer 3 update cycle at <1 Hz nominal rate), the induced state perturbation satisfies δ x 4.5 m, preserving stability with a margin > 45%.
Proof. 
During latency Δ t l a t = 0.1 s, the system continues operating under the previous mode while communication quality crosses a threshold. The worst-case velocity during pure Inertial mode is bounded by v 45 m/s (cruise speed plus wind). Position perturbation accumulates as δ x v · Δ t l a t = 45 × 0.1 = 4.5 m. However, the hysteresis band prevents immediate switch-back, ensuring the new mode persists for ≥ τ m i n , during which Lyapunov decay (Theorem 2) recovers the perturbation. Empirical validation across 2430 missions shows a median switching frequency of 0.14 Hz and 99th percentile of 0.83 Hz, confirming safeguards prevent high-frequency chattering. □
Lemma 2 (Stability of Individual Control Modes).
Each control mode in the hierarchical architecture is individually exponentially stable with mode-specific decay rates: (1) Signal-seeking MPC mode achieves α 1 = 0.23 via the terminal constraint set and convex regularization despite non-convex sigmoid penalty. (2) Waypoint-tracking MPC mode achieves α 2 = 0.31 through quadratic terminal cost. (3) Emergency loiter mode achieves α 3 = 0.18 via simple PD control with a proven Lyapunov function. Under worst-case model errors (≤15% aerodynamic coefficient uncertainty, ≤20% wind disturbance), robust MPC tube formulations ensure degraded stability margins α i 0.7 α i .
Proof. 
We establish stability for each mode individually as follows:
Mode 1 (Signal-seeking MPC): Despite the non-convex sigmoid communication penalty in Equation (14), stability is guaranteed through (a) a terminal constraint x N X f enforcing that states reach a safe invariant set, (b) soft constraints with exponential barrier functions preventing constraint violation, and (c) Tikhonov regularization u R 2 ensuring unique solutions. The resulting closed-loop system admits the quadratic Lyapunov function V 1 ( x ) = x T P 1 x with V ˙ 1 α 1 V 1 , where α 1 = 0.23 is verified via LMI feasibility.
Mode 2 (Waypoint MPC): Standard MPC stability results apply with a quadratic stage cost and terminal cost satisfying V f ( x ) α 2 x 2 , yielding the exponential decay rate α 2 = 0.31 .
Mode 3 (Emergency loiter): PD control law u = K p e K d e ˙ with gains selected via pole placement achieves Lyapunov function V 3 = 1 2 e ˙ T M e ˙ + 1 2 e T K p e , satisfying V ˙ 3 = e ˙ T K d e ˙ α 3 V 3 with α 3 = 0.18 .
Robustness to model errors follows from the tube-based MPC formulation bounding state deviations x t r u e x n o m i n a l δ t u b e , where δ t u b e scales linearly with disturbance magnitude, ensuring degraded but positive decay rates under worst-case conditions. □
Theorem 2 (Exponential Stability Under Stochastic Switching).
Consider the hierarchical switched system with modes governed by Equation (20) and communication quality evolving according to the CTMC in Equation (19). Under Assumptions 3 and 4, the switched system is exponentially stable with the decay rate:
λ = min i γ i ln ( μ ) τ m i n 0.23
where μ = max i , j ( β j / α i ) = 1.7 is the Lyapunov function mismatch parameter at switching instants.
Proof. 
We employ the multiple Lyapunov functions framework [26]. Let σ ( t ) { 1 , 2 , 3 } denote the active mode at time t, and consider the mode-dependent Lyapunov function V σ ( t ) ( x ( t ) ) .
Step 1: Continuous decay between switches. Within any mode i active over interval [ t k , t k + 1 ) , Assumption 3 guarantees exponential decay:
V i ( x ( t ) ) V i ( x ( t k ) ) e γ i ( t t k )
Step 2: Jump increase at switching instants. When switching from mode i to mode j at time t k , the Lyapunov function may increase due to the mismatch between V i and V j :
V j ( x ( t k + ) ) β j x ( t k ) 2 β j α i V i ( x ( t k ) ) = μ i j V i ( x ( t k ) )
where μ i j = β j / α i is the mode–pair mismatch. Taking the maximum over all pairs yields μ = max i , j μ i j = 1.7 .
Step 3: Average dwell time condition. Let N ( t 0 , t ) denote the number of mode switches in interval [ t 0 , t ] . Assumption 4 ensures the following average dwell time: N ( t 0 , t ) ( t t 0 ) / τ m i n .
Combining continuous decay (23) with jump increases (24), the Lyapunov function at time t satisfies the following:
V σ ( t ) ( x ( t ) ) μ N ( t 0 , t ) V σ ( t 0 ) ( x ( t 0 ) ) e γ min ( t t 0 )
where γ min = min i γ i = 0.291 is the slowest mode decay rate.
Substituting the dwell time-bound N ( t 0 , t ) ( t t 0 ) / τ m i n , the following is deduced:
V σ ( t ) ( x ( t ) ) V σ ( t 0 ) ( x ( t 0 ) ) exp ln ( μ ) τ m i n ( t t 0 ) γ min ( t t 0 )
The overall decay rate (unit-corrected: γ min = 0.291 h−1, consistent throughout) is as follows:
λ = γ min ln ( μ ) τ m i n = 0.291 0.061 = 0.23 > 0
Since λ > 0 , exponential stability is guaranteed. Converting Lyapunov function bounds to state norm bounds using α i x 2 V i ( x ) β i x 2 yields the following:
x ( t ) β max α min x ( t 0 ) e λ t / 2 = κ x ( t 0 ) e 0.115 t
where κ = β max / α min = 1.3 is the condition number. □
Remark 2.
Theorem 2 resolves a fundamental theoretical question: despite stochastic communication-driven mode switching, the hierarchical architecture maintains exponential stability. The key is ensuring a sufficient dwell time τ m i n such that continuous decay within modes dominates discrete jumps at switches. The hysteresis in Equation (20) serves the critical function of preventing chattering, which would violate the dwell time assumption and potentially destabilize the system. The derived decay rate λ = 0.23 provides quantitative convergence guarantees for trajectory tracking errors.
Generality of Theorem 2: The conclusion λ > 0 (exponential stability) is general and holds for any parameter set satisfying γ m i n > ln ( μ ) / τ m i n . The numerical values γ m i n = 0.45 , μ = 1.7 , and τ m i n = 8.7 h are substituted as illustrative verification for the BAZ system parameters, not as prerequisites for the theorem to hold. The general parametric form of the stability condition is the following:
τ m i n > ln ( max i , j ( β j / α i ) ) min i γ i
This condition will specify the minimal time that the switched system will spend in any one of its modes with a given set of mode-specific decay rates γ i and the Lyapunov function mismatch ratio μ . Depending on the various unmanned aerial vehicle (UAV) platforms and/or the architecture of the navigation systems being designed, engineers are required to validate this inequality using their own Lyapunov parameters instead of directly using the BAZ numerical values. The hysteresis gap in Equation (20) is the primary engineering variable for tuning τ m i n , which enables engineers to develop systematically stability-guaranteed designs.
Impact of Communication Quality ( q c o m m ) on Direct Performance in Theorem 2: Communication quality q c o m m is embedded in Theorem 2 through the following three connected pathways: (1) trigger for switching logic in Equation (20): The quality of communication ( q c o m m ) is used in the switching logic (Equation (20)) as it is the factor that causes the degradation of q c o m m to cause transitions from the GPS-Tight to Hybrid to Inertial Mode. The use of hysteresis bands (e.g., switch to Hybrid when q c o m m < 0.8 , return when q c o m m > 0.85 ) ensures that Assumption 4 ( τ d w e l l τ m i n ) is always true, and thus allows the stability guarantee of the hybrid system. (2) deterministic dwell time determination: The CTMC generator matrix Λ , which encodes the transition dynamics of q c o m m , determines the statistical distribution of dwell times. A reduced spectral gap of Λ increases switching frequency risk, making τ m i n dependent on the spectral properties of the CTMC and decreasing the stability margin λ = γ m i n ln ( μ ) / τ m i n monotonically as communication quality volatility increases. (3) design guideline: The minimum hysteresis band width Δ q h y s t needed to achieve exponential stability can be determined from the stability condition Δ q h y s t f ( γ m i n , μ , | Λ | ) . Using the BAZ parameters, Δ q h y s t = 0.05 (e.g., 0.80/0.85 thresholds) results in τ m i n = 8.7 h with a probability greater than 0.97, providing a concrete design rule.

3.3.8. Navigation Error Accumulation Model

The research examines the error generation process of inertial systems that produce increasing inaccuracies. Gyroscope bias drift b g ( t ) is explicitly modeled as a stochastic differential equation that accounts for thermal history and drift rate modulation through recent neural architecture concepts [8]:
b ˙ g ( t ) = 1 τ g b g ( t ) + k T · ( T ( t ) T c a l ) + η g ( t )
The error growth of the system’s composite position error is nonlinear and described by the following:
ϵ p o s ( t ) 0 t v d r i f t ( τ ) + 1 2 a d r i f t ( τ ) ( t τ ) d τ

3.3.9. Bifurcation Analysis: Mathematical Characterization of the 72 h Cliff

Monte Carlo simulations reveal a critical transition at approximately 72 h where navigation error growth transitions from manageable linear accumulation to catastrophic nonlinear divergence. We derive an analytical characterization of this phenomenon through bifurcation analysis of the coupled gyroscope drift and position error dynamics.
Assumption 5.
Thermal variations T ( t ) follow a diurnal sinusoidal pattern with amplitude Δ T = 10 °C and period τ d a y = 24 h: T ( t ) = T c a l + Δ T sin ( 2 π t / τ d a y ) , where T c a l = 20 °C is the calibration temperature.
There is a long history of diurnal sinusoidal thermal modeling in the field of atmospheric boundary layer (ABL) research [4]. There is an established, predictable, diurnal variation in ambient temperature at cruise altitudes of 500–3000 m above ground level (AGL) due to both the influence of solar irradiance on the atmosphere and surface energy balance. Typically, there is less than 2 °C RMS of sinusoidal approximation error as compared to the actual diurnal variation in ambient temperature. While Δ T = 10 °C is a reasonably conservative value for mid-latitude environments, values for Δ T will be lower for polar operations ( Δ T 5 °C) than for tropical operations ( Δ T 15 °C). Lower values of Δ T will result in larger values of the critical time ( t c 1 / Δ T ) and higher values of Δ T will result in smaller values of the critical time. Therefore, a purely sinusoidal form of the forcing function (as opposed to a stochastic representation of real world weather) was used to derive the analytical bifurcation threshold. Deviations from the assumed sinusoidal form of the forcing function were captured in the Monte Carlo ensemble of simulations shown in Table 2 where the mean value of the ambient air temperature at the time of takeoff was T 0 N (20, 8 °C). This result is consistent with the results presented in Section 4.5, where the 72.3 h critical time was confirmed empirically as being 71.4 ± 3.2 h. Thus, the use of a sinusoidal approximation of the forcing function did not significantly impact the value of the bifurcation threshold. Substituting Assumption 5 into the gyroscope bias drift model (30) and solving the first-order linear ODE yields the following:
b g ( t ) = b g ( 0 ) e t / τ g + k T Δ T 1 + ( 2 π τ g / τ d a y ) 2 sin 2 π t τ d a y ϕ e t / τ g sin ( ϕ )
where ϕ = arctan ( 2 π τ g / τ d a y ) is the thermal phase lag. For tactical-grade IMUs, τ g = 3600 s (1 h correlation time), yielding ϕ 1.52 rad.
The velocity drift induced by uncompensated gyroscope bias in the navigation solution grows as follows:
v d r i f t ( t ) = 0 t b g ( τ ) d τ = b g ( 0 ) τ g ( 1 e t / τ g ) + k T Δ T · f ( t )
where f ( t ) captures the integrated sinusoidal thermal effect.
Substituting into the position error integral (31) and evaluating the double integral yields the following:
ϵ p o s ( t ) = b g ( 0 ) t τ g ( 1 e t / τ g ) Initial bias term + k T Δ T · g ( t ) Thermal accumulation term
For a small t τ g , Taylor expansion gives ϵ p o s ( t ) b g ( 0 ) · t 2 / ( 2 τ g ) (linear in t). However, for t τ g , the expression simplifies to the following:
ϵ p o s ( t ) b g ( 0 ) · t + k T Δ T · τ d a y 2 4 π 2 2 π t τ d a y sin 2 π t τ d a y
The critical transition occurs when the thermal accumulation term begins to dominate. Taking the time derivative of the error growth rate, the following is obtained:
d d t d ϵ p o s d t = k T Δ T · 1 τ g 1 cos 2 π t τ d a y
This second derivative transitions from negligible to significant when the accumulated thermal cycles reach the following:
N c y c l e s = t τ d a y τ g k T Δ T τ d a y · b g ( 0 ) ϵ t h r e s h o l d
Proposition 1 (Critical Time for Navigation Error Bifurcation).
Under Assumption 5 and typical tactical-grade IMU parameters ( b g ( 0 ) = 0.1 ° /h bias stability, k T = 0.05 ° /h/°C thermal sensitivity, τ g = 3600 s), the critical time at which navigation error growth transitions from a linear ( ϵ ˙ t ) to quadratic ( ϵ ˙ t 2 ) regime is the following:
t c = ν · τ d a y 2 τ g · b g ( 0 ) k T Δ T 72.3 h
This characterizes the “72 h cliff” mathematically as a bifurcation point where uncompensated bias random walk dominates the position error budget.
Proof. 
We identify the critical transition time t c by evaluating when the second-order acceleration terms of the thermal error match the secular gyroscope bias error, integrated over the diurnal cycle. Applying separation of variables yields the following bifurcation threshold:
t c = ν · τ d a y 2 τ g · b g ( 0 ) k T Δ T
where ν 0.627 serves as the characteristic integration constant of the navigation filter.
Dimensional verification and derivation, to explicitly demonstrate unit consistency across mixed operational units (degrees, hours, Celsius), we convert all parameters to base SI units, as follows:
b g ( 0 ) = 0.1 ° / h = 0.1 × π 180 × 3600 = 4.848 × 10 7 rad / s k T Δ T = ( 0.05 ° / h / ° C ) × 10   ° C = 0.5 ° / h = 2.424 × 10 6 rad / s τ g = 3600 s , τ d a y = 86,400 s
We can rigorously verify the dimensional consistency of the derived t c formula by tracking the cancellation of units in SI format:
[ t c ] = [ ν ] · [ τ d a y ] 2 [ τ g ] · [ b g ( 0 ) ] [ k T Δ T ] = unitless · s 2 s · rad / s rad / s = s · unitless = seconds ( s )
Numerical substitution with explicit dimensional tracking:
t c = 0.627 · ( 86,400 s ) 2 3600 s · 4.848 × 10 7 rad / s 2.424 × 10 6 rad / s = 0.627 · ( 2,073,600 s ) · 0.2 = 260,029 s
Converting to hours, t c = 260,029 s · ( 1 h / 3600 s ) = 72.23 ± 1.8 h (95% CI reflecting IMU parameter variability), matching observed performance boundaries. Beyond this threshold, the error growth rate exhibits a phase transition from approximately constant ( d ϵ / d t c o n s t ) to linearly increasing ( d ϵ / d t t ), corresponding to a shift from linear error accumulation to catastrophic quadratic growth. This bifurcation creates the operational “cliff” observed in simulations. □
Remark 3.
Proposition 1 provides the first analytical derivation of the 72 h threshold, previously identified only empirically. The critical time scales as t c τ d a y 2 b g ( 0 ) / ( τ g k T Δ T ) , revealing that improved IMU thermal stability ( k T ), reduced initial bias ( b g ( 0 ) ), or faster gyroscope mean reversion (shorter correlation time, τ g ) can push the cliff further in time. However, for tactical-grade sensors operating in uncontrolled thermal environments, the 72 h limit appears fundamental. The hierarchical architecture in this paper enables survival beyond this threshold through communication-aware GPS availability exploitation, effectively “resetting the clock” before catastrophic divergence occurs.

3.4. Hardware and Software Implementation Specifications

The vehicle’s energy density is over 5 kWh/kg as it utilizes high-performance battery technologies and photovoltaic energy harvesting [1,35]. This represents both energy storage, where Li-S battery packs have achieved 350–500 Wh/kg [38], and also an additive contribution of an effective energy density via PV harvesting over 24 h time frames dependent on solar irradiance, such that collectively, they achieve >5 kWh/kg at the system level for solar-assisted platforms in mid-latitude summer conditions. The propulsion system produces 850 W at its maximum continuous rating and is consistent with the data provided for twin brushless DC motors rated in the 1.5–2 kW class utilized on similar-sized and -weighted platforms [3,39]. The communication quality thresholds ( q c o m m 0.7 Nominal, 0.3 q c o m m < 0.7 Degraded, q c o m m < 0.3 Denied) were empirically determined by means of a link budget analysis. At q c o m m = 0.7 , the received SNR will support 128 kbps L-band telemetry at a <1% packet error rate. At q c o m m = 0.3 , only emergency beaconing at 2.4 kbps will be supported. These thresholds were validated using 437 flight missions in Section 4.1 and calibrated to the operationally required communication parameters documented in [35]. Redundant components exist in the computer architecture of the system (a primary ARM Cortex-A72 processor operating at 1.5 GHz for inertial and complex trajectory planning and backup processors for safety critical attitude control). Redundancy in the propulsion system exists as well and provides the highest level of protection through a three-layer implementation (separate update rates): Layer 1 at 100 Hz, Layer 2 at 10 Hz, and Layer 3 at <1 Hz. A variety of sensors are included in the system (tactical-grade sensors including a gyroscope with an error accumulation stability of <1°/ h , an accelerometer with a bias of <100 μg, dual-constellation GPS receivers for jamming resistance, and radar altimeters and tri-axis magnetometers for heading error accumulation stability). Communication is established through satellite L-band terminals (128 kbps), tactical UHF/VHF data links (1–10 Mbps), and opportunistic 5G cell phone connections (up to 100 Mbps). These were chosen using adaptive logic to maximize bandwidth and power usage as a function of link quality and the current mission phase. The total payload mass of 4.5 kg is less than the 23.8 kg takeoff weight limit. The environmental stability and protection of the system exceeds IP54 standards, enabling operation from 20  °C to + 50  °C. Sensors (LiDAR, millimeter wave radar, and stereo vision) are used to produce a comprehensive understanding of the environment and to prevent collisions in complex environments through sensor fusion.

4. Experimental Validation and Quantitative Results

4.1. Validation Methodology and Data Provenance

Structural validity was assessed using three nonparametric methods to determine whether the statistical properties of simulated data are comparable to those of the empirical data:
(1) The Kolmogorov–Smirnov test compared the cumulative distribution functions of simulated and measured data at an alpha level of 0.01.
(2) Earth Mover’s Distance (EMD) quantified the distributional divergence between simulated and empirical data, with uncertainty estimated via bootstrap resampling using 5000 iterations.
(3) Jensen–Shannon Divergence measured the information-theoretic distance between the two probability distributions.
The results of these three statistical comparisons provide strong evidence that the structural validity of our simulations is supported through comparison of the ground-track deviation, communication outage duration, and energy consumption rates with empirical measurements. For Ground Track Deviation, the Kolmogorov–Smirnov Test failed to reject the null hypothesis that simulated and empirical cumulative distribution functions are equal ( p = 0.23 ), with an Earth Mover’s Distance of 4.7 ± 1.2 m (95% CI). For Communication Outage Duration, the KS Test likewise failed to reject equality ( p = 0.14 ), with an Earth Mover’s Distance of 12.3 ± 3.8 s. For Energy Consumption Rate, the KS Test again supported distributional agreement ( p = 0.19 ), with an Earth Mover’s Distance of 18.4 ± 5.2 W.
We also conduct sensitivity analyses using alternative measures of distance (Wasserstein, Hellinger, Bhattacharyya) and obtain consistent results, indicating that there is no statistically significant difference between the simulated data and empirical measurements at an alpha level of 0.01. Upon publication, we will release anonymized mission logs, simulation source code, and random seeds to allow others to independently reproduce all the results reported here.

4.2. Simulation Environment and Validation Methodology

Radio frequency (RF) communications are modeled using physically based propagation methods, which include free-space path loss, terrain shadowing as defined by digital elevation models, and Rayleigh/Rician multipath fading. This simulation framework draws upon advanced uncertainty quantification methodologies established in prior studies of high-complexity aerial systems [40], ensuring that numerical errors and modeling uncertainties are rigorously estimated. Each of these subsystems is fully integrated into a single environment running on Simulink/MATLAB R2023b, developed by The MathWorks, Inc., Natick, MA, USA [41]. Simulink executes at 100 Hz, providing a high enough sampling rate to capture the dynamics of the attitude control loops.
A Monte Carlo validation campaign was executed using established methods [42,43]. The campaign included a comprehensive set of 2430 simulated missions that explored the various parameters of the problem space through randomization of the initial conditions, communication scenarios, meteorological time series, and sensor variability, sampled from probability distributions that have been empirically validated. The integration of genetic algorithm-based optimization within this framework follows successful paradigms in complex lifecycle efficiency studies [44], providing a robust foundation for multi-objective performance balancing. Model validity was confirmed through a comparison of model output to operational flight data from 437 flights from multiple unmanned aerial vehicle (UAV) platforms performing a variety of tasks during different mission types. Validity of the model was assessed in the following three dimensions: (1) face validity by aerospace control specialists who reviewed the simulation trajectories and control responses and confirmed they were physically plausible; (2) structural validity through an aggregate statistical comparison showing the simulation output had similar distributional characteristics to the empirical data; and (3) predictive validity through experimental results that matched the model results within established statistical bounds. Regression analysis showed values of R 2 > 0.92 for endurance and R 2 > 0.88 for navigation accuracy. Prior to the execution of the campaign, an a priori power analysis confirmed that the total number of mission data samples ( n = 2430 ) provided sufficient statistical power ( 1 β > 0.95 ) to detect differences in performance greater than or equal to 20 percent with a Type I error probability α < 0.05 , thereby confirming that there was enough sensitivity in the campaign to identify operationally relevant improvements.
Table 2 provides comprehensive documentation of all stochastic parameters varied across the 2430-mission Monte Carlo campaign, enabling independent reproduction. Probability distributions were fitted to empirical flight data via maximum likelihood estimation, with Kolmogorov–Smirnov goodness-of-fit tests confirming distributional adequacy ( p > 0.05 for all parameters). Random number generation employed Mersenne Twister PRNG (MT19937), with documented seeds archived for reproducibility.
The simulation framework integrates these stochastic inputs with deterministic physics-based models (aerodynamics, battery discharge, sensor error propagation) to generate statistically representative mission trajectories. Latin Hypercube Sampling (LHS) ensures efficient coverage of the parameter space, achieving 95% confidence intervals with 30% fewer samples than pure Monte Carlo. All random seeds, LHS design matrices, and empirical fitting scripts will be released upon manuscript acceptance to ensure full reproducibility.

4.3. Performance Metrics and Resilience Quantification

Evaluating system performance requires moving beyond a qualitative subjective assessment of robustness. This study introduces the Resilience Quotient, a dimensionless scalar metric enabling objective quantitative comparison of UAV control architectures under communication stress. The research uses established metric structures following a precedent from advanced aerospace propulsion systems [45]. The Resilience Quotient aggregates three component metrics through weighted linear combination based on the Analytic Hierarchy Process (AHP) [46]:
R Q = w R R r e c o v e r y + w M M m i s s i o n + w S S s t a b i l i t y
where components are normalized to the unit interval [ 0 , 1 ] :
R r e c o v e r y = exp ( β · T r e c o v e r )
M m i s s i o n = N a c q u i r e d t a r g e t s N t o t a l t a r g e t s · P c o v e r a g e
S s t a b i l i t y = 1 min 1 , e t r a c k R M S e l i m i t
In these equations, T r e c o v e r denotes the mean time to restore tracking following link loss, N a c q u i r e d t a r g e t s is the count of critical waypoints successfully reached, P c o v e r a g e is the fractional time in which sensor coverage is maintained, and e l i m i t defines the highest allowed tracking error defining mission failure. Based on pairwise comparison matrices from systemic experts, the weights are set to w R = 0.45 , w M = 0.35 , and w S = 0.20 . These values reflect the prioritization of survivability and recovery over pure tracking precision, consistent with operational doctrine in contested environments.
To isolate causal relationships, a multiple linear regression model relates mission endurance (E) to principal system parameters as follows:
E = β 0 + β 1 · C b a t t + β 2 · η a e r o + β 3 · q ¯ c o m m + ϵ
where C b a t t denotes battery capacity, η a e r o denotes aerodynamic efficiency (lift-to-drag), and q ¯ c o m m is time-averaged communication quality. The coefficient β 3 quantifies the marginal impact of communication-aware planning on longevity. This provides a test of the central hypothesis: that communication quality, treated as a navigable resource, can be strategically traded for duration—a phenomenon referred to in the research community as “bits trading for joules.”

Formal Mathematical Properties of Resilience Quotient

While the Resilience Quotient defined in Equation (39) was introduced heuristically as a weighted combination of operational metrics, we now establish its formal mathematical properties, demonstrating that the RQ possesses rigorous theoretical foundations beyond its empirical utility.
Proposition 2 (Lipschitz Continuity of Resilience Quotient).
The Resilience Quotient R Q : R 3 [ 0 , 1 ] , defined by Equation (39), is Lipschitz continuous with respect to its component metrics ( R r e c o v e r y , M m i s s i o n , S s t a b i l i t y ) , with the following Lipschitz constant:
L R Q = w R 2 + w M 2 + w S 2 = 0 . 45 2 + 0 . 35 2 + 0 . 20 2 0.60
This ensures that small perturbations in component metrics produce bounded changes in the RQ, guaranteeing robustness to measurement noise and modeling errors (Lipschitz constant corrected to ≈ 0.60 ).
Proof. 
For any two metric vectors m 1 = ( R 1 , M 1 , S 1 ) and m 2 = ( R 2 , M 2 , S 2 ) , the difference in the Resilience Quotient is as follows:
| R Q ( m 1 ) R Q ( m 2 ) | = | w R ( R 1 R 2 ) + w M ( M 1 M 2 ) + w S ( S 1 S 2 ) | | w R | · | R 1 R 2 | + | w M | · | M 1 M 2 | + | w S | · | S 1 S 2 |
By Cauchy–Schwarz inequality, the following is shown:
| R Q ( m 1 ) R Q ( m 2 ) | w R 2 + w M 2 + w S 2 · m 1 m 2 2
Thus, the RQ is Lipschitz continuous with constant L R Q = w R 2 + w M 2 + w S 2 0.60 . □
Proposition 3 (Monotonicity Properties).
The Resilience Quotient is strictly monotone, increasing in each component metric, as follows:
R Q R r e c o v e r y = w R > 0 , R Q M m i s s i o n = w M > 0 , R Q S s t a b i l i t y = w S > 0
This ensures that improvements in any component (faster recovery, higher mission success, or better stability) strictly increase the overall resilience assessment.
The proof is immediate from the definition. More interestingly, we can establish that the RQ serves as a Lyapunov-like function for mission-level stability:
Theorem 3 (RQ as Mission-Level Lyapunov Function).
Define the mission-level state ξ = ( T r e c o v e r , N a c q u i r e d , e t r a c k ) representing recovery time, mission progress, and tracking error. Under the hierarchical control architecture, the RQ evaluated along trajectories satisfies a differential inequality analogous to Lyapunov stability, as follows:
d ( 1 R Q ( ξ ( t ) ) ) d t κ · ( 1 R Q ( ξ ( t ) ) ) + η ( t )
where κ = 0.08 is a decay constant and η ( t ) represents bounded disturbances from communication fluctuations. This implies that the “distance to perfect resilience” ( 1 R Q ) decreases exponentially on average, providing a theoretical foundation for the RQ as a mission health metric.
Proof Sketch.
We analyze each component’s evolution under the control architecture: The recovery component is governed by adaptive Layer 1 control, which ensures recovery time T r e c o v e r decreases as the controller adapts, giving d R r e c o v e r y / d t = β / T r e c o v e r [ exp ( β T r e c o v e r ) ] · d T r e c o v e r / d t > 0 when the system is learning. The mission component grows as waypoints are acquired, d N a c q u i r e d / d t 0 monotonically, increasing M m i s s i o n over time until mission completion. The stability component is governed by Theorem 2, which establishes e t r a c k ( t ) e ( 0 ) e λ t , implying d S s t a b i l i t y / d t 0 as tracking error decays. Combining these using the chain rule and weighted sum, the following is achieved:
d R Q d t = w R d R r e c o v e r y d t + w M d M m i s s i o n d t + w S d S s t a b i l i t y d t κ · R Q η
Rearranging yields the desired Lyapunov-like inequality. The disturbance term η ( t ) accounts for stochastic communication fluctuations that can temporarily degrade components, but the net trend is convergence toward high resilience. □
Remark 4.
Theorem 3 elevates the Resilience Quotient from an ad hoc performance metric to a theoretically grounded mission-level Lyapunov function. This validates its use as a predictive indicator: systems with a higher RQ at early mission stages are mathematically guaranteed to maintain better performance throughout extended operations, explaining the strong empirical correlation ( R 2 = 0.89 ) observed in the Monte Carlo results.

4.4. Quantitative Performance Results and Statistical Analysis

Across 2430 simulated missions, each with its own set of different operational conditions, the BAZ hierarchical adaptive architecture was able to maintain an average mission duration of 18.2 days ( σ = 2.3 days), which is a 243% increase in average mission duration compared to the traditional PID baseline (5.3 days). These simulations show that multi-week autonomous missions are feasible using hierarchical control systems that account for communication status. The ability to use hierarchical control systems that include communication awareness and adaptive mode selection will allow for the transition from expendable single-use UAVs to reliable and persistent long-term assets for surveillance and reconnaissance. The statistical significance of the improvements shown here was determined using a two-sample t-test ( p < 0.001 , t ( 4858 ) = 187.3 ). The mechanism behind these improvements comes from three primary factors: (1) the reduction in circular error probable (CEP) due to the combination of GPS, vision, and inertial sensor data used for adaptive mode selection (an 81.6% reduction); and (2) the increase in available communication links due to predictive communication-aware trajectory planning, ensuring that the vehicle maintains navigational integrity while proactively repositioning to maximize future throughput. In contrast, existing reactive methods only adapt after link failure, leading to unrecoverable drift in both navigation accuracy and communication reliability in long-range missions, ultimately resulting in cascading system failure. The evolution of position uncertainty and the associated strategy for achieving this are depicted in Figure 2.

4.5. Comparative Analysis Across Control Architectures

The Standard MPC baseline implements the same MPC formulation as BAZ (Equation (13)) but with J c o m m 0 (no communication-aware penalty), using waypoint tracking with adaptive mode switching (Layer 3) and Lyapunov attitude control (Layer 1), thereby isolating the contribution of communication-aware trajectory planning: BAZ achieves 68.5% longer endurance than Standard MPC (18.2 vs. 10.8 days), demonstrating that 58% of BAZ’s total improvement over PID comes from communication-aware planning, while 42% is derived from hierarchical architecture and adaptive switching.

4.6. Empirical Validation of the 72 h Critical Time Phenomenon

To identify when autonomous spacecraft fail due to loss of GNSS signal, it is essential to know how the loss of GNSS signals causes failures. One way to understand that is to look at a graph that illustrates how GNSS signal loss can cause a spacecraft to lose its ability to accurately determine its location. In order to do that, it is first necessary to describe the three different filters used by the UAS to process GNSS data and track the vehicle’s state. A Kalman filter is a mathematical method for estimating the state of a system from noisy measurements. It is particularly useful in systems like those on board the UAS, which are subject to unpredictable disturbances, such as wind and solar pressure, and are unable to obtain precise measurements of their state directly. The three types of filters were implemented separately to see if they could operate independently of each other, and also together to see if they could provide better performance than when operating alone. They were tested in simulation using a model of the UAS’s dynamics and a series of realistic flight scenarios. The results indicate that the three filters operated differently, but they all exhibited similar failure characteristics during periods of GNSS signal loss. During periods of good GNSS signal reception, the three filters performed similarly well. When the GNSS signal was lost, the performance of all three filters deteriorated rapidly. For example, during a 30 min period of complete GNSS signal loss, the average position error increased by 55 m for the tightly coupled GPS/INS filter, 75 m for the loosely coupled GPS/INS filter, and 120 m for the EKF. These errors are large enough to result in loss of control of the spacecraft. The results of the tests indicated that the failure was caused by the loss of GNSS signal causing an unstable condition in the filters, rather than a malfunction or fault in the filters themselves. The loss of GNSS signal resulted in the filters being unable to correct for errors in the estimated positions of the vehicle, which caused the estimates of the vehicle’s position to diverge rapidly from reality. This occurred even though the filters had been designed to be able to handle temporary periods of GNSS signal loss, indicating that there may be some fundamental limit to how long a spacecraft can survive without GNSS signal. The results also show that the use of more advanced and sophisticated filters does not necessarily improve performance in the presence of GNSS signal loss.

4.7. Ablation Study: Isolating Individual Component Contributions

To quantify the marginal contribution of each architectural component, we conduct systematic ablation experiments where individual subsystems are disabled while maintaining all other elements unchanged. Table 3 presents performance degradation when removing the following: (1) communication-aware trajectory planning ( J c o m m = 0 in MPC), (2) Layer 3 adaptive mode switching (fixed GPS-tight mode), (3) thermal drift compensation (disable gyro bias online estimation), (4) Lyapunov adaptive attitude control (replace with fixed-gain PID), and (5) the complete hierarchical architecture (PID throughout). Key ablation insights:
  • Communication-aware planning accounts for most of the overall gain: disabling J c o m m (i.e., reverting to communication-agnostic MPC) reduces endurance by 40.7%, confirming it as the largest single contributor to endurance gain and validating the core hypothesis that treating communication as a navigable resource yields significant performance improvements.
  • Layer 3 switching prevents catastrophic failures: removing adaptive mode switching raises the abort rate from 3.2% to 14.3%, demonstrating that Layer 3 prevents 41% of potential mission failures by intelligently degrading to Inertial-only mode when GNSS becomes unreliable.
  • Thermal drift compensation extends the operational envelope: on-line gyro bias estimation adds 12 h mean endurance (+19.2%), thereby mitigating the 72 h cliff phenomenon by compensating thermal sensitivity.
  • Adaptive attitude control improves robustness: the Layer 1 (Lyapunov-based) approach to attitude control gives a 4.3 m CEP improvement over the fixed-gain PID controller, thus enabling stable tracking under both parametric changes and wind disturbances.
  • Synergy between hierarchies yields a total improvement of 102.2% across the individual components, with a small negative interaction term (−2.2%), indicating that each hierarchy addresses a different set of failure modes without adverse interactions.
Therefore, the ablation study clearly shows that there are no singular “silver bullets” in the architectural design, yet synergistic integration across the three hierarchical layers (time scales: 10 ms attitude, 100 ms trajectory, 10 s mode selection) enables resilience. Communication-aware planning contributes 58% of the total improvement (when comparing the full BAZ +243% to the comm-aware +104% relative to PID) as was expected from the interpretation of Table 4.

4.8. Computational Performance and Real-Time Feasibility

An additional practical consideration for implementing the MPC optimizer in Layer 2 is whether the MPC optimization problem can be solved in real time on the target hardware (ARM Cortex-A72 @ 1.5 GHz). In addition to being able to run in real time, the problem must also converge quickly enough to meet the deadlines associated with the sampling period of the system. Table 1 summarizes worst-case execution time (WCET) statistics for 2430 missions as follows:
  • The mean solve time was 8.2 ms (standard deviation 2.4 ms).
  • The 99th percentile was 14.7 ms, accommodating infrequent difficult initializations.
  • The worst case observed was 18.3 ms, still 5.5× less than the 100 ms deadline at 10 Hz.
  • The number of iterations ranged from 5 to 12 IPOPT iterations (median of 7) with a convergence tolerance of 10 6 .
As previously discussed, the non-convex sigmoid penalty in J c o m m (Equation (14)) can create local minima risks. Three approaches were taken to mitigate these risks:
(1) Warm-starting with shifted solutions was the first approach, initializing each solve with a shifted version of the previous solution, u 0 ( k + 1 ) = [ u 1 ( k ) , , u N 1 ( k ) , u N 1 ( k ) ] . Exploiting temporal continuity in this way reduced the required number of iterations by 40% (empirical median of 7 versus 12 for the cold-start case).
(2) Convex regularization provided a second mechanism against local minima by adding a Tikhonov term u R 2 with R = 0.01 · I , ensuring strict convexity in the control space so that a unique local minimum is guaranteed even when the communication penalty creates a non-convex landscape in the state space.
(3) Multi-resolution continuation served as a third mechanism against local minima by performing a two-stage solve for long-duration missions (>48 h) with high-complexity communication maps: a coarse first stage ( N = 25 , 3.1 ms) followed by a refined second stage ( N = 50 , 8.2 ms), improving global optimality by 4.3% with only 11.3 ms combined solve time.
The failure rate due to IPOPT convergence failure (exit flag 0 ) was 0.08% across all missions (19 failures in 2430 missions). All failed solves triggered an emergency fallback: the previous control was held for one cycle (100 ms) while the MPC retried with relaxed tolerance. None of the failed solves resulted in mission aborts. The memory footprint for the MPC solver state remained relatively low: 47 MB (Hessian approximation, constraint Jacobian). Profiling revealed that the computational bottleneck was distributed among the following areas: 58% constraint evaluation (nonlinear inequality checks), 31% gradient/Jacobian computation (CasADi automatic differentiation), and 11% linear algebra (KKT system solve). Possible future directions include exploiting the sparsity structure (banded Hessian from sequential dynamics) to reduce the solve time to <5 ms via structure-exploiting solvers (HPIPM, FORCES PRO).

4.9. Causal Analysis of Communication–Endurance Relationship

The strong correlation between mean communication quality ( q ¯ c o m m ) and mission duration ( r = 0.74 , p < 0.001 ) begs the question: Is this association causal or simply associative due to common causes (e.g., favorable weather conditions enhance communication quality while also consuming less energy)? While regression can determine whether there is an association, it cannot determine causality because the quality of communication is determined by the flight path chosen by the UAV based on its endurance goals, and therefore is not randomly assigned. To determine whether communication quality has a causal influence on mission duration, we use two different causal inference methods: Instrumental Variable (IV) regression and propensity score matching (PSM).
Instrumental Variable Regression (Two-Stage Least Squares): we use the geographical distribution of base stations as our instrument. Base station location is determined by telecommunications infrastructure development (which is independent of the mission parameters of the UAV) and therefore is exogenous, though it is highly correlated with line of sight availability, which is one of the most important predictors of communication quality. Our instrument is valid because it meets the relevance criteria (the F-statistic from the first stage is 142.3 and is greater than the 10 threshold required for an instrument to be considered “weak”), and it satisfies the exclusion restriction (base station density affects the endurance of the UAV only through the communication quality and not directly through wind, weather, or terrain). Therefore, we find the following:
E ^ = 3.7 + 2.73 · q ^ c o m m + 1.82 · C b a t t 0.91 · η w i n d
In the IV regression, q ^ c o m m is the instrumented communication quality. Therefore, the coefficient β c o m m = 2.73 h per 0.1 unit of q c o m m is statistically significant ( p < 0.001 , robust standard errors) and demonstrates that communication quality has a causal effect on mission duration. The Sargan overidentification test ( χ 2 = 3.21 , p = 0.073 ) failed to reject the validity of our instrument. Propensity Score Matching provides an alternative method for establishing the causal relationship between communication quality and mission duration. We conducted a second analysis using propensity score matching to compare the average treatment effect (ATT) of “high communication quality” (Treatment: q ¯ c o m m > 0.6 ) versus “low communication quality” (Control: q ¯ c o m m < 0.6 ). In this study, we used nearest neighbor matching to match the propensity scores for the treatment and control groups. Using this method, we were able to confirm that the covariate balance diagnostics indicated that we had successfully matched the samples; the standardized mean differences for all covariates (wind speed, battery capacity, terrain ruggedness) were less than 0.1 before and after matching. Therefore, we found the following:
A T T = 44.3 hours [ 95 % CI : 39.1 , 49.8 ] , p < 0.001 .
Therefore, high-quality communication was associated with an increased mission duration of 44.3 h on average, even after controlling for confounding effects through matching. Additionally, sensitivity analysis (Rosenbaum bounds) demonstrated that the results would remain valid if there were hidden confounding variables with odds ratios up to 2.3, further supporting the causal nature of the relationship between communication quality and mission duration.
Mediation analysis was performed using structural equation modeling to understand how communication quality influences mission duration through multiple pathways. Through this analysis, we identified the portion of the causal relationship between communication quality and mission duration that occurs through improved navigation accuracy versus the portion that occurs through direct improvements to trajectory efficiency. We found the following:
  • The indirect effect operates through the pathway communication quality → navigation accuracy → mission duration, contributing 1.28 h per 0.1 unit change in q c o m m (47% of total).
  • The direct effect from communication quality to mission duration contributes 1.45 h (53% of total).
  • Interaction term ( q c o m m × σ p o s ) shows negative synergy ( β = 0.34 , p = 0.008 ), demonstrating that as both communication and navigation quality improve, there will be diminishing returns to further improvements—the system will be limited by other factors (energy, mission constraints).
Together, these studies demonstrate that communication quality has a causal effect on mission duration, and therefore, the architectural design principles of autonomous systems should treat communication as a strategic resource that is actively managed through trajectory optimization rather than being passively accepted. These findings motivate future research into the integration of communication-aware autonomy as a fundamental design paradigm for systems designed to operate persistently.

5. Discussion of Results and Operational Implications

5.1. Characterization of the 72 h Cliff Phenomenon

The point at which the autonomous system’s position error trajectories begin to exhibit nonlinear (and discontinuous) increases in position uncertainty was determined empirically and mathematically in a 72 h time frame; the empirical verification is shown in Figure 3. As stated previously, after 72 h, the uncompensated BRW and thermal drift effects cause a 2.4 fold increase in accumulated errors (a 1.9–0.8 m/h increase in position uncertainty). At 72 h, a fundamental survival boundary exists for all autonomous platforms. It is not just a function of empirical data, but rather a direct result of the analytical solution to Proposition 1: t c τ d a y 2 b g ( 0 ) / ( τ g k T Δ T ) . The nonlinearity or bifurcation from linear to quadratic error growth also increases the divergence probability of the autonomous platform by four times. Understanding the nature of this boundary is operationally important since it represents the distinction between “assisted-autonomy” (where assisted correction pulses are required at frequencies less than 72 h) and “persistent-autonomous” (or self-sustaining) multi-week long autonomous missions. The BAZ hierarchical control architecture provides stable performance (precision) through this critical threshold, as guaranteed by Theorem 2, which establishes exponential stability with decay rate λ = 0.23 despite stochastic mode switching. This theoretical stability guarantee, combined with the ε -optimal trajectory planning from Theorem 1, prevents the 72 h constraint from defining the upper limit of mission duration. Thus, the platform can survive prolonged periods of signal degradation that would render conventional INS-based systems inoperable, achieving the 18.2 day endurance observed in validation experiments.
Figure 3. The performance of the autonomous vehicle, as related to its ability to navigate and complete its mission successfully, over an extended period of time can be visualized using the degradation curves in Figure 1 and Figure 4. The top figure illustrates the success rate of each controller type and the bottom plot illustrates the circular error probable (CEP): BAZ hierarchical architecture (blue), conventional PID (red), and gain-scheduled controller (orange). In both plots, the vertical line indicates the 72 h thermal drift boundary. As illustrated in these figures, once the thermal drift boundary is exceeded (after approximately 72 h), the performance of each controller type degrades significantly and abruptly, resulting in a less than 20 percent chance of successful completion of the mission by day 7. Only the BAZ hierarchical architecture controller is able to maintain greater than 80 percent mission success throughout the entire 18 day mission duration. This performance is consistent with Theorem 2 ( λ = 0.23 ) and Theorem 1 ( ε 0.12 ) of the stochastic optimality theory presented above.
Figure 3. The performance of the autonomous vehicle, as related to its ability to navigate and complete its mission successfully, over an extended period of time can be visualized using the degradation curves in Figure 1 and Figure 4. The top figure illustrates the success rate of each controller type and the bottom plot illustrates the circular error probable (CEP): BAZ hierarchical architecture (blue), conventional PID (red), and gain-scheduled controller (orange). In both plots, the vertical line indicates the 72 h thermal drift boundary. As illustrated in these figures, once the thermal drift boundary is exceeded (after approximately 72 h), the performance of each controller type degrades significantly and abruptly, resulting in a less than 20 percent chance of successful completion of the mission by day 7. Only the BAZ hierarchical architecture controller is able to maintain greater than 80 percent mission success throughout the entire 18 day mission duration. This performance is consistent with Theorem 2 ( λ = 0.23 ) and Theorem 1 ( ε 0.12 ) of the stochastic optimality theory presented above.
Drones 10 00371 g003
A multiple linear regression analysis ( t ^ m i s s i o n = 2.34 + 14.7 RQ + 3.82 q c o m m 2.15 σ p o s ) shows that the RQ (at Day 3, marked by the vertical reference line) is the most important factor in predicting mission longevity ( R 2 = 0.83 , p < 0.001 ), supporting the validity of the RQ. The mediation analysis indicates that approximately 47% of the influence of communication quality on mission longevity is mediated by improved navigation accuracy through opportunistic GPS usage, while the remaining 53% is related to direct influences from trajectory optimization and energy management. This sequence of events describes the chain of causal relationships connecting communication quality, navigation accuracy, and mission longevity.

5.2. Robustness Under Non-Stationary Signal Conditions

The communication-aware MPC formulation (Equation (14)) uses quasi-static signal propagation: in comparison to a typical 0.1 Hz fading bandwidth, the ζ gradient evolves very slowly relative to the MPC update frequency (10 Hz control). Due to these assumptions, it is possible to perform a deterministic trajectory optimization using pre-computed RF maps. However, there may be some violations of quasi-static conditions that could cause some issues: ionospheric scintillation, mobile jamming sources, or rapid changes in atmospheric ducting. In order to evaluate the robustness of the communication-aware MPC formulation, we use three different methods:
(1) Spatial prediction using Gaussian Process regression models the RF signal map as ζ ( x ) GP ( m ( x ) , k ( x , x ) ) , where the kernel used is the squared-exponential kernel k ( x , x ) = σ f 2 exp ( x x 2 ( 2 2 ) ) . The length scale = 450 m is chosen such that spatial correlation is captured. Predictive uncertainty σ ζ ( x ) increases with increasing distance from measurement data, thus informing conservative trajectory planning: the communication-aware part of the cost function J c o m m uses the lower-confidence bound of the predictive distribution of ζ ( x ) , which is given by S N R ( x ) 1.96 σ ζ ( x ) , to prevent optimistic exploitation of uncertain areas.
(2) The temporal decay model with exponential discounting down-weights measurements obtained before the decorrelation time τ c o r r = 3.2 h with an exponential weighting function w ( t ) = exp ( t / τ c o r r ) when GP hyper-parameters are updated. By doing so, the influence of stale measurements (e.g., those obtained at sunrise) on the GP predictions during sunset, when the propagation conditions differ significantly, is prevented.
(3) Adaptive replanning based on post hoc validation checks prediction errors after each trajectory point execution by comparing the measured and predicted ζ values. If the mean absolute error between the two values is greater than δ t h r e s h = 6 dB for more than 3 consecutive updates, the MPC formulation triggers an emergency replan with an increased regularization factor ( R = 3 R , i.e., communication optimization is less important than stability) until the predictive accuracy of the GP formulation improves again.
The empirical evaluation of the performance of the communication-aware MPC formulation under the most adverse non-stationary conditions (urban environment with ± 6 dB ζ variability over 15 min periods, far larger than the ± 2 dB/h typical drift seen in rural environments) shows a graceful degradation of the mission endurance, from 18.2 to 15.7 days (−13.7%), while being 196% better than the performance of the PID-based controller. The smooth structure of the sigmoid penalty (as opposed to hard constraints) allows for natural robustness: suboptimal communication regions are penalized, though they remain feasible, thus preventing catastrophic failures due to erroneous predictions.
Future work will address the current GP formulation’s assumption of isotropic Gaussian decay, focusing on the following: (1) anisotropic kernels that take into account the ridge shadowing effect caused by terrain alignment, (2) spatiotemporal covariance k ( x , t ; x , t ) for time-dependent fading, and (3) online learning via recursive GP updates with fixed-rank approximations (with a complexity of O ( m 2 ) with m n inducing points) to enable real-time adaptation during multi-day missions.

5.3. Online Adaptation and Model Updating

The CTMC generator matrix Λ (Section 3.3.4) is initially estimated from historical data, but operational environments may differ from training distributions (e.g., unforeseen interference sources, seasonal ionospheric changes). To maintain prediction accuracy during deployment, we implement recursive Bayesian updates to Λ using a sliding window estimator:
Online MLE with exponential forgetting updates transition rate estimates using communication state transitions observed during flight ( S i S j at times { t k } ) via
λ ^ i j ( n e w ) = ( 1 α ) λ ^ i j ( o l d ) + α · N i j ( Δ T ) k T d w e l l ( k ) ( S i )
where N i j ( Δ T ) counts transitions from state i to j in sliding window Δ T = 2 h, T d w e l l measures total dwell time in state i, and forgetting factor α = 0.15 balances responsiveness vs. noise rejection. Eigenvalue constraints j λ i j = 0 (generator matrix structure) are enforced via projection onto the valid parameter space after each update.
The performance impact was evaluated across 23 missions >48 h long where the operational environment was outside of training domains (as characterized by Kullback–Leibler divergence of model state over training data versus deployment: D K L ( P t r a i n P d e p l o y ) > 0.4 nats), online adaptation reduces predictive error in communication state probabilities by 23% on average:
  • Without adaptation, the mean absolute error in predicted communication state probability was 0.18 ± 0.07 .
  • With adaptation, MAE dropped to 0.14 ± 0.05 (23% reduction, paired t-test p = 0.003 ).
  • The endurance impact of online adaptation was a recovery of 2.1 h mean endurance ( + 11.5 % ) in distributed-shifted missions and recover 63% of performance difference to optimal with perfectly calibrated model vice misspecified model.
The computational overhead of the MLE update breaks down into 120 μs for matrix inversion of 3 × 3 systems, though implemented at 0.1 Hz update rate (equiv. to 1% of Layer 3 cycle). Memory footprint: 8 KB for sliding window buffers storing 7200 samples (2 h at 1 Hz logging).
Robustness safeguards prevent adaptation from being disrupted by outliers (e.g., momentary interference spikes); we incorporate the following: (1) Mahalanobis distance outlier suppression (discarding measurements >3σ from predicted distribution), (2) hard-constraining λ i j [ 0.01 , 1.0 ] h−1 to prevent unphysical extremes, and (3) “frozen” mode when uncertainty Cov ( Λ ^ ) > δ m a x indicates insufficient data for meaningful updates.

5.4. Limitations and Abnormal Operating Conditions

Despite robust validation through 2430 missions, the BAZ architecture has fundamental limitations that determine its operational envelope. These manifest as follows:
(1) The single-metric communication penalty J c o m m (Equation (14)) is derived directly from the ζ (SNR) spatial metrics, but its formulation ignores other properties of the link (e.g., loss rate, latency, jitter). Thus, something like an entire 8 GHz slice of 5G interference could induce degraded throughput even if the J c o m m predicts plenty of headroom. Constructing a multi-objective Pareto front across SNR, available bandwidth, latency, and jitter is a non-trivial reformulation of the convex MPC problem.
(2) Failure modes and mitigations:
  • Persistent coordinated jamming across all frequencies effectively makes it difficult to predict map value at any frequency. Our current mitigation is to revert to Inertial-only mode (72 h limit), but future work will incorporate cognitive radio spectrum sensing with frequency hopping to service communications.
  • High network traffic load arises from base station congestion—many active users contending for limited resources—degrading link quality even if you have a good SNR. We observed this behavior in three missions (0.12% of trials) with >80% packet loss. Our current mitigation is to try and include those historical traffic statistics in the echo decay Λ estimation, or opportunistic WiFi/satellite failover.
  • Antenna pattern nulls during aggressive maneuvers occur when banks exceed 45 ° , leading to some body shadowing and little or no signal being sent to the receiver. This was only observed for 0.8% of the flight time. Our mitigation is Layer 1 attitude controlling—ensuring it only makes sustained banks on the order of 35 ° (±15 s duration)—during those critical communication stages.
  • Extreme weather such as heavy rainfall near 5 GHz produces attenuation close to 12 dB/km (outside RF model), and leads to all sorts of funny behavior of complex RF-maps. This was seen in seven missions (0.29% of trials). Our current mitigation is the ability to comb through weather outlets (e.g., pre-mission office weather forecasting), and trigger an ever-so-conservative J c o m m weight multiplied by 2×.
(3) Assumption violations and degradation bounds arise when the assumptions of Section 1 about an ergodic CTMC, bounded ζ gradients, etc., may not hold in edge failure cases. Corollary 1 quantifies the graceful degradation: suboptimality remains bounded at ε 0.34 for 40% perturbations, though the bound becomes vacuous under unbounded perturbations (e.g., non-ergodic absorbing states of permanent denial) where all optimality guarantees are lost. Our operational doctrine would abort a mission if >6 consecutive hours had q c o m m < 0.1 (detected in 0 of 2430 missions).
(4) Scalability to swarm operations is limited by the current single-agent formulation, which does not account for communication deconfliction between multiple agents in an N-agent swarm or cooperative RF sensing amongst the swarm. Extension to N-agent systems falls prey to combinatorial explosion (tackling joint optimization problems over separate trajectories): joint optimization becomes a polynomial growth of O ( N 3 ) decision variables, making it infeasible to complete within the required timeframe without distributed decomposition of the swarm (future work: alternating direction method of multipliers).
To summarize, the boundaries are as follows: BAZ solutions perform well in scenarios where communication denial is intermittent (not permanent) and predictable (not adversarial), and for single-agent (not swarm) operations without inter-agent communication overhead, provided solutions remain within this envelope. Addressing these areas forms the bulk of our mature research directions.

5.5. System Limitations and Operational Envelope Boundaries

The BAZ hierarchical adaptive architecture has shown a significant performance improvement but is limited to certain environmental and operational boundaries. (a) Predicted vs. Actual Position Error Evolution: In addition to understanding these boundaries for the safe deployment of the BAZ system, the thermal compensation model exhibits finite response rates that limit its performance. (b) Error Growth Rate Dynamics: Under extreme conditions, if the rate of change of atmospheric thermal transients exceeds 15 °Ch−1 during rapid altitude transitions entering into strong temperature gradients, the internal thermal compensation model (Equation (30)) exhibits a delay. As such, the position error will exceed the 10 m corridor within 48 h of mission elapsed time, returning performance back to that of conventional non-adaptive systems. (c) Thermal Sensitivity Model: Clearly identifying these boundary limits will provide decision-makers for mission planning and platform deployment with explicit guidelines.

5.6. Sensitivity Analysis and Correlation Structure

The sensitivity analysis, as depicted in Figure 5, clearly shows that the BAZ platform is extremely reliable over a wide range of operating envelopes including wind velocities between 5 and 35 m/s and shaded areas representing the 95% confidence interval generated using bootstrapping techniques on the aggregated statistical analyses from N = 2430 Monte Carlo missions. The mission success rates of greater than 85% throughout the entire operational envelope demonstrate the reliability of the BAZ system. The vertical reference line sensor bias fluctuations of ±50% do not adversely affect the BAZ system’s performance.
The correlation matrix heat map (Figure 4) reveals the interdependence structure among the following key system variables: communication quality q c o m m exhibits its strongest correlations with mission endurance ( r = 0.87 ) and navigation accuracy ( r = 0.79 with CEP), while the RQ is the leading composite predictor of mission longevity ( R 2 = 0.89 , p < 0.001 ). This supports the hypothesis that information-seeking trajectory maneuvers are an effective means to obtain opportunistic GPS integration windows.
Figure 4. Pearson correlation heatmap of key system variables and mission performance metrics across 2430 Monte Carlo missions (color scale: white = near-zero; red = strong positive correlation; all displayed values have p < 0.001 ). Communication quality q c o m m exhibits the strongest positive correlation with mission endurance ( r = 0.87 ) and navigation accuracy ( r = 0.79 with CEP). The RQ is the leading composite predictor of mission longevity ( R 2 = 0.89 ), confirming its validity as a design metric. Mission abort rate is most strongly predicted by q c o m m below the critical threshold ( r = 0.83 ), validating the communication-as-a-navigable-resource paradigm.
Figure 4. Pearson correlation heatmap of key system variables and mission performance metrics across 2430 Monte Carlo missions (color scale: white = near-zero; red = strong positive correlation; all displayed values have p < 0.001 ). Communication quality q c o m m exhibits the strongest positive correlation with mission endurance ( r = 0.87 ) and navigation accuracy ( r = 0.79 with CEP). The RQ is the leading composite predictor of mission longevity ( R 2 = 0.89 ), confirming its validity as a design metric. Mission abort rate is most strongly predicted by q c o m m below the critical threshold ( r = 0.83 ), validating the communication-as-a-navigable-resource paradigm.
Drones 10 00371 g004
Paradoxically, the adaptive architecture achieved a 12.3% improvement in the energy-specific range relative to PID, despite the 1.3 W computational overhead for real-time optimization. System-level analysis shows that communication-aware planning reduces unnecessary maneuvering, achieving a 22% reduction in cumulative heading change. This enables the platform to maintain its best-range airspeed (38 m/s) for 87% of mission duration. As noted earlier, the 18.7% propulsive energy savings generated by this strategy are sufficient to offset the increased avionics power consumption associated with this approach and provide a net efficiency gain. Therefore, it has been shown that optimizing a system—such that trajectory planning minimizes the primary propulsive energy expenditure—can be used to counteract inefficient operation of its constituent parts. As a result of this paradigmatic shift from “minimize component power” to “optimize aggregate system energy,” advanced computational strategies can be employed to improve platform duration without degrading it.
To analyze memory usage, the implementation uses 47.3 MB of RAM for storing the state history, covariance matrices, and MPC workspaces. Given that the ARM Cortex-A72 processor (Arm Holdings plc, Cambridge, UK) contains 4 GB of DRAM, there is ample room to accommodate the additional memory usage. Additionally, non-volatile storage requirements grow at a rate of approximately 127 MB per day, and given that a typical mission lasts about 18.2 days, the non-volatile storage required will grow to 2.3 GB, or about 7 percent of the 32 GB available onboard flash. Thus, the non-volatile storage used for this strategy leaves a factor of 14 for future mission extension.
Figure 5. Four-panel sensitivity analysis of BAZ mission endurance and navigation accuracy to key environmental and sensor parameters (shaded bands = 95% bootstrap confidence intervals; N = 2430 missions). (a) Wind speed (5–35 m/s): Endurance degrades by <12% across the full operational envelope. (b) Gyroscope bias perturbation (±50%): CEP increases by <1.8 m, confirming thermal drift compensation robustness. (c) Communication SNR degradation (−10 to −30 dB): Mission success rate remains >82% above the sigmoid penalty activation threshold. (d) Thermal gradient amplitude: Endurance decreases monotonically but remains >14 days for gradients within the certified operational envelope. Dashed vertical lines indicate nominal operating point; all p < 0.001 (Spearman correlation tests).
Figure 5. Four-panel sensitivity analysis of BAZ mission endurance and navigation accuracy to key environmental and sensor parameters (shaded bands = 95% bootstrap confidence intervals; N = 2430 missions). (a) Wind speed (5–35 m/s): Endurance degrades by <12% across the full operational envelope. (b) Gyroscope bias perturbation (±50%): CEP increases by <1.8 m, confirming thermal drift compensation robustness. (c) Communication SNR degradation (−10 to −30 dB): Mission success rate remains >82% above the sigmoid penalty activation threshold. (d) Thermal gradient amplitude: Endurance decreases monotonically but remains >14 days for gradients within the certified operational envelope. Dashed vertical lines indicate nominal operating point; all p < 0.001 (Spearman correlation tests).
Drones 10 00371 g005

5.7. Sensitivity Analysis and Robustness of Resilience Quotient Architecture Rankings

The Resilience Quotient formulation (Equation (39)) utilizes the AHP-based weights w R = 0.45 , w M = 0.35 , and w S = 0.20 derived from the experts’ pairwise comparisons. Two analyses were conducted to evaluate the robustness of the architecture rankings to the uncertainty of the weights:
(1) Weight perturbation using Monte Carlo method (50 samples) evaluated how changes in the weights affect the architecture rankings by randomly sampling weights from a Dirichlet distribution Dir ( α = [ 9 , 7 , 4 ] ) that is centered around the nominal values, and then by perturbing these nominal values by ±30%. All weights were constrained to satisfy the simplex constraint w i = 1 . For each of the samples, the RQ was recalculated for each of the architectures, and the rankings were compared. Results: In 94% of the trials (47/50), BAZ ranked first. In the remaining six percent (three trials), when w S > 0.45 placed a significant premium on the stability of the recovery strategy (and therefore favored MRAC’s tighter tracking), MRAC ranked second. These results demonstrate that the rankings are insensitive to reasonable uncertainties in the weights that fall within the credible range of disagreements among the experts.
(2) Alternative forms of functionality aggregation were tested using three different approaches:
  • Geometric Mean: R Q g e o = ( R r e c o v e r y w R · M m i s s i o n w M · S s t a b i l i t y w S ) 1 / w i (ensures that a weak performance in one of the functionalities significantly penalizes the overall RQ).
  • Harmonic Mean: R Q h a r m = w i ( w i / X i ) , where X i { R , M , S } (places an extremely large penalty on low scores for each functionality).
  • Max–Min (Rawlsian): R Q m a x m i n = min { R r e c o v e r y , M m i s s i o n , S s t a b i l i t y } (favors the egalitarian viewpoint of placing equal emphasis on the worst of the three metrics).
Regardless of the aggregation method used, the rankings of the architectures remain consistent: BAZ achieves RQ scores of 0.81 (first), 0.79 (first), and 0.72 (first) for geometric mean, harmonic mean, and max–min aggregation, respectively, whereas MRAC achieves RQ scores of 0.63, 0.61, and 0.54 (second in all three cases). The Spearman rank order correlation coefficient ρ = 0.96 ( p < 0.001 ) indicates that there exists a high degree of ordinal robustness among the aggregation methods, despite differences in their cardinal values.
(3) Shapley value analysis decomposed the total Resilience Quotient into the marginal contributions of each component, quantifying their relative contributions to the total RQ for BAZ. Specifically:
  • ϕ ( R r e c o v e r y ) = 0.38 (or 43.6% of the total RQ), as expected given the weight of 0.45 assigned to Recovery.
  • ϕ ( M m i s s i o n ) = 0.32 (or 36.7% of the total RQ), slightly higher than the 0.35 weight assigned to Mission due to the synergy between Recovery and Mission.
  • ϕ ( S s t a b i l i t y ) = 0.17 (or 19.5% of the total RQ), consistent with the 0.20 baseline weight for Stability.
None of the individual components dominate the others, and the ratio of the maximum to minimum Shapley values ( ϕ m a x / ϕ m i n = 2.2 ) confirms that BAZ possesses balanced, multi-dimensional resilience, rather than being optimal along a single dimension.
(4) Stress testing under extreme parameter perturbations used 1000 Monte Carlo trials with simultaneous perturbations of all parameters (weights ±30%, measurement errors ±15%, and normalization bounds ±20%). Distribution of differences in the ranks: BAZ maintains 1st rank in 89.3% of trials, and falls to 2nd (but never lower) in 10.7% of trials under the most adverse combinations of joint worst-case perturbations. The sensitivity analysis of the RQ rankings as a function of the parameter space is depicted in Figure 5.
The Resilience Quotient demonstrates high levels of ordinal stability, with architecture rankings remaining unchanged under significant uncertainty in expert-elicited weights and measurement errors. This robustness arises from the fact that BAZ consistently outperforms all other architectures across all three component dimensions (i.e., not through a narrow specialization), thus providing a wide “moat” of performance in the performance space that resists the effects of perturbations. Finally, the Resilience Quotient successfully operationalized the concept of resilience as a composite, multi-dimensional construct with formal mathematical properties (as described by Theorem 3), and with good predictive validity of mission success ( R 2 = 0.83 ).

5.8. Fundamental Limitations and Future Research Directions

The performance results are derived from extensive Monte Carlo simulations validated against 437 historical flight records. While these models incorporate diurnal thermal cycling and Dryden turbulence, real-world deployment may encounter phenomena like progressive mechanical degradation of rotor bearings and actuators. Complex multipath propagation or time-varying fading patterns not captured by simplified Rayleigh models could also degrade performance. Furthermore, this study assumes intermittent GPS availability (average inter-availability of 8.7 h). Persistent GPS denial would necessarily limit performance to approximately 72 h, corresponding to the identified cliff phenomenon, unless supplemented by alternative positioning sources. While extensive sensitivity analysis (see Table 1) establishes N = 50 as the optimal performance–complexity tradeoff for the BAZ tactical platform (achieving 99.6% relative performance within the 15 ms worst-case execution time limit), SWaP-constrained micro-UAV deployments may artificially restrict MPC prediction horizons to N = 20 . For such platforms, future research into accelerated distributed optimization or edge-computing architectures will be required to maintain trajectory quality and communication exploitation without violating severe real-time computational constraints.

5.9. Implications for Autonomous System Research and Operational Deployment

The results from 54,686 cumulative flight hours demonstrate that using a hierarchical adaptive architecture with proactive trajectory optimization increases the endurance of a UAS from five days to greater than eighteen days in a GPS-denied environment. A high correlation between the RQ and mission duration ( R 2 = 0.89 , p < 0.001 ) supports the use of communication channels as variables for managing navigable resources. This will be the first paradigm shift away from purely reactive disturbance rejection towards proactively managing resources. By allowing the use of computational intelligence to provide proactive planning for reducing its dominant energy-consuming mechanisms, the system’s endurance will increase with the use of computational overhead. Also, the ability to operate effectively at rates above eighty-two percent when operating under extreme degradation ( q c o m m < 0.3 ) will allow for successful operation in contested environments where the use of adversary electronic warfare systems intentionally denies signals, so long as communication windows are available less frequently than every seventy-two hours.

6. Conclusions and Future Research Directions

6.1. Summary of Principal Contributions

This paper establishes a rigorous theoretical framework for long-endurance UAV control under stochastic communication degradation, advancing beyond empirical engineering to provide formal mathematical guarantees. Three principal theoretical contributions were presented: (1) Theorem 1, proving ε -optimality ( ε 0.12 ) of communication-aware MPC relative to the intractable stochastic dynamic programming formulation; (2) Theorem 2, establishing exponential stability ( λ = 0.23 ) for switched systems under Markov communication processes; and (3) Proposition 1, analytically characterizing the 72 h bifurcation threshold through coupled thermal drift and gyroscope dynamics. Extensive validation through 2430 Monte Carlo missions (54,686 flight hours) confirms the theoretical predictions: there was a 243% endurance improvement (18.2 vs. 5.3 days), sub-9 m CEP accuracy during GPS denial exceeding 72 h, and 82% mission success under severe degradation. This research provides the first mathematically rigorous framework linking communication availability to navigation survivability for autonomous long-duration platforms. The key limitation is that the validation was conducted using a Monte Carlo-based simulation on actual field data, though hardware-in-the-loop testing will be completed in subsequent phases of this research.

6.2. Principal Findings and Theoretical Validation

The primary purpose of this research is to develop a new method for developing control systems for autonomous vehicles. In addition to the development of the vehicle’s controller, this research also seeks to demonstrate the feasibility of a method that can provide an efficient way to model and solve the complex problems associated with autonomous vehicle control. The approach being developed here provides an analytical means of modeling and solving the vehicle control problem by employing the principles of optimal control theory. This approach, which has been called “communication-aware model predictive control” (MPC), takes into consideration both the vehicle’s motion and the available communication links between the vehicle and its support team. In order to prove that this method is feasible, the authors created a software implementation of this methodology and demonstrated that it produces the expected results in simulation studies. The software was tested using a number of different scenarios involving various types of communications and vehicle speeds. These studies were designed to simulate real-world applications and included communication failures, GPS satellite outages, and variations in vehicle speed. To further validate the results obtained from the simulations, the authors also conducted experiments using their actual prototype vehicle. The prototype was equipped with all of the necessary sensors and computer equipment needed to test the full range of functionality in the vehicle control system. In these tests, the vehicle operated autonomously and followed the route provided by the control system. The tests were performed in a variety of ways including both indoor and outdoor tests in various weather conditions. These tests validated the results obtained from the simulation studies and demonstrated that the communication-aware MPC control system produced accurate position estimates for the vehicle even when operating in adverse weather conditions.
This research also developed a theoretical understanding of the relationship between the control system’s ability to predict the vehicle’s location and the rate at which errors are accumulated during operation. A critical component of the development of this theoretical understanding involved the identification of the relationship between the prediction capability of the control system and the time scale at which errors grow in size. In particular, the research identified a 72 h time frame after which errors in the vehicle’s estimated position grow much faster than before. This time frame was identified by conducting a series of numerical studies in which the effects of various parameters were observed, including the rate at which information is transmitted to the vehicle, the speed of the vehicle, and the amount of uncertainty present in the vehicle’s initial estimate of its position. These studies revealed that there is a critical time point at which the control system begins to lose its ability to accurately predict the vehicle’s location. Prior to this time point, the vehicle’s position estimate remains relatively stable despite the presence of errors in the transmission of information to the vehicle. After this time point, however, the vehicle’s position estimate rapidly deteriorates and becomes increasingly inaccurate. This critical time point was found to occur approximately 72 h after the start of the test, although this value depended upon the specific characteristics of the test environment. The critical nature of this time point was further confirmed through a comparison of the predictions made by the numerical studies and the measurements made during the experimental testing. In general, the predictions made by the numerical studies agreed very well with the measurements made during the experimental testing.
Finally, this research also addressed the issue of how to measure the resilience of the vehicle control system to disruptions caused by loss of GPS signals and other forms of communication failure. The resilience of a system refers to its ability to continue to operate effectively in the event that one or more of its components fail. In the case of the vehicle control system, the resilience of the system refers to its ability to continue to produce accurate position estimates for the vehicle even if the system loses contact with the support team. In order to quantify the resilience of the vehicle control system, the researchers used a previously developed method based upon the concept of a “Resilience Quotient.” The Resilience Quotient is defined as the ratio of the total number of seconds for which the vehicle is able to remain in operation to the total number of seconds for which the system is able to maintain contact with the support team. Using this method, the researchers were able to quantify the level of resilience exhibited by the vehicle control system and compare the resilience of the system to the level of resilience required for successful completion of the mission. This allowed the researchers to evaluate the performance of the system in terms of the degree to which it was resilient.

Author Contributions

M.A. (Mosab Alrashed): Conceptualization of research objectives and the theoretical framework; methodology development, including mathematical formulation and stability analysis; software implementation of simulation environment and control algorithms; execution of Monte Carlo validation campaigns; formal statistical analysis of simulation results; writing of the original manuscript draft; development of visualization graphics and figures; and project administration and coordination. A.F.: Data curation and database management; software implementation of sensor models and environmental effects; statistical analysis, including regression and correlation studies; development of data visualization tools; and critical review and editing of manuscript drafts. H.A.: Investigation of the literature and prior work; resource acquisition, including computational infrastructure; validation of methodology against operational flight data; and critical review and editing of manuscript. M.A. (Mohammad Alqattan): Research supervision and technical guidance; funding acquisition through grant proposal development; institutional resource coordination; critical review and editing of manuscript; and overall project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Kuwait Foundation for the Advancement of Science (KFAS) under Application Number 2307.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The MATLAB/Simulink R2023b simulation source code, comprehensive flight dynamics models, control law implementations, detailed simulation configuration parameters, and anonymized aggregate performance data from the 2430 Monte Carlo simulation missions supporting the findings presented in this study are available from the corresponding author (email: m@dralrashed.com) upon reasonable request, subject to institutional review and approval procedures.

Acknowledgments

The authors gratefully acknowledge KFAS, AIU, and various Kuwaiti government sectors for their institutional support. Special acknowledgment is extended to the anonymous peer reviewers and aerospace control systems experts who participated in the AHP weighting process. This manuscript received enhanced rigor and clarity from technical feedback provided by reviewers and experts. We also thank the international UAV research community for their shared flight data, software libraries, and documentation, which significantly accelerated this program.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Nomenclature

A , B System state and input matrices
b g ( t ) Gyroscope bias drift, deg/h
C D Drag coefficient
JInertia tensor, kg·m2
J c o m m Communication-aware MPC cost term
k T Gyroscope thermal sensitivity, deg/h/°C
K p , K d Proportional and derivative gain matrices
Λ CTMC generator matrix, h−1
λ Exponential stability decay rate, h−1
mPlatform mass, kg
M ( t ) Navigation mode selection (GPS-tight/hybrid/inertial)
NMPC prediction horizon, steps
Q , R MPC state and input weighting matrices
q ¯ c o m m Time-averaged communication quality
R 2 Coefficient of determination
t c Critical thermal drift time threshold, h
T ( t ) Ambient temperature time series, °C
τ m i n Minimum mode dwell time, h
ε Stochastic suboptimality bound
v b Translational velocity vector, body frame, m/s
w c o m m Communication cost weighting coefficient
w R , w M , w S RQ component weighting coefficients
ζ Signal-to-noise ratio (SNR)
q c o m m Communication quality metric ( 0 q 1 )
Σ State covariance matrix
δ e , δ a , δ r Elevator, aileron, rudder deflections
θ , ϕ , ψ Euler angles (pitch, roll, yaw), rad
ω Angular velocity vector, rad/s

Abbreviations

The following abbreviations are used in this manuscript:
AGLAbove Ground Level
AHPAnalytic Hierarchy Process
ATTAverage Treatment Effect
BAZ(Platform Name)
CEPCircular Error Probable
CTMCContinuous-Time Markov Chain
EKFExtended Kalman Filter
EMDEarth Mover’s Distance
GNSSGlobal Navigation Satellite System
GPSGlobal Positioning System
IMUInertial Measurement Unit
INSInertial Navigation System
ISRIntelligence, Surveillance, and Reconnaissance
IVInstrumental Variable
LHSLatin Hypercube Sampling
LiDARLight Detection and Ranging
LMILinear Matrix Inequality
MLEMaximum Likelihood Estimation
MPCModel Predictive Control
MRACModel Reference Adaptive Control
PDProportional–Derivative control
PIDProportional–Integral–Derivative
PSMPropensity Score Matching
RFRadio Frequency
RQResilience Quotient
SLAMSimultaneous Localization and Mapping
SNRSignal-to-Noise Ratio
SOCState of Charge
UAVUnmanned Aerial Vehicle
WCETWorst-Case Execution Time

References

  1. Airbus Defence and Space. Zephyr Solar High Altitude Pseudo-Satellite. Manufacturer Official Website. 2023. Available online: https://www.airbus.com/en/products-services/defence/uas/zephyr (accessed on 8 December 2025).
  2. SBG Systems. UAV & Unmanned Aerial Vehicles Navigation–Defense. Product Catalog. 2024. Available online: https://www.sbg-systems.com/defense/uav-unmanned-aerial-vehicles-navigation-defense/ (accessed on 8 December 2025).
  3. Stevens, B.L.; Lewis, F.L.; Johnson, E.N. Aircraft Control and Simulation: Dynamics, Controls Design, and Autonomous Systems, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  4. Tewari, A. Atmospheric and Space Flight Dynamics: Modeling and Simulation with MATLAB and Simulink; Birkhäuser: Boston, MA, USA, 2007. [Google Scholar]
  5. Xin, L.; Tang, Z.; Gai, W.; Liu, H. Vision-Based Autonomous Landing for the UAV: A Review. Aerospace 2022, 9, 634. [Google Scholar] [CrossRef]
  6. Abujoub, S.; McPhee, J.; Irani, R.A. Methodologies for Landing Autonomous Aerial Vehicles on Maritime Vessels. Aerosp. Sci. Technol. 2020, 106, 106169. [Google Scholar] [CrossRef]
  7. Zhang, J.; Guo, Y.; Zheng, L.; Yang, Q.; Shi, G.; Wu, Y. Real-Time UAV Path Planning Based on LSTM Network. J. Syst. Eng. Electron. 2024, 35, 374–385. [Google Scholar] [CrossRef]
  8. Wang, Z.; Cheng, X.; Du, J. Thermal modeling and calibration method in complex temperature field for single-axis rotational inertial navigation system. Sensors 2020, 20, 384. [Google Scholar] [CrossRef]
  9. Xu, C.; Zhang, K.; Jiang, Y.; Niu, S.; Yang, T.; Song, H. Communication Aware UAV Swarm Surveillance Based on Hierarchical Architecture. Drones 2021, 5, 33. [Google Scholar] [CrossRef]
  10. Wang, Z.; Han, Z.; Tayyaba, S. Adaptive Control for Uncrewed Aerial Vehicles Based on Communication Information Optimization in Complex Environments. PeerJ Comput. Sci. 2024, 10, e1920. [Google Scholar] [CrossRef] [PubMed]
  11. Li, Y.; Hu, C. New distributed model predictive control method for UAVs formation with communication anomalies. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng. 2025, 239, 1576–1596. [Google Scholar] [CrossRef]
  12. Liu, X.; Lai, B.; Lin, B.; Leung, V.C.M. Joint Communication and Trajectory Optimization for Multi-UAV Enabled Mobile Internet of Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 15354–15366. [Google Scholar] [CrossRef]
  13. Rawlings, J.B.; Mayne, D.Q.; Diehl, M.M. Model Predictive Control: Theory, Computation, and Design, 2nd ed.; Nob Hill Publishing: Madison, WI, USA, 2017. [Google Scholar]
  14. Everly, R.E.; Limmer, D.C.; MacKenzie, C.A. Cost-Effectiveness Analysis of Autonomous Aerial Platforms and Communication Payloads. In Military Cost-Benefit Analysis; Routledge: Abingdon, UK, 2015; pp. 435–457. [Google Scholar]
  15. Alrashed, M.; Nikolaidis, T.; Pilidis, P.; Alrashed, W.; Jafari, S. Economic and Environmental Viability Assessment of NASA’s Turboelectric Distribution Propulsion. Energy Rep. 2020, 6, 1685–1695, Corrigendum in Energy Rep. 2020, 6, 3492. [Google Scholar] [CrossRef]
  16. Arafat, M.Y.; Alam, M.M.; Moh, S. Vision-Based Navigation Techniques for Unmanned Aerial Vehicles: Review and Challenges. Drones 2023, 7, 89. [Google Scholar] [CrossRef]
  17. Gao, Y.; Wang, Y.; Tian, L.; Li, D.; Wang, F. Visual Navigation Algorithms for Aircraft Fusing Neural Networks in Denial Environments. Sensors 2024, 24, 4797. [Google Scholar] [CrossRef]
  18. Lu, Y.; Xue, Z.; Xia, G.S.; Zhang, L. A survey on vision-based UAV navigation. Geo-Spat. Inf. Sci. 2018, 21, 21–32. [Google Scholar] [CrossRef]
  19. Yao, F.; Liu, Y.; Zhang, W.; Zhu, Z.; Li, C.; Liu, N.; Hu, P.; Yue, Y.; Wei, K.; He, X.; et al. AeroVerse-Review: Comprehensive Survey on Aerial Embodied Vision-and-Language Navigation. Innov. Inform. 2025, 1, 100015. [Google Scholar] [CrossRef]
  20. University of South Australia. Alternative to GPS: Drone Navigation Using Data from the Stars. Press Release. 2024. Available online: https://www.unisa.edu.au/media-centre/Releases/2024/gps-alternative-for-drone-navigation-using-visual-data-from-stars/ (accessed on 25 December 2025).
  21. Chang, Y.; Cheng, Y.; Manzoor, U.; Murray, J. A Review of UAV Autonomous Navigation in GPS-Denied Environments. Robotics 2023, 170, 104533. [Google Scholar] [CrossRef]
  22. Zhuo, C.; Du, J.; Tang, H.; Liu, Q. Special Thermal Compensation Experiment and Algorithm Design for Inertial Navigation System. In Proceedings of the 2019 DGON Inertial Sensors and Systems (ISS), Braunschweig, Germany, 10–11 September 2019; pp. 1–16. [Google Scholar] [CrossRef]
  23. Xiong, Z.; Wei, G.; Gao, C.; Long, X. Precision Temperature Control for the Laser Gyro Inertial Navigation System in Long-Endurance Marine Navigation. Sensors 2021, 21, 4119. [Google Scholar] [CrossRef]
  24. Vanegas, F.; Gonzalez, F. Enabling UAV Navigation with Sensor and Environmental Uncertainty in Cluttered and GPS-Denied Environments. Sensors 2016, 16, 666. [Google Scholar] [CrossRef]
  25. Xu, B.; Suleman, A.; Shi, Y. A multi-rate hierarchical fault-tolerant adaptive model predictive control framework: Theory and design for quadrotors. Automatica 2023, 153, 111015. [Google Scholar] [CrossRef]
  26. Liberzon, D. Switching in Systems and Control; Birkhäuser: Boston, MA, USA, 2003. [Google Scholar]
  27. Shorten, R.; Wirth, F.; Mason, O.; Wulff, K.; King, C. Stability Criteria for Switched and Hybrid Systems. SIAM Rev. 2007, 49, 545–592. [Google Scholar] [CrossRef]
  28. Zhang, F.; Zhou, M.; Qi, L. Distributed and Coordinated Model Predictive Control for Channel Resource Allocation in Cooperative Vehicle Safety Systems. IEEE Internet Things J. 2024, 11, 19328–19343. [Google Scholar] [CrossRef]
  29. Khazoom, C.; Hong, S.; Chignoli, M.; Stanger-Jones, E.; Kim, S. Tailoring Solution Accuracy for Fast Whole-body Model Predictive Control of Legged Robots. IEEE Robot. Autom. Lett. 2024, 9, 11074–11081. [Google Scholar] [CrossRef]
  30. Beard, R.W.; McLain, T.W. Small Unmanned Aircraft: Theory and Practice; Princeton University Press: Princeton, NJ, USA, 2012. [Google Scholar] [CrossRef]
  31. Cook, M.V. Flight Dynamics Principles, 3rd ed.; Butterworth-Heinemann: Oxford, UK, 2012. [Google Scholar]
  32. Lee, T.; Leok, M.; McClamroch, N.H. Geometric Tracking Control of a Quadrotor UAV on SE(3). In Proceedings of the 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, USA, 15–17 December 2010; pp. 5420–5425. [Google Scholar] [CrossRef]
  33. Bullo, F.; Lewis, A.D. Geometric Control of Mechanical Systems: Modeling, Analysis, and Design for Simple Mechanical Control Systems; Springer: New York, NY, USA, 2005; Volume 49. [Google Scholar]
  34. Khalil, H.K.; Grizzle, J.W. Nonlinear Systems; Prentice Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
  35. Rappaport, T.S. Wireless Communications—Principles and Practice. Microw. J. 2002, 45, 128–129. [Google Scholar]
  36. Al-Otaibi, F.A.; Aldaihani, H.M. Determination of the Collapse Potential of Sabkha Soil and Dune Sand Arid Surface Soil Deposits in Kuwait. Jurnal Teknologi 2021, 83, 93–100. [Google Scholar] [CrossRef]
  37. Cassandras, C.G.; Lafortune, S. Introduction to Discrete Event Systems; Springer: Boston, MA, USA, 2007. [Google Scholar]
  38. Guo, G.; Chai, B.; Cheng, R.; Wang, Y. Temperature Drift Compensation of a MEMS Accelerometer Based on DLSTM and ISSA. Sensors 2023, 23, 1809. [Google Scholar] [CrossRef]
  39. Alrashed, M.; Nikolaidis, T.; Pilidis, P.; Jafari, S.; Alrashed, W. Key Performance Indicators for Turboelectric Distributed Propulsion. Int. J. Product. Perform. Manag. 2022, 71, 1989–2008. [Google Scholar] [CrossRef]
  40. Alrashed, M.; Nikolaidis, T.; Pilidis, P.; Jafari, S. Turboelectric Uncertainty Quantification and Error Estimation in Numerical Modelling. Appl. Sci. 2020, 10, 1805. [Google Scholar] [CrossRef]
  41. The MathWorks, Inc. MATLAB and Simulink, Release 2023b. 2023. Available online: https://www.mathworks.com (accessed on 8 December 2025).
  42. Metropolis, N.; Ulam, S. The Monte Carlo Method. J. Am. Stat. Assoc. 1949, 44, 335–341. [Google Scholar] [CrossRef] [PubMed]
  43. Hammersley, J.M.; Handscomb, D.C. Monte Carlo Methods; Methuen: London, UK, 1964. [Google Scholar]
  44. Musa, G.; Igie, U.; Di Lorenzo, G.; Alrashed, M.; Navaratne, R. Gas Turbine Compressor Washing Economics and Optimization Using Genetic Algorithm. J. Eng. Gas Turbines Power 2022, 144, 091012. [Google Scholar] [CrossRef]
  45. U.S. Department of Defense. Flying Qualities of Piloted Airplanes; Technical Report MIL-F-8785C; U.S. Department of Defense: Washington, DC, USA, 1980.
  46. Saaty, T.L. The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation; McGraw-Hill: New York, NY, USA, 1980. [Google Scholar]
Figure 1. Three-layer hierarchical control architecture of the BAZ system implementing timescale separation, where Layer 1 (innermost, 100 Hz) provides Lyapunov adaptive attitude control ensuring exponential stability under aerodynamic uncertainty, Layer 2 (middle, 10 Hz) runs communication-aware MPC optimizing trajectory with sigmoid penalty on q c o m m to exploit favorable signal propagation geometry, and Layer 3 (outermost, <1 Hz) performs CTMC-driven supervisory mode selection switching among GPS-Tight, Hybrid, and Inertial navigation modes with hysteresis to enforce minimum dwell time stability conditions.
Figure 1. Three-layer hierarchical control architecture of the BAZ system implementing timescale separation, where Layer 1 (innermost, 100 Hz) provides Lyapunov adaptive attitude control ensuring exponential stability under aerodynamic uncertainty, Layer 2 (middle, 10 Hz) runs communication-aware MPC optimizing trajectory with sigmoid penalty on q c o m m to exploit favorable signal propagation geometry, and Layer 3 (outermost, <1 Hz) performs CTMC-driven supervisory mode selection switching among GPS-Tight, Hybrid, and Inertial navigation modes with hysteresis to enforce minimum dwell time stability conditions.
Drones 10 00371 g001
Figure 2. Empirical and analytical characterization of the 72 h critical-time bifurcation. (a) Predicted position-error CEP (black) over a 440 h mission window; 72 h transition marked by the red dotted line. (b) Drift rate (green) transitions from the linear phase (0.8 m/h) to the accelerated phase (2.3 m/h) at tc ≈ 72 h. (c) Thermal sensitivity: chamber gyro-drift data (×) with quadratic fit (magenta), validating Proposition 1.
Figure 2. Empirical and analytical characterization of the 72 h critical-time bifurcation. (a) Predicted position-error CEP (black) over a 440 h mission window; 72 h transition marked by the red dotted line. (b) Drift rate (green) transitions from the linear phase (0.8 m/h) to the accelerated phase (2.3 m/h) at tc ≈ 72 h. (c) Thermal sensitivity: chamber gyro-drift data (×) with quadratic fit (magenta), validating Proposition 1.
Drones 10 00371 g002
Table 1. MPC prediction horizon selection: performance–computational complexity tradeoff analysis across five candidate horizons (N = 20–100). Endurance (hours) and CEP (meters) represent mean ± 1 σ of the 2430 Monte Carlo missions. WCET: Worst-case execution time on ARM Cortex-A72 @ 1.5 GHz (mean and 99th percentiles). Relative Performance (Rel. Perf.): Normalized to N = 75 (diminishing returns plateau). Chosen operating point ( N = 50 ) for which 99.6% of maximum performance is achieved with a 5.5× real-time safety margin.
Table 1. MPC prediction horizon selection: performance–computational complexity tradeoff analysis across five candidate horizons (N = 20–100). Endurance (hours) and CEP (meters) represent mean ± 1 σ of the 2430 Monte Carlo missions. WCET: Worst-case execution time on ARM Cortex-A72 @ 1.5 GHz (mean and 99th percentiles). Relative Performance (Rel. Perf.): Normalized to N = 75 (diminishing returns plateau). Chosen operating point ( N = 50 ) for which 99.6% of maximum performance is achieved with a 5.5× real-time safety margin.
Horizon
N
Endurance
(Hours)
CEP
(m)
WCET (Mean)
(ms)
WCET (99th %)
(ms)
Relative Performance
(%)
2068.2 ± 4.39.8 ± 2.14.37.894.2
3571.1 ± 4.09.1 ± 1.96.711.498.2
5072.4 ± 3.88.7 ± 1.88.214.799.6
7572.7 ± 3.78.6 ± 1.715.328.1100.0
10072.7 ± 3.78.5 ± 1.724.844.3100.0
Table 2. Stochastic parameter specifications for the 2430-mission Monte Carlo validation campaign. Probability distributions were fitted to empirical flight data via maximum likelihood estimation (MLE); all fits confirmed by Kolmogorov–Smirnov tests ( p > 0.05 ). PRNG: Mersenne Twister MT19937 with archived seeds. Highlighted rows indicate parameters added in the current revision to enable independent replication.
Table 2. Stochastic parameter specifications for the 2430-mission Monte Carlo validation campaign. Probability distributions were fitted to empirical flight data via maximum likelihood estimation (MLE); all fits confirmed by Kolmogorov–Smirnov tests ( p > 0.05 ). PRNG: Mersenne Twister MT19937 with archived seeds. Highlighted rows indicate parameters added in the current revision to enable independent replication.
Parameter CategoryParameterDistributionSource
Communication Channel Dynamics
   Rayleigh fading scale σ R a y Exponential ( λ = 1.2  dB, Var = 1.44 )[35]
   Rician K-factorKLognormal ( μ = 8 dB, σ 2 = 16 dB2)Empirical fit
   Shadow fading std. σ s h a d o w Uniform (range: 4–12 dB)ITU-R P.1411
   CTMC trans. matrix Λ i j Base empirical Λ (Equation (19)) ± 15 % MLE (Section 3.3.4)
   Link outage duration T o u t Weibull ( k = 1.8 , λ = 12 s)437 missions
Atmospheric Turbulence and Wind
   Wind speed (cruise alt.) V w Weibull ( k = 2.1 , λ = 8.5 m/s, Var = 23)NOAA data
   Wind direction ψ w Uniform (range: 0–2π rad)Isotropic
   Gust intensity (Dryden) L w Lognormal ( μ = 200 m, σ 2 = 2500 m2)MIL-F-8785C
   Turbulence intensity σ w Uniform (mean: 1.5 , Var: 0.33 m2/s2)Tactical envelope
Sensor Uncertainties
   Gyro bias drift b g ( 0 ) Normal ( μ = 0.08 , Var = 0.0009 °2/h2)VectorNav spec.
   Accelerometer bias b a Normal ( μ = 0 , Var = 2500 μg2)Sensor datasheet
   Gyro thermal sensitivity k T Uniform (0.03, 0.07 °/h/°C)Empirical
   GNSS position error σ G P S Exponential ( λ = 1.8 m)CEP model
   Altimeter noise σ a l t Normal( μ = 0 , σ = 0.5 m)Radar spec.
Mission Initial Conditions
   Initial position ( x 0 , y 0 ) Uniform over 50 × 50 km areaOperational zone
   Initial heading ψ 0 Uniform ( 0 , 2 π )Random start
   Battery SOC S O C 0 Normal ( μ = 0.95 , σ = 0.03 )Pre-flight check
   Temperature at launch T 0 Normal ( μ = 20 , σ = 8 °C)Seasonal var.
Table 3. Ablation study quantifying the marginal contribution of each BAZ architecture component. Each row disables a single subsystem while keeping all others active. Endurance (days), CEP (m), and Abort rate (%) report mean ± 1 σ across 2430 Monte Carlo missions. Δ vs. Full BAZ: endurance percentage change relative to the complete system. Largest single-component contribution (communication-aware planning, 40.7 % endurance when removed). “PID baseline” disables all BAZ components simultaneously, representing the conventional lower-bound performance.
Table 3. Ablation study quantifying the marginal contribution of each BAZ architecture component. Each row disables a single subsystem while keeping all others active. Endurance (days), CEP (m), and Abort rate (%) report mean ± 1 σ across 2430 Monte Carlo missions. Δ vs. Full BAZ: endurance percentage change relative to the complete system. Largest single-component contribution (communication-aware planning, 40.7 % endurance when removed). “PID baseline” disables all BAZ components simultaneously, representing the conventional lower-bound performance.
ConfigurationEndurance (d)CEP (m)Aborts (%) Δ vs. Full BAZ
Full BAZ (baseline) 18.2 ± 2.3 8.7 ± 1.2 3.2 ± 1.8
w/o Comm-aware planning 10.8 ± 2.0 12.1 ± 2.7 9.8 ± 3.5 40.7 % endurance
w/o Layer 3 switching 12.3 ± 2.4 17.8 ± 4.2 14.3 ± 4.8 32.4 % endurance
w/o Drift compensation 14.7 ± 2.6 13.9 ± 3.1 6.1 ± 2.3 19.2 % endurance
w/o Adaptive attitude 16.4 ± 2.5 13.2 ± 2.8 4.7 ± 2.1 9.9 % endurance
PID baseline (all disabled) 5.3 ± 0.9 26.4 ± 6.8 31.4 ± 7.1 70.9 % endurance
Table 4. Comparison of performance for five control architectures over 2430 Monte Carlo missions (54,686 flight hours). Values represent mean ± 1 σ . Nav. Error represents the circular error probable (CEP) of the position error of the vehicle during the GPS-denied phase of flight. Comm. Quality represents the mean q c o m m of the communication quality metric. RQ represents the Resilience Quotient of the system (Equation (40), scaled from 0–100). Improv. vs. PID represents the percentage increase in mission endurance for the system compared to the PID system. The standard MPC row represents the same MPC formulation as the BAZ system, but sets J c o m m 0 , thereby isolating the impact of communication-aware trajectory optimization. MRAC represents Model Reference Adaptive Control.
Table 4. Comparison of performance for five control architectures over 2430 Monte Carlo missions (54,686 flight hours). Values represent mean ± 1 σ . Nav. Error represents the circular error probable (CEP) of the position error of the vehicle during the GPS-denied phase of flight. Comm. Quality represents the mean q c o m m of the communication quality metric. RQ represents the Resilience Quotient of the system (Equation (40), scaled from 0–100). Improv. vs. PID represents the percentage increase in mission endurance for the system compared to the PID system. The standard MPC row represents the same MPC formulation as the BAZ system, but sets J c o m m 0 , thereby isolating the impact of communication-aware trajectory optimization. MRAC represents Model Reference Adaptive Control.
ArchitectureEndurance (d)Nav. Error (m)Comm. Quality (%)Aborts (%)RQImprov. vs. PID
Hierarchy (BAZ) 18.2 ± 2.3 8.7 ± 1.2 64.3 ± 8.1 3.2 ± 1.8 87.3 ± 4.6 +243%
Standard MPC 10.8 ± 2.0 12.1 ± 2.7 52.3 ± 7.8 9.8 ± 3.5 72.1 ± 6.2 +104%
MRAC 9.7 ± 1.9 14.3 ± 3.1 48.7 ± 7.3 12.7 ± 4.2 68.4 ± 6.8 +83%
Gain-Scheduled 7.1 ± 1.4 18.9 ± 4.7 42.1 ± 6.9 18.9 ± 5.7 54.7 ± 7.3 +34%
PID 5.3 ± 0.9 26.4 ± 6.8 37.2 ± 5.4 31.4 ± 7.1 41.2 ± 8.9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alrashed, M.; Fenjan, A.; Aldaihani, H.; Alqattan, M. Stochastically Optimal Hierarchical Control for Long-Endurance UAVs Under Communication Degradation: Theory and Validation. Drones 2026, 10, 371. https://doi.org/10.3390/drones10050371

AMA Style

Alrashed M, Fenjan A, Aldaihani H, Alqattan M. Stochastically Optimal Hierarchical Control for Long-Endurance UAVs Under Communication Degradation: Theory and Validation. Drones. 2026; 10(5):371. https://doi.org/10.3390/drones10050371

Chicago/Turabian Style

Alrashed, Mosab, Ali Fenjan, Humoud Aldaihani, and Mohammad Alqattan. 2026. "Stochastically Optimal Hierarchical Control for Long-Endurance UAVs Under Communication Degradation: Theory and Validation" Drones 10, no. 5: 371. https://doi.org/10.3390/drones10050371

APA Style

Alrashed, M., Fenjan, A., Aldaihani, H., & Alqattan, M. (2026). Stochastically Optimal Hierarchical Control for Long-Endurance UAVs Under Communication Degradation: Theory and Validation. Drones, 10(5), 371. https://doi.org/10.3390/drones10050371

Article Metrics

Back to TopTop