Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios

Cheng, Qing; Wu, Wenwen; Zhao, Ziwei

doi:10.3390/drones10060434

Open AccessArticle

Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios

by

Qing Cheng

¹,

Wenwen Wu

^1,*

and

Ziwei Zhao

²

¹

College of Air Traffic Management, Civil Aviation Flight University of China, Chengdu 641419, China

²

TUM School of Computation, Information and Technology (CIT), Technical University of Munich, Arcisstraße 21, 80333 Munich, Germany

^*

Author to whom correspondence should be addressed.

Drones 2026, 10(6), 434; https://doi.org/10.3390/drones10060434

Submission received: 16 April 2026 / Revised: 31 May 2026 / Accepted: 1 June 2026 / Published: 3 June 2026

(This article belongs to the Section Drone Communications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A multimodal CA-Gated predictor improves trajectory prediction accuracy, reducing ADE and FDE by about 2–3% compared with the Gated Uncertainty variant and by about 5–7% compared with representative uncertainty-aware baselines.
Conformal calibration improves uncertainty reliability, increasing Coverage@95 by about 7%, and enabling uncertainty-aware beam management to increase coverage by about 24% while reducing outage and misalignment by over 38% in high-risk scenarios.

What are the implications of the main findings?

Predictive uncertainty can be used as an intermediate signal for communication control, linking trajectory prediction with beam management decisions.
Improving uncertainty reliability leads to more conservative but more stable decision behavior, highlighting a trade-off between communication efficiency and robustness in dynamic UAV scenarios.

Abstract

Reliable beam management in Unmanned Aerial Vehicle (UAV)-assisted Integrated Sensing and Communication (ISAC) systems needs accurate trajectory prediction and a clear sense of prediction risk. Most existing methods use deterministic future positions or raw uncalibrated uncertainty. Under high mobility and uncertainty, this leads to unreliable beam decisions. We design a control-oriented probabilistic trajectory prediction framework. It uses calibrated trajectory uncertainty as a risk signal for adaptive beam management. The framework first combines motion history and visual context to predict trajectory distributions. Split conformal calibration turns raw Gaussian uncertainty into statistically reliable risk bounds. A codebook-constrained beam management strategy adjusts beamwidth based on the calibrated spatial risk. This balances beamforming gain, coverage robustness, and switching stability. Tests on UAV data show better prediction accuracy than representative probabilistic baselines. The raw uncertainty remains under-calibrated, and conformal calibration is therefore applied to improve its reliability before beam-control decisions. Using the calibrated uncertainty for beam control improves communication coverage and cuts outages and severe misalignment in high-risk situations. Calibrated predictive uncertainty can serve as an actionable control variable for robust beam management in dynamic UAV-assisted ISAC environments.

Keywords:

UAV trajectory prediction; ISAC; uncertainty calibration; beam management

1. Introduction

In recent years, with the rapid development of the low-altitude economy and sixth-generation (6G) mobile communication systems, unmanned aerial vehicles (UAVs) have been increasingly deployed in applications such as environmental sensing, communication relaying, and cooperative computing. Integrated Sensing and Communication (ISAC), which enables the shared utilization of sensing and communication resources, provides a promising paradigm for constructing efficient, flexible, and intelligent low-altitude information systems [1,2,3,4,5]. However, in UAV-assisted ISAC scenarios, link reliability still faces significant challenges. On the one hand, millimeter-wave (mmWave) communication relies on high-gain narrow beams to improve spectral efficiency, yet such beams are highly sensitive to pointing errors [6,7,8]. On the other hand, UAV platforms are characterized by high mobility, agile maneuvering, and rapidly changing environments, resulting in strongly time-varying and uncertain link conditions [4,5,9].

To address these challenges, predictive beam management has recently attracted increasing attention. The key idea is to leverage future state information to proactively adjust beam directions [9]. Nevertheless, most existing approaches rely on point estimates of future positions, which are inherently insufficient for risk-sensitive resource control. For reliability-oriented beam management, it is not only necessary to know where the target is likely to appear, but also how reliable such predictions are. Therefore, beyond improving trajectory prediction accuracy, a more critical problem is how to develop predictive models that can provide reliable uncertainty estimates, and further transform such uncertainty into actionable information for physical-layer control [10,11,12].

Meanwhile, relying solely on historical motion states, such as position, velocity, or acceleration, is insufficient to fully characterize motion constraints in complex low-altitude environments. The future trajectory of a UAV is influenced not only by its historical dynamics, but also by environmental factors such as local scene structure, obstacle distribution, and traversable regions. Incorporating external perceptual information, such as visual context, can provide complementary cues to motion states, thereby enhancing the model’s ability to capture complex maneuvers and scene constraints. However, most existing multimodal trajectory prediction studies primarily focus on improving point prediction accuracy, while the impact of multimodal interactions on the quality, reliability, and downstream usability of uncertainty remains underexplored. In particular, in ISAC scenarios, there is still a lack of systematic investigation into how calibrated trajectory uncertainty can be directly utilized for beam management.

Motivated by the above observations, this paper proposes an uncertainty-aware multimodal trajectory prediction and adaptive beam management framework for UAV-assisted ISAC systems. The proposed approach jointly exploits historical motion states and visual context as inputs, and develops a multimodal prediction model based on Long Short-Term Memory (LSTM), cross-attention, and gated fusion mechanisms. The model outputs both future trajectories and their corresponding uncertainty estimates. Unlike conventional approaches that treat uncertainty as an auxiliary output, this work further emphasizes its calibration and practical usability.

Building upon calibrated uncertainty, we design an adaptive beam management strategy that explicitly maps prediction uncertainty to beamwidth control, enabling a dynamic trade-off between beamforming gain and communication robustness in highly dynamic environments. The main contributions of this paper are summarized as follows:

A control-oriented probabilistic trajectory prediction formulation is developed for UAV-assisted ISAC beam management. The proposed formulation outputs both future position means and uncertainty estimates, so that the prediction results can be directly interpreted by downstream beam-control modules. The prediction module serves as an information interface between UAV motion perception and communication control.
Calibrated uncertainty is introduced as a risk quantification mechanism for communication-oriented decision-making. Split conformal calibration is applied to obtain more reliable uncertainty bounds. The calibrated uncertainty is then interpreted as a spatial risk measure, providing a more dependable basis for risk-sensitive beam management.
A codebook-constrained uncertainty-aware beam management strategy is designed to translate prediction risk into beam-control actions. The proposed strategy maps calibrated uncertainty to beamwidth selection under a finite beam codebook, enabling a practical trade-off among beamforming gain, link coverage, outage reduction, and switching stability. This establishes a closed prediction–calibration–control chain for UAV-assisted ISAC scenarios.

2. Related Work

2.1. UAV Trajectory Prediction

UAV trajectory prediction is a fundamental problem in intelligent air–ground systems. Early studies primarily relied on kinematic models or filtering-based approaches to model target motion, such as state propagation models based on constant velocity (CV) or constant acceleration (CA) assumptions, as well as methods like the Extended Kalman Filter (EKF) and Unscented Kalman Filter (UKF) [13]. These approaches are effective in low-dynamic or regular motion scenarios; however, due to their reliance on predefined motion assumptions, they struggle to adapt to nonlinear maneuvers and abrupt motion changes in complex environments.

With the advancement of deep learning, data-driven trajectory prediction methods have gradually become the mainstream. In particular, Recurrent Neural Networks (RNNs) and their variants, such as Long Short-Term Memory (LSTM) networks, have been widely adopted to model temporal dependencies in sequential data [14]. Alahi et al. [15] proposed the Social-LSTM model, which incorporates neighborhood interaction modeling within the LSTM framework to enable joint prediction of multiple trajectories. Gupta et al. [16] introduced Social-GAN, which employs generative adversarial networks to model the multimodal distribution of trajectories, thereby improving prediction diversity and plausibility. Lee et al. [17] proposed the DESIRE framework, which combines generative modeling with inverse reinforcement learning to generate and rank multiple trajectory hypotheses. Ivanovic and Pavone developed Trajectron++, which models multi-agent interactions using graph structures and performs probabilistic trajectory prediction via variational inference, demonstrating strong generalization in complex scenarios [18]. While these methods significantly improve prediction accuracy and diversity, their primary focus remains on trajectory distribution modeling itself, with relatively limited attention to the reliability of predictions and their usability in downstream decision-making.

Due to the inherently three-dimensional and highly dynamic nature of UAV flight environments, relying solely on historical motion states is often insufficient to fully characterize future behaviors. Consequently, recent studies have begun to incorporate environmental context into trajectory prediction. Chen et al. [19] utilize convolutional neural networks to encode scene semantics and fuse them with trajectory features to better capture environmental constraints. Mangalam et al. [20] proposed the Y-Net model, which leverages goal-conditioned prediction and multimodal trajectory generation to capture diverse future behaviors. Similarly, Sadeghian et al. [21] introduced the SoPhie model, which leverages attention mechanisms to model both social interactions and physical environments, leading to improved prediction performance. These studies demonstrate that incorporating visual or semantic information can significantly enhance trajectory prediction, particularly in scenarios with obstacles or structural constraints.

Despite the performance gains brought by multimodal information, most existing methods adopt simple concatenation or pooling strategies for feature fusion, which limits their ability to capture dynamic cross-modal interactions. Although attention-based approaches, such as those based on Transformers, can model long-range temporal dependencies and cross-modal relationships, they primarily focus on improving prediction accuracy, with limited systematic analysis of uncertainty structure and model robustness in complex scenarios. Furthermore, most existing methods optimize for accuracy metrics such as Average Displacement Error (ADE) and Final Displacement Error (FDE), while paying insufficient attention to uncertainty modeling and reliability evaluation. The predicted uncertainty is often not properly calibrated, making it difficult to apply in risk-sensitive downstream decision-making tasks. This limitation is particularly critical in UAV-assisted ISAC scenarios, where tasks such as beam alignment depend not only on predicted positions but also on the confidence of the predicted distributions. Therefore, relying solely on point prediction accuracy is insufficient to support reliable joint optimization of communication and sensing.

From the perspective of UAV-assisted ISAC, trajectory prediction should not be evaluated only as an isolated perception task. Existing trajectory prediction studies have achieved strong performance in terms of displacement-based metrics such as ADE and FDE. However, in narrow-beam communication systems, prediction errors may have asymmetric downstream consequences. A spatial error that appears moderate for trajectory tracking may still cause beam misalignment when a high-gain narrow beam is used. Therefore, for prediction-assisted beam management, the key issue is not only whether the predicted position is close to the ground truth, but also whether the prediction output can provide reliable information for communication control. This requirement motivates trajectory prediction models that output not only future positions, but also uncertainty estimates that can be interpreted by downstream beam-control modules.

2.2. Uncertainty Modeling and Calibration

Uncertainty modeling plays a crucial role in trajectory prediction, as it enhances both the reliability and interpretability of predictions. By characterizing the probability distribution of trajectories, uncertainty modeling reflects prediction confidence as well as multimodal characteristics. Predictive uncertainty is generally categorized into epistemic uncertainty, arising from model parameters, and aleatoric uncertainty, caused by data randomness and observation noise. Both jointly affect prediction reliability in dynamic environments [22]. In addition to probabilistic uncertainty modeling, deterministic robust estimation methods provide another important perspective for handling uncertainty in UAV systems. Nonlinear observer-based approaches, especially those based on sliding mode theory, are often designed to improve robustness against external disturbances, model uncertainties, sensor noise, and actuator faults. For example, super-twisting observer-based control has been applied to quadrotor UAVs to reconstruct unmeasured states and improve robust trajectory tracking under bounded disturbances [23]. More recently, fractional-order sliding mode observers have also been investigated for actuator fault estimation in quadrotor UAV systems, showing robustness in the presence of matched uncertainties and unmodeled dynamics [24]. These methods typically rely on bounded-error convergence or disturbance rejection guarantees rather than explicit probabilistic distributions.

Within deep learning frameworks, Kendall and Gal [25] systematically analyzed methods for modeling epistemic and aleatoric uncertainty, and proposed a unified approach that combines probabilistic regression with Bayesian approximation. In this context, Monte Carlo Dropout has been widely used to approximate Bayesian inference and estimate model uncertainty. Although this method has demonstrated effectiveness in various vision tasks, it incurs significant computational overhead and is difficult to apply efficiently in long-horizon sequence prediction. In addition, Lakshminarayanan et al. [26] proposed the Deep Ensembles approach, which improves uncertainty estimation by aggregating predictions from multiple independently trained models. However, this method also suffers from high computational and storage costs, limiting its applicability in resource-constrained scenarios. For regression tasks, probabilistic regression has gradually become a mainstream approach for uncertainty modeling. By assuming that the outputs follow a parameterized distribution, both mean and variance can be estimated simultaneously, and training is typically performed by minimizing the negative log-likelihood (NLL). However, relying solely on likelihood-based training often leads to mismatches between predicted confidence intervals and the true error distribution, resulting in biased uncertainty estimates.

Recent alternatives to pure likelihood-based uncertainty learning include deep evidential regression, ensemble-based uncertainty estimation, and conformal prediction. Deep evidential regression learns a higher-order evidential distribution and estimates both aleatoric and epistemic uncertainty without repeated stochastic inference [27]. Ensemble-based methods combine several independently trained models and usually provide stronger uncertainty estimates, but they increase storage and inference cost [26]. Conformal prediction offers a model-agnostic route to calibrate prediction intervals using validation residuals, with distribution-free coverage guarantees under standard exchangeability assumptions [28]. These studies suggest that likelihood-trained variance alone is not sufficient for reliable uncertainty estimation. In this work, conformal calibration is adopted as a lightweight post-processing step because it improves empirical coverage without retraining the trajectory predictor or maintaining multiple models.

To address this issue, recent studies have increasingly focused on uncertainty calibration. Guo et al. [29] showed that deep neural networks tend to be overconfident in classification tasks and proposed post hoc calibration methods such as temperature scaling. For regression tasks, Kuleshov et al. [30] further introduced the calibrated regression framework, which applies a monotonic transformation to predicted distributions so that confidence intervals align with empirical coverage probabilities. Although these methods have shown promising results in general machine learning tasks, the calibration problem becomes more challenging in scenarios involving multi-step prediction and multimodal outputs.

In the field of trajectory prediction, some studies have attempted to explicitly model predictive distributions. For example, models based on variational autoencoders (VAE) or generative adversarial networks (GAN) can generate diverse future trajectories, thereby implicitly capturing multimodal uncertainty [16,17,18]. However, such approaches typically emphasize diversity rather than statistical consistency of uncertainty. More recently, several evaluation metrics have been introduced to assess the quality of uncertainty estimates, including Prediction Interval Coverage Probability (PICP), Calibration Error, and Continuous Ranked Probability Score (CRPS). These metrics characterize the reliability and consistency of predictive distributions from different perspectives, but are often used as auxiliary analysis tools rather than as core optimization objectives.

Although robust observer-based methods are highly valuable for UAV state estimation and fault diagnosis, their objectives differ from the focus of this work. Observer-based methods mainly aim to reconstruct system states or disturbances with robustness guarantees under certain system assumptions. By contrast, the present work focuses on multi-step trajectory prediction under multimodal inputs, where the output is a future position distribution rather than only the current system state. In this setting, the key challenge is whether the predicted uncertainty can provide statistically reliable coverage for future trajectories and whether such uncertainty can be further used by downstream beam management. Therefore, probabilistic calibration and robust observer-based estimation should be regarded as complementary perspectives: the former emphasizes empirical coverage and decision-oriented risk quantification, while the latter emphasizes bounded-error state reconstruction and disturbance robustness. In summary, existing uncertainty-aware trajectory prediction methods still have three limitations when applied to UAV-assisted ISAC beam management. First, many probabilistic prediction models emphasize distribution generation or likelihood optimization, but likelihood-trained variance is not necessarily calibrated and may underestimate the true prediction error. Second, uncertainty is often evaluated using offline metrics, while its effect on downstream communication decisions is rarely analyzed. Third, deterministic robust estimation and probabilistic calibration address uncertainty from different perspectives, but the connection between calibrated predictive uncertainty and communication-control risk remains insufficiently studied. These limitations motivate the use of calibrated uncertainty as an intermediate risk signal for adaptive beam management.

2.3. Predictive Beam Management for ISAC

Narrow-beam transmission is a key technology for improving link gain and spectral efficiency in millimeter-wave (mmWave) and terahertz systems. However, it is highly sensitive to beam misalignment. In high-mobility UAV scenarios, frequent beam training and re-alignment introduce significant overhead and latency. As a result, predictive beam management based on location or trajectory prediction has emerged as an important research direction.

Early studies mainly relied on position-assisted beam alignment. For instance, Va et al. [31] utilized historical location information to predict future user positions in vehicular communication scenarios, thereby reducing beam search overhead. Giordani et al. [32] systematically analyzed beam tracking and handover in mmWave systems, highlighting that leveraging external sensing information can significantly improve beam management efficiency. Recent surveys on mmWave and THz beam management have shown that beam acquisition, beam tracking, and beam switching remain challenging in high-frequency mobile communication systems, especially when narrow beams, channel fluctuations, and dynamic environments are jointly considered. Existing beam management methods mainly reduce search overhead or improve beam tracking by exploiting location information, sensing features, channel measurements, or learning-based beam prediction. These studies provide important foundations for predictive beam management. However, many of them still make beam decisions based on point estimates, beam indices, or deterministic sensing features. The uncertainty of future UAV motion is often not explicitly calibrated or converted into a beam-control variable [33].

With the advancement of sensing technologies, researchers have begun to explore the use of multi-source perceptual information to enhance beam prediction. Ali et al. [34] investigated beam selection using out-of-band spatial information, demonstrating that auxiliary sensing modalities can provide useful priors for mmWave beam prediction. Kumari et al. [35] explored joint communication and sensing capabilities in mmWave systems, demonstrating the potential of leveraging sensing information to assist communication-related tasks. These studies indicate that integrating external sensing information can effectively improve beam prediction performance, particularly in complex environments or non-line-of-sight (NLoS) scenarios.

Within the ISAC framework, the deep integration of sensing and communication offers new opportunities for predictive beam management. Liu et al. [3] and Zhang et al. [36] analyzed the potential of sensing-assisted communication optimization in ISAC systems from system architecture and signal processing perspectives, respectively, and highlighted that leveraging sensing outcomes for proactive resource allocation is a promising direction. In this context, recent studies have begun to incorporate trajectory prediction into beam management, enabling proactive beam adjustment based on predicted future positions of users or UAVs, thereby reducing link interruption probability and improving system throughput.

Despite these advances, most existing methods rely on deterministic point predictions to make single beam decisions, neglecting the uncertainty arising from mobility, occlusions, and sensing noise in UAV scenarios. Relying solely on point estimates may lead to coverage risks or gain loss. Moreover, current approaches lack explicit modeling and effective utilization of uncertainty, and the connection between beam management strategies and predictive distributions remains weak, with limited systematic evaluation of reliability. How to transform uncertainty into beam control parameters and achieve a dynamic trade-off between beam gain and robustness remains an open problem. Robust ISAC and robust communication optimization methods also address uncertainty, but they usually focus on channel uncertainty, power allocation, beamforming robustness, or worst-case/risk-constrained resource optimization at the physical layer. These methods are different from the focus of this work. The present study starts from the mobility prediction side and investigates how calibrated trajectory uncertainty can be used to guide beamwidth selection under codebook constraints. In other words, the proposed framework does not replace robust beamforming or power control methods; instead, it provides a prediction-side risk input that can complement existing robust ISAC designs [37].

In summary, existing predictive beam management methods have made important progress in reducing beam training overhead and improving beam tracking efficiency. However, most of them rely on deterministic locations, predicted beam indices, or uncalibrated sensing features. Few studies explicitly examine how calibrated trajectory uncertainty can be transformed into beamwidth adaptation under finite codebook constraints. This gap is particularly important in UAV-assisted ISAC scenarios, where the reliability of the predicted future position directly affects beam coverage, outage probability, and the trade-off between array gain and robustness. Therefore, this paper focuses on linking calibrated probabilistic trajectory prediction with uncertainty-aware beam management, rather than treating beam selection as a purely deterministic prediction problem.

3. Proposed Method

3.1. Overall Framework

To achieve reliable trajectory prediction and prediction-driven beam control in UAV-assisted ISAC scenarios, this paper develops a unified methodological pipeline that integrates multimodal probabilistic prediction, uncertainty calibration, and risk-aware beam control. The overall framework is illustrated in Figure 1. Specifically, the framework first fuses historical motion states with visual context to perform probabilistic trajectory prediction, producing both position means and uncertainty estimates. The predicted uncertainty is then post-processed through calibration and reliability diagnosis. Finally, the calibrated uncertainty is explicitly mapped to beamwidth and switching decisions as a risk signal, enabling a dynamic trade-off between narrow-beam gain and link robustness.

From a system perspective, the proposed method consists of four sequential stages. In the first stage, data organization and multimodal input construction are performed. Based on the EuRoC MAV dataset, sliding-window samples are generated, where each sample consists of a historical motion state sequence of length

T_{h} = 20

, a temporally aligned image feature sequence, and future position labels of length

T_{p} = 10

. In the second stage, multimodal probabilistic trajectory prediction is conducted. The model takes motion state encoding as the backbone and employs cross-modal attention to select context from the visual sequence that is most relevant to the current motion state. A gated fusion mechanism is then used to conditionally enhance motion representations, ultimately producing the mean and variance of future 3D positions. In the third stage, uncertainty calibration and reliability analysis are performed. The scale of uncertainty is calibrated, and its statistical consistency is evaluated across multiple dimensions, including prediction horizons, spatial dimensions, and scenario difficulty levels. In the fourth stage, uncertainty-aware beam management is integrated. The calibrated uncertainty serves as a key intermediate variable linking the prediction module and the beam controller. The uncertainty output from the prediction model is transformed into a risk signal that directly drives beamwidth selection and switching cost trade-offs, thereby translating prediction accuracy into actionable control decisions.

It should be emphasized that the proposed framework is not intended to claim novelty from the individual use of LSTM, cross-attention, gated fusion, or conformal calibration alone. These components are adopted as functional modules within a control-oriented pipeline. The key design is the definition of calibrated predictive uncertainty as an intermediate control variable. Specifically, the prediction module provides a distributional estimate of future UAV positions, the calibration module converts the raw uncertainty into a statistically more reliable spatial risk bound, and the beam management module uses this bound to determine whether a narrow high-gain beam or a wider robust beam should be selected from the codebook. Therefore, the novelty of the framework lies in the explicit prediction–calibration–control mapping, rather than in treating trajectory prediction and beam management as two separate tasks.

3.2. Probabilistic Multimodal Trajectory Predictor

To fully exploit the complementarity between UAV motion dynamics and environmental perception, this paper develops a multimodal trajectory prediction model. The model takes historical motion sequences as the primary input while incorporating visual perceptual features as auxiliary information. A structured fusion mechanism is employed to enhance both prediction accuracy and robustness. The overall architecture is illustrated in Figure 2.

3.2.1. Problem Formulation

At time step

t

, given a historical time window of length

T_{h}

, the UAV state sequence is defined as:

X_{1 : T_{h}} = {x_{t}}_{t = 1}^{T_{h}}, x_{t} \in R^{13},

(1)

where each state includes position, velocity, and other motion-related information. Meanwhile, the corresponding visual observations are introduced and encoded into feature representations:

I_{1 : T_{h}} = {i_{t}}_{t = 1}^{T_{h}}, i_{t} \in R^{D}

(2)

The objective of the model is to predict the future trajectory over

T_{p}

time steps given the multimodal inputs:

{\hat{Y}}_{1 : T_{p}} = f (X_{1 : T_{h}}, I_{1 : T_{h}}),

(3)

Unlike conventional deterministic prediction, this work further models the conditional probability distribution:

p (Y_{1 : T_{p}} ∣ X_{1 : T_{h}}, I_{1 : T_{h}}),

(4)

so as to jointly output trajectory means and uncertainty, providing a foundation for subsequent risk-sensitive decision-making.

3.2.2. Motion Encoder

The historical motion sequence is modeled using a Long Short-Term Memory (LSTM) network to capture temporal dependencies and dynamic evolution patterns:

h_{t} = LSTM (x_{t}, h_{t - 1}),

(5)

where

h_{t} \in R^{H}

denotes the hidden state at time

t

. The hidden state at the final time step is given by:

h_{motion} = h_{T_{h}},

(6)

which represents the current motion trend of the UAV, such as velocity variations and turning behavior.

3.2.3. Visual Context Encoder

Visual information is extracted using a pre-trained convolutional neural network. To adapt these features to the downstream task, a trainable projection head is introduced to map them into a lower-dimensional embedding space:

i_{t} = ϕ (ResNet 18 ({image}_{t})),

(7)

where

ϕ (\cdot)

denotes a linear projection followed by a nonlinear activation. It reflects a typical paradigm of perception-side prior encoding combined with lightweight task-specific adaptation, enabling visual features to retain semantic richness while being tailored to trajectory prediction.

3.2.4. Multimodal Fusion Mechanism

Cross-attention and gated fusion are used in this work not as standalone architectural novelties, but as mechanisms for constructing a control-oriented probabilistic representation. In UAV-assisted ISAC scenarios, visual context may indicate local environmental constraints and potential motion changes, while historical motion states describe short-term dynamic trends. The cross-attention module conditions visual feature selection on the current motion state, and the gated fusion module controls the extent to which the selected perceptual context should modify the motion representation. This design aims to improve not only point prediction accuracy, but also the quality of the predicted distribution that will later be calibrated and used for beam-control risk estimation. Taking the motion hidden state as the query and the visual sequence as the key and value, the cross-modal attention is defined as:

z_{attn} = Attention (Q = h_{motion}, K = I_{1 : T_{h}}, V = I_{1 : T_{h}}),

(8)

This mechanism allows the model to dynamically select the most relevant visual context conditioned on the current motion state. After obtaining the visual context, a gating mechanism is applied for adaptive modulation:

g = σ (W_{g} z_{attn}),

(9)

The final fused representation is given by:

h_{fused} = h_{motion} ⊙ (1 + α g),

(10)

where

σ (\cdot)

denotes the Sigmoid activation function,

g

represents the gating weights,

α

is a fusion strength coefficient, and

⊙

denotes element-wise multiplication. From the perspective of beam management, this fusion design affects the system not only through lower trajectory error, but also through more informative uncertainty estimates. A more reliable trajectory distribution reduces the probability that the beam controller underestimates spatial risk, which is essential for selecting an appropriate beamwidth under codebook constraints.

3.2.5. Probabilistic Output Head and Gaussian NLL

The fused feature is fed into a prediction head to output the parameters of the probabilistic distribution over future 3D positions for

T_{p}

time steps:

(μ, l o g σ^{2}) \in R^{T_{p} \times 6},

(11)

For each prediction step

t

, the model outputs the mean vector and variance parameters of the 3D position:

μ_{t} = [μ_{t}^{x}, μ_{t}^{y}, μ_{t}^{z}], σ_{t}^{2} = [(σ_{t}^{x})^{2}, (σ_{t}^{y})^{2}, (σ_{t}^{z})^{2}],

(12)

The log-variance parameterization is adopted to ensure numerical stability, where the first three dimensions correspond to position means and the remaining three correspond to logarithmic variance. To prevent variance explosion or collapse during training, the log-variance is constrained within:

l o g σ_{t}^{2} \in [- 5,3],

(13)

thereby balancing training stability and model expressiveness. During training, the Gaussian Negative Log-Likelihood (NLL) is used as the primary optimization objective. The loss at each step is defined as:

L_{t} = \frac{1}{2} (l o g σ_{t}^{2} + \frac{(y_{t}− μ_{t})^{2}}{σ_{t}^{2}}),

(14)

where

y_{t} \in R^{3}

denotes the ground-truth future position at prediction step

t

,

μ_{t} \in R^{3}

denotes the predicted mean, and

σ_{t}^{2} \in R^{3}

denotes the coordinate-wise predictive variance. The overall loss is expressed as:

L_{NLL} = \frac{1}{N} \sum_{i = 1}^{N_{s}} \sum_{t = 1}^{T_{p}} \frac{1}{2} (\log σ_{i, t}^{2} + \frac{(y_{i, t}− μ_{i, t})^{2}}{σ_{i, t}^{2}}),

(15)

where

N_{s}

denotes the number of samples, and

N = N_{s} \times T_{p}

is the total number of predicted points. As a result, the model outputs not only point estimates of future positions but also probabilistic predictions with explicit confidence semantics.

3.3. Uncertainty Modeling

In UAV-assisted ISAC scenarios, the more critical issue in the trajectory prediction task is how to obtain reliable and decision-influencing uncertainty estimates. This work focuses on improving the practical reliability of existing probabilistic outputs and examining how calibrated uncertainty can be effectively utilized in control-oriented applications. The motivation for applying conformal calibration is to obtain an operationally meaningful uncertainty bound for downstream control. Raw Gaussian variance learned through NLL training can reflect relative prediction difficulty, but it does not guarantee that the predicted confidence region matches the empirical error distribution. Such under-calibrated uncertainty may cause the controller to select an overly narrow beam and increase the probability of outage. Therefore, calibration is introduced as a risk-regularization step before beam-control decision-making.

3.3.1. Split Conformal Calibration on Validation Set

As described in Section 3.2.5, the model predicts the Gaussian distribution of future trajectories by outputting the mean and variance at each prediction step. In multi-step predictions, errors accumulate over time, and the predicted variance may not accurately reflect this temporal propagation. Therefore, the original Gaussian uncertainty cannot be directly used for risk-sensitive ISAC decisions that require accurate coverage and reliable confidence estimation.

To address the above issue, a validation set-based split conformal calibration method is used to post-process the predicted uncertainty. A nonconformity score is defined on the validation set as:

s_{t} = \frac{{‖y_{t} - μ_{t}‖}_{2}}{σ_{t}^{r}},

(16)

where

σ_{t}^{r}

denotes the scalar spatial uncertainty radius obtained from the coordinate-wise standard deviations.

By analyzing the distribution of scores on the validation set, the quantile

q_{α}

corresponding to a confidence level

1 - α

can be obtained. During testing, this quantile is used to rescale the original uncertainty, yielding calibrated confidence intervals:

{‖y_{t} - μ_{t}‖}_{2} \leq q_{α} \cdot σ_{t}^{r},

(17)

This approach does not require additional model training and relies solely on validation statistics to achieve distribution calibration. Compared with retraining-based calibration or ensemble-based uncertainty estimation, split conformal calibration is selected because it is model-agnostic, lightweight, and directly provides empirical coverage control on a held-out validation set. These properties are important for UAV-assisted communication systems. Therefore, the calibration step is suitable for serving as a practical interface between probabilistic prediction and communication control.

3.3.2. Reliability Diagnostics Across Dimension, Horizon, and Difficulty

After post hoc calibration, a single metric is insufficient to fully evaluate the quality of uncertainty estimates. This paper conducts systematic diagnostics of predictive uncertainty from multiple perspectives to assess both statistical consistency and practical usability. First, from a global distribution perspective, the Negative Log-Likelihood (NLL) is used as a baseline metric for probabilistic prediction quality. In addition, the deviation between the empirical coverage rate and the nominal coverage is used to define the calibration error:

Calibration Error = ∣ Coverage Rate - Nominal Coverage ∣,

(18)

In the spatial domain, uncertainty is analyzed separately along the

x

,

y

, and

z

axes to examine the consistency between prediction errors and standard deviations in different directions. In the temporal domain, trends of ADE, standard deviation, and coverage rate are examined across prediction steps

t \in {1, \dots, T_{p}}

to verify whether uncertainty expands reasonably over time. In the scenario domain, the dataset is divided into easy, medium, and difficult subsets to evaluate the responsiveness of uncertainty under different complexity levels.

Through these multi-dimensional diagnostics, this work not only verifies the accuracy of uncertainty estimation, but also reveals how multimodal information fusion affects uncertainty quality.

3.3.3. Calibration-Driven Risk Quantification for Beam Control

After calibration, uncertainty is further interpreted from a system perspective as a control-oriented risk signal. The calibrated standard deviation

σ_{t}^{c a l}

is no longer merely a statistical measure of error scale, but can be directly mapped to decision risk. A larger

σ_{t}^{c a l}

indicates a more dispersed future position distribution, corresponding to higher uncertainty and a greater risk of link mismatch. Conversely, a smaller

σ_{t}^{c a l}

indicates more reliable predictions, allowing for more aggressive resource allocation strategies. Accordingly, the prediction output can be unified as:

(μ_{t}, σ_{t}^{c a l}) \Rightarrow position estimate + risk measure,

(19)

This enables a direct mapping from prediction to control in ISAC systems. In the subsequent beam management module, this risk signal is explicitly used to guide beamwidth adaptation, achieving a dynamic trade-off between communication gain and coverage robustness.

This section does not aim to develop a new uncertainty estimation method, but rather to enhance the reliability and practical usability of predictive uncertainty, enabling its effective integration into downstream communication control.

3.4. Risk-Aware Adaptive Beam Management

After completing multimodal trajectory prediction and uncertainty calibration, this paper incorporates the prediction outputs into an Integrated Sensing and Communication (ISAC) system, enabling end-to-end optimization from trajectory distribution estimation to communication resource control. This work treats calibrated uncertainty as a risk measure and explicitly embeds it into the beam control strategy, thereby constructing a risk-aware adaptive beam management mechanism for dynamic UAV scenarios.

In millimeter-wave communication systems, narrow beams provide higher array gain but are highly sensitive to pointing errors, whereas wide beams offer stronger coverage at the cost of reduced signal-to-noise ratio (SNR) and system capacity.

Based on the modeling results in Section 3.2 and Section 3.3, the model outputs the future position distribution parameters at time

t

:

(μ_{t}, σ_{t}^{c a l}),

(20)

where

μ_{t}

denotes the predicted position mean, and

σ_{t}^{c a l}

represents the calibrated standard deviation.

In practical mmWave systems, beams are typically selected from predefined codebooks. This work extends continuous beam control to a discrete beam selection problem under codebook constraints. Let the finite beam codebook be denoted as

B = {b_{k}}_{k = 1}^{K}

, where each candidate beam

b_{k}

is associated with a beam direction, a beamwidth

θ_{k}

, and a beam gain

G (b_{k})

. The beam selection can then be formulated as:

b_{t} \in B,

(21)

where the optimal beam must be chosen from a finite candidate set. The calibrated spatial uncertainty is converted into an angular risk margin according to the predicted link distance. Specifically, for the predicted link distance

d_{t}

, the angular uncertainty margin can be approximated as:

Δ θ_{t}^{c a l} = \arctan (\frac{q_{α} σ_{t}^{r}}{d_{t}}) .

(22)

A candidate beam is regarded as feasible when its half beamwidth can cover both the predicted mean direction and the calibrated angular risk margin. To this end, the continuous prediction results are mapped to the discrete codebook by quantizing the predicted mean direction and determining candidate beams based on uncertainty.

Building upon this, a utility-driven risk-aware decision mechanism is introduced, formulating beam selection as an optimization problem that jointly considers communication performance and reliability. For any candidate beam

b_{k}

, the utility function is defined as:

U (b_{k}) = R (b_{k}) - λ_{o u t} \cdot P_{o u t} (b_{k}) - λ_{s w} \cdot C_{s w} (b_{k}),

(23)

where

R (b_{k})

denotes the communication rate,

P_{out} (b_{k})

is the link outage probability, and

C_{sw} (b_{k})

represents the beam switching cost. The coefficients

λ_{out}

and

λ_{sw}

are weighting factors. The communication rate is computed according to the Shannon formula. Since the link budget is first expressed in the dB domain, the dB-scale SNR is defined as:

{S N R}_{d B} (b_{k}) = P_{t} + G (b_{k}) - P L (d_{t}) - N,

(24)

where

P_{t}

denotes the transmit power,

G (b_{k})

is the beam gain of candidate beam

b_{k}

,

P L (d_{t})

is the path loss at link distance

d_{t}

, and

N

denotes the noise power. The dB-scale SNR is then converted into the linear-scale SNR:

γ (b_{k}) = 10^{{S N R}_{d B} (b_{k}) / 10} .

(25)

Therefore, the achievable communication rate is given by:

R (b_{k}) = B W \log_{2} (1 + γ (b_{k})),

(26)

where

B W

denotes the system bandwidth, and

γ (b_{k})

denotes the linear-scale SNR converted from the dB-domain link budget. The path loss model is given by:

P L (d_{t}) = 28 + 22 \log_{10} (d_{t}) + 20 \log_{10} (f_{c}),

(27)

and the beam gain is inversely proportional to the beamwidth:

G (b_{k}) = 10 {l o g}_{10} (\frac{29000}{θ_{k}^{2}}),

(28)

for a given candidate beam

b_{k}

, the coverage condition is expressed as:

{∥ x_{t} - μ_{t} ∥}_{2} \leq d_{t} \cdot t a n (θ_{k} / 2),

(29)

where

x_{t}

is the ground-truth UAV position,

μ_{t}

is the predicted position mean,

d_{t}

is the predicted link distance, and

θ_{k}

is the beamwidth of candidate beam

b_{k}

. Based on the calibrated distribution, the coverage probability can be computed, leading to the link outage probability:

P_{out} (b_{k}) = 1 - P (coverage ∣ μ_{t}, σ_{t}^{c a l}, b_{k}),

(30)

The calibrated uncertainty

σ_{t}^{c a l}

has improved statistical consistency and can serve as a reliable risk measure in the decision process. In addition, to avoid excessive beam switching overhead, a switching penalty is introduced:

C_{sw} (b_{k}) = I (b_{k} \neq b_{t - 1}),

(31)

By combining the above factors, the final beam selection strategy is formulated as:

b_{t}^{*} = a r g \underset{b_{k} \in B}{m a x} U (b_{k}),

(32)

This strategy enables beam control to jointly consider communication performance, link reliability, and system stability.

In summary, the proposed risk-aware adaptive beam management method forms a coherent end-to-end framework that connects trajectory prediction, uncertainty calibration, and communication control. As a result, the proposed design enhances system robustness in dynamic UAV scenarios, while providing an interpretable and practically deployable framework for uncertainty-driven wireless resource optimization.

4. Experiments

4.1. Experimental Setup

The experiments are conducted on the publicly available EuRoC MAV dataset, which provides real UAV trajectories for evaluating the effectiveness of the proposed multimodal trajectory prediction and uncertainty-aware beam management framework in complex motion scenarios. We first performed unified temporal alignment of ground-truth, IMU, and camera data. The raw data were projected onto a unified 10 Hz time grid with a time interval of

Δ t = 0.1

s. For continuously varying ground-truth states, linear interpolation is adopted to preserve smooth trajectory evolution. IMU measurements are aligned using nearest-neighbor matching to avoid artificial high-frequency distortions introduced by interpolation. For visual data, the image frame closest to the current timestamp is selected as the observation. After alignment, the system state at each time step is represented as:

s_{t} = [x_{t}, y_{t}, z_{t}, v_{t}^{x}, v_{t}^{y}, v_{t}^{z}, a_{t}^{x}, a_{t}^{y}, a_{t}^{z}, g_{t}^{x}, g_{t}^{y}, g_{t}^{z}, ψ_{t}],

(33)

where the first three components denote position, followed by velocity, linear acceleration, angular velocity, and the yaw angle. The yaw angle

ψ_{t}

is obtained from the quaternion

(q_{w}, q_{x}, q_{y}, q_{z})

as:

ψ_{t} = atan 2 (2 (q_{w} q_{z} + q_{x} q_{y}), 1 - 2 (q_{y}^{2} + q_{z}^{2})),

(34)

Based on the aligned time series, supervised samples are constructed using a sliding window strategy. For each time step

t

, the input and prediction target are defined as:

X_{t} = {x_{t - T_{h} + 1}, \dots, x_{t}}, Y_{t} = {p_{t + 1}, \dots, p_{t + T_{p}}},

(35)

where

p_{t} = (x_{t}, y_{t}, z_{t})

represents the future position. To ensure temporal consistency, windows are constructed only when the time interval between consecutive samples is exactly 0.1 s. The resulting dataset has dimensions

(N, 20,13)

for historical states,

(N, 20,128)

for image features, and

(N, 10,3)

for future position targets. The image path, sequence name, and difficulty label are also retained.

For each time window, extract the corresponding sequence of consecutive images and extract the visual features of each frame, then obtain an embedding representation with dimension D = 128 through linear projection:

I_{t} = {i_{t - T_{h} + 1}, \dots, i_{t}}, i_{t} \in R^{128},

(36)

This feature extraction is performed offline, preserving full temporal context without increasing training overhead. In order to reduce the interference of position offsets of different flight sequences in the global coordinate system on the learning process, the position

p_{0}

at the last moment of the historical window is taken as the local reference origin, and the position terms in the historical states and future prediction targets are uniformly translated into this local coordinate system:

{\tilde{p}}_{k} = p_{k} - p_{0},

(37)

Through this processing, the model focuses on the local motion evolution trend at the current moment, which helps improve data consistency across different sequences and training stability.

4.2. Evaluation Metrics

To comprehensively evaluate the proposed multimodal trajectory prediction model, three aspects are considered: prediction accuracy, uncertainty reliability, and beam control utility.

4.2.1. Prediction Accuracy

For trajectory prediction, the most fundamental criterion is the spatial deviation between predicted and ground-truth trajectories. Let

{\hat{p}}_{b, t} \in R^{3}

denote the predicted position and

p_{b, t} \in R^{3}

denotes the ground-truth position, where

T_{p}

is the prediction horizon and

B

is the batch size. The Average Displacement Error (ADE) is defined as the mean Euclidean distance over all prediction steps:

ADE = \frac{1}{B} \sum_{b = 1}^{B} \frac{1}{T_{p}} \sum_{t = 1}^{T_{p}} ∥ {\hat{p}}_{b, t} - p_{b, t} ∥_{2},

(38)

The Final Displacement Error (FDE) focuses only on the last prediction step:

FDE = \frac{1}{B} \sum_{b = 1}^{B} ∥ {\hat{p}}_{b, T_{p}} - p_{b, T_{p}} ∥_{2},

(39)

To further characterize the overall error distribution, the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) are also considered:

RMSE = \sqrt{\frac{1}{3 B T_{p}} \sum_{b, t} ∥ {\hat{p}}_{b, t} - p_{b, t} ∥_{2}^{2}}, MAE = \frac{1}{3 B T_{p}} \sum_{b, t} ∣ {\hat{p}}_{b, t} - p_{b, t} ∣,

(40)

ADE reflects overall prediction quality, FDE emphasizes long-horizon accuracy, while RMSE and MAE capture the error distribution from second-order and first-order perspectives, respectively.

4.2.2. Uncertainty Reliability

For uncertainty evaluation, the Negative Log-Likelihood (NLL) is used as a fundamental metric for probabilistic prediction:

NLL = \frac{1}{B T_{p}} \sum_{b, t} \frac{1}{2} [l o g σ_{b, t}^{2} + \frac{(p_{b, t}− μ_{b, t})^{2}}{σ_{b, t}^{2}}],

(41)

For a 95% confidence interval, the coverage rate is defined as:

{Coverage}_{95} = E [1 (∣ p_{t} - μ_{t} ∣ \leq 1.96 σ_{t})],

(42)

Ideally, this value should be close to the nominal level of 0.95. The calibration error is defined as:

CalibErr = ∣ {Coverage}_{95} - 0.95 ∣,

(43)

which measures the discrepancy between predicted uncertainty and actual error distribution. In the calibration-method comparison, the Expected Calibration Error (ECE) is also used to evaluate the overall reliability of the predicted uncertainty:

ECE = \frac{1}{K} \sum_{k = 1}^{k} |C o v (a_{k}) - a_{k}|,

(44)

where

a_{k}

denotes the nominal confidence level,

C o v (a_{k})

is the empirical coverage at this confidence level, and

K

is the number of evaluated confidence levels.

4.2.3. Beam Control Utility

From a system perspective, the practical value of prediction is evaluated in the context of ISAC beam management. At time

t

, let the selected beam be

b_{t}

, the predicted position be

μ_{t}

, the ground-truth position be

x_{t}

, and the link distance be

d_{t}

. The Coverage Rate is defined as:

CoverageRate = \frac{1}{T} \sum_{t = 1}^{T} 1 (∥ x_{t} - μ_{t} ∥ \leq d_{t} \cdot t a n (θ_{t} / 2)),

(45)

where

1 (\cdot)

is the indicator function. Next, the Mean Beam Gain measures the communication capability of the selected beams:

G (θ_{t}) = 10 {l o g}_{10} (\frac{29000}{θ_{t}^{2}}),

(46)

The average gain is defined as:

MeanGain = \frac{1}{T} \sum_{t = 1}^{T} G (θ_{t}),

(47)

The Mean Beam Width reflects the trade-off between coverage and gain:

MeanBeamWidth = \frac{1}{T} \sum_{t = 1}^{T} θ_{t},

(48)

Smaller values indicate a preference for narrow beams (higher gain), while larger values indicate a preference for robust coverage.

4.3. Prediction Performance

To comprehensively evaluate the performance of the proposed method, the overall performance results are shown in Table 1. The model’s performance on the overall test set exhibits a clear progressive trend. The traditional constant-velocity model (CV Baseline) performs the weakest across all error metrics. In contrast, the LSTM Baseline, by modeling historical temporal states, validates the effectiveness of temporal dynamic modeling for short-term UAV trajectory prediction. On this basis, the Gated (Deterministic) model, which incorporates visual information, further reduces the FDE, indicating that environmental context plays a supplementary role in future trajectory inference. When the model is extended from deterministic prediction to probabilistic prediction, the Gated (Uncertainty) model continues to improve across all four error metrics, demonstrating that explicit uncertainty modeling not only enhances distributional representation capability but also positively impacts point prediction accuracy. The proposed CA-Gated model achieves the best overall performance among all methods. Compared with the Gated (Uncertainty) model, it reduces ADE and FDE by approximately 2.48% and 2.32%, respectively. Its NLL is also improved, indicating better distributional fitting. However, the raw uncertainty is still not sufficiently calibrated, which motivates the conformal calibration step in Section 4.4.

To avoid over-interpreting the single-run results in Table 1, we further conduct a robustness analysis over five repeated runs. The same data split and hyperparameter settings are used. Only the random seed is changed. The results in Table 2 show the same trend as Table 1. CA-Gated performs better than Gated-Unc in prediction accuracy, distribution fitting, and uncertainty coverage. The standard deviations are small. This means that the results are stable under different random seeds. The p-values further suggest that the differences between CA-Gated and Gated-Unc are statistically meaningful in these repeated runs. Still, the improvement should be interpreted carefully. It is not a breakthrough-level gain in point prediction accuracy. It is a moderate but consistent improvement. More importantly, the gain appears together with better probabilistic quality and higher uncertainty coverage. This supports the use of CA-Gated as the prediction module in the following calibration and beam management pipeline.

To further analyze the statistical characteristics of model performance, the distributions of ADE and FDE over all test samples are examined, as illustrated in the corresponding Figure 3.

For ADE, the CV model exhibits the most dispersed error distribution. Its median is relatively high, and the interquartile range (IQR) is wide. A large number of high-error outliers can also be observed. These results indicate unstable performance, especially under complex motion conditions. The LSTM model shows a more concentrated distribution compared with CV. The median and spread are both reduced, suggesting improved consistency due to temporal modeling. However, a noticeable long-tail behavior still exists. High-error samples are not fully eliminated. In contrast, the proposed CA-Gated model demonstrates a more compact ADE distribution. The IQR is further reduced, and the median is lower than those of the other models. The number of extreme outliers also decreases. These observations suggest that prediction errors are more consistently controlled within a limited range.

For FDE, the differences between models become more evident. The CV model shows large variability in endpoint prediction, with a wide spread and clear long-tail behavior. The LSTM model reduces both the median and the spread. However, high-error samples are still present. The CA-Gated model exhibits the most concentrated FDE distribution among the three methods. The median is further reduced, and the overall spread becomes narrower. The upper range of errors is also lower, and fewer extreme outliers are observed. These results suggest that the model can better control error accumulation over longer prediction horizons.

Overall, the distributional analysis of ADE and FDE shows that the proposed method achieves lower prediction errors for a large portion of test samples. The error distribution is also more concentrated, indicating improved consistency across different trajectories. The model exhibits more stable behavior under challenging or highly dynamic motion patterns. These properties are beneficial for downstream uncertainty-aware prediction. They may also support more reliable decision-making in ISAC-related resource allocation tasks.

4.4. Uncertainty Reliability and Calibration

For the UAV trajectory prediction and ISAC integration scenario considered in this work, it is particularly important that the predicted uncertainty exhibits both statistical consistency and interpretability. Based on this motivation, this section provides a systematic analysis of the uncertainty quality of the proposed probabilistic model.

4.4.1. Why Raw Gaussian Uncertainty Is Insufficient

We further examine whether the raw Gaussian uncertainty can serve as a useful indicator of prediction risk from three aspects, including its relationship with prediction error, distributional behavior, and calibration properties.

First, we analyze the relationship between uncertainty and prediction error, as illustrated in Figure 4. The sample-wise predicted uncertainty shows a positive correlation with the corresponding ADE. Samples with higher predicted uncertainty tend to be associated with larger prediction errors. In contrast, samples with lower uncertainty are mostly concentrated in low-error regions. The binned statistics provide a consistent observation. As the predicted standard deviation increases, the average ADE also shows an increasing trend. This relationship appears approximately monotonic across different uncertainty intervals. The Pearson correlation coefficient is around r ≈ 0.43–0.51. This indicates a moderate positive correlation between uncertainty and prediction error. It suggests that the predicted uncertainty can reflect relative differences in prediction difficulty.

However, this relationship is not perfect. Therefore, the raw uncertainty should be interpreted as a relative indicator rather than a strictly calibrated measure of risk.

Further analysis across different difficulty levels shows a consistent trend among uncertainty, prediction error, and task difficulty. As Figure 5 shifts from Easy to Difficult, both prediction error and uncertainty increase, while the coverage rate gradually decreases to 0.8969. This indicates the presence of under-coverage in more complex scenarios. These observations further confirm that although raw Gaussian uncertainty captures relative difficulty, its confidence intervals are not well aligned across different regions of the data distribution.

Based on these observations, the model outputs are not assumed to be inherently well calibrated. The raw Gaussian uncertainty is therefore treated as a useful but imperfect proxy of prediction risk. If used directly in the beamwidth control strategy described in Section 3.4, such uncertainty may introduce bias in risk estimation. This may, in turn, influence the stability of decision-making under certain conditions. It is beneficial to further improve calibration through post-processing, while preserving its ranking capability. This allows the uncertainty estimates to be more consistently applied in downstream uncertainty-aware beam management.

4.4.2. Effect of Conformal Calibration

To enable uncertainty to serve as a probabilistically interpretable risk measure, conformal calibration is applied as a post-processing step. Its effect is evaluated from three perspectives: coverage, calibration behavior, and distribution quality.

The analysis results from the perspective of coverage are shown in Table 3. After conformal calibration, the coverage improves noticeably. Coverage@90 increases by approximately 5.98%, while Coverage@95 improves by about 7.12%. The improvement at higher confidence levels is smaller, with Coverage@99 increasing by around 4.90%. These results indicate that calibration helps reduce under-coverage and brings the empirical coverage closer to the nominal confidence levels.

However, the calibrated results also tend to slightly exceed the target coverage, especially at higher confidence levels. This suggests a more conservative behavior after calibration. Overall, conformal calibration improves the reliability of uncertainty intervals, although it may introduce a certain degree of over-coverage.

From a global calibration perspective, the calibration curves before and after post-processing are shown in Figure 6. The original Gaussian pre-curve deviates from the ideal diagonal line

y = x

. The empirical coverage lies below the ideal line in the mid-to-high confidence regions. This indicates under-coverage in the original uncertainty estimates. After conformal calibration, the empirical coverage curve shifts upward. The calibrated curve becomes closer to the ideal line across most confidence levels. In some regions, it slightly exceeds the ideal line, indicating a tendency toward over-coverage. These results suggest that calibration improves the alignment between predicted and empirical coverage. At the same time, the calibrated intervals become more conservative. This behavior may be beneficial in risk-sensitive settings, where slightly higher coverage is often preferred.

To enable uncertainty to serve as a probabilistically interpretable risk measure, split conformal calibration is applied as a post-processing step. The main purpose of this calibration is to reduce under-coverage and provide a more conservative spatial risk bound for downstream beam management. After calibration, the empirical coverage at high confidence levels increases noticeably, indicating that the calibrated uncertainty intervals better contain the true future positions. It should be noted that conformal calibration is not intended to improve all probabilistic metrics simultaneously. Since the calibrated intervals are expanded according to validation residuals, the resulting uncertainty bounds become more conservative. This may reduce distribution sharpness, but it is beneficial for risk-sensitive beam management, where underestimating spatial uncertainty can lead to beam outage or severe beam misalignment.

Therefore, in this work, conformal calibration is evaluated not only by trajectory-level coverage, but also by its downstream effect on beam coverage, outage, and misalignment. A more detailed comparison with other post hoc calibration methods is provided in Section 4.4.3.

4.4.3. Comparison with Alternative Calibration Methods

To further examine the choice of split conformal calibration, several post hoc calibration methods are compared in this section. The trajectory prediction model is fixed. All methods use the same predicted mean, predicted variance, validation set, and test set. Only the uncertainty calibration step is changed. The compared methods include raw Gaussian uncertainty, temperature scaling, isotonic calibration, and split conformal calibration. Raw Gaussian directly uses the uncertainty predicted by the model. Temperature scaling applies a global scaling factor to the predicted uncertainty. Isotonic calibration learns a monotonic calibration mapping from the validation set. Split conformal calibration constructs prediction intervals using normalized residuals from the validation set.

In Table 4, Coverage denotes the empirical coverage under the 95% confidence level. Width@95 denotes the average width of the 95% prediction interval. Beam Coverage and Beam Outage are obtained by using the calibrated uncertainty in the same beam management strategy. The predicted mean, beam codebook, and beam-control parameters are kept unchanged. The raw Gaussian output gives the best NLL and the narrowest interval. However, its coverage is only 0.9055, which is lower than the nominal 95% level. This means that the original Gaussian uncertainty is under-covered. For beam management, this is risky. An underestimated uncertainty region may lead to an overly narrow beam and increase outage. Temperature scaling improves the coverage to 0.9642. Its beam coverage also increases compared with the raw Gaussian result. However, its ECE becomes higher. This shows that a single global scale can improve the 95% coverage, but it may not correct the whole calibration curve well. Isotonic calibration gives the lowest ECE. This indicates good overall calibration consistency. Its beam coverage and beam outage are also close to the best results. However, its coverage is slightly lower than that of split conformal calibration. Split conformal calibration achieves the highest 95% coverage. It also gives the best beam coverage and the lowest beam outage. Its interval is the widest, and its NLL is slightly worse than the raw Gaussian result. This is expected. Split conformal calibration is more conservative.

For the beam management task considered in this paper, this conservativeness is acceptable. The beam controller needs a reliable spatial risk bound. It does not only need the sharpest distribution. If uncertainty is underestimated, the selected beam may fail to cover the actual UAV position. Therefore, split conformal calibration is selected as the main calibration method in this work. These results also show that split conformal calibration is not universally better in all metrics. Its main advantage lies in reliable coverage and downstream beam-control robustness.

To further evaluate the proposed predictor, we compare it with representative uncertainty-aware baselines, including LSTM-Gaussian NLL, MC Dropout, and Deep Ensemble. All methods are evaluated under the same data split, prediction horizon, and beam-management setting.

The results in Table 5 show a clear trade-off among prediction accuracy, uncertainty reliability, and computational cost. CA-Gated achieves the most accurate trajectory prediction among the compared methods. This indicates that the cross-attention and gated fusion structure can better exploit the complementary information between motion states and visual context. However, its raw Gaussian uncertainty is not the most reliable. MC Dropout and Deep Ensemble provide better raw coverage, but they require repeated stochastic inference or multiple models, which increases online computational cost. From the beam-management perspective, CA-Gated achieves performance close to Deep Ensemble while using much lower inference time. This suggests that the proposed predictor is more suitable for online UAV-assisted ISAC scenarios, where both prediction quality and deployment efficiency are important. At the same time, the lower raw coverage of CA-Gated also confirms that likelihood-trained uncertainty should not be directly used as a control risk bound.

Overall, this comparison supports the design of the proposed pipeline. CA-Gated is used to provide accurate and efficient probabilistic trajectory prediction, while conformal calibration is further applied to improve the reliability of its uncertainty before beam-control decisions. In this way, the framework balances prediction accuracy, uncertainty reliability, and practical deployment cost.

4.4.4. Reliability Across Step, Dimension, and Difficulty

We further examine its behavior across spatial dimensions, prediction horizons, and task difficulty levels. In order to assess whether the observed improvements are consistent and robust.

From the spatial perspective, as illustrated in Table 6, the coverage rates and calibration errors exhibit certain variations across different coordinate dimensions. After calibration, the coverage in all three dimensions increases and becomes more conservative. However, the calibration error does not decrease uniformly across dimensions. In particular, the z dimension shows over-coverage after calibration, indicating that the conformal interval is conservative in this direction. This result suggests that calibration improves coverage robustness, but may sacrifice sharpness or dimension-wise calibration balance.

From the temporal perspective, as shown in Table 7, both uncertainty and coverage exhibit clear trends across different prediction steps. As the prediction horizon increases, the prediction error gradually grows, and the predicted uncertainty increases accordingly. However, under the raw Gaussian output, under-coverage appears in later prediction steps. This effect is less noticeable at early steps. After applying conformal calibration, the coverage improves across the entire horizon. The calibrated coverage values become closer to the target confidence level, especially at longer horizons. This suggests that calibration helps reduce the mismatch between predicted uncertainty and actual error in long-term prediction.

Under the most challenging difficulty level, a similar pattern can be observed. The raw uncertainty shows under-coverage, while the calibrated results move closer to the desired confidence level. The improvement is still observable under these conditions, although the intervals become more conservative.

In summary, the effect of conformal calibration is not limited to aggregated statistics. Similar trends can be observed across spatial dimensions, prediction horizons, and difficulty levels. The calibration step improves coverage consistency under different conditions, while introducing a certain degree of conservativeness. These properties may be useful for downstream applications that rely on stable uncertainty estimates.

4.5. Beam Management Results

4.5.1. Beam Management Performance in High-Risk Scenarios

To examine whether predictive uncertainty can be effectively utilized in downstream communication decisions, an adaptive beam management experiment is conducted based on predicted trajectories and their associated uncertainty.

The results in Table 8 show that uncertainty-aware strategies influence beam selection behavior and communication outcomes. Compared with the Adaptive Utility + Codebook baseline, the proposed Adaptive Conformal-Risk + Codebook strategy increases coverage by approximately 24.36%, while reducing outage and catastrophic misalignment by 38.74% and 43.32%, respectively.

These improvements demonstrate that incorporating calibrated uncertainty enables more robust beam selection under high-risk conditions. At the same time, this increased reliability comes with a moderate reduction in efficiency, as reflected by an approximately 4.89% decrease in overall utility.

This behavior indicates a clear trade-off between robustness and efficiency, where the proposed method prioritizes reliability by adaptively widening the beam based on calibrated uncertainty, rather than directly optimizing communication gain. Overall, the results suggest that calibrated uncertainty provides an effective mechanism for controlling this trade-off in dynamic UAV communication scenarios.

To further justify the parameter setting in the utility function, we conduct a sensitivity analysis on the outage penalty and switching penalty. The tested settings represent different operating preferences. The efficiency-oriented setting gives more priority to beam gain. The reliability-oriented setting corresponds to the default configuration used in the main high-risk evaluation in Table 8. The balanced and stronger reliability-oriented settings are reported as sensitivity settings to show how utility weights affect the trade-off between communication efficiency and robustness.

Table 9 shows a clear trade-off. When the outage penalty is small, the controller tends to select narrower beams. This leads to higher beam gain and higher utility. However, the coverage is lower, and the outage is higher. When the outage penalty increases, the controller becomes more conservative. It selects wider beams to cover a larger spatial risk region. As a result, coverage improves and outage decreases. The cost is also clear. Mean gain becomes lower, and the final utility decreases. This trend is consistent with the design of the utility function. The weights do not act as arbitrary parameters. They control the operating preference between efficiency and reliability. A smaller outage penalty is suitable when link gain is more important. A larger outage penalty is suitable when link robustness is preferred. The reliability-oriented setting corresponds to the default configuration used in the main high-risk evaluation in Table 8. The balanced and stronger reliability-oriented settings are reported as sensitivity settings to show how utility weights affect the trade-off between communication efficiency and robustness.

Overall, this analysis supports the assumption behind the beam management experiment. The communication improvement comes from a controllable risk-aware beamwidth adjustment, not from a fixed or manually selected beam behavior.

4.5.2. Sensitivity to Channel Assumptions

The beam management evaluation in this work is simulation-based. To further examine whether the communication results are overly dependent on a single ideal channel setting, we test the proposed strategy under several simulated channel assumptions. The prediction model, calibrated uncertainty, beam codebook, and utility weights are kept unchanged. Only the channel condition is changed.

The results show a clear trend in Table 10. Mild shadowing has little influence on coverage and outage. This is reasonable. Shadowing mainly changes link gain, while the geometric beam coverage is still determined by the predicted position and calibrated uncertainty. When NLOS loss is added, the mean effective link gain drops clearly. The utility also decreases. However, coverage and outage remain close to the nominal case. This suggests that the beamwidth decision still provides a stable spatial coverage region, even when the channel gain becomes worse. Under the severe channel setting, the performance degradation becomes more visible. Coverage decreases, and outage increases. The mean gain and utility also drop. This result is expected. Strong channel degradation cannot be fully compensated by trajectory prediction or beamwidth adaptation alone.

The proposed beam management strategy remains reasonably stable under different simulated channel assumptions. The results also show the limitation of the current evaluation. The method mainly addresses trajectory-uncertainty-induced beam risk. It does not replace robust channel estimation, blockage prediction, or physical-layer beamforming optimization. Real-world channel validation will be considered in future work.

4.6. Ablation Studies

To better understand the contribution of each key component in the proposed multimodal prediction framework, a series of ablation studies are conducted. The analysis focuses on three aspects: feature aggregation, gating mechanism, and fusion strategy. In addition, under the probabilistic modeling setting, the impact of different structures on uncertainty quality is also examined.

4.6.1. Effect of Feature Aggregation on Prediction Accuracy and Uncertainty Reliability

To further investigate the impact of feature aggregation strategies, we conduct a unified analysis of prediction accuracy and uncertainty reliability, as summarized in Table 11. The results show that different aggregation strategies lead to distinct trade-offs between these two aspects. The mean pooling strategy provides relatively stable performance but lacks flexibility in adapting to dynamic motion patterns, resulting in limited improvements in both accuracy and uncertainty calibration. In contrast, the last-step pooling strategy better captures recent motion dynamics, leading to improved probabilistic modeling quality, but with slightly less stable overall prediction performance. The cross-attention-based aggregation achieves a more balanced behavior across all evaluation metrics. By dynamically selecting relevant visual context conditioned on motion states, it enhances both prediction accuracy and uncertainty reliability without introducing significant degradation in either aspect. This indicates that structured cross-modal interaction is more effective than simple temporal aggregation in capturing the complex relationships between motion and environment.

Overall, these results suggest that feature aggregation plays a critical role not only in prediction accuracy, but also in shaping the quality and reliability of uncertainty estimation. The proposed cross-attention mechanism provides a practical way to balance these two objectives.

4.6.2. Effect of Gating Strength

To further examine the role of perceptual information in trajectory prediction, the gating coefficient

α

is varied to control the contribution of visual features. Table 12 shows a consistent improvement as

α

increases. When

α = 0

, the model performs the worst. This suggests that motion features alone may not be sufficient in more complex scenarios. Introducing a small amount of visual information leads to noticeable improvements across all metrics, suggesting that environmental perception effectively complements motion features. As the gating strength increases, performance continues to improve. The best results are obtained at

α = 0.2

. Compared with

α = 0

, ADE and FDE are reduced by approximately 6.46% and 6.40%, respectively. This suggests that incorporating visual information with an appropriate weight can improve prediction accuracy.

These results indicate that the contribution of perceptual features needs to be balanced. A moderate gating strength leads to a more stable combination of motion and visual information. Larger weights do not necessarily lead to proportional improvements, indicating a saturation effect under the current setting.

4.6.3. Effect of Fusion Strategy

To evaluate the impact of different multimodal fusion strategies on trajectory prediction performance, three approaches are compared: gated fusion, simple additive fusion, and a motion-only setting without visual input. The results are summarized in Table 13.

The additive fusion strategy introduces visual information through simple feature combination. This leads to a moderate reduction in prediction errors compared with the no-fusion setting. However, the improvement remains limited, indicating that direct feature addition may not sufficiently model interactions between modalities.

The gated fusion strategy shows further improvements over the other two approaches. Compared with the no-fusion setting, ADE and FDE are reduced by approximately 5.06% and 4.85%, respectively. This suggests that adaptive weighting of different modalities can better utilize complementary information.

Overall, these results show that simple feature aggregation cannot fully exploit multimodal information in UAV trajectory prediction. The gating mechanism, by adaptively weighting different modalities, enables more effective information integration and leads to better overall performance.

4.7. Complexity and Practical Deployment Analysis

To further evaluate the practical feasibility of the proposed framework, the computational complexity of different models is compared in this section. The comparison includes model parameters, MACs, inference time, memory cost, and beam decision cost. The results are shown in Table 14.

The CV baseline has the lowest cost, but it does not provide learnable temporal representation or uncertainty estimation. The LSTM baseline adds temporal modeling with a small computational burden. The gated models further introduce multimodal fusion and probabilistic prediction. The proposed CA-Gated model has the largest parameter count among the compared neural models. This is expected because it includes cross-attention, gated fusion, and uncertainty output. Even so, its model size remains below 1 MB, and its inference time is below 1 ms per sample. This shows that the added modules do not introduce a heavy online prediction burden.

The beam decision cost is also moderate. The proposed method uses a finite codebook instead of continuous beam search. Thus, the beam decision only needs to evaluate a limited number of candidate beams. The calibration step is lightweight. Split conformal calibration is performed after model training and only uses validation residuals to estimate calibration factors. No model retraining is required. During inference, the online calibration cost is negligible. The proposed beam selection does not claim global optimality in the continuous beamforming space. It is optimal only within the predefined codebook and the specified utility function. This design reduces online decision cost and improves deployment feasibility, but it does not provide a global optimality guarantee.

Overall, the proposed CA-Gated model is more complex than simpler baselines, but the additional cost is controlled. The model remains small, the inference time is short, and the beam decision is based on finite candidate search. These results support the practical feasibility of the proposed framework.

4.8. Temporal Resolution Sensitivity Analysis

To examine the influence of temporal resolution, we further evaluate the proposed framework under different sampling rates. Three representative settings are tested: 5 Hz, 10 Hz, and 20 Hz. The number of history and prediction steps is kept unchanged in this experiment. Therefore, different sampling rates correspond to different physical prediction horizons. At 5 Hz, the prediction horizon is 2.0 s. At 10 Hz, it is 1.0 s. At 20 Hz, it is 0.5 s. The results are shown in Table 15.

The 5 Hz setting gives the weakest performance. This is expected. A lower sampling rate makes the temporal interval larger. With the same number of prediction steps, the model needs to predict a longer physical time span. Prediction errors accumulate more easily. The uncertainty estimation also becomes less reliable. This leads to lower coverage and poorer beam management performance. The 10 Hz setting gives the best ADE, coverage, and calibration error among the tested settings. This indicates that 10 Hz provides a stable balance between trajectory prediction accuracy and uncertainty reliability. It also avoids the long-horizon error accumulation observed at 5 Hz. The 20 Hz setting achieves better FDE and beam management results. However, its prediction horizon is only 0.5 s. The task is shorter and easier from the beam-control perspective. Therefore, these beam results should not be interpreted as evidence that 20 Hz is universally better. Its coverage and calibration error are also slightly worse than those of the 10 Hz setting.

Overall, the results show that temporal resolution has a clear influence on prediction accuracy, uncertainty reliability, and beam management performance. The 10 Hz setting is not claimed to be globally optimal. It is used as the main setting because it provides a reasonable trade-off under the EuRoC multimodal synchronization condition. It keeps the prediction horizon meaningful while maintaining stable uncertainty calibration and acceptable beam-control performance.

5. Conclusions

Point prediction alone is insufficient for UAV-assisted ISAC beam management. In narrow-beam communication systems, even a moderate trajectory prediction error may lead to beam misalignment and link outage. For this reason, this paper studies UAV trajectory prediction not only as a perception task, but also as a control-oriented input for beam management. The main contribution of this work is to construct a prediction–calibration–beam management pipeline, where calibrated trajectory uncertainty is used as a risk signal for codebook-constrained beamwidth selection. The proposed framework does not claim novelty from the isolated use of LSTM, cross-attention, gated fusion, or conformal calibration. Instead, its value lies in connecting probabilistic trajectory prediction with uncertainty-aware communication control.

The experimental results show that the proposed CA-Gated model improves prediction accuracy compared with the Gated Uncertainty variant and representative uncertainty-aware baselines. Its raw uncertainty remains imperfect, which confirms the need for calibration before risk-sensitive beam management. Split conformal calibration improves the reliability of the predicted uncertainty and provides a more conservative risk bound for downstream decision-making. The model reduces ADE and FDE by approximately 2–3% in the main comparison, and the multi-run analysis further shows that the improvement is moderate but consistent. When calibrated uncertainty is used for beam management, the proposed strategy improves coverage and reduces outage in high-risk scenarios. The additional calibration-method comparison, utility-weight sensitivity analysis, and temporal-resolution analysis further show that the performance gain comes from risk-aware beamwidth adaptation rather than from a single fixed parameter setting.

The proposed framework is most useful in high-risk, high-mobility, and uncertainty-sensitive UAV-assisted ISAC scenarios. In these scenarios, the communication system needs to balance beamforming gain and coverage robustness. A narrow beam can improve gain, but it is vulnerable to prediction errors. A wider beam can improve robustness, but it reduces gain. The calibrated uncertainty provides a practical way to control this trade-off. Therefore, the framework is especially suitable for cases where UAV motion is uncertain, beam misalignment is costly, and the system needs an interpretable mechanism to adjust beamwidth according to prediction risk.

This work still has several limitations. The communication evaluation is simulation-based and does not include real UAV mmWave channel measurements. The current calibration method is post hoc, so it does not update online when the data distribution changes. The beam management strategy is also codebook-constrained and does not provide global optimality in the continuous beamforming space. Future work will consider real-world UAV channel measurement, more realistic blockage and NLoS modeling, online uncertainty calibration, and joint prediction-control training. These extensions will be important for moving the proposed framework from simulation-based validation toward practical deployment in UAV-assisted ISAC systems.

Author Contributions

Conceptualization, Q.C. and W.W.; methodology, W.W.; software, W.W.; validation, W.W. and Z.Z.; formal analysis, W.W.; investigation, W.W. and Z.Z.; resources, Q.C.; data curation, W.W.; writing—original draft preparation, W.W.; writing—review and editing, Q.C. and Z.Z.; visualization, W.W.; supervision, Q.C.; project administration, Q.C.; funding acquisition, Q.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project “Key Technologies and Demonstration Applications of Low-Altitude Integrated Flight of Manned and Unmanned Aircraft in Tibet”, grant number XZ202601ZY0091, and the Graduate Research Innovation Project of Civil Aviation Flight University of China.

Data Availability Statement

The EuRoC MAV dataset used in this study is publicly available. The processed data and experimental scripts are available from the corresponding author upon reasonable request.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT-4 for the purposes of improving the language and grammar.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Letaief, K.B.; Chen, W.; Shi, Y.; Zhang, J.; Zhang, Y.-J.A. The Roadmap to 6G: AI Empowered Wireless Networks. IEEE Commun. Mag. 2019, 57, 84–90. [Google Scholar] [CrossRef]
Saad, W.; Bennis, M.; Chen, M. A Vision of 6G Wireless Systems: Applications, Trends, Technologies, and Open Research Problems. IEEE Netw. 2019, 34, 134–142. [Google Scholar] [CrossRef]
Liu, F.; Cui, Y.; Masouros, C.; Xu, J.; Han, T.X.; Eldar, Y.C.; Buzzi, S. Integrated Sensing and Communications: Toward Dual-Functional Wireless Networks for 6G and Beyond. IEEE J. Sel. Areas Commun. 2022, 40, 1728–1767. [Google Scholar] [CrossRef]
Zeng, Y.; Zhang, R.; Lim, T.J. Wireless Communications with Unmanned Aerial Vehicles: Opportunities and Challenges. IEEE Commun. Mag. 2016, 54, 36–42. [Google Scholar] [CrossRef]
Mozaffari, M.; Saad, W.; Bennis, M.; Nam, Y.-H.; Debbah, M. A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems. IEEE Commun. Surv. Tutor. 2019, 21, 2334–2360. [Google Scholar] [CrossRef]
Hur, S.; Kim, T.; Love, D.J.; Krogmeier, J.V.; Thomas, T.A.; Ghosh, A. Millimeter Wave Beamforming for Wireless Backhaul and Access in Small Cell Networks. IEEE Trans. Commun. 2013, 61, 4391–4403. [Google Scholar] [CrossRef]
Ghosh, A.; Thomas, T.A.; Cudak, M.C.; Ratasuk, R.; Moorut, P.; Vook, F.W.; Rappaport, T.S.; MacCartney, G.R.; Sun, S.; Nie, S. Millimeter-Wave Enhanced Local Area Systems: A High-Data-Rate Approach for Future Wireless Networks. IEEE J. Sel. Areas Commun. 2014, 32, 1152–1163. [Google Scholar]
Va, V.; Heath, R.W. Basic Relationship between Channel Coherence Time and Beamwidth in Vehicular Channels. In 2015 IEEE 82nd Vehicular Technology Conference (VTC2015-Fall); IEEE: New York, NY, USA, 2015; pp. 1–5. [Google Scholar]
Alkhateeb, A.; Charan, G.; Osman, T.; Hredzak, A.; Morais, J.; Demirhan, U.; Srinivas, N. DeepSense 6G: A Large-Scale Real-World Multi-Modal Sensing and Communication Dataset. IEEE Commun. Mag. 2023, 61, 122–128. [Google Scholar]
Morais, J.; Bchboodi, A.; Pezeshki, H.; Alkhateeb, A. Position-Aided Beam Prediction in the Real World: How Useful GPS Locations Actually Are? In ICC 2023-IEEE International Conference on Communications; IEEE: New York, NY, USA, 2023; pp. 1824–1829. [Google Scholar]
Charan, G.; Hredzak, A.; Stoddard, C.; Berrey, B.; Seth, M.; Nunez, H.; Alkhateeb, A. Towards Real-World 6G Drone Communication: Position and Camera Aided Beam Prediction. In GLOBECOM 2022–2022 IEEE Global Communications Conference; IEEE: New York, NY, USA, 2022; pp. 2951–2956. [Google Scholar]
Jiang, S.; Charan, G.; Alkhateeb, A. LiDAR Aided Future Beam Prediction in Real-World Millimeter Wave V2I Communications. IEEE Wirel. Commun. Lett. 2023, 12, 212–216. [Google Scholar] [CrossRef]
Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Alahi, A.; Goel, K.; Ramanathan, V.; Robicquet, A.; Fei-Fei, L.; Savarese, S. Social LSTM: Human Trajectory Prediction in Crowded Spaces. In Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2016; pp. 961–971. [Google Scholar]
Gupta, A.; Johnson, J.; Fei-Fei, L.; Savarese, S.; Alahi, A. Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2018; pp. 2255–2264. [Google Scholar]
Lee, N.; Choi, W.; Vernaza, P.; Choy, C.B.; Torr, P.H.S.; Chandraker, M. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents. In Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2017; pp. 336–345. [Google Scholar]
Salzmann, T.; Ivanovic, B.; Chakravarty, P.; Pavone, M. Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2020; pp. 683–700. [Google Scholar]
Li, Z.; Chen, Z.; Li, Y.; Xu, C. Context-Aware Trajectory Prediction for Autonomous Driving in Heterogeneous Environments. Comput.-Aided Civ. Infrastruct. Eng. 2024, 39, 120–135. [Google Scholar] [CrossRef]
Mangalam, K.; An, Y.; Girase, H.; Malik, J. From Goals, Waypoints & Paths to Long Term Human Trajectory Forecasting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 15233–15242. [Google Scholar]
Sadeghian, A.; Kosaraju, V.; Sadeghian, A.; Hirose, N.; Rezatofighi, H.; Savarese, S. SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2019; pp. 1349–1358. [Google Scholar]
Li, G.; Li, Z.; Knoop, V.L.; van Lint, H. Unravelling Uncertainty in Trajectory Prediction Using a Non-Parametric Approach. Transp. Res. Part C Emerg. Technol. 2024, 163, 104659. [Google Scholar] [CrossRef]
Borja-Jaimes, V.; García-Morales, J.; Escobar-Jiménez, R.F.; Guerrero-Ramírez, G.V.; Adam-Medina, M. A Backstepping Sliding Mode Control of a Quadrotor UAV Using a Super-Twisting Observer. Appl. Sci. 2025, 15, 10120. [Google Scholar] [CrossRef]
Borja-Jaimes, V.; Coronel-Escamilla, A.; Escobar-Jiménez, R.F.; Adam-Medina, M.; Guerrero-Ramírez, G.V.; Sánchez-Coronado, E.M.; García-Morales, J. Fractional-Order Sliding Mode Observer for Actuator Fault Estimation in a Quadrotor UAV. Mathematics 2024, 12, 1247. [Google Scholar] [CrossRef]
Kendall, A.; Gal, Y. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? In Advances in Neural Information Processing Systems (NeurIPS); Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Lakshminarayanan, B.; Pritzel, A.; Blundell, C. Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles. In Advances in Neural Information Processing Systems (NeurIPS); Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Amini, A.; Schwarting, W.; Soleimany, A.; Rus, D. Deep Evidential Regression. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual, 6–12 December 2020; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 14927–14937. [Google Scholar]
Angelopoulos, A.N.; Bates, S. A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. Found. Trends Mach. Learn. 2023, 16, 494–591. [Google Scholar] [CrossRef]
Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML); PMLR: Sydney, Australia, 2017; pp. 1321–1330. [Google Scholar]
Kuleshov, V.; Fenner, N.; Ermon, S. Accurate Uncertainties for Deep Learning Using Calibrated Regression. In Proceedings of the 35th International Conference on Machine Learning (ICML); PMLR: Stockholm, Sweden, 2018; pp. 2796–2804. [Google Scholar]
Va, V.; Shimizu, T.; Bansal, G.; Heath, R.W. Beam Design for Beam Switching Based Millimeter Wave Vehicle-to-Infrastructure Communications. In 2016 IEEE International Conference on Communications (ICC); IEEE: New York, NY, USA, 2016; pp. 1–6. [Google Scholar]
Giordani, M.; Polese, M.; Roy, A.; Castor, D.; Zorzi, M. A Tutorial on Beam Management for 3GPP NR at mmWave Frequencies. IEEE Commun. Surv. Tutor. 2019, 21, 173–196. [Google Scholar] [CrossRef]
Xue, Q.; Ji, C.; Ma, S.; Guo, J.; Xu, Y.; Chen, Q.; Zhang, W. A Survey of Beam Management for mmWave and THz Communications Towards 6G. IEEE Commun. Surv. Tutor. 2024, 26, 1520–1559. [Google Scholar] [CrossRef]
Ali, A.; González-Prelcic, N.; Heath, R.W. Millimeter Wave Beam-Selection Using Out-of-Band Spatial Information. IEEE Trans. Wirel. Commun. 2018, 17, 1038–1052. [Google Scholar] [CrossRef]
Kumari, P.; Choi, J.; González-Prelcic, N.; Heath, R.W. IEEE 802.11ad-Based Radar: An Approach to Joint Vehicular Communication-Radar System. IEEE Trans. Veh. Technol. 2018, 67, 3012–3027. [Google Scholar] [CrossRef]
Zhu, X.; Liu, J.; Lu, L.; Zhang, T.; Qiu, T.; Wang, C.; Liu, Y. Enabling Intelligent Connectivity: A Survey of Secure ISAC in 6G Networks. IEEE Commun. Surv. Tutor. 2025, 27, 748–781. [Google Scholar] [CrossRef]
Shao, Z.; Zhao, X.; Liu, Z.; Zhu, X.; Wu, W.; Li, D.; Tian, F. Robust Power Control for Dual-Polarized Holographic MIMO Enabled ISAC Systems. IEEE Trans. Veh. Technol. 2026, 75, 5748–5763. [Google Scholar] [CrossRef]

Figure 1. Overall framework of the proposed uncertainty-aware UAV trajectory prediction and adaptive beam management pipeline.

Figure 2. Architecture of the multimodal probabilistic trajectory prediction model with cross-attention and gated fusion.

Figure 3. Distribution of trajectory prediction errors across different models on the test set.

Figure 4. Relationship between predicted uncertainty and trajectory prediction error (ADE), including sample-wise distribution and binned statistics.

Figure 5. Variation in prediction error, uncertainty, and coverage rate across different difficulty levels. Green, orange, and blue bars correspond to the Easy, Medium, and Difficult trajectory subsets, respectively.

Figure 6. Calibration curves before and after conformal calibration, comparing empirical coverage with nominal confidence levels.

Table 1. Overall trajectory prediction performance comparison across different models on the test set. ADE: Average Displacement Error; FDE: Final Displacement Error; RMSE: Root Mean Square Error; MAE: Mean Absolute Error; NLL: Negative Log-Likelihood; CalibErr: Calibration Error.

Model	ADE (m)	FDE (m)	RMSE (m)	MAE (m)	NLL	CalibErr
CV Baseline	0.1893	0.4350	0.1576	0.0927	N/A	N/A
LSTM Baseline	0.1725	0.3956	0.1405	0.0856	N/A	N/A
Gated (Deterministic)	0.1700	0.3780	0.1382	0.0841	N/A	N/A
Gated (Uncertainty)	0.1575	0.3624	0.1288	0.0785	−1.6430	0.0529
CA-Gated (Ours)	0.1536	0.3540	0.1200	0.0734	−1.6845	0.0445

Table 2. Statistical robustness analysis over five random seeds. ADE: Average Displacement Error; FDE: Final Displacement Error; NLL: Negative Log-Likelihood; Coverage: Empirical Coverage Rate.

Model	ADE (m)	FDE (m)	NLL	Coverage
LSTM Baseline	0.1689 ± 0.0028	0.3915 ± 0.0069	N/A	N/A
Gated (Uncertainty)	0.1594 ± 0.0021	0.3786 ± 0.0060	−1.6014 ± 0.0162	0.9029 ± 0.0028
CA-Gated (Ours)	0.1535 ± 0.0024	0.3524 ± 0.0056	−1.6900 ± 0.0229	0.9057 ± 0.0030
p-value	0.0002	0.0107	<1 × 10⁻⁴	<1 × 10⁻⁴

Table 3. Coverage performance before and after conformal calibration at different confidence levels.

Metric	Pre	Post	Improvement
Coverage@90	0.8668	0.9186	+0.0518
Coverage@95	0.9055	0.9700	+0.0645
Coverage@99	0.9507	0.9973	+0.0466

Table 4. Comparison of uncertainty calibration methods.

Method	Coverage	ECE	NLL	Width@95	Beam Coverage	Beam Outage
Raw Gaussian	0.9055	0.0523	−1.6845	0.3637	0.6187	0.3813
Temperature Scaling	0.9642	0.1213	−1.6839	0.5348	0.6946	0.3054
Isotonic Calibration	0.9676	0.0127	−1.6735	0.5506	0.6964	0.3036
Split Conformal	0.9700	0.0131	−1.6635	0.5650	0.6971	0.3029

Table 5. Comparison with representative uncertainty-aware trajectory prediction baselines before conformal calibration.

Method	ADE (m)	FDE (m)	NLL	Coverage	CalibErr	Beam Coverage
LSTM-Gaussian NLL	0.1722	0.3826	−1.7050	0.9194	0.0306	0.5952
MC Dropout	0.1680	0.3746	−1.7782	0.9515	0.0015	0.6049
Deep Ensemble	0.1643	0.3728	−1.7730	0.9410	0.0091	0.6206
CA-Gated (Ours)	0.1536	0.3540	−1.6845	0.9055	0.0445	0.6187

Table 6. Uncertainty reliability analysis across spatial dimensions.

Dimension	Pre Coverage	Post Coverage	Improvement	Pre Calibration Error	Post Calibration Error	Change
x	0.8793	0.9933	0.1140	0.0707	0.0433	+0.0274
y	0.8983	0.9987	0.1064	0.0517	0.0487	+0.0029
z	0.9388	1.000	0.0612	0.0112	0.0500	−0.0388

Table 7. Evolution of coverage and calibration error across prediction horizons.

Step	Pre Coverage	Post Coverage	Pre Calibration Error	Post Calibration Error
t = 1	1.000	1.000	0.0500	0.0500
t = 2	1.000	1.000	0.0500	0.0500
t = 3	1.000	1.000	0.0500	0.0500
t = 4	0.9929	1.000	0.0429	0.0500
t = 5	0.9696	1.000	0.0196	0.0500
t = 6	0.9045	0.9985	0.0455	0.0485
t = 7	0.8313	0.9958	0.1187	0.0458
t = 8	0.8021	0.9935	0.1479	0.0435
t = 9	0.7780	0.9923	0.1720	0.0423
t = 10	0.7765	0.9935	0.1735	0.0435

Table 8. Beam management performance under different uncertainty-aware strategies in high-risk scenarios.

Policy	Coverage	Outage	Catastrophic	Mean Beam Width (°)	Mean Gain (dBi)	Utility
Fixed Narrow	0.4825	0.5175	0.3967	5.00	12.81	1.6757
Fixed Wide	0.8136	0.1864	0.1106	60.00	2.02	1.4270
Adaptive Utility + Codebook	0.6187	0.3813	0.2253	16.18	7.79	1.5286
Adaptive Risk + Codebook	0.7198	0.2802	0.1488	31.79	4.97	1.4809
Adaptive Conformal-Risk + Codebook	0.7664	0.2336	0.1277	42.80	3.54	1.4538

Table 9. Sensitivity analysis of utility weights.

Setting	$λ_{out}$	$λ_{sw}$	Coverage	Outage	Mean Beam Width	Mean Gain	Utility
Efficiency-oriented	0.0	0.1	0.7128	0.2872	30.06	5.19	1.5998
Balanced	0.4	0.1	0.7198	0.2802	31.79	4.97	1.4809
Reliability-oriented	0.6	0.1	0.7620	0.2380	41.66	3.64	1.4072
Strong reliability-oriented	0.8	0.1	0.7630	0.2370	41.82	3.62	1.3604

Table 10. Sensitivity analysis under different simulated channel assumptions.

Channel Setting	Shadowing/Extra Loss (dB)	Coverage	Outage	Mean Beam Width	Mean Effective Link Gain	Utility
Nominal Channel	N/A	0.7638	0.2362	42.8	3.54	1.4507
Shadowing Channel	4	0.7621	0.2379	42.8	0.35	1.2738
NLOS Channel	+10	0.7564	0.2436	42.8	−6.46	0.9222
Severe Channel	6/+15	0.7424	0.2576	42.8	−11.50	0.7549

Table 11. Effect of temporal feature aggregation strategies on trajectory prediction performance.

Policy	ADE (m)	FDE (m)	RMSE (m)	MAE (m)	NLL	CalibErr	Cov95
Mean	0.1535	0.3537	0.1221	0.0738	−1.6678	0.0279	0.9221
Last	0.1536	0.3557	0.1214	0.0734	−1.7749	0.0304	0.9196
Cross-attn	0.1530	0.3488	0.1219	0.0729	−1.7750	0.0274	0.9260

Table 12. Effect of gating strength on multimodal trajectory prediction performance.

Gating Coefficient (α)	ADE (m)	FDE (m)	RMSE (m)	MAE (m)
Alpha = 0.0	0.1797	0.4033	0.1452	0.0891
Alpha = 0.05	0.1752	0.3850	0.1417	0.0865
Alpha = 0.1	0.1713	0.3851	0.1406	0.0865
Alpha = 0.2	0.1681	0.3775	0.1381	0.0865

Table 13. Synergy between feature structure and uncertainty modeling under different fusion strategies.

Policy	ADE (m)	FDE (m)	RMSE (m)	MAE (m)
Gate	0.1650	0.3745	0.1339	0.0820
Additive	0.1730	0.3865	0.1406	0.0861
None	0.1738	0.3936	0.1451	0.0862

Table 14. Computational complexity and practical deployment cost comparison of different models.

Model	Params	MACs (M)	Infer. Time (ms)	Memory (MB)	Beam Cost (ms)
CV Baseline	0	0.000	0.0250	0.0000	N/A
LSTM Baseline	77,086	1.448	0.2648	0.2941	N/A
Gated Deterministic	110,110	1.480	0.5737	0.4200	N/A
Gated Uncertainty	113,980	1.484	0.9470	0.4348	2.2421
CA-Gated Ours	213,052	2.833	0.6672	0.8127	2.0034

Table 15. Sensitivity analysis under different temporal resolutions.

Sampling Rate	Horizon Time (s)	ADE (m)	FDE (m)	Coverage	CalibErr	Beam Coverage	Beam Outage
5 Hz	2.0	0.5068	0.9663	0.5639	0.3861	0.4681	0.5319
10 Hz	1.0	0.1536	0.3540	0.9055	0.0445	0.6187	0.3813
20 Hz	0.5	0.1746	0.2890	0.8862	0.0638	0.6520	0.3480

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cheng, Q.; Wu, W.; Zhao, Z. Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios. Drones 2026, 10, 434. https://doi.org/10.3390/drones10060434

AMA Style

Cheng Q, Wu W, Zhao Z. Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios. Drones. 2026; 10(6):434. https://doi.org/10.3390/drones10060434

Chicago/Turabian Style

Cheng, Qing, Wenwen Wu, and Ziwei Zhao. 2026. "Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios" Drones 10, no. 6: 434. https://doi.org/10.3390/drones10060434

APA Style

Cheng, Q., Wu, W., & Zhao, Z. (2026). Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios. Drones, 10(6), 434. https://doi.org/10.3390/drones10060434

Article Menu

Uncertainty-Calibrated UAV Trajectory Prediction for Beam Management in UAV-Assisted ISAC Scenarios

Highlights

Abstract

1. Introduction

2. Related Work

2.1. UAV Trajectory Prediction

2.2. Uncertainty Modeling and Calibration

2.3. Predictive Beam Management for ISAC

3. Proposed Method

3.1. Overall Framework

3.2. Probabilistic Multimodal Trajectory Predictor

3.2.1. Problem Formulation

3.2.2. Motion Encoder

3.2.3. Visual Context Encoder

3.2.4. Multimodal Fusion Mechanism

3.2.5. Probabilistic Output Head and Gaussian NLL

3.3. Uncertainty Modeling

3.3.1. Split Conformal Calibration on Validation Set

3.3.2. Reliability Diagnostics Across Dimension, Horizon, and Difficulty

3.3.3. Calibration-Driven Risk Quantification for Beam Control

3.4. Risk-Aware Adaptive Beam Management

4. Experiments

4.1. Experimental Setup

4.2. Evaluation Metrics

4.2.1. Prediction Accuracy

4.2.2. Uncertainty Reliability

4.2.3. Beam Control Utility

4.3. Prediction Performance

4.4. Uncertainty Reliability and Calibration

4.4.1. Why Raw Gaussian Uncertainty Is Insufficient

4.4.2. Effect of Conformal Calibration

4.4.3. Comparison with Alternative Calibration Methods

4.4.4. Reliability Across Step, Dimension, and Difficulty

4.5. Beam Management Results

4.5.1. Beam Management Performance in High-Risk Scenarios

4.5.2. Sensitivity to Channel Assumptions

4.6. Ablation Studies

4.6.1. Effect of Feature Aggregation on Prediction Accuracy and Uncertainty Reliability

4.6.2. Effect of Gating Strength

4.6.3. Effect of Fusion Strategy

4.7. Complexity and Practical Deployment Analysis

4.8. Temporal Resolution Sensitivity Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI