4.1. Simulation Setup
All simulations were implemented in MATLAB R2023b (MathWorks, Natick, MA, USA) using the parameter settings summarized in
Table 2 and
Table 3. For each Monte Carlo trial, UAV locations were generated according to a homogeneous PPP with density
inside a spherical coverage region of radius
R. Independent small-scale fading coefficients
were generated for each UAV–UE link.
The RL baselines (REINFORCE and PPO-Lagrangian) were trained using 500 episodes with 200 interaction steps per episode and a two-layer MLP architecture with 64 hidden units. All reported results were averaged over 100 independent Monte Carlo deployments for each operating point to ensure statistical reliability.
In this section, the proposed LSRFDA is compared with PSO, Random Search, REINFORCE, and PPO-Lagrangian under different UAV densities and coverage radii. The evaluation focuses on key initial-access metrics, including expected latency, scanning time, successful detection probability, energy consumption, and optimal beamwidth selection.
Expected beam alignment latency: As illustrated in
Figure 4a,b, a decreasing trend in expected latency is observed as UAV density increases. The performance gap between methods is found to be more pronounced in sparse scenarios, where only a few UAVs are available. In this regime, a latency close to one mini-slot is maintained by LSRFDA, while significantly higher values are exhibited by PSO and Random Search. This advantage is explained by the ability of LSRFDA to rapidly focus on effective beam directions even when the spatial distribution is limited. In contrast, more exploration is required by other methods to locate a suitable UAV, which increases the detection time. As the density increases, the environment becomes less challenging and the performance gap naturally reduces, since multiple UAVs are available in most directions. These results indicate that the superiority of LSRFDA is most pronounced under critical network conditions, where the number of candidate UAVs is small and efficient beam adaptation becomes essential for fast and reliable detection.
The results in
Table 4 confirm that all algorithms satisfy the variance constraint, with values remaining below 0.01. The lowest latency (1.12 mini-slots) and the lowest variance (0.0014) are achieved by LSRFDA, confirming that faster and more reliable alignment is obtained compared with all other methods. Slightly higher values are shown by PPO-Lagrangian and REINFORCE, while larger latency and variability are exhibited by PSO and Random Search, with Random Search being the least consistent. Overall, more reliable and stable alignment behaviour is achieved by LSRFDA under equivalent network conditions.
Expected scanning time: Figure 5a,b illustrate the scanning time required to detect a UAV under different coverage radii. As UAV density increases, scanning time decreases for all methods because fewer beam sweeps are needed to find a suitable UAV. The shortest scanning time across all values of
R and
is achieved by LSRFDA.
Successful detection probability: Figure 6 shows the successful detection probability
as a function of UAV density
for different coverage radii. Overall, the advantage of LSRFDA is most visible at smaller coverage radii and low UAV densities, where the number of available UAVs is limited. In this regime, a high detection probability is maintained by LSRFDA while a significant drop is experienced by the other methods. As the coverage radius or the UAV density increases, this gap gradually reduces, since the presence of multiple UAVs makes the detection process less sensitive to the optimisation strategy. These results indicate that more reliable detection is maintained by LSRFDA in challenging conditions, particularly at low UAV densities, where other methods fail to maintain reliable detection.
Optimal Beamwidth: Figure 7 presents how the optimal beamwidth
evolves with the coverage radius
R for different UAV density settings. Smaller beamwidth values are consistently selected by LSRFDA across all radii. This indicates that the search process is concentrated by LSRFDA toward a more directional beam configuration without degrading detection performance. In practice, this improvement means fewer unnecessary beam directions are explored, which is desirable for fast initial access. On the other hand, wider beams tend to be retained by PSO, while more irregular behaviour depending on the radius is shown by Random Search. The learning-based methods, PPO-Lagrangian and REINFORCE, lie somewhere in between, with smoother variations. Overall, strong adaptability in adjusting the beamwidth to the spatial conditions is demonstrated by LSRFDA, especially when the UAV distribution becomes denser and narrower beams are sufficient.
Energy considerations: The total energy consumed during the beam-alignment process for LSRFDA, PSO, Random Search, RL, and PPO-Lagrangian is illustrated in
Figure 8a,b for
m and
m. At low UAV densities, the lowest energy consumption is achieved by LSRFDA because fewer beam scans are needed and convergence is reached faster. As UAV density increases, a decrease and eventual saturation in energy consumption is shown by all algorithms since a nearby UAV is detected quickly. Even in this dense regime, the most stable and energy-efficient performance is maintained by LSRFDA.
4.2. Sensitivity Analysis
To evaluate the robustness of the proposed framework, key system parameters are varied individually around the baseline values in
Table 3.
Table 5 reports the resulting latency variance for LSRFDA and PSO under each condition, with all other parameters kept at their baseline values. Several observations can be drawn from
Table 5. As the path-loss exponent
increases from 2.0 to 4.0, the latency variance rises for both methods, with PSO showing a larger relative increase (from 0.0031 to 0.0071) compared with LSRFDA (from 0.0009 to 0.0028). This confirms that risk-aware beamwidth selection becomes more critical under severe propagation conditions. A similar trend is observed as the blockage parameter
increases: LSRFDA remains below the variance threshold
across all tested values, whereas PSO approaches the constraint boundary at
. Tightening
from 0.05 to 0.005 has a limited effect on LSRFDA, since the variance constraint is already satisfied with margin at the baseline setting. Finally, THz operation at 300 GHz leads to the highest variance values across all methods due to increased molecular absorption, further motivating risk-aware design for future 6G systems. In all tested conditions, LSRFDA maintains lower latency variance than PSO, confirming the robustness of the proposed framework across a wide range of operating environments.
4.3. Post-Alignment Throughput and Spectral Efficiency
After beam alignment with the optimised beamwidth
, the downlink achievable rate is estimated as
where
B is the channel bandwidth and
is the mean UAV–UE distance under the PPP model. The channel bandwidth
B at each carrier frequency is set to
,
,
, and
GHz at 28, 60, 120, and 300 GHz, respectively, following representative 3GPP NR channel allocations for the mmWave band and IEEE 802.15.3d channel plans for the THz band, consistent with the values adopted in [
15]. The SNR is computed from (
7) using the optimal beamwidth returned by each method; since LSRFDA consistently selects narrower beamwidths, the resulting antenna gain
is higher, which translates directly into improved post-alignment SNR and estimated throughput.
Table 6 reports the achievable rates obtained across the mmWave and THz frequency range for all compared methods, computed directly from the optimal beamwidths produced by each algorithm and the SNR expression in (
7).
The throughput values reported in
Table 6 are obtained analytically from the post-alignment SNR. For each method, the optimised beamwidth
is substituted into the antenna-gain and SNR expressions in (
7), after which the achievable rate is computed using (
23). The indirect impact of beamwidth optimisation on post-alignment communication performance is therefore reflected in the table.
The results indicate that consistently higher estimated throughput is achieved by LSRFDA than by the other methods across all tested frequencies, with the relative advantage becoming more pronounced at higher frequencies where narrower beams are needed to compensate for increased path loss. At 300 GHz, molecular absorption increases path-loss variability, and a more noticeable throughput reduction is experienced by methods producing wider beamwidths. It should be noted that these results represent analytical throughput estimates derived from the post-alignment SNR model. In practical deployments, the achievable throughput would additionally depend on factors such as channel estimation accuracy, hardware impairments, and beam-tracking overhead.