4.2. Algorithm Comparison Experiment
To systematically assess the impact of different improvement modules on UAV path planning performance, this study employs a stepwise module-comparison design. Starting with the basic Double DQN model, while keeping the environment, network structure, and hyperparameters consistent, the Noise Penalty, NDM, ABN, MSL, and PER modules are sequentially introduced. By constructing progressively enhanced comparison models, the influence of each module on policy learning can be clearly identified.
Table 1 shows the performance changes in the path after each module is added, and
Figure 7 presents the time-series curve of the sound pressure level at the nearest receiver points along the path for each model.
As shown in
Table 1, the results reflect a progressive module-addition process, in which each module is introduced on top of the previous configuration. The baseline DDQN shows the worst noise performance, with the highest average SPL of 60.85 dB and the highest maximum SPL at the nearest building façade of 69 dB. Here, the maximum façade SPL is evaluated only during the steady-flight phase, excluding takeoff and landing. After adding the noise penalty, the average SPL decreases only slightly to 60.17 dB, while the maximum façade SPL remains 69 dB, indicating that the reward-level noise penalty alone has limited effect. After introducing NDM, the average SPL, safety risk, and maximum façade SPL all decrease noticeably, showing that this module can effectively suppress high-noise actions.
With the further addition of ABN and MSL, the average SPL remains at a relatively low level of about 52 dB, while flight time and safety are further improved. Among them, MSL achieves the lowest average SPL of 52.11 dB. After incorporating PER, the flight time is further reduced to 195.37 s, which is the best among all configurations, while both the average SPL and maximum façade SPL remain low. Overall, the final enhanced model achieves a good balance among flight efficiency, noise control, and safety.
Figure 8 further reflects the differences in noise compliance among the ablation models. The baseline DDQN remains above the 55 dB threshold over most of the trajectory and exhibits several pronounced peaks, indicating poor noise control. After adding the noise penalty, some high-noise segments are reduced, but repeated threshold exceedances are still observed in the middle section of the trajectory, suggesting that reward-level penalization alone is insufficient to achieve stable noise control.
After introducing NDM, the SPL curve decreases significantly and becomes smoother, with most segments remaining below the threshold, indicating that this module can effectively improve noise-compliance stability. With the further addition of ABN and MSL, the curve becomes more stable overall, showing that these modules further enhance the continuity and consistency of low-noise decisions. After incorporating PER, although slight local fluctuations appear in the final stage, the overall performance is still clearly better than that of the baseline and the noise-penalty-only model.
It should also be noted that, because different modules generate different final trajectories, the SPL data are no longer updated once the path reaches the target point in the later stage. As a result, some areas in the figure show missing SPL data.
Figure 9 compares the path-planning behaviors of different ablation models in an urban environment. As noise-related modules are gradually introduced, clear changes can be observed in both the horizontal trajectory and altitude distribution.
Figure 9a shows that, after noise constraints are introduced, the models tend to avoid dense building areas in the horizontal plane.
Figure 9b shows that the models generally begin climbing earlier and maintain higher flight altitudes to reduce noise exposure near buildings.
For the baseline DDQN, the path is relatively direct, with only slight deviations near building boundaries, and the flight altitude is mainly maintained at 50–60 m, indicating that it mainly relies on local obstacle avoidance. After introducing the Noise Penalty, the cruising altitude increases to about 60–70 m, suggesting that the model begins to reduce noise exposure by increasing altitude.
With the further addition of ABN, NDM, and MSL, the preference for low-noise regions becomes more evident, and the cruising altitude increases overall. After incorporating PER, the horizontal path becomes smoother and the altitude remains at about 90–100 m, showing better continuity and coordination.
Overall, with the progressive addition of noise-related modules, the path-planning strategy gradually evolves from low-altitude direct obstacle avoidance to a noise-constrained planning mode that combines horizontal detours and altitude compensation, thereby achieving a better balance among noise control, trajectory continuity, and operational efficiency.
Figure 10 shows the evolution of the total reward per episode during training. Overall, all models exhibit a rapid reward increase in the early stage and then gradually become stable, indicating that all models can learn feasible policies. However, their convergence behaviors differ clearly. The baseline DDQN and the Noise Penalty model show relatively large fluctuations in the early and middle stages, indicating weaker training stability.
After introducing NDM, the reward curve becomes noticeably smoother and the model enters the stable stage earlier, showing that this module can effectively reduce training oscillations. With the further addition of ABN, the reward evolution becomes more stable, indicating that it helps improve action-selection consistency and training stability. After incorporating MSL, the reward evolution becomes more continuous, suggesting that multi-step return modeling enhances long-term reward estimation. Finally, after adding PER, the model shows the most stable behavior in the late training stage, indicating that prioritized experience replay further improves training stability and sample utilization efficiency.
Based on the analysis of each module’s contribution in the ablation experiment, this study further conducts an algorithm comparison experiment.
Figure 11 presents the path comparison results of TNAP-DDQN, B-APF-DQN [
47], S-JPS [
48], and Dueling-DQN [
49], while
Table 2 summarizes the performance of each algorithm.
TNAP-DDQN has a flight time of 195.37 s, which is 13.64%, 13.39%, and 19.63% higher than that of B-APF-DQN, S-JPS, and Dueling-DQN, respectively, indicating that it adopts a more conservative trajectory strategy under explicit noise constraints and therefore incurs additional time cost.
However, this time cost leads to better noise-control performance. The average SPL of TNAP-DDQN is 52.66 dB, which is 6.38 dB, 6.19 dB, and 4.19 dB lower than that of B-APF-DQN, S-JPS, and Dueling-DQN, respectively. Its maximum SPL at the nearest building façade is also the lowest, at 54.38 dB. Here, the maximum façade SPL is evaluated only during the steady-flight phase, excluding takeoff and landing. These results indicate that TNAP-DDQN can more effectively reduce noise exposure during flight.
In terms of safety, the safety-risk value of TNAP-DDQN is 2.3564 × 10−4, which is 9.57%, 18.89%, and 5.56% lower than that of B-APF-DQN, S-JPS, and Dueling-DQN, respectively. Overall, TNAP-DDQN shows better noise-control performance and lower safety risk than the compared algorithms. Although the differences in computation time are small, clear differences still exist in flight time, noise, and safety metrics.
Figure 12 shows the time-series results of the sound pressure level (SPL) at the nearest receiver points along the path for each algorithm, highlighting the differences in noise exposure across the planned trajectories. Overall, the SPL curve of TNAP-DDQN is the lowest and most stable. For most of the route, its SPL remains clearly below the 55 dB threshold, indicating that the proposed method can more effectively incorporate noise constraints into path planning and maintain a stable low-noise state along the trajectory.
In contrast, the SPL curves of B-APF-DQN, S-JPS, and Dueling-DQN are generally above the threshold for most of the cruise phase, with only a few intervals close to the threshold. These algorithms are less effective in maintaining stable noise compliance and show more pronounced noise peaks in certain intervals.
All algorithms exhibit an instantaneous SPL increase during the takeoff and climb phases, caused by the transient effect of the changing distance between the UAV and the receiver, rather than a reflection of noise control during the cruise phase. The advantage of TNAP-DDQN is that it maintains lower and more stable SPLs during the cruise phase, avoiding prolonged periods above the threshold seen in other algorithms.
It should be noted that due to the different paths generated by each algorithm, some algorithms do not have SPL data in the later stages of the path. This reflects performance differences across algorithms.
4.3. Comparison and Analysis of Paths with Different Noise Constraints
To further investigate the impact of noise constraints on urban low-altitude UAV path planning and to analyze the required AGL under different UAV source noise levels, comparative experiments were conducted by adjusting the local noise limits and the source noise intensity while keeping the environmental parameters unchanged. Based on the preceding analysis, the UAV noise source was modeled as a point source in the subsequent experiments, and a reference sound pressure level of approximately 89 dB at a reference distance of 1 m was adopted as the baseline. On this basis, to compare the influence of different source noise levels on path-planning results, the UAV source noise level was discretized at 3 dB intervals, ranging from 65 dB to 95 dB. These discrete values were used as the parameter in Equation (20) to compute the received sound level at an arbitrary distance .
The local noise limits were defined according to the standards for acoustic environment functional zones [
43], including the Class 0 (
) and Class 1 (
) limits, while the Class 2 limit (
) was used as the feasibility protection threshold. In this study, these limits correspond to the daytime limits specified in the acoustic-environment functional-zone standard [
43].
- (i)
Noise threshold
Appendix B (
Table A3) summarizes the path-optimization results under different UAV sound pressure levels for a fixed noise threshold of
. The table reports the corresponding flight time, safety-risk value, and average sound pressure level at the nearest receiver points for each UAV sound pressure level, thereby providing a quantitative basis for analyzing how the UAV sound pressure level affects trajectory performance under the same threshold constraint.
As shown in
Table A3, as the UAV source noise level gradually decreases from 95 dB to 65 dB, the average sound pressure level correspondingly declines from 55.35 dB to 38.09 dB, while the flight time is also reduced overall from 217.15 s to 162.61 s. This indicates that, under higher UAV source noise levels, the path-planning strategy tends to adopt more conservative flight maneuvers in order to satisfy the local noise limits, thereby resulting in a higher time cost. In contrast, when the UAV source noise level is lower, the feasible space under the local noise limits is relatively expanded, and the corresponding constraint cost is reduced accordingly.
It should also be noted that the Safety Risk Value does not exhibit a strictly monotonic trend. This is mainly because, under different UAV source noise levels, the optimization process may produce different flight paths, and the proportions of building-shielded areas and unshielded ground areas traversed by these paths are not necessarily the same. Since ground risk is influenced not only by the altitude above ground level (AGL) but also closely related to the shielding conditions of the underlying area, the Safety Risk Value shows some fluctuations rather than a simple monotonic relationship with the UAV source noise level. Nevertheless, from the overall results, these fluctuations are relatively small, indicating that the safety-risk levels remain generally within a similar range under different source noise conditions.
Figure 13 illustrates the UAV trajectory characteristics under different UAV source noise levels from both the top-view and front-view perspectives. In the top view (
Figure 13a), under the same local-noise-limit constraint, the overall horizontal direction of the trajectories remains consistent, while noticeable differences can still be observed in the detailed spatial layouts, especially in the middle and later segments of the routes. This indicates that, under noise constraints, different UAV source noise levels affect the spatial organization of the trajectories, leading to different path configurations while preserving a broadly similar start-to-goal direction.
In the front view (
Figure 13b), the AGL profiles exhibit a clear layered platform structure, and the cruising AGL generally increases with the UAV source noise level. Specifically, lower UAV source noise levels (65–77 dB) mainly correspond to relatively low AGL platforms of about 50–60 m. A UAV source noise level of 83 dB corresponds to a cruise platform of about 60–70 m, while 86 dB further increases to about 80–90 m. At 89 dB, the cruise AGL rises to about 90–100 m, whereas 92 dB and 95 dB require significantly higher platforms of approximately 120 m and 140 m, respectively. These results indicate that, as the source becomes louder, the feasible low-altitude corridor is progressively compressed, and the planning strategy therefore becomes more reliant on earlier climbing, maintaining a higher AGL, and sustaining longer high-altitude cruise segments in order to enlarge the propagation distance margin and preserve noise compliance.
Figure 14 shows that, as the UAV source noise level increases, the SPL curves at the nearest receiver points shift upward and progressively approach the 55 dB threshold, indicating a gradual reduction in the available noise-feasibility margin. Under low UAV source noise levels (65–77 dB), the curves remain stably below the threshold over most of the trajectory, whereas under moderate UAV source noise levels (80–86 dB), several key segments become critically close to the threshold. For 89–92 dB, the curves remain near the threshold in multiple segments, suggesting a markedly narrowed noise-feasibility margin; by contrast, 95 dB is the only case that is clearly clustered around the threshold and more prone to local exceedances. In all cases, transient SPL rises are observed during the initial climb and final descent, mainly due to rapid changes in the UAV–receiver distance during these phases. Combined with
Figure 13b, these results indicate that the path-planning strategy increasingly relies on a higher AGL to maintain noise compliance.
It is worth noting that some areas in the graph are missing SPL data, as different algorithms produce results based on different SPL calculation methods. As a result, no further noise changes occur in the subsequent steps, leading to missing SPL data in certain areas of the graph. This phenomenon reflects the impact of different SPL calculation methods on the algorithm’s results.
- (ii)
Noise threshold
Compared to the more relaxed noise threshold
, the detailed path-optimization results under different UAV sound pressure levels for the stricter threshold are provided in
Appendix B (
Table A4). The table reports the corresponding flight time, safety-risk value, and average sound pressure level at the nearest receiver points for each UAV sound pressure level, thereby providing a quantitative basis for evaluating how a tighter noise threshold influences trajectory performance.
Similar to the results under the 55 dB threshold,
Table A4 shows that, under the stricter threshold of
, both the average sound pressure level and the flight time generally decrease as the source-noise intensity is reduced. This indicates that tighter noise constraints impose a higher time cost under strong source-noise conditions, whereas lower UAV source noise levels provide a relatively larger noise-feasible margin. The safety-risk value does not vary monotonically, but instead reflects the combined effect of AGL adjustment and the building-shielding conditions beneath the UAV trajectory.
Figure 15 further compares the trajectories under different UAV source noise levels. In the top view (
Figure 15a), all cases generally maintain a similar start-to-goal direction, whereas the detailed path layouts vary with the UAV source noise level, indicating that stronger noise constraints require a greater sacrifice of geometric directness. In the front view (
Figure 15b), the AGL profiles show a clear stepwise increase with increasing UAV sound pressure level: 65–74 dB mainly corresponds to 50–60 m, 77 dB to 69–70 m, 80 dB to 70–80 m, 83 dB to 80–90 m, 86 dB to 90–100 m, 89 dB to 110–120 m, 92 dB to 130–140 m, and 95 dB to about 150 m. These results indicate that, under the local noise limits, the planning strategy increasingly relies on earlier climbing and sustained high-AGL cruise segments to enlarge the propagation-distance margin and maintain noise compliance.
Figure 16 shows the time-series characteristics of the sound pressure level (SPL) at the nearest receiver points under stricter local noise limits, revealing the impact of different UAV source noise levels on path planning. Overall, none of the curves exceed the feasibility protection threshold (FPT); however, as the noise threshold tightens, the SPL curves gradually approach the threshold line, indicating that the noise feasibility margin is decreasing.
A more detailed analysis shows that, at lower UAV source noise levels (65–74 dB), the SPL curves remain below the 50 dB threshold for most of the flight segments, with minimal fluctuations, indicating that there is still a sufficient noise feasibility margin. When the UAV source noise level increases to 77–86 dB, the curves approach the threshold more frequently, suggesting that the trajectory has entered a critically feasible state. As the UAV source noise level further increases to 89–95 dB, threshold contact and local exceedances become more frequent, reflecting a significant reduction in the noise feasibility margin. This indicates that, at high UAV source noise levels, the path-planning strategy can no longer maintain stable noise compliance throughout the full trajectory.
Some areas in the graph are missing SPL data due to the use of different SPL calculation methods by different algorithms, causing noise changes to stop in subsequent steps, resulting in missing SPL data in certain areas.
- (i)
Sensitivity Analysis of Noise Thresholds and Noise Source Levels
Based on the path optimization results under different UAV sound pressure levels and noise threshold conditions,
Figure 17,
Figure 18 and
Figure 19 summarize the variations in required AGL, flight time, and noise exposure.
Figure 17 shows that the required AGL increases stepwise with UAV sound pressure level. The values shown here represent the highest cruise AGL adopted during the stable-flight phase rather than a constant altitude throughout the entire trajectory. Under relatively low UAV sound pressure levels (65–74 dB), the AGL requirements under the 50 dB and 55 dB thresholds are similar, remaining around 50–60 m. As the UAV sound pressure level increases, the stricter 50 dB threshold requires earlier and larger altitude increases than the 55 dB threshold, indicating a stronger reliance on altitude compensation.
Figure 18 shows that, under both threshold settings, the total flight time generally increases with UAV sound pressure level, and this increase is more pronounced under the stricter 50 dB threshold. Under the 50 dB threshold, the flight time rises from about 162.31 s at 65 dB to 223.68 s at 95 dB, whereas under the 55 dB threshold it increases from about 162.61 s to 217.15 s over the same range. The difference between the two cases is relatively small at lower UAV sound pressure levels but becomes increasingly evident as the UAV sound pressure level increases. This indicates that a stricter threshold leads to stronger AGL compensation and, consequently, a higher time cost.
Figure 19 shows that, under both threshold settings, the average SPL at the nearest receiver point increases monotonically with UAV sound pressure level. Under the 50 dB threshold, the average SPL remains below the limit from 65 dB to 86 dB but exceeds the threshold from 89 dB onward. Under the 55 dB threshold, the average SPL remains below the limit up to 92 dB and only slightly exceeds it at 95 dB. These results indicate that the stricter 50 dB threshold causes the trajectories to approach the noise boundary earlier, whereas the 55 dB threshold retains a relatively larger noise-feasibility margin over a wider range of UAV sound pressure levels. It should also be noted that the average SPL reflects only the overall noise level and does not guarantee full-process compliance with the noise constraint during the flight, excluding the takeoff and landing phases. Even when the average value remains below the 50 dB or 55 dB threshold, local or short-term exceedances may still occur in certain critical flight segments. Therefore, further analysis in combination with the maximum SPL at the nearest building façade is still required.