Research on Aerodynamic Force/Thrust Vector Combined Trajectory Optimization Method for Hypersonic Drones Based on Deep Reinforcement Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper addresses the cruise range maximization problem for hypersonic drones by proposing a combined aerodynamic force/thrust vector trajectory optimization method based on deep reinforcement learning. The paper constructs three progressively complex control schemes: fixed thrust vector control based on PSO algorithm, time-varying thrust vector control based on deep residual network DDPG algorithm, and height-thrust vector collaborative optimization control. Simulation results validate the significant advantages of the proposed method and the effectiveness of the deep reinforcement learning framework. However, this manuscript still has some issues that need further improvement, with specific comments as follows:
-
Is the continuous linear parameterization strategy a traditional piecewise linear interpolation method? What are its unique advantages compared to existing technologies?
-
The authors used one drone model for simulation experiments. Is this method applicable to cruise situations of drones with other configurations or under other conditions?
-
It is recommended to supplement the discussion on computational efficiency analysis and hardware implementation requirements.
-
The references on deep learning in the paper are too old. It is suggested that the authors add the following two latest references: [1] Looking Clearer with Text: A Hierarchical Context Blending Network for Occluded Person Re-Identification. [2] Learning discriminative topological structure information representation for 2D shape and social network classification via persistent homology.
-
Punctuation marks need to be added after equations, and the equations lack detailed explanations of symbol meanings.
Author Response
Comments 1: Is the continuous linear parameterization strategy a traditional piecewise linear interpolation method? What are its unique advantages compared to existing technologies? |
Response 1: Thank you for this astute observation. Our continuous linear parameterization strategy is indeed conceptually grounded in piecewise linear interpolation, but incorporates significant improvements over traditional approaches: First, the recursive design automatically ensures inter-segment continuity, eliminating potential discontinuities that may occur in conventional methods. Second, unlike traditional approaches that require independent parameter specification for each segment, our method covers the complete trajectory using only 21 parameters through an incremental slope mechanism. Third, the parameterization is specifically optimized for DDPG's continuous action space requirements. |
Comments 2: The authors used one drone model for simulation experiments. Is this method applicable to cruise situations of drones with other configurations or under other conditions? |
Response 2: Thank you for raising this important concern. We acknowledge the limitation in the validation scope of the current study, which employs only a single UAV model for verification. However, from an algorithmic design perspective, the proposed continuous linear parameterization strategy and ResNet-DDPG framework possess the following generalizable characteristics: the state-action mapping mechanism is independent of specific configurations, and the dynamics and aerodynamic models can be parametrically adjusted for different UAVs. To address this concern and enhance the credibility of our approach, we will add a new subsection "Robustness Analysis under Aerodynamic Coefficient Uncertainties" in Section 4.4, which will include: perturbation of key aerodynamic coefficients (lift coefficient CL, drag coefficient CD, etc.) within an 80%-120% range to simulate verification of control strategy performance under parametric uncertainties; analysis of algorithm sensitivity to aerodynamic parameter variations and identification of applicable boundaries. |
Comments 3: It is recommended to supplement the discussion on computational efficiency analysis and hardware implementation requirements. |
Response 3: Thank you for your valuable suggestion. We will add a new Section 4.5 "Computational Efficiency Analysis" to provide a detailed comparison of training time and computational complexity among the three control strategies, including network architecture and parameter scale, training complexity comparison, and performance-efficiency trade-offs. |
Comments 4: The references on deep learning in the paper are too old. It is suggested that the authors add the following two latest references: [1] Looking Clearer with Text: A Hierarchical Context Blending Network for Occluded Person Re-Identification. [2] Learning discriminative topological structure information representation for 2D shape and social network classification via persistent homology. |
Response 4: Thank you for the reviewer's valuable suggestion. We acknowledge that the deep learning-related references in our paper indeed have timeliness issues, which may affect the reflection of the latest developments in this field. We will add the following recent references to the manuscript: [1] Looking Clearer with Text: A Hierarchical Context Blending Network for Occluded Person Re-Identification. [2] Learning discriminative topological structure information representation for 2D shape and social network classification via persistent homology.These additions will help contextualize our ResNet-DDPG approach within the broader landscape of contemporary deep learning methodologies and demonstrate awareness of current research trends in neural network architectures and representation learning. |
Comments 5: Punctuation marks need to be added after equations, and the equations lack detailed explanations of symbol meanings. |
Response 5: Thank you for pointing out these formatting issues. We have carefully reviewed the mathematical expressions throughout the manuscript and indeed found problems with non-standard equation punctuation and insufficient symbol explanations. We will systematically examine all mathematical expressions in the manuscript to ensure that every symbol has a clear and accurate definition, and maintain consistency in symbol usage throughout the paper. We sincerely appreciate your patient guidance, as these suggestions will significantly enhance the readability and academic rigor of our manuscript. |
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authorsplease see attached file
Comments for author File: Comments.pdf
Author Response
Comments 1: In Section 4 it states that the cruise speed is Ma=6. Is this true for all examples and for all phases of flight? Shouldn’t the drone accelerate to Ma=6 from low speed? |
Response 1: Thank you for acknowledging the reviewer's concern regarding flight speed description. This clarification helps establish clear research boundaries. Research Scope Clarification: This study focuses specifically on the cruise range maximization problem for hypersonic drones during the steady-state cruise phase, with trajectory optimization centered on constant cruise conditions. Therefore, all simulations and analyses in this paper are conducted based on a constant cruise velocity of Mach 6. We will clarify the research assumptions and applicable scope in the revised manuscript (lines 196-204) to prevent reader misunderstanding regarding velocity assumptions. |
Comments 2: If Mach number is constant through the flight, the temperature, speed of sound and, consequently, drone speed in m/s would change with elevation. The aerodynamic coefficient in Eq. (1) might change too. Have you accounted for it? |
Response 2: Thank you for raising this important technical question. Environmental parameter variations during flight indeed affect aerodynamic coefficients. Based on Zhou & Huang [29]'s model, we have conducted aerodynamic characteristic calculations for different flight conditions within the 20-30 km altitude range, incorporating altitude correction factors for aerodynamic coefficients. We will supplement Equation (1) with clarification regarding aerodynamic force altitude corrections. |
Comments 3: When you calculate the aerodynamic coefficient C, to which degree hypersonic effects in air flow (ionization, dissociation, real gas effects) are accounted for? Otherwise, please change title using “Ma=6” in place of “hypersonic”. |
Response 3: Thank you for the reviewer's attention to the rigor of hypersonic modeling. This study directly adopts the validated aerodynamic data for the full-waverider configuration from Zhou & Huang [29]. Their research conducted detailed CFD calculations using coupled RANS equations, the SST k-ω turbulence model, and Sutherland's law, employing a computational mesh of 13.97 million cells. Zhou & Huang [29] have comprehensively validated the modeling of hypersonic effects under these flight conditions, including critical physical phenomena such as shock wave capture, boundary layer separation, and aerodynamic heating. These details have been supplemented in lines 144-151 of the manuscript. Our research is based on their peer-reviewed and validated aerodynamic data, ensuring modeling accuracy and consistency. The innovation of this study lies in the thrust vector control strategies and deep reinforcement learning methods, rather than aerodynamic modeling itself. By utilizing validated hypersonic aerodynamic data as the foundation, we are able to focus on control algorithm development and performance evaluation. |
Comments 4: P.4 Please move definitions of T and R up. Please put them after equations (1-2) not after equation (3). How do you calculate Fc-the control force- in Eq. (3)? |
Response 4: Thank you for the important question regarding the calculation method of control force Fc. Control Force Fc Source and Calculation: As shown in Figure 1's full-waverider configuration, this UAV is equipped with conventional aerodynamic control surfaces. Control force Fc represents the control forces generated by these aerodynamic control surfaces, with specific values obtained through interpolation from the aerodynamic database provided by Zhou & Huang [29] to derive control effectiveness derivatives under different flight states. The treatment approach in this study adopts an instantaneous response assumption. We will clearly specify the calculation assumptions and simplification conditions for control force Fc in the revised manuscript (supplemented in lines 173-178), ensuring readers understand the coordination relationship between thrust vector control and traditional aerodynamic control. |
Comments 5: In cruise phase of flight with zero acceleration and constant elevation can we say that T+R+Fc+mg=0? Can you comment on it? |
Response 5: Thank you for the important physical question raised by the reviewer. In the cruise phase with zero acceleration and constant altitude, the force balance condition (T+R+Fc+mg=0) is indeed satisfied. Regarding the explanation of control force Fc, as mentioned in Comment 4, this has been clarified in lines 173-178 of the manuscript. |
Comments 6: P. 5 you use r as the position vector and r as height. Isn’t it better to use variable h as height? |
Response 6: Thank you for acknowledging the symbol usage issues pointed out by the reviewer. These details are indeed crucial for improving the professionalism and readability of the paper. Regarding the mixed use of position vector and height symbols, your observation is accurate. We have indeed used "r" to represent both position vector and height throughout the manuscript, which causes reader confusion. We will make systematic adjustments in the revised manuscript. |
Comments 7: P.5 is V vector? Please use bold V as you use bold r. Please check that you use regular font for scalars and bold for vectors throughout the paper. |
Response 7: We will systematically examine the entire manuscript to ensure: consistency of all symbol definitions, standardized use of vector and scalar fonts, and standardization of mathematical expressions. |
Comments 8: Do you use angular velocity components (p, q, r) in the study? if not, why do you define them? |
Response 8: Thank you for pointing out this issue. Your observation is correct. In this study, our current trajectory optimization is primarily based on center-of-mass motion equations and indeed does not directly utilize angular velocity components (p, q, r) for control design and analysis. The current trajectory optimization focuses mainly on translational dynamics equations. In future research, we plan to incorporate attitude dynamics into the optimization framework, where these angular velocity components will play important roles. Since they are not addressed in this paper, we will remove them. |
Comments 9: P. 6 In Eq. (6) for flight range you integrate variable v. However, in the next formula you use absolute value of v for the same purpose. Which one is correct? |
Response 9: Thank you for pointing out this important technical question. Since the research focuses on cruise phase range maximization, the ground speed v remains positive throughout the entire flight process. Therefore, the integral ∫vdt correctly reflects the actual flight distance, and the absolute value can be removed. |
Comments 10: Step 2 in p. 6. Please refer to Eq. (10) after words “fitness value”. In Step 3 what are pi and best pi? What is pg? Please define them. |
Response 10: Thank you for acknowledging the symbol definition issues pointed out by the reviewer. This indeed affects readers' understanding of the algorithm. Regarding fitness value reference: Your suggestion is very reasonable. We will add a reference to Equation (10) after Step 2 "Calculate the fitness value of each particle," clearly indicating that the fitness function is specifically defined as the total flight distance. Regarding PSO algorithm symbol definitions: We have indeed omitted key symbol definitions in the algorithm description and will supplement them in the revised manuscript: pi: Personal best position of the i-th particle, representing the best position experienced by this particle throughout the entire search processBest pi: The position with the highest fitness value among all current personal best positionspg: Global best position, representing the best position discovered by the entire particle swarm during the search process. |
Comments 11: In Eq. (25) what are G1 and G2? Please define them. |
Response 11: Thank you for acknowledging the symbol definition issue pointed out by the reviewer. In the Breguet range formula (25), we have indeed omitted the definitions of G1 and G2. We will immediately add complete symbol definitions after Equation (25) in the revised manuscript and explain the applicability of this formula in hypersonic vehicle range estimation. |
Editorial comments: Line 169: why r is subscript? |
Response Editorial comments: Thank you for acknowledging the formatting error pointed out by the editor. The subscripted "r" in line 169 is indeed a typographical error and should be corrected to normal font. We will carefully review the entire manuscript in the revision to ensure the correctness and consistency of all symbol formats. We will systematically proofread all mathematical symbols and typographical formats throughout the manuscript to avoid similar formatting errors. |
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have addressed all of my concerns very well, and I recommend acceptance of the manuscript.
Author Response
comment:The authors have addressed all of my concerns very well, and I recommend acceptance of the manuscript.
Reviewer 2 Report
Comments and Suggestions for Authorssee attached files
Comments for author File: Comments.pdf
Author Response
Response to Reviewer 2 Comments
|
||
1. Summary |
|
|
We sincerely thank the reviewers for their thorough evaluation and constructive feedback. The manuscript has been substantially revised to address all concerns raised. We will highlight all modifications in the manuscript to clearly indicate the revised sections. |
||
2. Point-by-point response to Comments and Suggestions for Authors |
||
Comments 1: Regarding Comment 3, my question was whether the authors account to what they call in the professional community hypersonic effects in air flow (that is, ionization, dissociation, real gas effects). If these effects are not accounted for in present manuscript, please change the paper title using “Ma=6” in place of “hypersonic” as it will be confusing for readers. |
||
Response 1: We sincerely thank the reviewer for this insightful and important remark concerning the treatment of hypersonic effects in the present work. We fully agree that at Mach-6 flight conditions the flow may indeed exhibit phenomena commonly referred to in the professional community as “hypersonic effects,” including high-temperature real-gas behavior such as molecular dissociation, ionization, and vibrational excitation. These effects can strongly influence thermal loads and heat-transfer rates and therefore are of great significance for aerothermodynamic analyses and thermal protection system design. In the current study, however, our primary objective is flight-control trajectory optimization rather than detailed aero-thermodynamic modeling. The aerodynamic database we use—originally developed from high-fidelity CFD solutions and validated experimental data—assumes a perfect-gas model. While we acknowledge that real-gas effects may slightly alter absolute aerodynamic coefficients at Mach numbers in the hypersonic regime, the following considerations led us to retain the current modeling framework: Limited influence on aerodynamic force and moment trends: Published high-temperature CFD comparisons show that, for slender lifting bodies at altitudes corresponding to our cruise conditions, real-gas corrections primarily affect surface heating and pressure distributions near shock layers but cause only small variations in the integrated lift, drag, and moment coefficients that drive the vehicle’s global motion. Robustness analysis included in the paper: To ensure that our guidance and control method is not overly sensitive to modest coefficient variations, Section 4.4 of the manuscript already presents a parametric “bias-perturbation” study, in which aerodynamic coefficients are systematically offset to test the robustness of the optimized control strategy. The results confirm that the proposed reinforcement-learning-based controller remains effective under such deviations. General applicability of the vehicle model: Although Mach 6 serves as the representative test case, the control framework and vehicle model were designed for a broader speed envelope that spans the hypersonic regime. The vehicle geometry, propulsion concept, and guidance method are all typical of hypersonic cruise configurations. Replacing the term “hypersonic” with “Mach 6” in the title could therefore give the unintended impression that the method applies only to a single Mach number, whereas it is in fact suitable for a wider range of high-Mach flight scenarios. For these reasons we respectfully believe that retaining the term “hypersonic” in the title better reflects the intended scope and the generic nature of the proposed control methodology, while we fully acknowledge that the present aerodynamic database does not explicitly resolve high-temperature chemistry. To avoid any potential confusion for readers, we have revised the manuscript text (see the beginning of Section 2 and Section 4.4) to explicitly state that: 1. The aerodynamic coefficients are generated using a perfect-gas assumption; 2. Hypersonic real-gas effects such as dissociation and ionization are not directly modeled; 3. A sensitivity analysis of aerodynamic-coefficient perturbations has been performed to demonstrate robustness of the optimization results. We hope these clarifications address the reviewer’s concern and make the modeling assumptions transparent to readers while preserving the broader applicability of the proposed hypersonic-cruise control framework. |
||
Comments 2: Regarding Comment 5 in my original review, by authors’ answer “In the cruise phase with zero acceleration and constant altitude, the force balance condition (T+R+Fc+mg=0) is indeed satisfied.” In this case the right-hand side is zero, correct? Then, the left-hand side is zero too. Why do we need the differential equation (3) that is identically equal to zero? Please comment on it in the manuscript text. Please explain in which situations the equation is not trivial (zero) and meaningful. |
||
Response 2: We sincerely thank the reviewer for raising this important point. The comment is well taken, and we have revised the manuscript to clarify the physical meaning and non-trivial nature of Equation (3) in different flight phases, especially during steady-state cruise. The following explanations have been incorporated into the text: Equation (3) represents the general equations of motion for the vehicle’s center of mass in a ground-launched rotating reference frame. Although during steady-state cruise the net acceleration is zero and the forces are balanced (i.e., the right-hand side sums to zero, implying the left-hand side is also zero), the equation is non-trivial and physically meaningful in the following contexts: Transient Dynamics: During maneuvers such as changes in thrust vector angle or altitude, the system undergoes transient phases where acceleration is non-zero. Equation (3) captures the full dynamic response during these periods. Trajectory Optimization: The optimization process involves searching across feasible trajectories that may include both equilibrium and non-equilibrium segments. Equation (3) provides the necessary dynamic constraints to ensure physical realizability of the optimized path. Earth Curvature and Continuous Adjustment: Even during “steady” cruise, the trajectory inclination angle γ varies slightly due to Earth’s curvature to maintain constant altitude above the ellipsoid. Thus, the system is never in perfect equilibrium, and the differential equations continue to describe meaningful dynamics. In the context of this paper, although the cruise is approximately steady in altitude and velocity, the variation in γ due to geometric and gravitational considerations implies that Equation (3) remains relevant throughout the flight envelope. Additionally, we have amended the description of Equation (4) to no longer state that “the trajectory inclination angle is constant at 0”. Instead, it is clarified that γ varies continuously to compensate for Earth’s curvature. These revisions emphasize that the equations of motion are integral to the optimization formulation—even in near-steady flight—and ensure that the resulting control strategies are dynamically consistent and physically achievable. |
||
Editorial Comments :[[24]]=[24] remove double brackets |
||
Response 3: Thank you for pointing it out. The double brackets were indeed a typographical error. We have removed them in the revised draft and have conducted a comprehensive review of the entire text to ensure that similar errors do not occur again. |
Author Response File: Author Response.pdf