Approximation of Closed-Loop Sensitivities in Robust Trajectory Optimization under Parametric Uncertainty

: Trajectory optimization is an essential tool for the high-fidelity planning of missions in aerospace engineering in order to increase their safety. Robust optimal control methods are utilized in the present study to address environmental or system uncertainties. To improve robustness, holistic approaches for robust trajectory optimization using sensitivity minimization with system feedback and predicted feedback are presented. Thereby, controller gains to handle uncertainty influences are optimized. The proposed method is demonstrated in an application for UAV trajectories. The resulting trajectories are less prone to unknown factors, which increases mission safety.


Introduction
In order to plan trajectories in aerospace missions, optimal control methods can deliver highly detailed trajectories that can serve to analyze system performance or as reference trajectories for tracking purposes.For increased safety, environmental or system-related uncertainties must be considered in trajectory planning.Robust optimal control methods take these uncertainties into account and provide robust trajectories that are less prone to disturbances, i.e., under disturbances in model parameters, the trajectory deviations are reduced.Two key techniques in application-oriented studies are the stochastic and deterministic approaches.
In statistical approaches, it is assumed that the uncertainty underlies a statistical distribution.Hence, probability or chance constraints can be imposed, requiring a particular event to happen with at least a given probability.Studies on this can be found in [1], where robust transition trajectories of an electric Vertical Take-off and Landing (eVTOL) aircraft are calculated considering uncertainties in wind estimations.There, a stochastic collocation approach combined with general Polynomial Chaos (gPC) is used.In [2], the authors propose to evaluate chance constraints via kernel density estimation, similarly as in [3] coupled with Markov Chain Monte Carlo sampling.The main challenge in these methods is the high computational effort in calculating probabilities and their gradients.
In deterministic approaches, uncertainties are considered to appear deterministically in the dynamic system.One approach here is semi-infinite optimization [4], wherein constraints are imposed on all realizations of an uncertain parameter, making the solution feasible for a set of disturbances and optimal for the ideal parameter value.However, these problems require high computational effort to be solved due to the complex constraints, including worst-case considerations.
Another approach is sensitivity minimization [5,6], wherein sensitivities are minimized within the cost function, reducing the dependency of a state of interest concerning an uncertain model parameter.Thereby, the system states' or cost variations under parameter deviations are reduced, i.e., the robustness of the resulting trajectories is increased.In the mentioned studies, sensitivity minimization is applied to a Rayleigh system [5] and to robustify reentry trajectories regarding the heat flux under uncertain air density [6].
It is important to note that the mentioned studies are limited to open-loop optimal control.However, in many applications, it is essential to consider a closed-loop system in the optimization to describe a realistic behavior and exploit the full performance.
In gradient-based optimal control methods, incorporating a closed-loop system model is usually not suitable for the differentiability requirements of the optimization methods.Three main approaches to tackle this problem are discussed in [7].In summary, one approach may be to smooth a given controller for use in optimization.However, the controller behavior may not be perfectly reproduced after modification, as discussed in [8], and the resulting trajectories are then limited to this one control strategy.Another approach is presented in [9,10], wherein an explicit controller is designed in the optimization process considering Lyapunov and eigenvalue stability constraints.The advantage here is that specific requirements for the controller can be set during optimization.Nevertheless, the procedure to obtain the optimal trajectory and control design highly depends on the considered system.
Another challenge is the consideration of stability of linear time-variant (LTV) systems.There are stability criteria, e.g., the fulfillment of the Riccati equation [11,12].Nevertheless, the incorporation into an optimization problem is not straightforward since the Riccati equation needs to be solved.
An optimization-based state feedback approach is Model Predictive Control (MPC) [13].Here, cyclic re-optimization of the controls starting from the current state yields a feedback control law.By this, deviations from the optimal trajectory can be taken into account and can ideally be corrected.However, resolving the optimization problems can be computationally expensive.Instead of recalculating the optimal solution, post-optimal sensitivities can be used to linearly approximate the state and control history for solution updates as in [14].Here, the post-optimal sensitivities describe the change in the optimal solution regarding parameter value changes.
An approach combining sensitivity minimization with feedback is presented in [15,16].
Here, the open-loop sensitivity minimization approach is extended by control sensitivity terms regarding state feedback, such that the controls can be updated linearly according to state feedback.The control sensitivities are subject to optimization.The effectiveness of the method is presented in an academic example.The authors propose only the state feedback incorporation in case a feedforward cannot be determined, as parameter perturbations may not be measurable.
The approach of a linear control update with optimized gains is advantageous since, in the early development stages of new systems, performance assessment via trajectory optimization is possible without the actual usage of a fully developed controller.Furthermore, it provides a holistic optimization of trajectories and controller gains, which fully exploits system performance.
In the paper at hand, the sensitivity minimization approach is combined with state feedback and predicted feedback under the assumption that parameter values can be estimated during flight.In detail, generic control update terms considering state feedback and predicted feedback are introduced into the trajectory optimization problem appearing in the sensitivity differential equations.Thereby, the update terms improve sensitivity reduction and trajectory robustness compared to open-loop considerations.The proposed method is applied to a practically relevant UAV trajectory optimization problem of high dimension and is validated by closed-loop simulations.
This study is structured as follows.In Section 2, the theoretical background of robust open-loop optimal control using sensitivity minimization is presented.Afterwards, the closed-loop sensitivity minimization approach incorporating control updates is introduced in Section 3.An application example for UAV robust trajectory optimization is presented in Section 4. This study is concluded in Section 5.

Robust Open-Loop Optimal Control with Sensitivity Minimization
In this subsection, an open-loop optimal control problem formulation with sensitivity penalty is presented according to [5,6,17].The considered time horizon of the trajectories to be optimized is denoted by T = [t 0 , t f ] ⊆ R. Let x : T → R n x be the state history and u : T → R n u the control history.The parameter vector is denoted by p ∈ R n p .Then, the robust open-loop optimal control problem statement is as follows: subject to the dynamic constraints with f : R the initial and final boundary conditions with ψ : R n x × R n x → R n ψ and inequality constraints for all t ∈ T with c ineq : R n x × R n u → R n ineq .Here, J represents a Mayer cost function, which depends only on initial and final states and the free final time.The cost function here is formulated with a sensitivity penalty with weights w i,j ∈ R, which can be chosen to find a trade-off between optimality and robustness.The higher the sensitivity weights, the higher the robustness.Please note that increased robustness usually leads to increased nominal costs J compared to the minimal cost value resulting from only minimizing J. Solving the robust optimal control problem delivers an open-loop control history u, which can be utilized to obtain a state history x, which is less prone to deviations in the parameters p as studied in [5,6].It is worth noting that the open-loop sensitivity does not include any terms describing the controls' dependencies on parameters.
In the following section, the open-loop approach is extended to closed-loop modeling to increase robustness.

Robust Closed-Loop Optimal Control with Sensitivity Minimization
Since, in practice, a closed-loop system is considered, it is meaningful to also consider a closed-loop system in robust optimal control.As discussed in the literature review in Section 1, few approaches exist to incorporate feedback into the optimization routine due to demanding differentiability requirements.In this section, an extension of the openloop sensitivity minimization approach presented in Section 2 is given by modeling a linear control update with state feedback and predicted feedback.The idea of a linear feedback controller is based on [15,16] and is extended by a feedback prediction.The main advantage is that a realistic system behavior is mapped by the closed-loop system, which increases the usability of the optimized trajectories in real-life scenarios.Furthermore, the proposed approach holistically optimizes trajectories and controller gains, exploiting the full performance spectrum of the system.The extension to the predicted feedback offers more flexibility in using either actual system feedback or the prediction when uncertainties can be estimated during flight.

Sensitivity Formulation with Feedback
A common approach to design a linear feedback control law is presented in the following.Let P ⊂ R n p be the set of possible parameter realizations and p 0 ∈ P be the fixed nominal parameter vector.The optimal open-loop control for problem (1)-( 5) is denoted by u(•; p 0 ) : T → R n u .The control update u : T × R n x × P → R n u is based on the first-order Taylor approximation: where x(•; p) : T → R n x describes the perturbed state trajectory at parameter value p.In the approach proposed by [15,16], new control variables with K x : T → R n u × R n x , which represent the control sensitivities, are introduced using the fact that the state feedback is the deviation that is to be eliminated, and hence, a feedforward can be omitted.This enables the determination of an optimal control update rule u x (t, x; p) = u(t; p 0 ) + K x (t)(x(t; p) − x(t; p 0 )) with u x : T × R n x → R n u .Thereby, the sensitivity differential equation in (3) extends to Additionally, problem (1)-( 5) is extended by limit constraints on the newly introduced control variables, i.e., K x,lb ≤ K x (t) ≤ K x,ub (11) with K x,lb , K x,ub ∈ R n u ×n x .Since the gains K x can be considered as design parameters, the limits can be chosen iteratively until satisfactory results are obtained to meet the control constraints given in (5).

Closed-Loop Sensitivity Minimization Problem with Predicted Feedback
In case the values of the uncertain parameters can be determined during flight, the approach of updating controls presented in Section 3.1 can be reformulated by making use of the fact that the states x(t; p) under parameter influences can locally be approximated with the help of the sensitivities from (9).The approximation is given by The replacement in (8) leads to the control update formulation as follows: with u p : T × P → R n u .Compared to a feedforward term ∂u(•;p 0 ) ∂p (p − p 0 ), which is not utilized in this study, the predicted feedback term has time-varying limits under (11), and can be interpreted to be weighted with the sensitivities.The less the state is expected to deviate, the smaller the gain limit that will reduce the control effort as smaller sensitivities result in smaller corrections.Furthermore, this formulation provides flexibility in using the predicted or actual feedback due to the relation in (12).This can be useful in cases wherein there is a loss of measurements, e.g., loss of GPS signals due to jamming, where the control can still be updated based on previous estimations.From an implementation perspective, this formulation eases the transference of the nominal control limits given in (5) to the updated control (13) as the values for the control update, namely the sensitivities and upper and lower bounds of the parameters, are available within the optimization.Finally, note that, in the feedback implementation (8), a full state feedback is assumed, whereas in the predicted feedback approach (13) all that is needed is a good estimation of the constant vector p.
Under the assumption that a bounded set of uncertainty realizations is considered with p lb ≤ p ≤ p ub and that the worst-case trajectory deviations are realized at these bounds, the following constraints are imposed on the control updates: for all t ∈ T .

Application to UAV Trajectory Optimization
The proposed approach from Section 2 is applied to a UAV climb trajectory optimization problem wherein flight time and sensitivities are minimized.The dynamic model of a fixed-wing UAV is based on [18] and given by where x, y, h describe the positional coordinates, χ the course angle, γ the flight path angle and V the velocity.Furthermore, m is the mass of the vehicle, g the gravitational acceleration and µ the bank angle.The thrust, lift and drag are given by with δ T being the thrust lever position, T max the maximum thrust of the vehicle, ρ the air density, S the reference area, C D 0 the zero-lift drag coefficient, k the induced drag factor and C L the lift coefficient.The values of the model parameters are given in Table 1.The state and the controls are defined to be The maneuver is a climb in minimum time after transition to start a particular mission at a given altitude: min such that the initial and final boundaries in Table 2 are fulfilled.
Furthermore, flight time is limited to 10 s to allocate enough flight time for the mission.This optimal control problem is solved with FALCON.m [19], which is an optimal control framework in MATLAB.It is based on direct collocation methods [17] using Trapezoidal integration.
Since aerodynamic parameters, in this case the drag coefficient C D 0 , are usually estimated based on wind tunnel and CFD data, they may be subject to uncertainty.It is assumed that the drag coefficient C D 0 can deviate by 30 % and that the most probable values of the realizations of C D 0 are in the interval P = [0.0105,0.0195].
If the actual parameter values differ from the ideal values assumed in the model, the actual trajectory using the optimal controls may significantly change.For this reason, the robust optimal control methods presented in Section 2 are applied.Robustness of alongtrack position may be essential for specific scenarios, e.g., to ensure separation between vehicles and reduce the collision risk.Therefore, the aim is to minimize the deviations in the positional state x with respect to the parameter C D 0 by minimizing the sensitivity For validation, simulations with worst-case deviations in C D 0 are conducted.In detail, the following methods are applied and compared.

Case 1: Problem Formulation for Open-Loop Sensitivity Minimization
In order to robustify the trajectory in Section 4.1, open-loop sensitivity minimization as presented in Section 2 is conducted as a benchmark for the proposed methods.Therefore, the cost function is modeled as The constraints regarding the states and controls are equal to those of case 0 in Section 4.1.

Case 2: Problem Formulation for Closed-Loop Sensitivity Minimization with Feedback
To improve robustness, the method using state feedback as in Section 3.1 is applied.The cost function is given by (29) and the same as in the open-loop sensitivity minimization case 1.New controls are introduced, namely the time-dependent gains K x (t) ∈ R n u ×n x .They are limited by the lower and upper bounds in order to map the limitations of the controller.The post-optimal simulations are conducted with the updated control ( 8).An estimation of feasible gain limits can be made by analyzing the magnitude of the estimated deviation S(t)(p − p 0 ).

Case 3: Problem Formulation for Closed-Loop Sensitivity Minimization with Predicted Feedback
Under the assumption that the drag coefficient can be estimated during flight by energy estimations, the predicted feedback formulation is applied.Therefore, similarly to the feedback case in Section 4.3, the gains K x (t) ∈ R n u ×n x are introduced with the same bounds as in (30).In this case, the control update in (13) is utilized in the post-optimal simulations.Furthermore, control update constraints as in ( 14) and ( 15) are imposed with the same values as in Table 2. Analogously, the same rate constraints as for the nominal controls are applied to the worst-case control updates.
An overview of the compared cases is given in Table 3.

Numerical Results
The optimization using the open-loop and closed-loop sensitivity minimization leads to the optimal state and control trajectories depicted in Figure 1 and Figure 2, respectively.In the open-loop sensitivity minimization case 1, the optimization can find optimal controls that reduce the sensitivity of the x-position and the other system states due to system interdependencies (see Figure 3a).The sensitivity reduction can also be observed in the simulated trajectories under variations in C D 0 (see Figure 1).In the nominal case 0, the variations in the x-position are up to 3 m, whereas in the open-loop sensitivity minimal case 1, the values vary up to 1 m.The direct influence of C D 0 on V, which directly influences the other system states, suggests that the optimization found a solution reducing the sensitivity of V. Indeed, the solution with a sensitivity penalty on S V shows the same maneuver structure.This suggests that a reduced velocity is more robust.To meet the final boundary conditions in Table 2, the velocity increases by the end of the maneuver.
Compared to the nominal case 0, where the flight time is 9.05 s, flight time increases to the upper time limit of 10 s when minimizing sensitivities in the open-loop case 1 due to the trade-off between optimality and robustness.The significant increase in the final time can mathematically be explained by the sensitivity having a higher order of magnitude than the final time.For this reason, the sensitivity penalty considerably influences the overall cost (29) and leads to an optimal solution, which may allow for an increased final time compared to the nominal case.Please note that, for increased time optimality, the weights of the sensitivities in (29) could be decreased.For comparability to the subsequent closed-loop cases described in Table 3, the weights are fixed for all cases.
In the closed-loop optimal solutions depicted in Figure 2, the sensitivities can further be reduced by a magnitude of around 10 to 1000.The sensitivity reduction can also be observed in the simulations since the deviations between the simulated trajectories at perturbed values of C D 0 are decreased.In the feedback case 2 and the predicted feedback case 3, the deviations in the x-position are reduced to be below 1 cm.Taking into account that a reasonable estimator can reduce the uncertainty in the drag estimation during flight to 5 %, simulations employing disturbed parameter values for the control updates have shown to reduce the x-position deviations up to 15 cm, hence still showing good performance in terms of robustness.Therefore, the simulations using an ideal parameter estimation are depicted for comparability with the benchmark method.Notably, the closedloop robust optimal trajectories are similar to the nominal open-loop optimal trajectory without sensitivity minimization.Due to the nature of a closed-loop system, the overall magnitude of sensitivities is smaller, as depicted in Figure 3b, having a negligible impact on the overall cost.Hence, a solution with a nearly minimum final time can be found.Specifically, the flight time for case 2 is 9.05 s and for case 3 is 9.07 s.This indicates that the closed-loop solutions allow for increased optimality while being robust.Furthermore, both closed-loop solutions in Figure 3b show high concordance.This indicates that the predicted feedback leads to the desired validity of the prediction and underscores the effectiveness of the proposed predicted feedback method.The effect of the control update constraints according to (14) and (15) and Table 2 can be observed in the differences in the gains.Due to the fact that the updated controls in case 2 are already close to fulfilling these constraints, there is a slight difference to the updated controls in case 3.Although the gains vary rapidly over time, the constraints ensure that the updated controls are feasible.case 0 (optimal) case 0 (simulation) case 1 (optimal) case 1 (simulation) 0 0.5

Conclusions
In this study, a methodology for robust trajectory optimization under uncertainties via sensitivity minimization considering state feedback and predicted feedback is presented and applied to a practical trajectory optimization problem for UAV.Thereby, time-dependent controller gains are optimized together with system state trajectories and a nominal control, where a nominal cost function and sensitivities of a state of interest are minimized.The proposed closed-loop method enhances trajectory robustness with a better trade-off in optimality and robustness, further enabling the efficient inclusion of control update constraints.Further research can be directed to compare the feedback approaches with the inclusion of a pure feedforward term.Moreover, it may be of interest to analyze the potential of the closed-loop approaches in applications considering unstable systems.

Figure 1 .
Figure 1.Optimal states and controls (solid) for the open-loop cases 0 and 1.The dashed lines are simulated trajectories for cases 0 (grey) and 1 (purple) after worst-case variations of C D 0 by 30 %.

Figure 2 .
Figure 2. Optimal states and controls (solid), including the controller gains, the closed-loop cases 2 and 3.The dashed lines are simulated trajectories for cases 2 (orange) and 3 (cyan) after worst-case variations of C D 0 by 30 %.

Figure 3 .
Sensitivities of states with respect to variations in C D 0 in (a) the open-loop system for the nominal case 0 (grey) and sensitivity minimized case 1 (purple), and (b) the closed-loop system with feedback (case 2, orange) and predicted feedback (case 3, cyan).

Table 2 .
State and control limits and initial boundary conditions.

Table 3 .
Overview of compared cases for robust optimal control of the maneuver given in Section 4.1.