Neural Network and Dynamic Inversion Based Adaptive Control for a HALE-UAV against Icing Effects

: In the past few decades, in-ﬂight icing has become a common problem for many missions, potentially leading to a reduction in control effectiveness and ﬂight stability, which would threaten ﬂight safety. One of the most popular methods to address this problem is adaptive control. This paper establishes a dynamic model of an iced high-altitude long-endurance unmanned aerial vehicle (HALE-UAV) with disturbance and measurement noise. Then, by combining multilayer perceptrons (MLP) with a nonlinear dynamic inversion (NDI) controller, we propose an MLP-NDI controller to compensate for online inversion errors and provide a brief proof of control stability. Two experiments were conducted: on one hand, we compared the MLP-NDI controller with other typical controllers; on the other hand, we evaluated its robustness and adaptiveness under different icing conditions. Results indicate that the MLP-NDI controller outperforms other typical controllers with higher tracking accuracy and exhibits strong robustness in the presence of icing errors and measurement noise, which has huge potential to ensure ﬂight safety.


Introduction
In-flight icing poses a significant threat to flight safety and can result in fatal accidents.The effects of ice accretion on wings and control surfaces can greatly reduce lift and increase drag forces, and may even cause structural imbalances [1].This will lead to a decrease in flight stability and controllability [2].Statistic data from American Safety Advisor showed that 12% of all weather-resulted accidents were caused by icing and 92% of the ice-induced accidents were in-flight icing [3].
To mitigate the risks of in-flight icing, two main approaches are currently used in practice.The first approach provides pilots with detailed weather information before the flight mission to avoid potential icing conditions.The second approach relies on classical icing protection systems (IPSs) which employ deicing and anti-icing methods to remove or inhibit ice accretion, respectively [4].Various deicing methods have been incorporated into IPSs, including bleeding hot engine exhaust to counteract frigid icing conditions and using inflatable boots to break off the accumulated ice.Chemical reagents have been considered as an effective anti-icing method in recent years [5].Despite the implementation of various engineering measures , accidents resulting from in-flight icing continue to occur.In October 1994, an accident involving American Eagle ATR-72 near Roselawn, Indiana, resulted in the death of 68 individuals [6].Similarly, China Eastern Airlines Flight 5210 (CRJ-200) crashed after takeoff in Baotou City in November 2004 and left 55 dead.In 2009, Air France Flight 447 (A330) crashed over the Atlantic Ocean; all the passengers on board were lost [7,8].These accidents highlight the inadequacy of relying solely on IPS for ensuring flight safety.A four-year tailplane icing program (TIP) was then cosponsored by NASA and FAA after the ATR-72 accident [9], which led to the proposal of the icing management system (IMS) by Bragg et al. [10].An IMS firstly takes into account the deterioration of aerodynamic derivatives and can automatically activate the traditional IPS devices.It also introduces a configuration into the flight control laws to restrict the maneuver within a proper margin of safety.
For HALE-UAVs, ice accretion occurs during the climb stage of flight, particularly below 25,000 feet altitude.These aircraft often fly through areas with high humidity or sufficient water content in the clouds, which are naturally icing high-risk regions or strong icing conditions [11].For these aircraft, the IPS cannot be activated by a human pilot, it only relies on the IMS.Inspired by Bragg, recent research on HALE-UAV icing has primarily focused on two key areas.The first area involves online estimation of the effects of icing accretion [12], while the second area involves developing an ice-tolerant control law.Since the icing effects can be considered as a part of the modeling error in many ways, the use of fault-tolerant control (FTC) can bring better results [13].Jiang and Yu divided FTC into passive and active types and gave their comparison results [14].Passive approaches such as robust control are relatively easy to implement, but can only handle limited faults.On the other hand, active approaches perform better when dealing with various faults, and a lot of research was therefore conducted on it [15][16][17][18][19][20][21][22][23][24].Ru proposed a multiple model control method that utilizes a finite set of linear models to express the system in different conditions [16].Additionally, an online processor that determines which model to use based on Kalman filters was also employed.Furthermore, Verhaegen et al. discussed three typical multiple model controllers [17], but the oscillation problem in model switching still needs further improvements.Shtessel constructed a two-loop cascade structure of sliding mode control (SMC) by using standard sliding mode functions [18].It is theoretically able to handle all structural errors less than the prior-assumed uncertainty, although limitations still exist.For instance, there must be one, and only one, control surface for every controlled variable, and we can never afford to lose it [17].With the enhancement of hardware computing power in recent years, model-predictive control (MPC) has become more popular.It can effectively control a multi-input multi-output (MIMO) system with constraints as long as a reference model exists [19].However, obtaining an accurate system without modeling errors remains a critical challenge for MPC.
In recent years, neural networks have also been widely used due to their good global approximation properties for nonlinear functions [20], and are therefore suitable for adaptive compensation.Calise et al. firstly applied this approach to control a rigid robot arm and successfully incorporated it into a feedback linearization framework [21][22][23], as shown in Figure 1.Shin et al. also proposed a neural-network-based adaptive backstepping controller and achieved good performance [24].Though feedback linearization is an effective control technology, the linearized model for a six-DOF icing aircraft can be highly time-varying, and compensation for this approximate linear system can vary significantly in continuous time steps.To address this issue, this paper combines the nonlinear inversion method with neural networks and divides the entire system into three-layer subsystems.This decoupling stabilizes the ideal compensation and enhances the neural network's learning effect.
To be specific, the highlights of our work are 1.
We established a comprehensive fixed-wing icing model of HALE-UAV considering wind disturbance and sensor noise.

2.
We proposed an ice-tolerant control structure by combining multilayer perceptrons with the nonlinear dynamic inverse method (MLP-NDI controller) to provide robust compensation for the nonlinear and time-varying icing effects.

3.
We conducted extensive comparisons between the MLP-NDI controller and three typical controllers, demonstrating its superior performance in terms of stability, accuracy, and robustness.

4.
We explored the robustness and verified the effectiveness of an MLP-NDI controller under various icing scenarios.The remaining parts of this paper are organized as follows.Section 2 analyses the icing effect on flight dynamics based on the DHC-6 nonlinear model.Section 3 gives the overall control architecture of the NDI with MLP compensator and applies the MLP-NDI controller to icing flight control, including the icing effect model.Section 4 provides the simulation results and analysis, which demonstrate the feasibility of the MLP-NDI with a preliminary assessment of its control performance.Finally, a conclusion is presented in Section 5.

Icing Effects on Flight Dynamic
Bragg et al. proposed a classical model of icing effects on a fixed-wing aircraft [25]: where C clean * , C iced * are the same aerodynamic derivative before and after icing.η ice is an icing severity factor, and k C * is the associated slope determined from the parameter being modified.
Considering the lack of available data on icing-related research for fixed-wing UAVs and the similarity in the aerodynamic design to fixed-wing aircrafts, this paper established the icing model utilizing open access data of the Twin Otter icing research plane DHC-6 (Figure 2), which has been extensively tested by NASA [26].Table 1 provides detailed clean and iced parameters for DHC-6, which were captured at η ice = 0 and η ice = 0.2 [5].k C * could then be calculated as the associated slope from this particular data point under different icing locations (wing, tail, and wing-tail both icing).To accurately simulate the in-flight icing process, we utilized a time-varying model that takes into account the accumulation of in-flight icing with time [27,28].
where C η denotes the conduciveness of the atmosphere to icing.The coefficients N 1 , N 2 are determined from an assumed icing severity profile characterized by the icing duration time, T cld , and the final and middle values of the icing severity, η ice (T cld ) and η ice (T cld /2), respectively.For all cases in this paper, the conduciveness of the atmosphere to icing is assumed to be a raised cosine as follows: Note that there exists an uncertainty d η in the conduciveness model; here, we consider a zero-bias situation with d η = 0. We then have Besides the clean case (no ice), two typical scenarios are investigated in this paper.Figure 3 shows the change of η ice in different cases.For the moderate icing case, T cld = 360, η ice (T cld ) = 0.2, η ice (T cld /2) = 0.12.For the severe/rapid icing case, T cld = 120, η ice (T cld ) = 0.9, η ice (T cld /2) = 0.5.Both cases have the same start of ice accretion at t = 30 with wing-tail both icing and the whole experiment lasts for 542 s, which would be sufficient to fully evaluate different controllers' performance.

Nonlinear Dynamics
In this paper, the six-DOF motion equations of an HALE-UAV were established in the body axis [29].To ensure consistency across all testing cases, the simulations were initialized from a trimmed point of steady, level flight at an altitude of 3000 m and a velocity of 60 m/s.At the start of the simulation, the icing severity of the HALE-UAV was 0, and all icing effects began at t = 30.Throughout the simulation period, the icing severity was modeled using Equations ( 2)-( 4) with different trends shown in Figure 3.

Disturbances and Measurement Noise
The effects of disturbance and measurement noise on an icing fixed-wing aircraft were first considered by Bragg et al. [25], after which a lot of further research has been conducted [4,5,27,28,30].In this paper, we modeled disturbance as a zero-mean, bandlimited white Gaussian noise with 50 Hz bandwidth.Since the linearized motion transform between the UAV's wind and body axis is V ≈ u, α ≈ w/V, β ≈ v/V, the intensity of disturbance is modeled as a perturbation to the velocities in body axes with a severe case of Similarly, the measurement noise is also constructed as a zero-mean, band-limited Gaussian noise.The noise intensities, which depend largely on the UAV's sensor resolution, were picked from the detailed information of in-flight instruments [26].Some of these are listed in Table 2.
where x ∈ R m represents the state vector and u ∈ R n represents the control vector.Define a pseudo-control vector v, which satisfies The inverse system can then be written as where f −1 represents the approximation of inverse model.Since it is always hard to obtain the explicit model, the inversion error ∆ is defined as follows: The closed-loop dynamic system then becomes Mathematically speaking, the inversion error is a function that is nonlinear, timevarying, and dependent on the command signal, states, and inputs, making it difficult to model precisely under a severe icing scenario.Furthermore, the aerodynamic derivatives of icing conditions in Table 1 can only be used for modeling purposes.Consequently, the control law should be based solely on the parameters of the clean aircraft.It is therefore theoretically challenging to eliminate the inversion error within a dynamic inverse framework alone.
Noticing the good approximation property of neural networks for continuous nonlinear functions, an MLP can be employed here as an error compensator.By adding an additional term v nn on the pseudo-control signal v at each moment, the inversion error can be sufficiently mitigated.The resulting control structure is depicted in Figure 4.The pseudo-control signal v is then composed of three parts: a derivative term of the reference signal v f , a proportional term of the error v p , and an adaptive compensation where Substituting ( 11) and ( 12) into (10), the error system can be obtained as follows: Hence, if the adaptive compensator can fully offset the inversion error, the error system would then be asymptotically stable.

MLP-Based Adaptive Compensator
In this paper, a multilayer perceptron was applied to reconstruct the inversion error [31].As shown in Figure 5, an input-output map of the MLP structure can be written as where k = 1, 2, ..., n 3 , and where n 1 , n 2 , n 3 , respectively, represent the input size, the hidden layer size, and the output size of the neural network.w jk , v ij are the layer weights and θ is the bias.σ(z) is an activation function, and, here, we chose the sigmoid function defined as In addition, we define two weight matrices and two augmented vectors Equations ( 14) and ( 15) can then be expressed as The universal approximation property of neural networks ensures that for any 0 > 0 and x ∈ D, where D is a bounded domain, there always exists a set of V * , W * which satisfies where < 0 .This set of V * , W * is often considered as the ideal weights of MLP.Therefore, it is always theoretically possible to obtain an ideal estimate of the inversion error v * nn = ∆, so as to make the tracking error tend to 0.

Application to Icing Flight Control
According to the multiscale singular perturbation theory, the flight state variables can be divided into three groups: the fastest variables of angular velocity p, q, r, the slower variables of attitude angle α, β, µ, and the slowest variables of speed and track angle V, γ, χ.Thus, we adopted the control framework shown in Figure 6, where dT, da, de, anddr denote the control inputs of propulsion and deflection angles.All systems in Figure 6   In this structure with time-scale separation , the maneuver generator only calculates the inversion solution from translational displacement to rotational attitude.This subsystem only involves kinematic equations which are less affected by the in-flight icing.Therefore, adopting adaptive controllers in the inner systems would be sufficient.
As shown in Figure 7, we added the MLP compensators in the channels of α, β, andµ, respectively, in the slow attitude system.Taking channel α as an example, we have the network input and output: where Ẑ = Vα 0 0 Ŵα , and • denotes the Frobenius norm.The extra term vnn_α will then be added into the original NDI controller before generating the instructed signal p d , q d , r d for the inner loop [32].However, we can hardly obtain the value of inversion error ∆ due to the effects of in-flight icing and the uncertainty of process and measurement.Thus, it is necessary to update the weight matrices V, Ŵ with time to estimate the ideal weights where vnn represents the estimation of ∆ by MLP at this time step, and v r represents a term that provides robustness for the higher-order terms in the Taylor series detailed later.For convenience, letting σ = σ( VT x), we then have  Substituting Equation ( 23) into ( 13), the error system then becomes In addition, considering the approximation of ∆ in Equation ( 20), we have Using a Taylor-series expansion of σ * (z) at z = ẑ, we have where O([V * T − VT ]x) 2 denotes the higher-order terms of the Taylor-series and σ is the Jacobian matrix of σ: Substituting ( 28) into (25), we have Organizing the formula, we have where Lewis et al. gives the general bounding expression for ω where γ = xT K −1 denotes the generalized error [31]: Supposing the command signal and the norm of network weights are bounded as follows: we then have the form of coefficients in Equation ( 32): Consider a candidate Lyapunov function with the weights adaptation law where Γ w , Γ v represent the learning rates of weight matrices, and λ is the weight between the gradient update and error update.In order to speed up the convergence of neural networks, we adjusted the learning rates online with respect to the latest error: where η > 0 represents the adjustment factor of the learning rate.Designing the robust term described in (23) as follows, we then have the derivative of Lyapunov function with (32), ( 34)- (36), and (38): where Thus, a 0 > 0 implies that K r1 > c 2 , which is sufficient to show that L is negative semidefinite when γ ≥ |a 1 /a 0 |.Recall that γ = xT K −1 ; thus, L = 0 is satisfied only when x = 0.According to LaSalle's invariance theorem, the tracking error x(t) is asymptotically stable.
Similarly, we also adopted the MLP compensators in the fast rotational system and obtained the mapping relationship from the reference signal of p d , q d , r d to the controller outputs da, de, dr.The overall structure is shown in Figure 8.

Experiment Evaluation and Comparison
In this section, the effectiveness of the proposed algorithm was demonstrated through experiments consisting of two scenarios.The UAV model with measurement noises proposed in Section 2 was utilized here for simulation under the effects of ice and disturbance.Since ice accretion is a continuous process, a long pentagonal route that includes both longitudinal and lateral maneuvers was chosen as the experiment trajectory.The command signals of V, γ, χ with respect to time are presented in Table 3.All experiments were implemented in MATLAB and executed on a server with a 2.60 GHz CPU and 16.0 GB of RAM.In this comparison scenario, all experiments were conducted under the moderate icing case described in Figure 3, where the severity factor η ice increased from 0 to 0.2 over a simulation time of 30∼390 s.To obtain an overall assessment of control performance, another three typical methods were adopted to make a comparison with MLP-NDI, which are MPC, SMC, and L1 adaptive control.

•
MPC is widely recognized as a highly effective time domain controller due to its ability to predict the system's future response and handle various process constraints in a systematic manner.Inspired by Wang [33], we utilized orthogonal basis functions such as Laguerre and Kautz function to establish the trajectory model of the control input signal u(t).By doing so, we were able to obtain a concise cost function J, which could be optimized using quadratic programming (QP) to provide the optimal input series within each time horizon.

•
The SMC method is a popular nonlinear control approach known for its robustness and ability to handle modeling error within a certain range [34,35].In this paper, firstand second-order dynamic sliding mode technologies were employed to construct a sliding surface for the attitude control system [36], which was then used to derive the control law.In addition, a proportional control method with a low-pass filter was introduced outside the attitude loop to enable the tracking of velocity and track angles.• L1 adaptive control is an adaptive method that can handle system uncertainty and parameter variation with sufficient robustness [37,38].By designing a PI controller with a state observer using the linear quadratic regulator (LQR) technique, the L1 adaptive control is then applied to the traditional NDI framework to improve the system tracking performance under icing scenarios [39].

•
The MLP-NDI controller described in Section 3.3 was tested here and initialed with random weights.The hidden layer consisted of 15 neurons and the adjustment factor of the learning rate was η = 0.01.The weight matrices were updated online according to Equation (36).
Figure 9 shows the complete tracking trajectories of command signals and Figure 10 provides the tracking error.Since the wind disturbance and icing effects cannot be modeled in advance, the MPC controller can only make future predictions based on a clean DHC-6 model, resulting in considerable oscillations beyond the 5% error band.Therefore, the response performance of the remaining three controllers was mainly compared in Table 4.The metrics used in the comparison, namely, settling time t p , rise time t r , maximum overshoot σ%, and steady-state error e ss , are listed in the first column.Three step signals were generated, respectively, for the V, γ channels, and two ramp signals for the χ channel, with the arrival time of each command signal displayed in the second row.Especially, for ramp signals, the settling time was defined as the time required to achieve a stable slope, while metrics of rise time and maximum overshoot were deprecated here.Results indicate that the SMC controller has a faster settling time and rise time than L1 and MLP-NDI in channel V, while it has a steady-state error in channel γ, and is unable to stabilize within the 5% error band, leading to an invalid rise time metric.In addition, each time SMC receives a new command, there is a buffeting effect that results in a large overshoot.Both L1 and SMC show degeneration on lateral tracking in channel χ with a steady-state error always present.This suggests the existence of a delay in tracking the ramp signal.In contrast, the MLP-NDI controller is able to adjust weight matrices to catch up with the reference ramp signal and minimize the steady-state error.Further analysis can be conducted on the specific reasons behind the different performance characteristics of each controller.

•
In the case of the MPC controller, the disturbance and measurement noise were unable to be modeled, which led to oscillations during the optimization of Laguerre functions [33].However, since the uncertainty was modeled with a zero-mean Gaussian function, the output of the MPC controller still remained close to the ideal output.

•
In the SMC controller, the nonlinear system constructed is different on either side of the sliding mode region, leading to different paths towards the termination point [40].This can result in buffeting due to the sensitivity of the system.The trajectory of SMC shows that the HALE-UAV experiences a long oscillation above the command signal, while on the other side of the sliding surface, the system performs more sensitively and is much quicker.As a result, the time accumulation of climbing is always larger than the time of descending, leading to a steady-state error in channel γ.

•
Due to the effect of the low-pass filter in L1 adaptive law [41] and also in SMC [36], the tracking of command ramp signal in channel χ experiences a short delay compared with MLP-NDI, and this results in larger offsets in terms of displacement.

•
Since the neural network of MLP-NDI was initialed with all random weights, it requires some time to update weights matrices before converging to a local optimum.Similarly, the adaptability of L1 also comes from its construction of the error system at each time moment; thus, both perform much slower than SMC, which mainly relies on the robustness of its default sliding surface.
Figure 11 depicts the complete tracking trajectories of the command signals, while Table 5 shows the tracking errors of displacement.We divided the case into three phases: before icing (clean), during icing (icing), and after icing (iced).All three phases, respectively, represent three different system types: a time-invariant system with known parameters, a time-variant system with unknown parameters, and a time-invariant system with unknown parameters.Results indicate that all controllers show sufficient control accuracy of a clean aircraft, but show significant performance degradation in the presence of icing.Consistent with previous results, MPC achieves the largest tracking error due to the inability to model icing.On the other hand, MLP-NDI has the minimum tracking error and standard deviation in the icing scenario, indicating the best control performance and stability.Particularly, the L1 adaptive controller exhibits the minimum average error in the iced case, despite its suboptimal standard deviation.
In addition, the inner-loop states are shown in Figure 12.Since different controllers have different control strategies every time a command signal comes, SMC exhibits more aggressive behavior compared to L1 and MLP-NDI.Overall, the MLP-NDI controller shows the strongest robustness and best adaptability for in-flight icing scenarios, while maintaining an acceptable rate of convergence.

Scenario 2: Ice-Tolerant Robustness
The second experiment aimed to assess the robustness of the MLP-NDI controller under different levels of icing severity, as described in Figure 3.For the moderate icing scenario, η ice increased from 0 to 0.2 within a simulation time of 30∼390 s, while for the severe/rapid scenario, η ice rose from 0 to 0.9 within a simulation time of 30∼150 s.The tracking trajectories of command signals are shown in Figure 13 and the tracking error is given in Figure 14.For all cases, the MLP-NDI controller can always follow the command while maintaining stability.However, as the severity of icing increased from the clean scenario to the moderate and severe/rapid scenarios, the tracking performance of the MLP-NDI controller gradually deteriorated, with the longitudinal tracking being more affected than the lateral tracking.Figure 15 shows that the compensation term of MLP in each channel increased significantly as the icing severity and rate increased.Additionally, the response of the inner-loop states to the 50 Hz perturbation and measurement noise is depicted in Figure 16, revealing high-frequency changes.The rotational states show similar trends in three scenarios with different amplitudes.Figure 17 gives the overall trajectories and Table 6 shows the tracking error of clean scenario, moderate icing scenario, and severe/rapid scenario in terms of displacement.These results confirm the MLP-NDI controller's sufficient robustness to different icing conditions.

Conclusions
In this paper, a novel MLP-NDI controller was proposed and its performance in ice-tolerant control was demonstrated.To implement and test the controller, a DHC-6 model was constructed that includes icing effect, wind disturbance, and measurement noise.In addition to the MLP-NDI controller, three other controllers were also tested on this icing model, and the performance of all controllers was evaluated during various combined maneuvers.
In the aforementioned modeling scenario, the traditional NDI method was found to have a huge inversion error brought by model uncertainty.To solve this problem, a compensator based on neural networks was designed for each of the three control channels.Two simulations were conducted with the following purposes: the first compared the MLP-NDI controller's performance with other typical controllers, and the second tested its robustness under three icing scenarios.The results demonstrate that the MLP-NDI controller is capable of adapting to different icing conditions and exhibits strong robustness against in-flight icing.
Future work will involve the application of deep neural networks to the controller, and the consideration of more complex models, such as those involving center of gravity offset and shape asymmetry.In addition, the current controller will be tested and improved in more extreme conditions, including addressing actuator limitations or failures.

Figure 3 .
Figure 3. Icing severity for the clean and two icing cases.

Figure 7 .
Figure 7. Structure of MLP-NDI in slow attitude system.

Figure 8 .
Figure 8. Structure of MLP-NDI in fast rotational system.
p, q, r Figure12.Inner-loop states of different controllers.

Figure 17 .
Figure 17.Overall flight trajectories under three icing scenarios.

Table 1 .
Dynamic parameters of Twin Otter in clean and iced configurations.

Table 2 .
Part of the sensor resolution of Twin Otter.

Table 3 .
Command signals in simulation.

Table 5 .
Tracking error in displacement.

Table 6 .
Tracking error in 3 icing cases.