Application of Norm Optimal Iterative Learning Control to Quadrotor Unmanned Aerial Vehicle for Monitoring Overhead Power System

: Wind disturbances and noise severely affect Unmanned Aerial Vehicles (UAV) when monitoring and ﬁnding faults in overhead power lines. Accordingly, we propose repetitive learning as a new solution for the problem. In particular, the performance of Iterative Learning Control (ILC) that are based on optimal approaches are examined, namely (i) Gradient-based ILC and (ii) Norm Optimal ILC. When considering the repetitive nature of fault-ﬁnding tasks for electrical overhead power lines, this study develops, implements and evaluates optimal ILC algorithms for a UAV model. Moreover, we suggest attempting a learning gain variation on the standard optimal algorithms instead of heuristically selecting from the previous range. The results of both simulations and experiments of gradient-based norm optimal control reveal that the proposed ILC algorithm has not only contributed to good trajectory tracking, but also good convergence speed and the ability to cope with exogenous disturbances such as wind gusts.


Introduction
Overhead electrical power lines are a vital component of the power supply infrastructure. It is essential to carry out preventive monitoring of high voltage transmission lines more safely and efficiently in order to meet consumer demand [1][2][3]. UAVs have the potential to perform these monitoring tasks, due to their inherent advantages of cost, manoeuvrability, speed, and easy set-up [4,5]. They can attain the required heights and positions that are needed for performing major inspection tasks and do not require contact. However, UAVs are susceptible to aerodynamic disturbances [6,7]. Because of this, the autonomous identification of faults in terms of performance and accuracy for overhead power lines remains a difficult problem. To ensure the quality of the tasks in fault-finding operations for overhead power lines, precise and accurate trajectory tracking is important [8,9].
The UAV performance relies both on the chosen control strategy and on the underlying vehicle dynamics. Quadrotors are under-actuated and open-loop unstable with significant non-linearity and strong dynamic coupling. Furthermore, exogenous disturbances can easily compromise quadrotor stability [10]. These make the control problem non-trivial. To design and test the control system, models of the dynamics are required. However, the system is subject to modelling, plant and disturbance uncertainties that should be accounted for.
Researchers have developed an ongoing design capability to overcome limitations using various control methods. Some depend on classical control (e.g., PID controller), which is simple and capable of providing acceptable performance [11]. Nevertheless, this method needs an accurate mathematical model, as do Linear Quadratic Regulator (LQR) [12] approaches. Others are based on non-linear control such as sliding-mode [13] or back-stepping [14]. These works make use of conventional feedback and feedforward and to improve the control performance by augmenting the previous controllers (i.e., PID controller). Consequently, the feedback controllers react on the basis of the observation of information using reference signal and disturbance that results in a delayed tracking response [15]. In these regimes, it can be observed that feedback is unable to react in time.
To achieve high performance, extending classical control methods with learning schemes is another approach. Iterative Learning Control (ILC) algorithms provide some advantages over conventional feedback controllers in that they develop an element of intelligence by memorizing from previous practice [16]. ILC has been applied to quadrotors using various approaches. These include derivative-type ILC [17], Proportional-Interactive-Derivative (PID-type) ILC [18], and basic optimal ILC approaches [19] that assume the reference (r) is specified over the entire finite time horizon, [0, T], and they also do not require a model. On the other hand, ILC has been applied to some complicated systems, such as precise speed-tracking control of a robotic fish [20], for multi-agent systems in [21], and recently in [22] for the distributed ILC of multiple flexible manipulators in the presence of uncertain disturbances and actuator dead zones.
We propose using two ILC methods that are based on optimal approaches. These are the Gradient-based ILC and the Norm Optimal ILC. They are chosen because of their prevalence, their use of the same underlying cost function, and because they have similar implementations [23]. This paper seeks to determine whether the proposed optimal algorithms can be applied to quadrotors for tracking performance. Furthermore, it proceeds to motivate, develop, and evaluate the flight controller required to address and if it can address this deficiency of these previous methods.
Existing ILC approaches to UAV control are reviewed in the next section. The ILC design, application, and optimal algorithm designs are described in Section 3. A suitable experimental system is selected in order to provide a test platform and results for the proposed algorithm G-ILC and NO-ILC presented in Sections 5 and 4, respectively. Section 6 provides some concluding remarks.

Basic ILC
ILC relies on performing similar missions multiple times, so that the control can be modified to improve performance over previous operations (i.e., trials, iterations, and passes) through learning. However non-learning systems do not improve their performance due to the same tracking error on each iteration in which despite large model uncertainty and repeating disturbances [23]. Learning-type control strategies can accordingly be classified into ILC, Repetitive Control (RC), neural networks, and adaptive control. Whilst ILC strategies modify the input signal (i.e., the control input), adaptive and neural network learning control methods modify the system (i.e., the controller), and controller parameters, respectively [24]. Additionally, ILC usually guarantees fast convergence within just a few iterations, but the alternative strategies may not [25].
ILC has been applied to quadrotors in a few cases. ILC can be used for systems for which a finite-duration task is repeated. Every iteration should have the same initial conditions and, as the number of trials increase, ILC updates the input signal to ensure that the system output converges to a reference signal. ILC has been applied to many fields, including robotics [26].
The simple D-type ILC form used in [27] for UAV trajectory tracking was based on the Additive State Decomposition (ASD) method. The block diagram shown in Figure 1 demonstrates the D-type algorithm, which employs the change of error rate to modify the input for the next iteration in lieu of the error itself. It is evident that the derivative part in the ILC algorithm amplifies small noise signals which may destabilize the system. Ref. [28] has developed another special case of D-type in UAV applications in order to obtain a better tracking performance and lower errors than the typical D-type update rule in [27]. However, although both [27] and [28] use the D-type ILC, they still do not guarantee the rate of convergence in the presence of disturbances.  Figure 1. Block diagram for D-type ILC:-here the u k (n) it is the input signal was used on kth iterations, L opt is the D-type gain, e k (n) the derivative of error, and r(n) and y k (n) are the reference and the plant output, respectively.
In [29,30], the P and D-term ILCs are combined as a PD-type ILC to increase the convergence rate. In [29], three different methods are additionally applied: offline ILC, online ILC, and a combination of both ILCs. These have the respective forms for P-type ILC u k+1 (n) = u k (n) + K P e k (n) where subscript k denotes the iteration number and subscript n the the sample number. An inner online PD type ILC update was designed by [29] for quadrotor trajectory tracking control to stabilise the UAV system without taking disturbances into account. The algorithm is u k+1 (n) = u k (n) + K P e k (n + 1) + K D [e k (n + 1) − e k (n)] The results show a high tracking error. However, the ILC was able to reduce it over subsequent iterations but with low convergence. In [30], ILC was implemented by an adaptive term for enhancing the performance and robustness. This controller term was implemented on a quadrotor platform, where the test results showed improvement in tracking performance, despite the presence of disturbances. In [31], another ILC form is proposed that includes a combination of the previous two simple structures to include an integral term and is termed PID-type ILC. A controllable flight was optimized using the PID-type ILC after a chnge in mass of the quadrotor. The method is based only on manual auto-tuning for parameters. In summary, all the previous structures (D-type, PD type, PID-type ILC) are susceptible to process disturbance and measurement error, while rarely being utilized in practical applications.
ILC was applied to achieve quadrotor trajectory tracking while balancing an inverted pendulum [32]. The learning algorithm used was of the form where F is the lifted system matrix, α weights the additional penalty term, and d k is an updated estimate of disturbance. Via the matrix D, the input derivatives can be penalized. The matrix S allows for the error signal to be scaled or filtered.
The aforementioned approaches are very limited in accuracy. Apart from initial identification procedures and tuning, it is also noted that these approaches demand a large level of computation and do not require an explicit model. Although usability is an advantage of this simplicity, it necessarily degrades performance. There is a great opportunity to assess a wide variety of ILC approaches on UAVs. There is no single algorithm that delivers all of the required features for high performance control while facing uncertain dynamics and environmental factors. Overall, ILC approaches demonstrated the best tracking performance only with medium complexity. Relatively few ILC schemes have been applied to quadrotors, and their evaluation is quite limited.

Optimal Approach ILCs
The properties of linear optimal algorithms have been studied extensively [33][34][35][36][37]. Leading ILC examples are now introduced with their own specific features.

Gradient-Based Iterative Learning Control
Due to their attractive theoretical properties, Gradient-based (G-ILC) algorithms have received considerable attention in the literature. When compared to generic ILC approaches, the optimal gradient ILC achieves faster error convergence by relying on the system model and utilizing the characteristics of gradient-descent in order to structure the ILC control action update. In [38][39][40], Gradient ILC has been used for SISO systems and was derived for MIMO systems.
The common form of standard ILC employed by the input update law is specified as [23] u k+1 (n) = u k (n) + L opt e k (n) (5) where L opt is a learning operator, y d is the desired reference signal, and e k = y d − y k is the error. By taking the transpose, G T , an alternative method of guaranteeing Equation (5) can be obtained. The lifted form of the more general adjoint operator G * leads to which yields the update law where the scalar β is the learning gain. Equation (7) is interpreted to be the gradient descent solution to the minimisation problem min u J(u k ) = y d − Gu k 2 . The spectral radius condition is a necessary and sufficient condition for convergence Substituting L opt = βG T into the general convergence conditions Equation (8) yields Here, the GG T is positive definite, so the convergence condition becomes Therefore, it should be noted, when the number of trials, k, approaches infinity, Equation (10) ensures the error converges monotonically to zero.

Norm Optimal ILC
The model-based Norm Optimal ILC (NO-ILC) algorithm was introduced in [41]. The ILC input to the following trial is acquired through optimising a specific performance index that allows a balance between the error convergence and input energy. NO-ILC has been used for many applications and extended, for instance a predictive approach based on norm-optimal ILC [42]. NO-ILC uses the quadratic cost function: where the weighting matrices R(t) and Q(t) are positive semi-definite and symmetric.
The requirement is to minimize the tracking error by modifying the input control from one trial pass to the next. This can be done by generating the control action u k+1 for the next trial. The problem at each iteration is thus min J(u k+1 ) (12) which can be solved by applying a partial differentiation to J k+1 with respect to u, and determining the stationary point, ∂J k+1 /∂u = 0 . This leads to the update law where G T is the adjoint operator of the system G, where G T = R −1 G T Q . It can be demonstrated that the convergence condition in Equation (8) is always satisfied and, moreover, where σ 2 (GG T ) is the smallest spectral radius value of the symmetric, positive definite operator GG T . Equation (14), being non-causal, can be manipulated to either generate a causal feed-forward form or feedback and feed-forward form by [43], where Therefore, the updated law is Note that the implementation of optimal algorithms in such cases needs to be investigated. This is more crucial in the application of UAVs for the monitoring overhead power system, especially since the quadrotor UAV has more than one degree of freedom. Moreover, the electric power system inspection task is inherently repetitive while detecting errors that require using optimal algorithms to critically compare performance and inform design.

ILC Design and Application to Quadrotor
This section purports to put forward the optimal algorithms (G-ILC, NO-ILC) for the UAV quadrotor. The design of the optimal algorithm is based on the following assumptions and steps : I. The system is presumed to operate in a repetitive manner (iteratively) for both optimal algorithms, G-ILC and NO-ILC. II. At the end of every iteration, the state is reset operation toward a particular repetition that have independent initial condition to the next operation. III. A new control signal might be utilized during this time. A reference signal, r(t), is presumed to be known and the ultimate control objective is to determine an input function u * (t) such the output function y(t) = r(t) on [1; N]. IV. For G-ILC, the value of the learning gain β old (k) is heuristically selected for the first step, and then calculated automatically using the the gain β new (k) by establishing the varying gain equations.
The established variable will be repeated again for NO-ILC, but with a different learning gain Q k . V. To guarantee error convergence, the necessary conditions are J(β new (k)) = e k+1 2 + ζ β new (k) 2 .
Now, the SISO model is a non-linear, discrete, state space system: where x k (t) ∈ R n , u k (t) ∈ R m , y k (t) ∈ R p and A ∈ R n×n , B ∈ R n×m , C ∈ R p×n are the system matrices. Moreover, x k , u k , and y k are the state vector, input and output respectively, for trial k.

Gradient-Based ILC (G-ILC)
When compared to generic ILC approaches, the optimal gradient ILC depends on the system model to obtain faster error convergence. It constructs the ILC control action update while using the properties of gradient descent. This happens through minimising the cost function g N = CA N+r−1 B N = 0, 1, 2...., h − 1 and the tracking error e k from the Nth trial. This is the error between the actual outputs y k of the system and their desired reference signal y d is then Using gradient descent to solve the optimisation problem given by Equation (18) gives where β represents the learning gain. From Equation (24), the error evolution of the G-ILC can be derived as By choosing the learning gain β from the range 0 < β < 2/σ(G), whereσ(G) is the largest singular of the matrix G, it can be easily shown that I − βGG T < 1. Therefore, the error converges monotonically to zero, as the trials k goes to infinity. Instead of arbitrarily selecting a value of β old (k) from the range, the error convergence rate can be optimized. Repeating Equations (23) and (27) the optimal iteration-varying β new (k) can be obtained by minimising: where ζ is a small positive weighting constant. Substituting Equation (29) into Equation (30) we get = e T k e k − 2β k e T k GG T e k + β 2 k e T k GG T GG T e k + ζ β 2 k .
Differentiating Equation (32) with respect to β new (k) and equating to zero gives the optimal learning gain: Thus the necessary and sufficient conditions for guaranteeing a convergence of error are e k+1 < e k for all k ≥ 0 and lim k→∞ e k = 0.
From Equation (29) we get = e T k ((I − β k GG T ) 2 − I)e k (38) = e T k (−2β k GG T + β 2 k GG T GG T )e k (39) Furthermore from Equation (34), we get Substituting Equation (41) into Equation (42) gives From Equation (43) it can be deduced that e k+1 = e k if and only if β k = 0. Because GG T is a positive definite matrix, from Equation (34) we have that β new (k) = 0 if and only if e k = 0, Thus the conditions of Equation (36) are satisfied and the and the system has monotonic convergence.

Norm Optimal ILC (NO-ILC)
To produce the optimal action ILC u k+1 , we recall the Equation (11). Setting the gradient to zero gives Since matrix R is a positive definite so is non-singular, rearranging gives From Equation (45), the error evolves as However Equation (11) is implicit. However, it can be solved by supposing that G * = R −1 G T Q and substituting the Equation (48) into Equation (11) gives Monotonic convergence can be shown as follows We get error convergence if either there exist no e such that G T e = 0 so lim k→∞ e k+1 = 0, or if y d ∈ range(G).

Application to Quadrotors
The dynamics of standard quadrotors are well established, the main equations are given here. Details of the dynamics model can be found in [34], for example.
The control inputs are related to each rotor speed Ω i by: where and l is the arm length and b and d are rotor thrust and drag coefficients respectively. The dynamic model for the quadrotor attitude is given bẏ where the triplet (p, q, r) are the rotation rates about the body axes, I xx , I yy , I zz are the moments of inertia about the body axes, and J p is the rotor moment of inertia about the rotor rotation axis.
We define a state variable vector as where the triplet (x, y, z) is the position of the vehicle in the earth axes, and (φ, θ, ψ) are the standard aerospace Euler angles. By approximating the rotation rate triplet (p, q, r) by the Euler angle derivative (φ,θ,ψ) and from the standard aeronautics navigation equations we get the dynamic model in the and where m is the mass, g is the gravitational constant,Ω is the rotor rotation rate sum, a 1 = (I yy − I zz )/I xx , a 2 = J p /I xx , a 3 = (I zz − I xx )/I yy , a 4 = J p /I yy , a 5 = (I xx − I yy )/I zz , b 1 = l/I xx , b 2 = l/I yy , and b 3 = l/I zz .
The SISO structure of Equation (17) is extended to a MIMO dynamics to give x k (n + 1) = f (x k (n), u k (n)), The modelẋ = f (x, u) can be discretized by an Euler approximation. Full state feedback is assumed, that is y k = x k .

Physical Parameters
The AscTec Hummingbird is chose as the experimental test platform. This quadrotor is popular, has good performance and is light-weight maneuverable. It has a payload of 200 g and a flight endurance of nearly 20 min.The aircraft component frame is made out of balsa wood and carbon fiber. The vehicle is powered by four brushless DC motors running off an 11.1V Lithium Polymer (LiPo) battery pack. It is equipped with an accelerometer, pressure sensor, magnetic sensor, gyros, and GPS module. These can provide the vehicle state. Some of the technical details are listed in Table 1 [44]. The model parameters are given in Table 2.

Test Bed
A test bed, designed for analysing the motor's performance and enabling controller tuning, is constructed from steel and finished in black paint and bearings, so that it allows three DOF of rotation. Steel tube was selected because of its easy availability and high density gives the rig stability and rigidity. The UAV is secured in place with a spherical rolling joint. The assembled mechanical design is shown in Figures 2 and 3, The UAV installed on the top. A Raspberry Pi 3 is used for the control.

Results and Discussion
The G-ILC and NO-ILC algorithms are applied to the test system. Simulations are also performed. The simulations and experiments were conducted on a Laptop (i7) ThinkPad P1 Mobile WorkStation with 16 GB RAM/2.20 GHz via MATLAB R2018b. The reference trajectory is shown in Figure 4. The trajectory consists of a single period sin wave and is non-smooth; hence is a challenging task for the ILC algorithm. Sixteen iteration trials were performed for each algorithm. The input update for the G-ILC and NO-ILC algorithms was acquired by Equations (28) and (35), respectively, with the help of the linearized quadrotor dynamics from Equation (49). Demonstrating the monotonic convergence of the G-ILC algorithm is also important. The simulation results show a notable decrease in the error over different trial iterations. Figure 5 shows the decrease of the 2-norm of the error, with a value of 0.3092 at the 16th iteration. The variation in φ and θ over time for different iterations are also shown in Figure 5.  The G-ILC with updating Equations (28) and (29) was implemented to track the reference signal. An optimal value of gain, β, is chosen between 0.01 and 1.0. After testing a wide range of values of β, the best performance was found with β = 0.1. The experimental results show a significant decline in the error over the first five trial iterations as shown in Figure 7a. A slight increase occurred at the 7th and 10th trials but the trend was from 1.277 at first trial to the value of 0.574 at the 6th trial. The performance of the NO-ILC algorithm is also investigated with the reference signal shown in Figure 4. The results are shown in Figure 8a. The weighting parameter is set to Q = 0.1. The value of Q can be increased to improve convergence, but Figure 8b shows the convergence is similar to that of the G-ILC experiment, with the latter slightly better. Convergence was achieved after 8 iterations.
Note that although convergence is established theoretically, in practice the system is subject to disturbances and uncertainty. The effect of disturbances is evaluated in simulation and by experiment for the two approaches. First the performances of the two methods in sumulation without disturbance are quantified and compared. The results are shown in Table 3. The NO-ILC method had significantly better performance and convergence properties in simulation.
The disturbances took the form of torques that were injected in the φ and θ channels. The disturbances defined as exponentially decaying sinusoidal functions δτ = e −0.1t (sin t), cos t, 0) for t ∈ (2, 6) s. The results for experiment are shown in Table 4. These show the better performance of the NO-ILC but the difference is less marked.
To improve the performance of the G-ILC algorithm, the value of the learning gain β can be changed. Figure 9 shows the effect of β on the convergence rate.   For a large class of practical systems, such as UAV reference tracking (as required for power line surveillance and monitoring) it is required that the output achieves perfect tracking at more than one defined time and enables the system error to converge to zero norm as rapidly as possible. Consequently, it includes future work on an alternative controller (i.e., ILC with hybrid controller) as an extension to enhance the tracking performance at subset (instantaneous in time) for many critical positions.

Conclusions
The suggested G-ILC and NO-ILC have been formulated and applied to the problem of reference tracking for UAV. When comparing the findings, the NO-ILC has shown superior tracking performance. Furthermore, the suggested NO-ILC has shown substantially improvement over the G-ILC in terms of error decrease and monotonic convergence. The results of the simulations and experiments both with and without an external disturbance show the proposed ILC performance for the two methods. The results the potential potential to achieve good trajectory tracking.
The NO-ILC method could form the basis for a power line inspection system. The repetitive nature of the power line geometries lends itself to this approach. However there are many control challenges to be faced, such as disturbances in the form of steady wind and unsteady wind gusts, and decision-making in the face of uncertainty. This points to the urgent need for additional future work for expanding ILC (i.e., point-to-point with hybrid controller) for tracking identification, for instance, through a straight conductor for a electrical overhead conductors monitoring-task.