Design of Aquila Optimization Heuristic for Identiﬁcation of Control Autoregressive Systems

: Swarm intelligence-based metaheuristic algorithms have attracted the attention of the research community and have been exploited for effectively solving different optimization problems of engineering, science, and technology. This paper considers the parameter estimation of the control autoregressive (CAR) model by applying a novel swarm intelligence-based optimization algorithm called the Aquila optimizer (AO). The parameter tuning of AO is performed statistically on different generations and population sizes. The performance of the AO is investigated statistically in various noise levels for the parameters with the best tuning. The robustness and reliability of the AO are carefully examined under various scenarios for CAR identiﬁcation. The experimental results indicate that the AO is accurate, convergent, and robust for parameter estimation of CAR systems. The comparison of the AO heuristics with recent state of the art counterparts through nonparametric statistical tests established the efﬁcacy of the proposed scheme for CAR estimation.


Literature Review
In recent years, system identification has gained significant attention in various areas such as signal processing, parameter estimation, multiple-input multiple-output systems, etc. [1][2][3][4]. Parameter estimation refers to the determination of the best fitness values for each parameter using local or global optimization techniques. Parameter estimation consists of three steps: First, construct a mathematical model for a given system such that it replicates exact behaviour under the same conditions. Second, define a fitness function for a given set of parameters using various approximations such as least square, weighted least square, and generalized least square. Third, select an optimization technique for finding the best fitness values through iteration [5].
The research community has shown great interest in parameter estimation of control autoregressive (CAR) systems because of their importance and significance in effectively modelling a variety of engineering problems including power system optimization [6] electricity load prediction [7] battery charge estimation [8] forecasting groundwater flooding [9] and CO 2 emission forecasting [10]. Various methods for parameter estimation of control autoregressive (CAR) models are proposed in the literature. Ding et al. [11] decompose a

•
The strength of a swarm intelligence-based Aquila optimizer (AO) heuristic is exploited for solving parameter estimation in a control autoregressive (CAR) model.

•
The convergence, accuracy, and robustness analyses of the AO are conducted for different noise levels considered in the CAR model.

•
Statistical analyses for parameter tuning of the AO as well as for reliability and stability assessment are conducted for different generations and population sizes.

Paper Organization
The rest of the paper is prepared as follows: the CAR system model is given in Section 2. The AO-based methodology is presented in Section 3. The performance analysis of the CAR model is provided in Section 4. The main conclusions and some future research directions are listed in Section 5.

Mathematical Model of CAR Systems
Consider the second-order CAR model presented in (1): where θ(t) is the input of the model, q(t) is the output of the model, and ω(t) is zero mean white noise. R(z) and S(z) are polynomials given in (2) and (3): R(z) = 1 + r 1 z −1 + r 2 z −2 + . . . + r n r z −n r , S(z) = s 1 z −1 + s 2 z −2 + . . . + s n s z −n s .
The corresponding information vectors are given in (6) and (7): The model presented in (1) can be rewritten as given in (8) and simplified in (9).
The overall information and parameter vectors of the CAR model are given as Then, the identification for CAR system becomes:

Methodology
In this section, an AO-based methodology for parameter estimation of a CAR model is presented. The graphical abstract of the proposed methodology for CAR model is presented in Figure 1. It provides the overview of the proposed study, which includes the parameter estimation of a CAR model by applying a swarm intelligence-based Aquila optimizer. The optimum parameters are evaluated on the basis of the square of the difference between estimated and true values along with the number of generations as termination criteria.

Aquila Optimization (AO) Method
The AO is a swarm intelligence-based method for finding solutions to optimum global problems [33]. It is applied in various domains such as internet of things (IoT) [37], power electronics [35], image processing [38], oil production forecasting [39], Francis turbines [40], hybrid solid oxide fuel cell (SOFC) [41], wind energy [42], and population forecasting [43]. AO is a population-based optimization method inspired by Aquila's preyhunting ability. It uses four hunting techniques and has the ability to switch between these techniques. The mathematical model, pseudocode, and flowchart are presented below.

Population Initialization
AO starts with the initialization of the population for candidate solutions (W) as given in (13): .
The population is randomly generated using (14): where N is the total population size, and D is the number of decision variables.

The Mathematical Model
The mathematical formulation is divided into four steps which are presented below.

Expanded Exploration (W )
In the first method, W , the Aquila explores the prey area by involving a high soar and a vertical stoop. It is presented in (15):

Aquila Optimization (AO) Method
The AO is a swarm intelligence-based method for finding solutions to optimum global problems [33]. It is applied in various domains such as internet of things (IoT) [37], power electronics [35], image processing [38], oil production forecasting [39], Francis turbines [40], hybrid solid oxide fuel cell (SOFC) [41], wind energy [42], and population forecasting [43]. AO is a population-based optimization method inspired by Aquila's prey-hunting ability. It uses four hunting techniques and has the ability to switch between these techniques. The mathematical model, pseudocode, and flowchart are presented below.

Population Initialization
AO starts with the initialization of the population for candidate solutions (W) as given in (13): The population is randomly generated using (14): 14) where N p is the total population size, and D is the number of decision variables.

The Mathematical Model
The mathematical formulation is divided into four steps which are presented below.
Expanded Exploration (W 1 ) In the first method, W 1 , the Aquila explores the prey area by involving a high soar and a vertical stoop. It is presented in (15): where W 1 (t + 1) is the next-iteration solution for W 1 , W best (t) is the best solution, 1 − t T is used to control the search space, and W M (t) is the mean of the current solution, which is calculated using (16): 16) where N p is the total population size, and D is the number of decision variables.

Narrowed Exploration (W 2 )
In the second method (W 2 ), upon finding the prey area, the Aquila circles around the target and uses a method called contour fight with a short glide attack. In W 2 , AO narrowly explores the area for preparation of the attack on the target, which is calculated using (17): where W 2 (t + 1) is the next-iteration solution for W 2 , DI is the space dimension, W R (t) is the random solution from [1 N p ], and LEF(DI) is the distribution function calculated in (18): where d is constant equals 0.01, e and f are random numbers between 0-1. σ is calculated using (19): where δ is fixed at 1.5. y and w from (17) are calculated as follows.
where g is between 1-20.
where V is fixed to 0.00565 and ε is fixed to 0.005.

Expanded Exploitation (W 3 )
In the third method (W 3 ), AO exploits the search space by descending vertically to discover prey reaction and to land and attack. It is given in (25): where W 3 (t + 1) is the next-iteration solution for W 3 , β and µ are exploitation adjustment factors, W best (t) is the best solution, UB and LB are problem-dependent parameters, and W M (t) is the mean of the current solution, which is calculated using (16).

Narrowed Exploitation (W 4 )
In the fourth method (W 4 ), AO uses the method of walking and grabbing by getting closer to prey and attacking based on stochastic movement as presented in (26): (26) where W 4 (t + 1) is the next-iteration solution for W 4 and QYF is the quality factor, calculated using (27): O 1 and O 2 indicate the variations of motion, which are calculated using (28) and (29): The flowchart for AO is shown in Figure 2.
Mathematics 2022, 10, x FOR PEER REVIEW 6 of 23 In the fourth method (W ), AO uses the method of walking and grabbing by getting closer to prey and attacking based on stochastic movement as presented in (26): where W (t + 1) is the next-iteration solution for W and QYF is the quality factor, calculated using (27): .
O and O indicate the variations of motion, which are calculated using (28) and (29): The flowchart for AO is shown in Figure 2.  The pseudocode of the MLADE is presented in Algorithm 1.

Initialization:
Initialize the population W and parameters of AO such as σ, β, etc.

WHILE do
Calculate fitness values Determine the best obtained solution W .
for k = 1: N The pseudocode of the MLADE is presented in Algorithm 1.
W best (t) = W 4 (t + 1) end end end end end end return W best

Performance Analysis
In this section, the performance analysis of AO for the CAR model is presented. The identification of the CAR model is conducted on various noise levels, generations, and population sizes. The algorithm is weighed in terms of accuracy, which is measured by the fitness function presented in (30): (30) whereẑ is the estimated response through the proposed evolutionary algorithms and z is the desired response. For the simulation study, we considered the second-order CAR model from [6] as presented in (31) and (32):

Parameter Tuning of AO
Learning optimal weights plays a significant role in boosting the performance of the AO method. Hence, the aim is to use the best values for the exploitation adjustment factors (β and µ) for learning the optimum weights W 3 using the update rule given in (25). The best values for both parameters (β and µ) are achieved through hyper-parameter tuning. Using grid search, various combinations of both β and µ are split into different cases like case 1 to case 9, and the chosen values of β and µ are presented in Table 1. Hyper-parameter tuning is performed for different generations and population sizes in a noise-free environment (zero noise). Each case is executed for three generations, i.e., 1000, 1500, and 2000, and populations, i.e., 30, 40, and 50, whereas the simulations for a combination of one generation and one population are performed for 15 runs to accomplish the average fit, best fit, worst fit, and standard deviation.  Different scenarios (cases) reflecting the tuning of β and µ for the optimal weight update mechanism and the average fit, best fit, worst fit, and standard deviation with different generations and populations are computed and represented in Tables 2-10. Varying β and µ causes the fit to vary with regard to a change in a generation or a population size, and the tables show the fitness trends. It is observed from the results given in Tables 2-10 that the optimal fit for different generations and populations is achieved with case 9, i.e., β = 0.1 and µ = 0.1.
Apart from the fitness results presented in Tables 2-10, the mean fit values achieved with nine (β and µ) variations, three generations, and three populations are presented in Table 11. It is observed from the mean fit values in Table 11 that AO obtained the minimum mean fit for case 9, i.e., 440 × 10 −6 . Therefore, for optimal AO performance, the remaining simulation results are presented with the best hyper-parameter values, i.e., β = 0.1 and µ = 0.1. Table 2. AO parameter analysis for case 1. Table 3. AO parameter analysis for case 2. Table 4. AO parameter analysis for case 3. Table 5. AO parameter analysis for case 4. Table 6. AO parameter analysis for case 5. Table 7. AO parameter analysis for case 6. Table 8. AO parameter analysis for case 7. Table 9. AO parameter analysis for case 8.   The fitness plots for case 9 with zero noise for three generations and populations are shown in Figures 3 and 4. The fitness curves with fixed generation size and varying population size are given in Figure 3a  Apart from the fitness results presented in Tables 2-10, the mean fit values achieved with nine (β and μ) variations, three generations, and three populations are presented in Table 11. It is observed from the mean fit values in Table 11 that AO obtained the minimum mean fit for case 9, i.e., 440 × 10 . Therefore, for optimal AO performance, the remaining simulation results are presented with the best hyper-parameter values, i.e., β = 0.1 and μ = 0.1. The fitness plots for case 9 with zero noise for three generations and populations are shown in Figures 3 and 4. The fitness curves with fixed generation size and varying population size are given in Figure 3a

Statistical Convergence Analysis
In this section, the performance of AO algorithm is assessed by introducing three noise levels to the CAR model. Moreover, the fit of AO is estimated through three variations of generation [1000, 1500, 2000] and population [30,40,50]. The evaluation metrics used to assess the performance of AO for CAR are average fit, best fit, worst fit, and standard deviation (STD).
The performance in terms of fit variations and standard deviations for the three noise levels, i.e., 0.04, 0.06, and 0.08, is demonstrated in Tables 12-14, respectively. It is witnessed from Tables 12-14 that the AO fit decreases by increasing population and generation size. It is noticed from Table 12 that the minimum average fit, best fit, and worst fit achieved for noise level = 0.04 are 1.7 × 10 , 1.0 × 10 , and 3.0 × 10 , respectively. Similarly, the three best fit values for noise levels 0.06 and 0.08, given in Tables 13 and 14, are 2.7 × 10 , 2.3 × 10 , and 3.2 × 10 and 4.5 × 10 , 4.1 × 10 , and 4.9 × 10 , respectively.

Statistical Convergence Analysis
In this section, the performance of AO algorithm is assessed by introducing three noise levels to the CAR model. Moreover, the fit of AO is estimated through three variations of generation [1000, 1500, 2000] and population [30,40,50]. The evaluation metrics used to assess the performance of AO for CAR are average fit, best fit, worst fit, and standard deviation (STD).
The performance in terms of fit variations and standard deviations for the three noise levels, i.e., 0.04, 0.06, and 0.08, is demonstrated in Tables 12-14, respectively. It is witnessed from Tables 12-14 that the AO fit decreases by increasing population and generation size. It is noticed from Table 12 that the minimum average fit, best fit, and worst fit achieved for noise level = 0.04 are 1.7 × 10 −3 , 1.0 × 10 −3 , and 3.0 × 10 −3 , respectively. Similarly, the three best fit values for noise levels 0.06 and 0.08, given in Tables 13 and 14, are 2.7 × 10 −3 , 2.3 × 10 −3 , and 3.2 × 10 −3 and 4.5 × 10 −3 , 4.1 × 10 −3 , and 4.9 × 10 −3 , respectively. Table 12. AO analysis with respect to generation and population sizes at 0.04 noise variance.   Table 14. AO analysis with respect to generation and population sizes at 0.08 noise variance. The performance of the AO method in terms of best fit for three noise levels, i.e., 0.04, 0.06, and 0.08, is evaluated for the three variations in a generation, 1000, 1500, and 2000, and population size, 30, 40, and 50. Figure 5 shows the fitness plots. The fitness curves in Figure 5a-c represent the best fit of the AO algorithm for noise variance = 0.04. In contrast, Figure 5d-e show the best fit curves for noise variance = 0.06. Likewise, the best fit plots for noise variance = 0.08 are given in Figure 5g-i. Figure 5a-i shows that the AO fit for the three noise levels, i.e., 0.04, 0.06, and 0.08, decreases significantly with increasing generation and population size as well. However, better fit results are achieved for smaller values of noise with a greater number of generations and populations.   The performance of the AO method in terms of best fit for three noise levels, i.e., 0.04, 0.06, and 0.08, is evaluated for the three variations in a generation, 1000, 1500, and 2000, and population size, 30, 40, and 50. Figure 5 shows the fitness plots. The fitness curves in Figure 5a-c represent the best fit of the AO algorithm for noise variance = 0.04. In contrast, Figure 5d-e show the best fit curves for noise variance = 0.06. Likewise, the best fit plots for noise variance = 0.08 are given in Figure 5g-i. Figure 5a-i shows that the AO fit for the three noise levels, i.e., 0.04, 0.06, and 0.08, decreases significantly with increasing generation and population size as well. However, better fit results are achieved for smaller values of noise with a greater number of generations and populations.  To confirm the natural behaviour of the AO strategy for different noise values, the performance of the AO method is also verified by fixing the population size (30, 40, or 50) and changing the generation size (1000, 1500, or 2000) for three values of noise variance (0.04, 0.06, and 0.08), and the fitness-based learning curves are presented in Figure 6. Figure 6a-c represent the AO fit with population = 30, and the fitness plots for population = 40 are given in Figure 6d-e. Figure 6g-i denotes the fitness plots for population = 50. It is realized from the fitness curves given in Figure 6a-i that for a fixed population and gen- To confirm the natural behaviour of the AO strategy for different noise values, the performance of the AO method is also verified by fixing the population size (30, 40, or 50) and changing the generation size (1000, 1500, or 2000) for three values of noise variance (0.04, 0.06, and 0.08), and the fitness-based learning curves are presented in Figure 6. Figure 6a-c represent the AO fit with population = 30, and the fitness plots for population = 40 are given in Figure 6d,e. Figure 6g-i denotes the fitness plots for population = 50. It is realized from the fitness curves given in Figure 6a-i that for a fixed population and generation size, the AO fit for low levels of noise, i.e., 0.04 and 0.06, is quite lower than the fit for high noise, i.e., 0.08. Nevertheless, AO accomplishes the minimum fit for the smallest noise level, i.e., 0.04, for fixed population size. Therefore, it is confirmed from the curves in Figure 6 that the performance of AO degrades noticeably for higher noise values.  To confirm the natural behaviour of the AO strategy for different noise values, the performance of the AO method is also verified by fixing the population size (30, 40, or 50) and changing the generation size (1000, 1500, or 2000) for three values of noise variance (0.04, 0.06, and 0.08), and the fitness-based learning curves are presented in Figure 6. Figure 6a-c represent the AO fit with population = 30, and the fitness plots for population = 40 are given in Figure 6d-e. Figure 6g-i denotes the fitness plots for population = 50. It is realized from the fitness curves given in Figure 6a-i that for a fixed population and generation size, the AO fit for low levels of noise, i.e., 0.04 and 0.06, is quite lower than the fit for high noise, i.e., 0.08. Nevertheless, AO accomplishes the minimum fit for the smallest noise level, i.e., 0.04, for fixed population size. Therefore, it is confirmed from the curves in Figure 6 that the performance of AO degrades noticeably for higher noise values.

Results Comparison with other Heuristics
To further investigate the exploration and exploitation phase of the AO, it is compared with the arithmetic optimization algorithm (AOA) [44], the sine cosine algorithm (SCA) [45], and the reptile search algorithm (RSA) [46] for 15 independent runs with 3 variations of generation (1000, 1500, 2000) and population (30,40,50). Tables 15-17 shows the performance of all algorithms in terms of estimated weights and best fit for 0.04, 0.06, and 0.08 noise levels. It is seen that the algorithm gives better results at low noise, i.e., 0.04, than at high noise. Moreover, for low noise, the estimated weights are closer to the true values with minimum fit.

Results Comparison with other Heuristics
To further investigate the exploration and exploitation phase of the AO, it is compared with the arithmetic optimization algorithm (AOA) [44], the sine cosine algorithm (SCA) [45], and the reptile search algorithm (RSA) [46] for 15 independent runs with 3 variations of generation (1000, 1500, 2000) and population (30,40,50). Tables 15-17 shows the performance of all algorithms in terms of estimated weights and best fit for 0.04, 0.06, and 0.08 noise levels. It is seen that the algorithm gives better results at low noise, i.e., 0.04, than at high noise. Moreover, for low noise, the estimated weights are closer to the true values with minimum fit.  Tables 18-20 show the performance of the AO, AOA, RSA, and SCA algorithms in terms of average fit for all noise variances. It is seen that for all noise variances, the AO algorithm gives better results as compared with RSA, SCA, and AOA for generations and populations.   The statistical analysis of AO, SCA, AOA, and RSA for multiple runs, noise variances, and population sizes and constant generation size are shown in Figure 7. It is witnessed that for all noise variances, the AO achieves worse fit as compared with RSA, AOA, and SCA. It is also observed that by increasing the noise level degrades the performance of all algorithms. However, AO achieves optimal fit in all scenarios.   To further investigate the performance of AO vs. RSA, AO vs. AOA, and AO vs. SCA, a nonparametric Mann-Whitney U test [47] is performed on average fit values for all Mathematics 2022, 10, 1749 20 of 23 algorithms on noise variances 0.04, 0.06, and 0.08 with generations 1000, 1500, and 2000 and populations 30, 40, or 50. The Mann-Whitney U test is a parametric equivalent of two sample t test. The significance level is 0.01 and one tailed hypothesis is used. The computed z-score is −6.29719 and p-value is less than 0.00001. Moreover, the result is significant at p < 0.01 as presented in Figures 8-10. To further investigate the performance of AO vs. RSA, AO vs. AOA, and AO vs. SCA, a nonparametric Mann-Whitney U test [47] is performed on average fit values for all algorithms on noise variances 0.04, 0.06, and 0.08 with generations 1000, 1500, and 2000 and populations 30, 40, or 50. The Mann-Whitney U test is a parametric equivalent of two sample t test. The significance level is 0.01 and one tailed hypothesis is used. The computed z-score is −6.29719 and p-value is less than 0.00001. Moreover, the result is significant at p < 0.01 as presented in Figures 8-10.    To further investigate the performance of AO vs. RSA, AO vs. AOA, and AO vs. SCA, a nonparametric Mann-Whitney U test [47] is performed on average fit values for all algorithms on noise variances 0.04, 0.06, and 0.08 with generations 1000, 1500, and 2000 and populations 30, 40, or 50. The Mann-Whitney U test is a parametric equivalent of two sample t test. The significance level is 0.01 and one tailed hypothesis is used. The computed z-score is −6.29719 and p-value is less than 0.00001. Moreover, the result is significant at p < 0.01 as presented in Figures 8-10.    To further investigate the performance of AO vs. RSA, AO vs. AOA, and AO vs. SCA, a nonparametric Mann-Whitney U test [47] is performed on average fit values for all algorithms on noise variances 0.04, 0.06, and 0.08 with generations 1000, 1500, and 2000 and populations 30, 40, or 50. The Mann-Whitney U test is a parametric equivalent of two sample t test. The significance level is 0.01 and one tailed hypothesis is used. The computed z-score is −6.29719 and p-value is less than 0.00001. Moreover, the result is significant at p < 0.01 as presented in Figures 8-10.    The results of detailed simulations and the statistics indicate that the AO based swarming optimization heuristics effectively estimates the parameters of the CAR systems. However, the real time implementation of the swarm optimization algorithms for practical system identification problems require further investigation.

Conclusions
Following are the conclusions drawn from the extensive simulation results presented in the last section: The strength of swarm intelligence of the Aquila optimizer, AO, is effectively exploited for parameter estimation of control autoregressive, CAR, systems. Performance of the AO is enhanced by escalating the population and generation at the expense of computational cost. While, the optimal fitness for different generations and populations is achieved for β = 0.1 and µ = 0.1 exploitation adjustable factors. The robustness and accuracy of the AO decreases by varying the noise level. The comparative study of the AO with other heuristics based on AOA, SCA and RSA established the efficacy of the proposed scheme. The statistical analysis through Mann-Whitney U test endorsed the reliability of the AO scheme for CAR system identification The current study expands the application domain of swarm intelligence based optimizers by exploiting the strength of AO approach for system identification. However, future work may consider applying the proposed methodology of the for solving complex problems [48][49][50][51][52][53][54].