A Novel Adaptive LMS Algorithm with Genetic Search Capabilities for System Identification of Adaptive FIR and IIR Filters

In this paper we introduce a novel adaptation algorithm for adaptive filtering of FIR and IIR digital filters within the context of system identification. The standard LMS algorithm is hybridized with GA (Genetic Algorithm) to obtain a new integrated learning algorithm, namely, LMS-GA. The main aim of the proposed learning tool is to evade local minima, a common problem in standard LMS algorithm and its variants and approaching the global minimum by calculating the optimum parameters of the weights vector when just estimated data are accessible. In the proposed LMS-GA technique, first, it works as the standard LMS algorithm and calculates the optimum filter coefficients that minimize the mean square error, once the standard LMS algorithm gets stuck in local minimum, the LMS-GA switches to GA to update the filter coefficients and explore new region in the search space by applying the cross-over and mutation operators. The proposed LMS-GA is tested under different conditions of the input signal like input signals with colored characteristics, i.e., correlated input signals and investigated on FIR adaptive filter using the power spectral density of the input signal and the Fourier-transform of the input’s correlation matrix. Demonstrations via simulations on system identification of IIR and FIR adaptive digital filters revealed the effectiveness of the proposed LMS-GA under input signals with different characteristics.


Introduction
Adaptive filters are systems whose configuration is adaptable or customizable so that its performance enhances via interaction contact with its peripheral. These filters frequently can, in a repeated manner, adjust even with evolving situations, they can be trained to achieve appropriate filtering, and they do not require elaborate synthesis procedures usually needed for nonadaptive systems, other characteristics can be found in [1].
Typically, traditional nonadaptive filters which are utilized for extraction of data from a particular input sequence have the linearity and time-invariance properties. While for the case of the adaptive filters, the limitation of invariance is eliminated by enabling the filter to update its weights as per specific foreordained optimization process. The digital adaptive filter can be classified into adaptive finite impulse response (FIR) filter, or commonly known as an Adaptive Linear Combiner, which is unconditionally stable, and infinite impulse response (IIR) presents a prospective enhancement in the performance and less computation power than corresponding adaptive FIR filter [1]. Typical applications of adaptive filters are noise cancelation, inverse modeling, prediction, jammer suppression [2][3][4], and system identification, which is the main topic of this paper.
Adaptive system identification had a long history of many types of research ranged from the implementation of neural networks [5][6][7][8][9] to swarm optimization algorithms [10][11][12], reaching to the applications of LMS adaptation algorithm on IIR and FIR adaptive filters proposed by [13,14] with different techniques and applications [15][16][17][18][19]. Applications of Genetic Algorithm (GA) in system identification are studied in [20,21]. Standard LMS algorithm with variable step-size are studied in [22][23][24][25] with [22] emphasizing on link noise impact in wireless sensor networks with noise being modeled as a Gaussian distribution, while [25] assumed an impulsive noise in the network. The work in [24] applied the proposed variable step size LMS algorithm on active vibration control study case. Stochastic analysis of the LMS algorithm on colored input signals can be found in [26] and the references therein. To reduce the number of the calculations required to process the input data in LMS algorithm, a block by block manipulation of the noise data based on the fast Fourier-transform (FFT) and the overlap-save method is proposed in [27]. It has been validated by testing its performance for active vehicle interior noise control. Adaptive filtering is also extremely important in model predictive control [28,29], identification of AR latent-variable graphical models is studied in [30].
From the above studies, even though huge investigation has been done to speed up the standard LMS adaptive filtering algorithm on FIR and IIR filters in different applications and recasting the standard LMS algorithm to work perfectly in noisy environments, still the essential limitation of the standard Least Mean Square (LMS) algorithm in system identification based on IIR exists, which can be briefly described as follows. The adaptive IIR digital filter suffers from the multimodality of the error surface versus the filter coefficients. It is natural that the adaptation techniques (e.g., standard LMS algorithm) get stuck at one of the local minima and diverge away from the global optimum solution. The global optimum in the error surface is reached in the LMS algorithm by traveling toward the negative direction of the error gradient. In the case of an error surface with the multimodal situation, the LMS algorithm, like the vast majorities of the learning techniques, may drive the adaptive digital filter into the local minima. Moreover, the initial choice of the filter coefficients and the proper selection of the step size mainly determine the convergence behavior of the LMS algorithm.
An evolutionary algorithm named Genetic Algorithm (GA) is presented for multimodal error surface searching in IIR adaptive filtering. Moreover, GA has the capability to create a wide diversity in its populations through its cross-over and mutation operators. Nevertheless, the high computational complexity and slow convergence are the main drawbacks of utilizing such an algorithm. On the other hand, gradient descent algorithms (e.g., LMS algorithm) can only do well locally, i.e., once it gets stuck at a local minimum rather than the global minimum of the error surface, it cannot leave it, hence, its solution is suboptimal. The main motivation of the proposed work is to override the suboptimal solution of the gradient descent based LMS algorithm and obtain the optimal one (global optimum). Starting with the benefits and deficiencies of the evolutionary algorithm and gradient descent algorithm, we built a novel integrated searching algorithm by combining the standard LMS and GA algorithms, namely, LMS-GA. The proposed algorithm can explore regions in the searching space of the multidimensional error surface might not be visited by the standard LMS algorithm and reaches the global minimum (optimal solution) by finding a new set of chromosomes (filter coefficients) that are very close to the global optimum. The proposed LMS-GA algorithm has the attributes of simple implementation, global searching ability, rapid convergence, and less sensitivity to the parameters selection.
Paper Findings. This paper reviews the implementation of the LMS algorithm in the adaptation of FIR digital filters with an application on system identification discusses the effect of colored input signals on the convergence rate of the adaptation process. Furthermore, developing a new search algorithm, namely, LMS-GA for learning adaptive IIR digital filters coefficients using the gradient descent algorithm integrated with the evolutionary computations. The algorithm is designed in such a way that as soon as the adaptive IIR filter is found to have a sluggish convergence or to be trapped at a local minimum, the adaptive IIR digital filter parameters are updated in a random behavior to move away from the local minimum and possess a higher chance of traveling toward the global optimum solution.
The current paper is structured as follows: The basic structures of FIR and IIR adaptive digital filters together with the application of LMS algorithm on both adaptive FIR and IIR digital filters are given in Section 2. A concise overview of GA is introduced in Section 3. The main results are presented in Section 4, including the discussion of the effect of the colored input signal on the adaptation process and investigating the new LMS-GA learning technique with its application as an adaptive filtering tool. The numerical results and simulations are presented and discussed in Section 5. Finally, the paper is concluded in Section 6.

Preliminaries on Adaptive FIR and IIR Filtering
The FIR filter is shown in Figure 1 in the form of a single input transversal filter. The "adaptation" is that by which the weights are adjusted or adapted in response to a function of the error signal. When the weights are in the adaptation course, they are a function of the input samples and not just the output so that the transversal filter's output is not characterized a linear in terms of the input. algorithm, namely, LMS-GA for learning adaptive IIR digital filters coefficients using the gradient descent algorithm integrated with the evolutionary computations. The algorithm is designed in such a way that as soon as the adaptive IIR filter is found to have a sluggish convergence or to be trapped at a local minimum, the adaptive IIR digital filter parameters are updated in a random behavior to move away from the local minimum and possess a higher chance of traveling toward the global optimum solution.
The current paper is structured as follows: The basic structures of FIR and IIR adaptive digital filters together with the application of LMS algorithm on both adaptive FIR and IIR digital filters are given in Section 2. A concise overview of GA is introduced in Section 3. The main results are presented in Section 4, including the discussion of the effect of the colored input signal on the adaptation process and investigating the new LMS-GA learning technique with its application as an adaptive filtering tool. The numerical results and simulations are presented and discussed in Section 5. Finally, the paper is concluded in Section 6.

Preliminaries on Adaptive FIR and IIR Filtering
The FIR filter is shown in Figure 1 in the form of a single input transversal filter. The "adaptation" is that by which the weights are adjusted or adapted in response to a function of the error signal. When the weights are in the adaptation course, they are a function of the input samples and not just the output so that the transversal filter's output is not characterized a linear in terms of the input.
where N is the filter order. In vector notation, and the error signal is given as where ( ) is the desired signal, ( ) is the filter output, ( ) is the input signal vector, representing the time index, the superscript T denoting transpose operator, and the subscript N representing the dimension of a vector.

Remark 1:
To a set of points of the inputs sequence ( ) and the reference waveform ( ) there corresponds an optimum coefficient vector or impulse response ( ). Given another set of points, there is no guarantee that the resulting optimum vector is related to the first unless the properties of the waveform do not change over different sections. Based on this, we can formulate the following assumption.
where N is the filter order. In vector notation, y(n) = X T N (n) C N (n) (2) and the error signal is given as where d(n) is the desired signal, y(n) is the filter output, X T N (n) is the input signal vector, is the coefficient vector, n representing the time index, the superscript T denoting transpose operator, and the subscript N representing the dimension of a vector.

Remark 1:
To a set of points of the inputs sequence x(n) and the reference waveform d(n) there corresponds an optimum coefficient vector or impulse response C N (n). Given another set of points, there is no guarantee that the resulting optimum vector is related to the first unless the properties of the waveform do not change over different sections. Based on this, we can formulate the following assumption.
Assumption (H1). The input sequence x(n) and the reference waveform d(n) are stochastic processes. Then the error ε(n) defined by (3) is also stochastic. The performance function or the mean-square error E ms is defined as Then, Expanding and using (2) we obtain, where * and denote conjugate and conjugate transpose, respectively, and the time index n is omitted from C N for simplicity. If we define the expected value of d(n) as and the ensemble or statistical autocorrelation matrix R ms of x(n) as which is a Toeplitz matrix and the ensemble average cross-correlation vector as Remark 2: The matrix R ms is designated as the "input correlation matrix", while P ms is the cross-correlations between the desired response and the input components. The main diagonal terms of The matrix R ms are the mean squares of the input components, and the cross terms are the cross-correlations among the input components. The elements of R ms are all constant second-order statistics when X N (n) and are stationary.
Then, E ms can be written as which shows that E ms has a quadratic form. To find the choice for C N that minimizes E ms , we find the gradient of E ms w.r.t. C N and find the optimum value of the weights C o ms which sets it to zero. This leads to, The solution is unique if R ms is invertible, and then which is the Wiener-Hopf equation in matrix form. Recursive filters, like IIR, with poles as well as zeros would offer the same advantages (resonance, sharper cut off, etc.) that nonrecursive filters offer in time-invariant applications. The recursive filters have two main weakness points, they become unstable if the poles move outside the unit circle and their performance indices are generally nonquadratic and may even have a local minimum. The adaptive IIR filter may be represented in the standard adaptive model as illustrated in Figure 2. which is the Wiener-Hopf equation in matrix form. Recursive filters, like IIR, with poles as well as zeros would offer the same advantages (resonance, sharper cut off, etc.) that nonrecursive filters offer in time-invariant applications. The recursive filters have two main weakness points, they become unstable if the poles move outside the unit circle and their performance indices are generally nonquadratic and may even have a local minimum. The adaptive IIR filter may be represented in the standard adaptive model as illustrated in Figure 2.  The input-output relationship is expressed as where b k 's and a k 's are the IIR filter coefficients, and y(n) and x(n) are the output and input of the IIR filter, respectively. The IIR filter is characterized by the following transfer function [1].
Note that in (13) the current output sample is a function of the past output y(n − k), as well as the present and past input sample x(n), and x(n − k), respectively. The powerfulness of the IIR filter comes from the feedback connection it has where it provides an additional flexibility. Due to this, the IIR filter usually needs less parameters than the FIR filter for the same sets of specifications.
Newton's and steepest descent methods are used for descending toward the minimum on the performance surface. Both require an estimation of the gradient in each iteration. The gradient estimation method is general because they are based on taking differences between estimated points on the performance surface, that is, the difference between estimates of the error ε(n). In this subsection, we will use another algorithm for descending on the performance surface, known as Least Mean Square (LMS) algorithm and will be investigated on both FIR and IIR digital filters.

The LMS Algorithm and Adaptive FIR Filtering
Given a record of data (x(n), d(n)), one can compute R ms and P ms ; R ms might not be invertible and, if it is, its inversion requires high numerical precision. A method depending on search techniques has the advantage of being simple to implement but at the expense of some inaccuracy in the final estimate. We use the gradient or steepest descent searching technique to find C o ms iteratively. This technique is applicable to the minimization of the quadratic performance function E ms , since it is a convex function of the coefficients C N , i.e., it possesses a global minimum. A gradient vector is computed as ∂E ms /∂C N , i = 1, · · · , N − 1, at this point each tap weight is changed in the direction opposite to its corresponding gradient component and by an amount proportional to the size of that component. Therefore where ∇ C E ms is the gradient vector, the subscript indicating that the gradient is taken w.r.t. to the components of the filter coefficients vector C N , l is the iteration number, and µ is the convergence factor that regulates the speed and the stability of the adaptation. It is clear from Figure 3 which represents the one-dimensional case, how repeating this procedure leads to the minimum of E ms and hence the optimal value C o ms . Approximating the gradient of E ms by the gradient of the instantaneous squared error, i.e., Information 2019, 10, x 6 of 20 where is the maximum eigenvalue of .

Adaptation of IIR Digital Filter Based on LMS Algorithm
To develop an algorithm for the recursive IIR filter, let us define the time-varying vector ( ) and the signal ( ) as follows From Figure 2 and equation (10), we can write We may write C N (l + 1) as Usually, C N (l) is updated for every sample as is the case when variations are to be tracked in an estimation process. When this is true, l = n. The LMS algorithm is summarized as follows A flowchart for the LMS algorithm is given in Figure 4. A convergence analysis for the LMS algorithm has been done in [1] and concluded that to achieve convergence the value of µ is found as, where λ max is the maximum eigenvalue of R ms .
Information 2019, 10, x 6 of 20 where is the maximum eigenvalue of .

Adaptation of IIR Digital Filter Based on LMS Algorithm
To develop an algorithm for the recursive IIR filter, let us define the time-varying vector ( ) and the signal ( ) as follows

Adaptation of IIR Digital Filter Based on LMS Algorithm
To develop an algorithm for the recursive IIR filter, let us define the time-varying vector C N (n) and the signal U(n) as follows From Figure 2 and Equation (10), we can write This is quite similar to the nonrecursive case in (3). The main difference being that U(n) contains values of y(n) as well as x(n). We again use the gradient approximation The derivatives in (23) present a special problem because y(n) is now a recursive function. Using (13) we define With the derivatives defined in this manner, we have Now we write the LMS algorithm as follows With nonquadratic error surface, we now have a convergence parameter µ for both a and b. We may even wish to have this factor vary with time. Using the current values of the a's and in (24) and (25), the LMS algorithm computation for recursive adaptive IIR filter as follows Initialization is the same as in the case adaptive FIR filter except that here, the α's and β's should be set initially to zero unless their values are known.

Evolutionary Computation: The Genetic Algorithm (GA)
The GA is a global, parallel, robust stochastic search algorithm based on the principle of natural selection and natural genetic process that combines the notion of survival of the fittest, random search, and parallel evaluation of points in the search space. GAs start with the initial strings population (denoted chromosome), which map individuals (candidate solutions) to the optimization problem, and the individuals' population develops into the better solutions using GA operators described next. Typically, the objective of GA is commonly stated as the maximization of fitness function F(t) given by [31,32] (29) where f (t) is the cost function to be minimized. In adaptive filtering, GA operates on a set of filter parameters (the population of the chromosomes), in which a fitness values are specified to each individual chromosome. The cost function f (t) in adaptive filtering is taken as the Mean Square Error (MSE) ε j 2 , which is given by where t e is the size of the window over which the errors will be added and stored, y j (n) is the estimated output associated with the j-th set of estimated filter coefficients, and j is the chromosome index. GA consists of key three stages: (1) selection, (2) cross-over, and (3) mutation. The Selection denotes the procedure of selecting a group of chromosomes from the population that will subsidize to the formation of the offspring for the subsequent generations. The chromosomes having greater fitness values in the population will get a higher probability of being selected for the following generation. Numerous techniques have been proposed for the selection process; for example, see [32]. The evolutionary process usually starts from a randomly generated population; in each generation, the fitness of each individual in the population is evaluated, which is determined by a problem specific fitness function of (30). Then, multiple individuals are randomly selected from the current population based on their fitness, and combined by cross-over operation and/or undergo mutation operation. The mutation operator is another technique of the GA to explore the cost surface, and it can create chromosomes that never occurs in the early populations and maintain the GA from approaching the solution quickly before examining the entire cost error surface [33,34]. The new offsprings (the set of filter coefficients) then form the basis of the next generation. The basic cycle of GA is depicted in Figure 5. where ( ) is the cost function to be minimized. In adaptive filtering, GA operates on a set of filter parameters (the population of the chromosomes), in which a fitness values are specified to each individual chromosome. The cost function ( ) in adaptive filtering is taken as the Mean Square Error (MSE) , which is given by where is the size of the window over which the errors will be added and stored, ( ) is the estimated output associated with the j-th set of estimated filter coefficients, and j is the chromosome index.
GA consists of key three stages: (1) selection, (2) cross-over, and (3) mutation. The Selection denotes the procedure of selecting a group of chromosomes from the population that will subsidize to the formation of the offspring for the subsequent generations. The chromosomes having greater fitness values in the population will get a higher probability of being selected for the following generation. Numerous techniques have been proposed for the selection process; for example, see [32]. The evolutionary process usually starts from a randomly generated population; in each generation, the fitness of each individual in the population is evaluated, which is determined by a problem specific fitness function of (30). Then, multiple individuals are randomly selected from the current population based on their fitness, and combined by cross-over operation and/or undergo mutation operation. The mutation operator is another technique of the GA to explore the cost surface, and it can create chromosomes that never occurs in the early populations and maintain the GA from approaching the solution quickly before examining the entire cost error surface [33,34]. The new offsprings (the set of filter coefficients) then form the basis of the next generation. The basic cycle of GA is depicted in Figure 5. Also, the GA will search within each generation for the minimum of the estimation error = min( ) for every chromosome in the entire population with an attempt of driving to an acceptable minimum value or to zero in the subsequent generations. Adopting the standard GA itself in adaptive filtering leads to a slow convergence rate, this is due to the tremendously big searching  Also, the GA will search within each generation for the minimum of the estimation error ε min = min(ε j 2 ) for every chromosome in the entire population with an attempt of driving ε min to an acceptable minimum value or to zero in the subsequent generations. Adopting the standard GA itself in adaptive filtering leads to a slow convergence rate, this is due to the tremendously big searching space of the GA, which makes the mutation (randomization) process wasting time in examining solutions over improper directions.

Proposed Adaptive System Identification for FIR and IIR Digital Filters
In this section, the main results of this work will be presented and analyzed with both white and colored signals. The effect of colored input signals on the adaptation process is discussed first followed by the proposed LMS-GA learning tool.

The Effect of Colored Signal On the Adaptation Process of LMS Algorithm
Suppose that the input signal x(n) is passed through a digital Low-Pass Filter (LPF), and then the output of the digital filter is applied to the adaptive system as shown in Figure 6. To show the effect of the digital low-pass filter on the adaptation process, let us discuss the difference between the input and the output signals of the digital filter. The input to the digital filter is shown in Figure 7a, where Figure 7b shows the autocorrelation φ ij (t, n), where φ ij (t, n) = x i x j , with x i and x j are samples of x(n), i, j = 1, 2, · · · are the time indices of x(n), and t = i − j . We notice from Figure 7 that the input signal x(n) is an impulse signal at each instant and is correlated with itself only and it is never correlated with other impulses. The output of the digital low-pass filter is also a random signal as shown in Figure 8. In this section, the main results of this work will be presented and analyzed with both white and colored signals. The effect of colored input signals on the adaptation process is discussed first followed by the proposed LMS-GA learning tool.

The Effect of Colored Signal On the Adaptation Process of LMS Algorithm
Suppose that the input signal ( ) is passed through a digital Low-Pass Filter (LPF), and then the output of the digital filter is applied to the adaptive system as shown in Figure 6. To show the effect of the digital low-pass filter on the adaptation process, let us discuss the difference between the input and the output signals of the digital filter. The input to the digital filter is shown in Figure 7a, where Figure 7b shows the autocorrelation ( , ), where ( , ) = , with and are samples of x(n), , = 1,2, ⋯ are the time indices of ( ), and = | − |. We notice from Figure 7 that the input signal ( ) is an impulse signal at each instant and is correlated with itself only and it is never correlated with other impulses. The output of the digital low-pass filter is also a random signal as shown in Figure 8.  In this section, the main results of this work will be presented and analyzed with both white and colored signals. The effect of colored input signals on the adaptation process is discussed first followed by the proposed LMS-GA learning tool.

The Effect of Colored Signal On the Adaptation Process of LMS Algorithm
Suppose that the input signal ( ) is passed through a digital Low-Pass Filter (LPF), and then the output of the digital filter is applied to the adaptive system as shown in Figure 6. To show the effect of the digital low-pass filter on the adaptation process, let us discuss the difference between the input and the output signals of the digital filter. The input to the digital filter is shown in Figure 7a, where Figure 7b shows the autocorrelation ( , ), where ( , ) = , with and are samples of x(n), , = 1,2, ⋯ are the time indices of ( ), and = | − |. We notice from Figure 7 that the input signal ( ) is an impulse signal at each instant and is correlated with itself only and it is never correlated with other impulses. The output of the digital low-pass filter is also a random signal as shown in Figure 8.     The spectral characteristics of the random signal are obtained by calculating the Fourier-transform for the correlation φ ij (t, n) of the input signal x(n). The power spectral density of the input signal x(n) is shown in Figure 9, which is flat for all frequencies (White spectrum). The spectral characteristics of the random signal are obtained by calculating the Fouriertransform for the correlation ( , ) of the input signal ( ). The power spectral density of the input signal ( ) is shown in Figure 9, which is flat for all frequencies (White spectrum). It is seen from the demonstration of the LMS convergence that the algorithm convergence time is determined as where ( ) is the power spectral density of the input or the Fourier-transform of the autocorrelation function ( , ) (elements of the matrix). As the order of the matrix, N, tends to infinity: Given that the convergence time can be expressed as in (31), we infer that the spectra that with the ratio of the maximum to minimum spectrum is large results in sluggish convergence. Spectra with an eigenvalue disparity near unity (i.e., flat spectra) lead to rapid convergence. The conjecture about such results is that large correlation among the input samples is related to a large eigenvalue disparity, which in turn decelerates the convergence of the adaptation process of the FIR filter. Now, we show the effect of the colored signal ( ) on the convergence speed of the adaptation process. The autocorrelation function ′ ( , ) of the colored signal ( ) is shown in Figure 10. It is seen from the demonstration of the LMS convergence that the algorithm convergence time τ is determined as where α = µ λ max 2 , λ max , and λ min are the minimum and maximum eigenvalues of the autocorrelation matrix R ms respectively. The ratio of the maximum to the minimum eigenvalue is called the eigenvalue disparity and determines the speed of convergence. The convergence time depends only on the nature of the input sequence x(n) and not on the desired signal d(n). The matrix R ms is given as It is seen from (31) that the convergence time is inversely proportional to µ and depends only on the nature of the input sequence x(n). The physical interpretation of the eigenvalues λ i of R ms can be illustrated by comparing them with the spectrum of the input signal x(n). It is a classical result of Toeplitz form theory that the eigenvalues are bounded by X e jω min < λ i < X e jω max , i = 0, 1, . . . , N − 1 (33) where X e jω is the power spectral density of the input or the Fourier-transform of the autocorrelation function φ ij (t, n) (elements of the R ms matrix). As the order of the matrix, N, tends to infinity: Given that the convergence time τ can be expressed as in (31), we infer that the spectra that with the ratio of the maximum to minimum spectrum is large results in sluggish convergence. Spectra with an eigenvalue disparity near unity (i.e., flat spectra) lead to rapid convergence. The conjecture about such results is that large correlation among the input samples is related to a large eigenvalue disparity, which in turn decelerates the convergence of the adaptation process of the FIR filter. Now, we show the effect of the colored signal x (n) on the convergence speed of the adaptation process. The autocorrelation function φ i j (t, n) of the colored signal x (n) is shown in Figure 10. From Figure 10, we can verify that the impulse (0) is more correlated with itself and less with other impulses. The spectral density of the input signal can be obtained by computing the Fouriertransform of the autocorrelation function (the elements of matrix) as shown in Figure 11. So that the eigenvalue disparity → ⁄ ( ) ( ) ⁄ will be larger than the white signal. Therefore, the effect of the colored signal will result in slow convergence.

The Integrated LMS Algorithm with Genetic Search Approach (LMS-GA)
As said earlier, GA is slow for tuning in adaptive filtering; while, the gradient descent techniques behave well in local regions only. To overcome the difficulties of the GA and gradient descent methods, we propose in this section a novel approach for the learning the adaptive FIR and IIR filters that incorporates the quintessence and features of both algorithms.
The essential principle of our novel learning algorithm is to combine the evolutionary idea of GA into the gradient descent technique to give an organized random searching amid the gradient descent calculations. The filter coefficients are represented as a chromosome with a list of real numbers in our proposed technique. The LMS algorithm is embedded in the mutation process of GA to discover the fastest shortcut path in adjusting the optimal solution through the learning process. Each time the LMS learning tool get caught in a local minimum, or the convergence of the LMS algorithm is slow (i.e., the gradient of the error is within a specific range), we begin the GA by arbitrarily varying the estimated filter parameters values to obtain a new set of filter coefficients. The proposed learning algorithm, namely, LMS-GA, chooses the filter among the new filters and the first one with the smallest MSE (best fitness value) as the new candidate to the next evolution. The above process will be done more and more if the convergence speed is detected to be sluggish at uniform intervals or the LMS algorithm stuck in one more local minimum. In the suggested learning technique, the parameters of the filter are varied during each evolution according to From Figure 10, we can verify that the impulse x (0) is more correlated with itself and less with other impulses. The spectral density of the input signal can be obtained by computing the Fourier-transform of the autocorrelation function (the elements of R ms matrix) as shown in Figure 11. So that the eigenvalue disparity λ max /λ min → X e jω max /X e jω min will be larger than the white signal. Therefore, the effect of the colored signal will result in slow convergence. From Figure 10, we can verify that the impulse (0) is more correlated with itself and less with other impulses. The spectral density of the input signal can be obtained by computing the Fouriertransform of the autocorrelation function (the elements of matrix) as shown in Figure 11. So that the eigenvalue disparity → ⁄ ( ) ( ) ⁄ will be larger than the white signal. Therefore, the effect of the colored signal will result in slow convergence.

The Integrated LMS Algorithm with Genetic Search Approach (LMS-GA)
As said earlier, GA is slow for tuning in adaptive filtering; while, the gradient descent techniques behave well in local regions only. To overcome the difficulties of the GA and gradient descent methods, we propose in this section a novel approach for the learning the adaptive FIR and IIR filters that incorporates the quintessence and features of both algorithms.
The essential principle of our novel learning algorithm is to combine the evolutionary idea of GA into the gradient descent technique to give an organized random searching amid the gradient descent calculations. The filter coefficients are represented as a chromosome with a list of real numbers in our proposed technique. The LMS algorithm is embedded in the mutation process of GA to discover the fastest shortcut path in adjusting the optimal solution through the learning process. Each time the LMS learning tool get caught in a local minimum, or the convergence of the LMS algorithm is slow (i.e., the gradient of the error is within a specific range), we begin the GA by arbitrarily varying the estimated filter parameters values to obtain a new set of filter coefficients. The proposed learning algorithm, namely, LMS-GA, chooses the filter among the new filters and the first one with the smallest MSE (best fitness value) as the new candidate to the next evolution. The above process will be done more and more if the convergence speed is detected to be sluggish at uniform intervals or the LMS algorithm stuck in one more local minimum. In the suggested learning technique, the parameters of the filter are varied during each evolution according to Figure 11. The power spectral density of the digital filter output x (n).

The Integrated LMS Algorithm with Genetic Search Approach (LMS-GA)
As said earlier, GA is slow for tuning in adaptive filtering; while, the gradient descent techniques behave well in local regions only. To overcome the difficulties of the GA and gradient descent methods, we propose in this section a novel approach for the learning the adaptive FIR and IIR filters that incorporates the quintessence and features of both algorithms.
The essential principle of our novel learning algorithm is to combine the evolutionary idea of GA into the gradient descent technique to give an organized random searching amid the gradient descent calculations. The filter coefficients are represented as a chromosome with a list of real numbers in our proposed technique. The LMS algorithm is embedded in the mutation process of GA to discover the fastest shortcut path in adjusting the optimal solution through the learning process. Each time the LMS learning tool get caught in a local minimum, or the convergence of the LMS algorithm is slow (i.e., the gradient of the error is within a specific range), we begin the GA by arbitrarily varying the estimated filter parameters values to obtain a new set of filter coefficients. The proposed learning algorithm, namely, LMS-GA, chooses the filter among the new filters and the first one with the smallest MSE (best fitness value) as the new candidate to the next evolution. The above process will be done more and more if the convergence speed is detected to be sluggish at uniform intervals or the LMS algorithm stuck in one more local minimum. In the suggested learning technique, the parameters of the filter are varied during each evolution according to where D is the permissible offset range for each evolution, m is the offsprings number produced in the evolutions, Θ i denotes the i-th offsprings that are produced by the parents filter Θ, and σ i ∈ [−1, +1] is a random number. To pick the optimum filter to be the next candidate amongst the sets of new offsprings in the course of each evolution, we calculate the MSE for each new filter (Θ i , 1 ≤ i ≤ m) by (30) for a block of time t e . The filter with the smallest MSE will be selected as the next candidate for the subsequent phase of the learning process. We represent the behavior of the new proposed LMS-GA by the flowchart shown in Figure 12. The ∆E(n) in the flowchart of Figure 12 is the error gradient and defined as where γ is the window size for estimation of ∆E. The computational complexity of the LMS algorithm of the FIR filter for the case of the white input signal is found to be (2N) multiplication per iteration, where N is the length of the FIR filter. While the computational complexity required for the IIR filter is equal to (M + L)(L + 2), where L is the backward length and M is the forward length of the IIR filter for the same order of both FIR and IIR filters (i.e., N = L + M). The computational complexity of the LMS algorithm of the FIR filter with colored input signal is given as N·P + 2N, where P is the length of the digital LPF.
Information 2019, 10, x 12 of 20 by (30) for a block of time . The filter with the smallest MSE will be selected as the next candidate for the subsequent phase of the learning process. We represent the behavior of the new proposed LMS-GA by the flowchart shown in Figure 12. The ΔE(n) in the flowchart of Figure 12 is the error gradient and defined as where is the window size for estimation of ∆ . The computational complexity of the LMS algorithm of the FIR filter for the case of the white input signal is found to be (2 ) multiplication per iteration, where is the length of the FIR filter. While the computational complexity required for the IIR filter is equal to ( + )( + 2), where is the backward length and is the forward length of the IIR filter for the same order of both FIR and IIR filters (i.e., = + ). The computational complexity of the LMS algorithm of the FIR filter with colored input signal is given as • + 2 , where is the length of the digital LPF.

Numerical Results and Simulations
Consider the 4-Tap adaptive FIR filter in channel equalization application as a plant model for system identification:

Numerical Results and Simulations
Consider the 4-Tap adaptive FIR filter in channel equalization application as a plant model for system identification: The basic idea of the system identification using adaptive FIR filtering depends on matching the coefficients of the adaptive filter to that of the plant. The convergence factor µ regulates the adaptation stability and convergence speed. A special class of input signal is generated to train the weights of the adaptive FIR filter, it consists of four level values (−3, −1, +1, and +3) and governed by a uniformly generated random input R ∈ [0, 1] as shown in Figure 13, e.g., if the random number is R oo , i.e., it is in the range [0, 0.25], then x(n) = −3, the same for other values of R. These four level input values generated using the scheme proposed in Figure 13 are entered repeatedly into the input channel ( ) of Figure 14 until a convergence is reached or a maximum number of iterations are achieved. The results of applying the standard LMS algorithm on an adaptive FIR filter to identify the parameters of (37) with different values of are shown in Table 1. The learning curves for three different values of are shown in Figure 15.  These four level input values generated using the scheme proposed in Figure 13 are entered repeatedly into the input channel x(n) of Figure 14 until a convergence is reached or a maximum number of iterations are achieved. The results of applying the standard LMS algorithm on an adaptive FIR filter to identify the parameters of (37) with different values of µ are shown in Table 1. The learning curves for three different values of µ are shown in Figure 15. These four level input values generated using the scheme proposed in Figure 13 are entered repeatedly into the input channel ( ) of Figure 14 until a convergence is reached or a maximum number of iterations are achieved. The results of applying the standard LMS algorithm on an adaptive FIR filter to identify the parameters of (37) with different values of are shown in Table 1. The learning curves for three different values of are shown in Figure 15.     From the above results, we observe that the optimum value of that corresponds to fewer iterations is found to be (0.045). For the case of the colored input signal, the same input produced using the scheme of Figure 13 is applied on the input channel of Figure 6, we conclude that the spectra with the ratio of the maximum to minimum spectrum is large results in sluggish convergence. Spectra with an eigenvalue disparity near unity (i.e., flat spectra) lead to rapid convergence. The digital filter of Figure 6 used in this simulation is of 8-Tap FIR LPF type given as The plant dynamics is given in (37). Different s have been used in colored input signal case study with the results given in Table 2. We note that the optimum value of that corresponds to the less number of iterations is found to be 3. The learning curves for different values of are illustrated in Figure 16. The best value of is found with MSE of -163.131 dB.  From the above results, we observe that the optimum value of µ that corresponds to fewer iterations is found to be (0.045). For the case of the colored input signal, the same input produced using the scheme of Figure 13 is applied on the input channel of Figure 6, we conclude that the spectra with the ratio of the maximum to minimum spectrum is large results in sluggish convergence. Spectra with an eigenvalue disparity near unity (i.e., flat spectra) lead to rapid convergence. The digital filter of The plant dynamics is given in (37). Different µs have been used in colored input signal case study with the results given in Table 2. We note that the optimum value of µ that corresponds to the less number of iterations is found to be 3. The learning curves for different values of µ are illustrated in Figure 16. The best value of µ is found with MSE of -163.131 dB.  Concerning the adaptive IIR filter, the error surface is generally a multimodal against filter parameters. The adaptation techniques for the case of adaptive IIR filter can easily get stuck at a local minimum and escape away from the global minimum. Some of the adaptive IIR filter coefficients will be matched with that of the plant, and the other will be constant at specific values, which means that these coefficients are stuck at local minima. The following 1st order transfer function ( ) is used to represent the plant The results of the system identification using IIR adaptive filtering are shown in Table 3. As can be seen, the best value of was found to be 0.065. The IIR error surface here is a special case as it is an unimodal (local minimum does not exist) and having a global optimum only. However, the Concerning the adaptive IIR filter, the error surface is generally a multimodal against filter parameters. The adaptation techniques for the case of adaptive IIR filter can easily get stuck at a local minimum and escape away from the global minimum. Some of the adaptive IIR filter coefficients will be matched with that of the plant, and the other will be constant at specific values, which means that these coefficients are stuck at local minima. The following 1st order transfer function H(z) is used to represent the plant The results of the system identification using IIR adaptive filtering are shown in Table 3. As can be seen, the best value of µ was found to be 0.065. The IIR error surface here is a special case as it is an unimodal (local minimum does not exist) and having a global optimum only. However, the practical problem still exists here which is the pole of the adaptive filter may move outside of the unit circle resulting in an unstable system. To solve this problem, we use a certain criterion that states when the magnitude of the pole exceeds unity; we limit its magnitude to be less than one. The learning curves for different values of µ are shown in Figure 17. Table 3. Simulation of adaptive IIR filter (white signal).
Step practical problem still exists here which is the pole of the adaptive filter may move outside of the unit circle resulting in an unstable system. To solve this problem, we use a certain criterion that states when the magnitude of the pole exceeds unity; we limit its magnitude to be less than one. The learning curves for different values of are shown in Figure 17.  The new LMS-GA learning tool can be applied to the system identification as shown in Figure  14 with FIR adaptive filter instead of an IIR one. Then, we can deduce the learning curve with window size 8 and offsprings m =5 and offset D = 0.02 as shown in Figure 18 and the comparison results with The new LMS-GA learning tool can be applied to the system identification as shown in Figure 14 with FIR adaptive filter instead of an IIR one. Then, we can deduce the learning curve with window size 8 and offsprings m =5 and offset D = 0.02 as shown in Figure 18 and the comparison results with the standard LMS algorithm are listed in Table 4. One can see that the simple LMS algorithm is faster than the new learning algorithm because the LMS-GA is a random technique which is applied to a multimodal error surface. In the case of unimodal error surface (as the case of FIR adaptive filter), the simple LMS algorithm is a better choice than other algorithms like the LMS-GA learning tool.  Table 4. One can see that the simple LMS algorithm is faster than the new learning algorithm because the LMS-GA is a random technique which is applied to a multimodal error surface. In the case of unimodal error surface (as the case of FIR adaptive filter), the simple LMS algorithm is a better choice than other algorithms like the LMS-GA learning tool.  The proposed LMS-GA learning tool is exploited to learn an adaptive IIR filter to recover the performance of the gradient descent technique (e.g., the LMS algorithm) with multimodal error surface. To compare the new LMS-GA learning tool with the standard LMS algorithm, we must determine the window size δ the window for estimation of ∆E and the Gradient Threshold (GT). These can be calculated from the learning curve of the simple LMS algorithm as follows, window size τ is calculated as being the number of iterations between the first iteration and the iteration at which the learning curve fluctuate with small variations. GT is calculated by determining the maximum and minimum values of these fluctuations. Now, GT can be determined as GT = (max.swing − min.swing)/δ. If these two parameters are calculated, we can apply the procedure of the new learning algorithm of Figure 12 on an adaptive IIR filter of a unimodal error surface as in (39). We conclude that the new learning algorithm will converge to the same MSE (the same MSE the pure LMS reached to it) but with a fewer number of iterations as shown in Figure 19.   The proposed LMS-GA learning tool is exploited to learn an adaptive IIR filter to recover the performance of the gradient descent technique (e.g., the LMS algorithm) with multimodal error surface. To compare the new LMS-GA learning tool with the standard LMS algorithm, we must determine the window size δ the window for estimation of ∆E and the Gradient Threshold (GT). These can be calculated from the learning curve of the simple LMS algorithm as follows, window size τ is calculated as being the number of iterations between the first iteration and the iteration at which the learning curve fluctuate with small variations. GT is calculated by determining the maximum and minimum values of these fluctuations. Now, GT can be determined as GT = (max.swing − min.swing)/δ. If these two parameters are calculated, we can apply the procedure of the new learning algorithm of Figure 12 on an adaptive IIR filter of a unimodal error surface as in (39). We conclude that the new learning algorithm will converge to the same MSE (the same MSE the pure LMS reached to it) but with a fewer number of iterations as shown in Figure 19.
the learning curve fluctuate with small variations. GT is calculated by determining the maximum and minimum values of these fluctuations. Now, GT can be determined as GT = (max.swing − min.swing)/δ. If these two parameters are calculated, we can apply the procedure of the new learning algorithm of Figure 12 on an adaptive IIR filter of a unimodal error surface as in (39). We conclude that the new learning algorithm will converge to the same MSE (the same MSE the pure LMS reached to it) but with a fewer number of iterations as shown in Figure 19. Figure 19. The convergence performance of the LMS-GA tool for adaptive IIR filter. Figure 19. The convergence performance of the LMS-GA tool for adaptive IIR filter.

Conclusions
In this work, adaptive algorithms are utilized to learn the parameters of the digital FIR and IIR filters such that the error signal is minimized. These algorithms are the standard LMS algorithm and the LMS-GA one. The numerical instabilities inherent in other adaptive techniques do not exist in the conventional LMS algorithm. Moreover, prior information of the signal statistics is not required, i.e., the autocorrelation and cross-correlation matrices. The LMS algorithm produces only estimated adaptive filter coefficients. These estimated coefficients match that of the plant progressively through time as the parameters are changed and the adaptive filter learns the signal characteristics, and then identifies the underlying plant. Due to the multimodality of the error surface of adaptive IIR filters, a new learning algorithm, namely, LMS-GA, is proposed in this paper, which integrates the genetic searching methodology LMS algorithm and speeds up the adaptation procedure and offers universal searching ability. Besides, the LMS-GA preserved the characteristics and the simplicity of the standard LMS learning algorithm, and it entailed comparatively fewer computations and had a fast convergence rate as compared to the standard GA. It is evident that the numerical simulations elucidated that the LMS-GA outperforms the standard LMS in terms of the capability to determine the global optimum solution and the faster convergence rate to this solution. Future work includes investigating the stability of the proposed algorithm as well as the validation of this algorithm with more realistic applications such as echo and noise cancelation. Investigating the performance of the proposed LMS-GA algorithm with that of the Prediction Error Method (PEM) is a good point for future study. Finally, the hardware complexity is very high in case of adaptive algorithms like LMS one, especially for higher accuracy and floating point calculations, this adds more overhead. So, future work might be to design a systolic array architecture to reduce this overhead.