Normalized Minimum Error Entropy Algorithm with Recursive Power Estimation

Abstract: The minimum error entropy (MEE) algorithm is known to be superior in signal processing applications under impulsive noise. In this paper, based on the analysis of behavior of the optimum weight and the properties of robustness against impulsive noise, a normalized version of the MEE algorithm is proposed. The step size of the MEE algorithm is normalized with the power of input entropy that is estimated recursively for reducing its computational complexity. The proposed algorithm yields lower minimum MSE (mean squared error) and faster convergence speed simultaneously than the original MEE algorithm does in the equalization simulation. On the condition of the same convergence speed, its performance enhancement in steady state MSE is above 3 dB.


Introduction
Wired or wireless communication channels are under multipath fading as well as impulsive noise from various sources [1,2].The impulsive noise can cause large instantaneous errors and system failure so that enhanced signal processing algorithms for coping with such obstacles are needed.Most algorithms are designed based on the mean squared error (MSE) criterion, but it often fails in impulsive noise environments [3].One of the cost functions based on information theoretic learning (ITL), minimum error entropy (MEE) has been developed by Erdogmus [4].As a nonlinear version of MEE, the decision feedback MEE (DF-MEE) algorithm has been known to yield superior performance under severe channel distortions and impulsive noise environments [5].It also has been shown for shallow underwater communication channels that the DF-MEE algorithm has not only robustness against impulsive noise and severe multipath fading but can also be more improved by some modification of the kernel size [6].
One of the problems of the MEE algorithm is its heavy computational complexity caused by the computation of double summations for the gradient estimation of MEE algorithm at each iteration time.In the work conducted by [7], a computation reducing method by the recursive gradient estimation of the DF-MEE has been proposed for practical implementation.Though those practical difficulties have been removed through the recursive method, theoretic analysis in depth on its optimum solutions and their behavior has not been carried out yet for further enhancement of the algorithm.
In this paper, based on the analysis of behavior of optimum weight and some factors on mitigation of influence from large errors due to impulsive noise, we propose to employ a time-varying step size through normalization by the input power that is recursively estimated for effectiveness in computational complexity.The performance comparison with MEE will be discussed and experimented through simulation in equalization as well as in system identification problems with impulsive noise that can be encountered in experiments investigating physical phenomenon [8].

MSE Criterion and Related Algorithms
The overall communication system model for this work is described in Figure 1.The transmitter sends a symbol d k at time k through the multipath channel described in z-transform, Hpzq " ř h i z ´i, and then impulsive noise n k is added to the channel output to become the received signal x k so that the adaptive system input x k contains noise n k and intersymbol interference (ISI) caused by the channel's multipath [9].
Entropy 2016, 18, 239 2 of 14 problems with impulsive noise that can be encountered in experiments investigating physical phenomenon [8].

MSE Criterion and Related Algorithms
The overall communication system model for this work is described in Figure 1.The transmitter sends a symbol k d at time k through the multipath channel described in z-transform, ( )   , and then impulsive noise k n is added to the channel output to become the received signal k x so that the adaptive system input k x contains noise k n and intersymbol interference (ISI) caused by the channel's multipath [9].
With the input the tapped delay line (TDL) equalizer, the output k y and the error k e become With the current weight k W , a set of error samples and a set of input samples, the adaptive algorithms designed according to their own criteria such as MSE or MEE produce updated weight 1 k

W
with which the adaptive system makes the next output  x k " With the input X k " rx k , x k´1 , . . ., x k´j , . . ., x k´L`1 s T and weight W k " rw 0,k , w 1,k , .., w j,k , . . ., w L´1,k s T of the tapped delay line (TDL) equalizer, the output y k and the error e k become With the current weight W k , a set of error samples and a set of input samples, the adaptive algorithms designed according to their own criteria such as MSE or MEE produce updated weight W k`1 with which the adaptive system makes the next output y k`1 .
Taking statistical average Er¨s to the error power e k 2 , the MSE criterion is defined as Ere k 2 s.
For practical reasons, instantaneous error power e 2 k can be used and the LMS (least mean square) Entropy 2016, 18, 239 3 of 13 algorithm has been developed based on minimization of e 2 k [9].The minimization of e 2 k can be carried out by the steepest descent method utilizing the gradient of e 2 k as With Equation ( 4) and the step size µ LMS , the well-known LMS algorithm is presented as By letting the gradient Be 2 k BW be zero, we have the optimum condition of the LMS as Taking statistical average Er¨s to Equation ( 6) leads us to the optimum condition of the MSE criterion as Ere k X k s " 0.
Inserting ( 3) into ( 6), we get the optimum weight of the LMS algorithm, The optimum weight W opt LMS in (8)might be expected to get wildly shaky in impulsive noise situations since it has no protection measures from such impulses existing in the input vector X k .
When the effect of fluctuations in the input power levels is considered, the fact that the step size µ LMS of the LMS algorithm should be inversely proportional to the power of the input signal X k leads to the normalized LMS algorithm (NLMS) where its step size is normalized by the squared norm of the input vector ||X k || 2 , that is, µ NLMS {||X k || 2 [9].One of the principal characteristics of the NLMS algorithm is that the parameter µ NLMS is dimensionless, whereas µ LMS has the dimensioning of inverse power as mentioned above.Therefore, we may view that the NLMS algorithm has an input power-dependent adaptation step size, so that the effect of fluctuations in the power levels of the input signal is compensated at the adaptation level.When we assume that in the steady state e k and X k are independent, the input vector X k can be viewed as being normalized by its squared norm Unlike the LMS or NLMS, the MEE algorithm based on the error entropy criterion is known for its robustness against impulsive noise [6].In the following section, the MEE algorithm will be analyzed with respect to its weight behavior under impulsive noise environments.

MEE Algorithm and Magnitude Controlled Input Entropy
The MSE criterion is effective under the assumptions of linearity and Gaussianity since it uses only second order statistics of the error signal.When the noise is impulsive, a criterion considering all the higher order statistics of the error signal would be more appropriate.
Error entropy as a scalar quantity provides a measure of the average information contained in a given error distribution.With N samples (sample size N) of error samples te k , e k´1 , . . ., e k´N`1 u the distribution function of error, f E peq can be constructed based on Kernel density estimation as in Equation ( 9) [10].
Since the Shannon's entropy in [9] is hard to estimate and to minimize due to the integral of the logarithm of a given distribution function, Renyi's quadratic error entropy Hpeq has been effectively used in ITL methods as described in (10).
When error entropy Hpeq in (10) is minimized, the error distribution f E peq of an adaptive system is contracted and all higher order moments are minimized [4].
Inserting ( 9) into (10) leads to the following Hpeq that can be interpreted as interactions among pairs of error samples where error samples act as physical particles.
Since the Gaussian kernel G σ ? 2 pe j ´ei q is always positive and is an exponential decay function with the distance square, the Gaussian kernel may be considered to create a potential field.The sum of all pairs of interactions in the argument of log [.] in (11) is called information potential IP e [4].
Then, minimization of error entropy is equivalent to maximization of IP e .For the maximization of IP e , the gradient of (12) becomes pe j ´ei q ¨Gσ ? 2 pe j ´ei q ¨pX j ´Xi q. (13) At the optimum state (BIP e {BW " 0), we have pe j ´ei q ¨Gσ ? 2 pe j ´ei q ¨pX j ´Xi q.
Since the term pe j ´ek q implies how far the current error e k is located from each error sample e j , we may define the error pair pe j ´ek q as e j,k which is generated from the error space E at each iteration time as in Figure 2. The term e j,k can be considered to contain information of the extent of spread of error samples.Considering that entropy is a measure of how evenly energy is distributed or the range of positions of components of a system, we will refer to this information as error entropy (EE) in this paper for convenience.
Similarly, the term pX j ´Xk q indicates the distance between the current input vector X k and another input vector X j in the input vector space.Therefore, with the following definition, we can say that X j,k contains the information of the extent of spread of input vectors, that is, input entropy (IE).Likewise, we will refer to X j,k as an IE vector in this paper.X j,k " pX j ´Xk q. (15) Then, with the EE sample e j,k and IE vector X j,k Equation ( 14) can be rewritten as Error space E and error entropy samples generated from error pairs.
Similarly, the term ( )  X X indicates the distance between the current input vector k X and another input vector j X in the input vector space.Therefore, with the following definition, we can say that , j k X contains the information of the extent of spread of input vectors, that is, input entropy (IE).Likewise, we will refer to , j k

X
as an IE vector in this paper.
, ( ) Then, with the EE sample , j k e and IE vector , j k X Equation ( 14) can be rewritten as If we consider the sample-averaged operation  17), this process can be described as in Figure 3.

( )
Error space E and error entropy samples generated from error pairs.
If we consider the sample-averaged operation 1 p¨q in ( 16) can be replaced with the statistical average Er¨s or vice versa for practical reasons, the comparison between ( 16) and the optimum condition of the MSE criterion Ere k X k s " 0 in (7) provides insight that e k of the MSE criterion can correspond to EE sample e j,k , and X k of MSE criterion can be related to G σ ? 2 pe j,k qX j,k as a kind of modified input entropy vector.We also see that the term G σ ? 2 pe j,k qX j,k in (16) implies that the magnitude of X j,k is controlled by G σ ? 2 pe j,k q.At the occurrence of a strong impulse in n k ,e k can be located far away from e j so that the EE sample e j,k has a very large value.Then, the Gaussian function output G σ ? 2 pe j,k q becomes a very small value since its exponential is a decay function of e 2 j,k .In turn, the value of the IE vector X j,k is reduced by the multiplication of G σ ? 2 pe j,k q.In this regard, it is appropriate that the term G σ ? 2 pe j,k qX j,k in ( 16)is interpreted as a magnitude-controlled version of ... Defining X MCIE j,k as a magnitude controlled input entropy (MCIE) in (17), this process can be described as in Figure 3.
Entropy 2016, 18, 239 6 of 14 In an element expression, , , , , ( ) ( )( ) The optimum condition in (16) can be rewritten as We may observe that the MEE algorithm in (20) is very similar to (7) in the aspect of the error In an element expression, , the MEE algorithm becomes Entropy 2016, 18, 239 6 of 13 The optimum condition in (16) can be rewritten as We may observe that the MEE algorithm in (20) is very similar to (7) in the aspect of the error and input terms.One different part is that the MEE algorithm consists of summations of error entropy samples and input entropy vectors, while the LMS just has an error sample and an input vector.
On the other hand, it can be noticed that MCIE X MCIE j,i can keep the algorithm stable even at the occurrences of large error entropy that occurs mostly when the input is contaminated by impulse noise.The summation process over e j,i X MCIE j,i can also mitigate the influence of impulses, but it does not contribute much to deterring the influence of large errors since even an impulse can dominate the averaging (summation) operation.

Recursive Power Estimation of MCIE
The fixed step size of the MEE algorithm may make the MEE require an understanding of the statistics of the input entropy prior to the adaptive filtering operation.This makes it hard in practice to choose an appropriate step size µ MEE that controls its learning speed and stability.
Like the approach of the normalized LMS that solves this kind of problem through normalization by the summed power of the current input samples as in [9,11], we propose heuristically to normalize the step size by the summed power of the current MCIE element in (18) as Considering the fact that impulses can defeat the average operation as explained in Section 3, we can notice that the denominator may become large in an incident with impulsive noise; in turn, µ N MEE becomes a very small value, so that it may induce a very slow convergence.To avoid this kind of situation, we may adopt a sliding window as However, this approach places a heavier computational complexity on the MEE algorithm.For reducing the burdensome computations, we need to track the power recursively using a single-pole low-pass filter, i.e., where β p0 ă β ă 1q controls the bandwidth and time constant of the system whose transfer function Tpzq with its input

ˇˇ2 is given by
Tpzq " p1 ´βq z z ´β . (24) Then, the resulting algorithm that we will refer to in this paper as normalized MEE (NMEE) becomes On the other hand, the NLMS in (19) has been developed based on the principle of minimum disturbance that states the tap weight change of an adaptive filter from one iteration to the next, that is, the squared Euclidean norm (SEN) of the change in the tap-weight vector, ||W k`1 ´Wk || 2 should be minimal [9].From that perspective, the effectiveness of the proposed NMEE algorithm can be analyzed based on the disturbance, SEN, at around the optimum state as For the existing MEE algorithm of (19), For the proposed NMEE algorithm, Then This result indicates that the proposed method is more suitable for the conventional MEE when the MCIE power is greater than µ{µ MEE , which means when a smaller µ MEE is demanded, such as when the input signal is contaminated with strong impulsive noise.On the other hand, it can be noticed that when the input signal is not large so that a bigger µ MEE can be employed for faster convergence, the proposed method may not be guaranteed to be better than the fixed step size MEE algorithm.
On the other hand, we know there are a lot of step size selection methods for gradient-based algorithms, and we need to verify that this approach is the right one for the MEE problems.Considering that the proposed step size selection method is motivated and designed by the concept of the input power normalization as in the NLMS algorithm, it may be reasonable to investigate whether the input power normalization is effective in the MEE algorithm under impulsive noise.
When we employ the squared norm of the input vector ||X k || 2 , that is, Assuming the error entropy e j,i and MCIE X MCIE j,i are independent in the steady sate, (33) becomes The SEN in (29) adopting the squared norm of the MCIE instead of ||X k || 2 can be rewritten as This indicates that SEN N MEE2 might vary to some degree since the denominator containing impulsive noise can fluctuate from small values to large values due to strong impulses dominating the sum operation.From this analysis, the fact that the MCIE in SEN N MEE is normalized by MCIE power Ppkq that uses the output of the magnitude controller cutting the outliers from strong impulses leads us to the argument that our proposed method is appropriate for impulsive noise situations.This will be tested in Section 5.As observed in (30), when the input signal is not in strong impulsive noise environments, the proposed method may not be better than the existing MEE algorithm.The effectiveness of the proposed NMEE algorithm under strong impulsive noise will be investigated in the following section.

Results and Discussion
The simulation for observations of the optimum weight behavior of MEE algorithm is carried out in equalization of the multipath channel of Hpzq " 0.26 `0.93z ´1 `0.26z ´2 [12].The transmitted symbol d k sent at time k is randomly chosen from the symbol set td 1 " ´3, d 2 " ´1, d 3 " 1, d 4 " 3u(M " 4).The impulsive noise n k in (1) consists of the background white Gaussian noise (BWGN) and impulses (IM) with variance σ 2 BWGN and σ 2 I N , respectively.The impulses are generated according to a Poisson process with its incident rate ε [10].The distribution f N pn k q of the impulses is The BWGN with σ 2 BWGN = 0.001 is added throughout the whole time to the channel output.The impulses are generated with variance σ 2 I N = 50.The TDL equalizer has 11 tap weights pL " 11q.For the parameters for the MEE algorithm, the sample size N, the kernel size σ and convergence Entropy 2016, 18, 239 9 of 13 parameter µ MEE are 20, 0.7 and 0.01, respectively.The step size µ LMS for the LMS algorithm is 0.001.All parameter values are selected when they produce the lowest minimum MSE in this simulation.
Firstly, the weight traces will be investigated through simulation in order to verify the property of robustness against impulsive noise.The impulses are generated with ε = 0.01 for clear observation of the weight behavior.The impulse noise as depicted in Figure 4 is applied to the channel output in the steady state, that is, after convergence.We can notice that the optimum weight of MEE has averaging operations and MCIE in (23) has some differences when the weight update Equation ( 8) is compared.Since the average operations can easily be defeated even by just one strong impulse, we can figure out that the dominant role of robustness against impulsive noise is the MCIE.
Secondly, the effectiveness of the proposed NMEE algorithm (25) designed with the MCIE is investigated through the learning performance comparison with the original MEE algorithm in (19) under the same impulsive noise with 2 IN σ = 50 and ε = 0.03 as in the work [5] in which the impulsive noise is used in all time.The MSE learning results are shown in Figure 6.
The LMS algorithm converges very slow and stays at about −8 dB of MSE in the steady state.
This result can be explained from the expression of Figure 5 shows the learning curves of weight w 4,k and w 5,k (only two weights are chosen due to the page limitation).At around 5000 samples, both reach their optima completely, and then they undergo the impulsive noise like that in Figure 4.In Figure 5, it is observed that MEE and LMS have the same steady state weight values and each weight trace of MEE in the steady state shows no fluctuations remaining undisturbed under the strong impulses.This is obviously in contrast to the case of the LMS algorithm where traces of w 4,k and w 5,k have sharp perturbations at impulse occurrences and remain perturbed for a long time though gradually dying.We can notice that the optimum weight of MEE has averaging operations and MCIE in (23) has some differences when the weight update Equation ( 8) is compared.Since the average operations can easily be defeated even by just one strong impulse, we can figure out that the dominant role of robustness against impulsive noise is the MCIE.
Secondly, the effectiveness of the proposed NMEE algorithm (25) designed with the MCIE is We can notice that the optimum weight of MEE has averaging operations and MCIE in (23) has some differences when the weight update Equation ( 8) is compared.Since the average operations can easily be defeated even by just one strong impulse, we can figure out that the dominant role of robustness against impulsive noise is the MCIE.
Secondly, the effectiveness of the proposed NMEE algorithm (25) designed with the MCIE is investigated through the learning performance comparison with the original MEE algorithm in (19) under the same impulsive noise with σ 2 I N = 50 and ε = 0.03 as in the work [5] in which the impulsive noise is used in all time.The MSE learning results are shown in Figure 6. ).
As for the performance comparison between MEE and NMEE in Figure 6, NMEE shows lower minimum MSE and faster convergence speed simultaneously.The difference of convergence speed is about 500 samples and that of minimum MSE is around 1 dB.When compared to the condition of the same convergence speed, the difference in minimum MSE is shown to be about 3 dB.This amount of performance gap indicates that the proposed method of tracking the power of MCIE recursively and using it in normalization of the step size is significantly effective in the aspect of performance as well as computational complexity.
In Figure 7, the MCIE power becomes large as the MEE algorithm converges, and after convergence, the trace shows large variations, mostly above 6.The condition ( ) 6 P k  in this simulation implies that when NMEE is employed, the value μ according to (32) must be greater than 6 MEE μ for better performance.The fact that this is exactly in accordance with the choice 6 MEE μ μ  described in Figure 6 justifies the effectiveness of the proposed method by simulation.The LMS algorithm converges very slow and stays at about ´8 dB of MSE in the steady state.This result can be explained from the expression of W opt LMS in (8) having no measures to protect it from fluctuations from impulsive noise as discussed in Section 3. On the other hand, the MEE algorithm rigged with the magnitude controller for IE converges in about 1000 samples even under the strong impulsive noise.This result supports the analysis that the MCIE X MCIE j,k keeps the algorithm (19) and its steady state weight undisturbed by large error values that may be induced from excessive noise such as impulses.
As for the performance comparison between MEE and NMEE in Figure 6, NMEE shows lower minimum MSE and faster convergence speed simultaneously.The difference of convergence speed is about 500 samples and that of minimum MSE is around 1 dB.When compared to the condition of the same convergence speed, the difference in minimum MSE is shown to be about 3 dB.This amount of performance gap indicates that the proposed method of tracking the power of MCIE recursively and using it in normalization of the step size is significantly effective in the aspect of performance as well as computational complexity.
In Figure 7, the MCIE power becomes large as the MEE algorithm converges, and after convergence, the trace shows large variations, mostly above 6.The condition Ppkq ą 6 in this simulation implies that when NMEE is employed, the value µ according to (32) must be greater than 6µ MEE for better performance.The fact that this is exactly in accordance with the choice µ " 6µ MEE described in Figure 6 justifies the effectiveness of the proposed method by simulation.convergence, the trace shows large variations, mostly above 6.The condition ( ) 6 P k  in this simulation implies that when NMEE is employed, the value μ according to (32) must be greater than 6 MEE μ for better performance.The fact that this is exactly in accordance with the choice 6 MEE μ μ  described in Figure 6 justifies the effectiveness of the proposed method by simulation.In the same simulation environment, the MSE learning curves for two input power normalization approaches, NMEE and NMEE2 are compared in Figure 8.In the same simulation environment, the MSE learning curves for two input power normalization approaches, NMEE and NMEE2 are compared in Figure 8.As observed in Figure 8, the input power normalization approach for variable step size selection for the MEE algorithm shows different MSE performances according to which signal power is normalized.When NMEE is employed where the magnitude controls input entropy, MCIE is used for power normalization, the MSE learning performance yields better steady state MSE of above 2 dB and faster convergence speed by about 1000 samples than when NMEE2 is adopted, in which the squared norm of the unprocessed input 2 k X is used for normalization.As discussed in Section 4, under strong impulsive noise, the power of MCIE can be the right choice for step size normalization for better performance.
In system identification applications of adaptive filtering as appeared in the work [8], the desired signal is derived by passing the white Gaussian input through the unknown system.The unknown system in this simulation is of length 9.The impulse response of the unknown system is chosen to follow a triangular wave form that is symmetric with respect to the central tap point [9,13].The TDL filter has 9 tap weights.The input signal is a white Gaussian process with zero mean and unit variance.The same impulsive noise used in Figure 6, uncorrelated with the input, is added to the output of the unknown system.MSE learning curves are depicted in Figure 9.As observed in Figure 8, the input power normalization approach for variable step size selection for the MEE algorithm shows different MSE performances according to which signal power is normalized.When NMEE is employed where the magnitude controls input entropy, MCIE is used for power normalization, the MSE learning performance yields better steady state MSE of above 2 dB and faster convergence speed by about 1000 samples than when NMEE2 is adopted, in which the squared norm of the unprocessed input ||X k || 2 is used for normalization.As discussed in Section 4, under strong impulsive noise, the power of MCIE can be the right choice for step size normalization for better performance.
In system identification applications of adaptive filtering as appeared in the work [8], the desired signal is derived by passing the white Gaussian input through the unknown system.The unknown system in this simulation is of length 9.The impulse response of the unknown system is chosen to follow a triangular wave form that is symmetric with respect to the central tap point [9,13].The TDL filter has 9 tap weights.The input signal is a white Gaussian process with zero mean and unit variance.The same impulsive noise used in Figure 6, uncorrelated with the input, is added to the output of the unknown system.MSE learning curves are depicted in Figure 9.
unknown system in this simulation is of length 9.The impulse response of the unknown system is chosen to follow a triangular wave form that is symmetric with respect to the central tap point [9,13].
The TDL filter has 9 tap weights.The input signal is a white Gaussian process with zero mean and unit variance.The same impulsive noise used in Figure 6, uncorrelated with the input, is added to the output of the unknown system.MSE learning curves are depicted in Figure 9.One can observe from Figure 8 that the proposed NMEE achieves lower steady-state MSE than the conventional MEE algorithm in the system identification problems as well.

Conclusions
The MEE algorithm is known to outperform MSE-based algorithms in most signal processing applications in an impulsive noise environment.The conventional MEE algorithm has a fixed step size so that it may require in practice to employ a time varying step size that appropriately controls its learning performance.
Based on the analysis of the behavior of optimum weight and the role of MCIE in mitigation of influence from large error, it was found in this paper that the NMEE employing the step size normalized with the power of the current MCIE element can yield lower minimum MSE and faster convergence speed simultaneously.On the condition of the same convergence speed, the performance enhancement of 3 dB in the equalization simulation leads us to conclude that the proposed method of recursive estimation of the MCIE power for normalization of the step size is significantly effective in both aspects of performance and computational complexity.
can be replaced with the statistical average [ ] E  or vice versa for practical reasons, the comparison between (16) and the optimum condition of the MSE criterion [ provides insight that k e of the MSE criterion can correspond to EE sample k j e , , and k X of MSE criterion can be related to G e X as a kind of modified input entropy vector.We also see that the term the occurrence of a strong impulse in k n , k e can be located far away from j e so that the EE sample , turn, the value of the IE vector , controlled input entropy (MCIE) in (

Figure 4 .
Figure 4.The impulse and background noise for the simulation for the behavior of optimum weight.

Figure 5 .
Figure 5.The behavior of weight values of 4,k w and 5,k w with impulsive noise being added in the

Figure 4 .
Figure 4.The impulse and background noise for the simulation for the behavior of optimum weight.

Figure 4 .
Figure 4.The impulse and background noise for the simulation for the behavior of optimum weight.

Figure 5 .
Figure 5.The behavior of weight values of 4,k w and 5,k w with impulsive noise being added in the

Figure 5 .
Figure 5.The behavior of weight values of w 4,k and w 5,k with impulsive noise being added in the steady state.

Figure 7 .Figure 7 .
Figure 7.The trace of MCIE power ( ) P k of the MEE algorithm under the same simulation conditions used for Figure 6.

Figure 8 .
Figure 8. Learning curves of NMEE and NMEE2 algorithms for comparison of the two methods of input power normalization.

Figure 8 .
Figure 8. Learning curves of NMEE NMEE2 algorithms for comparison of the two methods of input power normalization.

Figure 9 .
Figure 9. Learning curves of MEE and NMEE for system identification.

Figure 9 .
Figure 9. Learning curves of MEE and NMEE for system identification.
the MEE algorithm (we will refer to this as NMEE2 for convenience), the squared Euclidean norm becomes