Norm Penalized Joint-Optimization NLMS Algorithms for Broadband Sparse Adaptive Channel Estimation

A joint-optimization method is proposed for enhancing the behavior of the l1-normand sum-log norm-penalized NLMS algorithms to meet the requirements of sparse adaptive channel estimations. The improved channel estimation algorithms are realized by using a state stable model to implement a joint-optimization problem to give a proper trade-off between the convergence and the channel estimation behavior. The joint-optimization problem is to optimize the step size and regularization parameters for minimizing the estimation bias of the channel. Numerical results achieved from a broadband sparse channel estimation are given to indicate the good behavior of the developed joint-optimized NLMS algorithms by comparison with the previously proposed l1-normand sum-log norm-penalized NLMS and least mean square (LMS) algorithms.


Introduction
Recently, adaptive channel estimation has been extensively studied all over the world [1,2].From these adaptive filtering frameworks, the LMS adaptive filtering algorithm has been deeply discussed for adaptive control, system identification, channel estimation applications owing to its low computational complexity and easy practical realization [3][4][5].Although the LMS algorithm can effectively estimate the broadband multi-path channel, it has a sensitivity to the input signal scaling, and it is difficult to select an agreeable learning rate to achieve a stable and robust channel estimation behavior [6][7][8][9][10][11]. Subsequently, the NLMS has been proposed by using the normalization of the input training signal power in order to overcome the above addressed problem [12,13].However, the conventional NLMS algorithm cannot perform well for dealing with sparse channel estimation.
Additionally, the broadband multi-path channel or underwater communication channel might be a sparse channel, which has been studied in recent decades [13][14][15][16][17]. From the measurement of the wireless channel, the channel impulse response can be regarded as sparse channel.This is to say that only a few channel impulse responses in most of the multi-path channels are dominant, while the major channel responses are zeros or their magnitudes are near-zeros.The traditional LMSand NLMS-based channel estimation methods cannot make use of the inherent sparse properties of these broadband multi-path channels [6][7][8][9][10][11][12][13].To utilize the sparse structures of the broadband sparse channel, compressed sensing (CS) methods have been introduced for developing various channel estimation algorithms used in sparse cases [18][19][20].Although some of these CS-based sparse channel estimations can achieve robust estimation performance, these CS algorithms may have high complexity for dealing with time-varying channels or they have difficulty in constructing desired measurement matrices with the restricted isometry property limitation [21].Thus, simple sparse channel estimation developments have attracted much more attentions in the recent decades.
Sparse LMSs have been presented under the inspiration of the CS techniques [22,23], which are realized by using a norm constraint term in the cost function of the LMS.The first sparse LMS algorithm motivated by the CS technique is carried out by introducing a l 1 -norm constraint term into the basic LMS to exploit the in-nature sparse characteristics of broadband multi-path channel [24][25][26][27].As a result, a zero attractor is given in the updating equation of the sparse LMS algorithm to put forward a zero-attracting (ZA) LMS (ZA-LMS) algorithm.Furthermore, a reweighting ZA-LMS (RZA-LMS) was reported by using a sum-log constraint instead of the l 1 -norm penalty in the ZA-LMS algorithm [24,27].Subsequently, the zero attracting techniques have been widely researched, and a great quantity of sparse LMS algorithms was exploited by using different norm constraints, such as l p -norm and smooth approximation l 0 -norm constraints [28][29][30][31][32]. Furthermore, the zero attracting (ZA) technique has also been expanded into the affine projection algorithm and the normalized NLMS algorithms to further exploit the applications of the ZA algorithms [12,[33][34][35][36][37][38][39][40], which includes ZA-NLMS and RZA-NLMS algorithms.However, the affine projection algorithm has higher complexity than the NLMS algorithm, which limited its applications.Thus, the sparse NLMS algorithms have been extensively studied and have been used for sparse channel estimations.However, the behavior of these NLMS-based channel estimations was affected by the modified step size and the regularized parameters.The normalized step size has an important effect on the compromise between the channel estimation behavior and convergence speed, while the regularization parameter depends on the SNR of the systems [41].From [41], we can see that the proposed joint-optimization method cannot utilize the sparsity of the multi-channels.In addition, the step size and the regularization parameters should be selected to address the conflict requirement between the channel estimation behavior and the convergence speed.
The structure is illustrated herein.We review the basic NLMS and the previously-reported ZA-NLMS, as well as the RZA-NLMS algorithms in Section 2 through estimating a broadband multipath wireless communication channel.Section 3 gives the derivation of the joint-optimization scheme and the proposed ZAJO-NLMS and RZAJO-NLMS algorithms.In Section 4, our ZAJO-NLMS and RZAJO-NLMS algorithms will be evaluated though a broadband sparse multipath channel, and their channel estimation behaviors are discussed and compared with the previous ZA-NLMS, RZA-NLMS, ZA-LMS, RZA-LMS, traditional LMS and NLMS algorithms.At last, a short summary is given in Section 5.

NLMS-Based Sparse Adaptive Channel Estimation Algorithm
We review the traditional NLMS and its sparse forms, which include the ZA-NLMS and RZA-NLMS, over a sparse multipath wireless channel, which is given in Figure 1.
T is used as a training input signal, which is transmitted over an FIR channel w = [w 0 , w 1 , • • • , w N−1 ] T .In this paper, N denotes the length of a multipath channel, and (•) T is the transpose operation.The channel output is y(n), which is denoted as y(n) = x T (n)w.The expected signal d(n) is acquired at the receiver, and d(n) = y(n) + v(n).v(n) denotes an additional white Gaussian noise (AWGN).The NLMS-based channel estimation algorithms aim to get the unknown sparse channel w by utilizing x(n) and y(n) to minimize the instant error e(n).We define e(n) = d(n) − x T (n) ŵ(n).ŵ(n) denotes the estimated vector.The cost function of the traditional NLMS is depicted as: By using the Lagrange multiplier method to carry out the desired minimization of (1) and introducing a controlling step size, the updating of the traditional NLMS is obtained [3][4][5]: µ denotes a step size of the NLMS algorithm, and δ represents a regularization parameter with small value, which is to prevent from dividing by zero.Although the NLMS algorithm can give a good estimation of the sparse channel w by the use of the update Equation ( 2), it cannot use the sparsity property of the practical existing multi-path channels.Recently, the ZA technique has been introduced into the original cost function of the traditional NLMS for exploiting the sparseness of the channel.Then, the ZA-NLMS algorithm is presented by using an l 1 -norm penalty to modify the original cost function of the traditional NLMS.Thus, the modified cost function for the reported ZA-NLMS is written as: where γ 1 denotes a zero attracting strength parameter used to get a balance between the estimation behavior and the sparseness of ŵ(n + 1) 1 .Furthermore, the Lagrange multiplier method is utilized to find out a solution of (3).As a result, the updating equation of ZA-NLMS is described as [12]: where µ ZA-NLMS denotes a step size and ρ ZA-NLMS denotes the zero-attracting strength controlling factor for the ZA-NLMS algorithm, respectively.The ZA-NLMS algorithm utilizes the sparse characteristic of the multi-path wireless channel.However, it exerts a uniform zero attracting on all of the channel taps.Thus, the ZA-NLMS algorithm may reduce the estimation behavior when the designated channel is less sparse.To address this problem, a sum-log penalty [42] is utilized to form the RZA-NLMS algorithm, whose cost function is written as: We also employ the Lagrange multiplier method to solve Equation ( 5).Then, the channel coefficients of the RZA-NLMS algorithm are updated by: [12] ŵi or its vector form: where µ RZA−NLMS is a step size, while ρ RZA-NLMS denotes a zero-attracting strength controlling factor of the RZA-NLMS algorithm, respectively.Here, ε is a threshold that is used for controlling the reweighted factor 1/(1 From the above discussions, it is observed that the previously-proposed ZA-NLMS and RZA-NLMS exert the desired zero-attractor on each iteration.The proposed zero attractors attract the zero channel coefficients to zero quickly compared with the traditional NLMS algorithm.Thus, we can say that the sparse NLMS algorithms, namely ZA-NLMS and RZA-NLMS, utilize different sparse penalties to achieve various zero attractors.The traditional NLMS-based channel estimation algorithm is concluded as follows: Comparing to the traditional NLMS method mentioned in (8), the mentioned sparse NLMS-based channel estimation algorithms provide amazing zero attractors for both ZA-NLMS and RZA-NLMS, and hence, their updated equation is:

Proposed Sparse Joint-Optimization NLMS Algorithms
Though the proposed sparse NLMS algorithms effectively utilize the in-nature properties of the wireless multi-path channel for achieving a superior channel estimation behavior, their performance will be affected by step size, regularization parameter δ and the zero-attracting strength controlling factor.After that, various techniques have been presented to enhance the behavior of the ZA-NLMS and RZA-NLMS algorithms, including variable step size methods and parameter-adjusting techniques [32].Herein, we concentrate on constructing a joint-optimization method on the regularization parameter δ and step size.
Next, we will introduce our proposed ZAJO-NLMS algorithm in detail.Here, we consider the updating equation of the ZA-NLMS algorithm: where µ ZA-NLMS denotes a step size, δ represents a regularization parameter and: denotes the instantaneous channel estimation error at instant n.
As we know, x [41]; E {•} represents expectation, and δ 2 x represents the variance of x(n).For large N, we can get x T (n)x(n) ≈ Nδ 2 x [43].Then, we have: Next, we consider w(n) as a channel that can be modeled as a simplified first order Markov model [41], where g(n) represents an AWGN signal with zero mean.Furthermore, we assume that g(n) is independent of the channel w(n).Therefore, we can get: where I N is an N × N identity matrix.Thus, the variance of δ 2 g gives an important uncertainty on w(n).Here, we define a posteriori bias as: Here, we can get: Taking the l 2 -norm on (16), using expectation on its left and right sides and getting rid of the uncorrelated products based on the i.i.d.assumptions, we have: Now, we concentrate on the last five terms in the Equation (17).From the above discussion, the instantaneous error is illustrated as: By taking the Equation ( 18) into account, we can get: where tr [•] represents the trace operation of a matrix.Assume that the misalignments at the instant n and (n + 1) are uncorrelated.In the stable state, a posteriori bias correlation matrix is approximated as a diagonal matrix [41], which is because the bias of each coefficient tends to be uncorrelated.Thus, we obtain: Similarly, the cross-correlation E x T (n)g(n)e(n) can also be obtained based on (18).Therefore, the correlation matrix of g(n) can be regarded as a diagonal matrix.By eliminating the uncorrelated terms, we obtain: Then, the expectation term E e 2 (n)x T (n)x(n) can also be calculated by taking Equation ( 18) into consideration.Similarly, we have: Here, we assume that the correlation matrix of x(n) approximates a diagonal matrix, which has been widely used for simplifying the analysis [41,44].Furthermore, this also motivates us to further develop the second term in Equation ( 22) based on the Gaussian moment factoring theory [45].Then, we obtain: As we know, at the stable status, we get: and ρ ZA−NLMS is very small.Thus, Equation ( 17) can be approximated to be: From the discussions above, we substitute Equations ( 20), ( 21) and ( 23) into Equation (25) and denote m(n) = E ∆(n + 1) 2 2 ; we have: where: and: From the result in Equation ( 26), we can see that the convergence speed and the misadjustment components are separated from each other.The first term of Equation ( 26) plays a significant role in the convergence speed of adaptive filters, which depends on µ ZA-NLMS , δ, input signal power and filter length.It is worth noting that the convergence speed component does not rely on the δ 2 v and δ 2 g of the model [41].Thus, the noise power δ 2 v and the uncertainties δ 2 g do not give any effect on the convergence.We can see that fastest convergence speed can be achieved when Equation (27) reaches its minimum.Then, we have: Therefore, we can get: where µ H is the fastest convergence speed controlling factor.As we know, the regularization parameter δ is small, and the length N is large.Thus, µ H ≈ 1 for achieving the fastest convergence speed, which is well known in [3,44].Furthermore, the stability condition is obtained by letting f (µ ZA−NLMS , δ, N, δ 2 x ) < 1, which results in: Again, by considering δ = 0 and N 2, the stability conditions of the NLMS and ZA-NLMS algorithms can be obtained, which is written as µ NLMS/ZA-NLMS .
The second term of Equation ( 26) gives large effects on the misalignment of the proposed algorithm, which significantly depends on the noise power δ 2 v and the uncertainties δ 2 g of the model.With the increment of these two parameters, the misalignment is also increased [41].The lowest misalignment is obtained when Equation ( 28) reaches the minimum.Furthermore, by taking the step size into account, we can get the lowest misalignment, which can be expressed as: As we know, the broadband channel is always time-varying, and hence, δ 2 g = 0.In this case, µ L = 0.This is to say that the lowest misalignment is achieved as the step size approximates zero [3].From the discussions mentioned above, we follow the optimization criterion to mimic the channel estimation misalignment to obtain the optimized sparse RZA-NLMS and ZA-NLMS algorithms, which is based on the convergence analysis above.
An ideal adaptive filtering algorithm needs low misalignment and a rapid convergence speed rate.Unluckily, the results in Equations ( 30) and (32) give opposite directions.Thus, we need to optimize the step size to enhance the channel estimation behavior.Furthermore, the regularization affects the behaviors on sparse NLMS algorithms.From (27), we can see that the convergence decreases when the regularization parameter δ increases, while the misalignment in Equation ( 28) always increases when regularization parameter δ decreases.Thus, we should control the step size and δ to mimic the effects on the performance of the channel estimation algorithms.Additionally, we also follow a minimization problem with respect to the channel estimation misalignment.According to Equation (26) and assuming these two parameters are dependent on time, we can impose [41]: Then, we can get the same result, which is expressed as: which gives a joint-optimization procedure.Then, we introduce Equation (34) into Equation ( 10) to obtain the updating equation of our ZAJO-NLMS algorithm.Then, we can get: Obversely, we should update the m(n) in (35).Then, substituting ( 34) into ( 26) results in: When n → ∞, we have: whose solution is expressed as: Similarly to the extraction of the ZAJO-NLMS, a reweighting factor is incorporating into the ZAJO-NLMS.As a result, the updating equation is obtained for realizing the RZAJO-NLMS algorithm: From the proposed ZAJO-and RZAJO-NLMS algorithms, we found that the regularization parameter δ and step size are joint-optimized, while the zero attractor keeps invariable.From the above discussions, we found that there are two additional zero attraction terms, namely −γ ZAJO sgn { ŵ(n)} and −γ RZAJO sgn{ ŵ(n)} 1+ε| ŵ(n)| , in the ZAJO-NLMS and RZAJO-NLMS algorithms, which are different from [41] due to the zero attractors.In this paper, the proposed ZAJO-NLMS and RZAJO-NLMS algorithms are implemented by the combination of the zero-attraction-based NLMS and the joint-optimization method in [41].As a result, the ZAJO-NLMS and RZAJO-NLMS algorithms are constructed to deal with the sparse channel estimation, which can give better performance due to the joint-optimization and the zero attractors −γ ZAJO sgn { ŵ(n)} and −γ RZAJO sgn{ ŵ(n)} 1+ε| ŵ(n)| .Our contributions can be summarized as follows: (1) Two optimized ZAJO-NLMS and RZAJO-NLMS algorithms with zero attractors have been proposed for sparse channel estimation, in the context of the state variable model.(2) The proposed ZAJO-NLMS and RZAJO-NLMS algorithms are realized by using the joint-optimization method and the zero attraction techniques to mimic the channel estimation misalignment.(3) The behaviors of the proposed ZAJO-NLMS and RZAJO-NLMS algorithms are evaluated for estimating sparse channels. (4) The ZAJO-NLMS and RZAJO-NLMS algorithms can achieve both faster convergence and lower misalignment than the ZA-and RZA-NLMS algorithms owing to the joint-optimization, which effectively adjusts the step size and the regularization parameter.In the future, we will develop an optimal algorithm to optimize the zero-attractor terms.

Results and Discussion
We construct several experiments to look into the estimation behavior of our ZAJO-NLMS and RZAJO-NLMS algorithms through a multi-path wireless communication channel, which is a general sparse channel model obtained from the measurement [14,17] and which has been widely used for verifying the estimation performance of NLMS-based channel estimations [6,7,9,12,14,24,[27][28][29][30][31][32][33][34][35][36][37][38][39][40].Moreover, the channel estimation behavior is evaluated using mean-square error, and the channel estimation performance is also compared with the traditional LMS, NLMS, ZA-LMS, RZA-LMS, ZA-NLMS and RZA-NLMS algorithms.Here, a multipath channel has a length of N = 16.This channel with varying K is used for predicting the estimation behavior by means of the mean square error (MSE) standard.In the investigation, the channel estimation performance with different sparsity level K is also analyzed in detail.The K dominant coefficients are distributed randomly in the channel, and it is limited by w 2 2 = 1.An example of a typical broadband sparse multi-path wireless channel is given in Figure 2. Here, the sparse channel has a length of 16, and it has three dominant coefficients.The number of non-zero coefficients is denoted as sparsity level K.In all of the experiments, x(n) is a Gaussian signal, and v(n) is a Gaussian noise.The received signal has a power of E b = 1.The power of the noise v(n) is given by δ 2 v .The estimation behaviors of our ZAJO-NLMS and RZAJO-NLMS algorithms are accessed by the MSE given by: Since the step size and the regularization parameter have been optimized based on the derivation of our ZAJO-NLMS and RZAJO-NLMS, only the zero-attracting parameter can affect the performance of these proposed algorithms.Thus, we investigated the effects of the γ ZAJO and γ RZAJO on the MSE.The performance is shown in Figures 3 and 4 for γ ZAJO and γ RZAJO , respectively.We can see that both γ ZAJO and γ RZAJO have important effects on the sparse channel estimation.With a decrement of the γ ZAJO , the MSE of the ZAJO-NLMS algorithm shown in Figure 3 is getting better when γ ZAJO ranges from 3 × 10 −3 to 1 × 10 −4 .As for γ ZAJO = 1 × 10 −5 , both the convergence speed and MSE are deteriorated.From Figure 4, it is observed that γ RZAJO has the same trend as that of the γ ZAJO .Thus, we can properly select γ ZAJO and γ RZAJO to achieve desired channel estimation performance.According to the parameter effects in Figures 3 and 4, we choose γ ZAJO = γ RZAJO = 3 × 10 −4 to investigate the effects of the sparsity level on the sparse channel estimation performance.
Next, we set the sparsity level of the broadband sparse multi-path to K = 1, 2, 4, 8 to analyze the channel estimation performance.The simulation parameters are The results are comparatively shown in Figures 5-8 for K = 1, K = 2, K = 4 and K = 8, respectively.It is observed from Figure 5 that our RZAJO-NLMS algorithm achieves the quickest convergence and smallest steady-state error compared with the traditional LMS, NLMS, ZA-LMS, RZA-LMS, ZA-, RZA-and ZAJO NLMS algorithms.The proposed RZAJO-NLMS algorithm has much more gains than the traditional RZA-NLMS algorithm.Additionally, our proposed ZAJO-NLMS algorithm achieves better channel estimation behaviors than those of the RZA-NLMS algorithm for K = 1.When the sparsity level K increases from 2 to 8, the gain between the RZAJO-NLMS and RZA-NLMS algorithm is getting small.However, our proposed RZAJO-NLMS algorithm outperforms all of the other sparse channel estimation algorithms with reference to both the MSE and convergence.When K = 8, our proposed ZAJO-NLMS algorithm achieves quicker convergence compared to the traditional NLMS to get the same MSE level.On the contrary, the behavior of the ZA-NLMS becomes worse than that of the traditional NLMS.Thereby, our proposed RZAJO-and ZAJO-NLMS algorithms are more useful for adaptive sparse channel estimation applications.

Conclusions
We proposed joint-optimization sparse NLMS algorithms, namely RZAJO-NLMS and ZAJO-NLMS.The joint-optimization was realized by using a state model to improve the channel estimation performance of both the ZA-and RZA-NLMS algorithms.Our RZAJOand ZAJO-NLMS algorithms are based on the joint-optimization of step size and the regularization parameter.The proposed joint-optimization was derived in detail.Furthermore, the estimation behavior of our RZAJO-and ZAJO-NLMS algorithms is evaluated on a broadband sparse multi-path channel with different sparsity levels.The results verified that our RZAJO-NLMS algorithm provides the fastest convergence speed rate and lowest MSE.In addition, the proposed ZAJO-NLMS outperforms the previously-reported ZA-NLMS and traditional NLMS algorithms.
This study provided the RZAJO-NLMS and ZAJO-NLMS algorithms based on the zero attracting and the joint-optimization techniques.The proposed joint-optimization technique can be expanded to the l p -norm (0 ≤ p ≤ 1) constrained NLMS, normalized LMF (NLMF) and normalized least mean mixed norm (NLMMN) to enhance the sparse channel estimation performance or sparse system identification.In addition, the proposed method can be used for exploiting the two-dimensional (2D) adaptive filters for imaging processing, which can be used for medical imaging denoising applications [46].Moreover, our proposed RZAJO-NLMS and ZAJO-NLMS algorithms can be integrated into the orthogonal frequency-division multiplexing (OFDM) and multiple-input multiple-output (MIMO) OFDM systems to improve the quality of the communication systems [47,48].

Figure 2 .
Figure 2. A typical broadband sparse multi-path channel.

Figure 3 .
Figure 3.The effect of the γ ZAJO on the zero-attracting joint-optimization (ZAJO)-NLMS for sparse channel estimation.