Regularization Factor Selection Method for l 1 -Regularized RLS and Its Modiﬁcation against Uncertainty in the Regularization Factor

Featured Application: This algorithm can be applied to various kinds of sparse channel estimations, e.g., room impulse response, early reﬂection, and underwater channel response. Abstract: This paper presents a new l 1 -RLS method to estimate a sparse impulse response estimation. A new regularization factor calculation method is proposed for l 1 -RLS that requires no information of the true channel response in advance. In addition, we also derive a new model to compensate for uncertainty in the regularization factor. The results of the estimation for many different kinds of sparse impulse responses show that the proposed method without a priori channel information is comparable to the conventional method with a priori channel information.


Introduction
Room impulse response (RIR) estimation is a problem in many applications that use acoustic signal processing. The RIR identification [1] is fundamental for various applications such as room geometry related spatial audio applications [2][3][4][5], acoustic echo cancellation (AEC) [6], speech enhancement [7], and dereverberation [8]. In [9], the RIR has relatively large magnitude values during the early part of the reverberation and fades to smaller values during the later part. This indicates that most RIR entries have values close to zero. Therefore, the RIR has a sparse structure. The sparse RIR model is useful for estimating RIRs in real acoustic environments when the source is given a priori [10]. There has been recent interest in adaptive algorithms for sparsity in various signals and systems [11][12][13][14][15][16][17][18][19][20][21][22]. Many adaptive algorithms based on least mean square (LMS) [11,12] and recursive least squares (RLS) [14][15][16][17] have been reported with different penalty functions. Sparse estimation research, such as that done by Eksioglu and Tanc [17], has proposed a sparse RLS algorithm, l 1 -RLS, which is fully recursive like the plain RLS algorithm. The algorithm of l 1 -RLS in [17] proposed a proper calculation method for the regularization factor. These recursive algorithms have the potential for sparse RIR estimation; however, the regularization factor should be established prior to applying these algorithms. The regularization factor calculation method requires information about a true sparse channel response for a good performance. The authors in [18,19] have also proposed recursive regularization factor selection methods; however, these methods still need the true impulse response in advance.
In this paper, we propose a new regularization factor calculation method for l 1 -RLS algorithm in [17]. The new regularization factor calculation needs no information for the true channel response in In the sparse channel estimation problem of interest, the system observes a signal represented by an M × 1 vector x(k) = [x k , · · · , x k−M+1 ] T at time instant n, performs filtering, and obtains the output T is the M dimensional actual system with finite impulse response (FIR) type. For system estimation, an adaptive filter system applies with M dimensional vector w(k) to the same signal vector x(k) and produces an estimated output y(k) = x T (k)w(k), and calculates the error signal e(k) = y(k) + n(k) −ŷ(k) = y(k) −ŷ(k), where n(k) is the measurement noise, y(k) is the output of the actual system, andŷ(k) is the estimated output. In order to estimate the channel impulse response, an adaptive algorithm minimizes the cost function defined by From the gradient based minimization, Equation (1) becomes where . This equation is the normal equation for the least squares solution. Especially, w o (k) is considered as a sparse system when the number of nonzero coefficients K is less than the system order of M. In order to estimate the sparse system, most estimation algorithms exploit non-zero coefficients in the system [11][12][13][14][15][16][17]. In [17], Eksioglu proposed a full recursive l 1 -regularized algorithm by the minimization of the object function as shown in Equation (3).
When we solve Equation (4), we should select the regularization factor as shown in Equation (5).
Appl. Sci. 2019, 9,202 3 of 11 where f (w(k)) = w(k) 1 and the subgradient of f (w) is ∇ s w 1 = sgn(w). In Equation (5), the regularization factor has the parameter, ρ, which should be set beforehand. In [17], the parameter was set as ρ = f (w true ) = w true 1 with w true indicating the impulse response of the true channel. There was no further discussion about how to set ρ. However, it is not practical to know the true channel in advance.

Measure of Sparseness
In [20], the sparseness of a channel impulse response is measured by Equation (6).
where ŵ p is the p-norm ofŵ and L is the dimension ofŵ. The range of χ is 0 ≤ χ ≤ 1. That is dependent on the sparseness ofŵ. Asŵ becomes sparser, the sparsity, χ, comes close to 1, and asŵ becomes denser, χ comes close to 0. We often have small and none-zero value of χ, even in a dense channel. For example, Figure 1 shows the relation of the value of χ and the percentage of none-zero components inŵ with L = 215. In Figure 1, we consider all possible cases of none-zero components inŵ.
When we solve Equation (4), we should select the regularization factor as shown in Equation (5).
and the subgradient of ( ) f w is 1 sgn ( ) s ∇ = w w . In Equation (5), the regularization factor has the parameter, ρ , which should be set beforehand. In [17], the parameter was set as There was no further discussion about how to set ρ . However, it is not practical to know the true channel in advance.

Measure of Sparseness
In [20], the sparseness of a channel impulse response is measured by Equation (6).
where ˆp w is the p-norm of ŵ and L is the dimension of ŵ . The range of χ is 0 1 χ ≤ ≤ . That is dependent on the sparseness of ŵ . As ŵ becomes sparser, the sparsity, χ , comes close to 1, and as ŵ becomes denser, χ comes close to 0. We often have small and none-zero value of χ , even in a dense channel. For example, Figure 1 shows the relation of the value of χ and the percentage of none-zero components in ŵ with L = 215. In Figure 1, we consider all possible cases of none-zero components in ŵ .

New ρ Selection Method in the Sparsity Regularization Constant k
γ Section 2 shows that the regularization constant k γ in Equation (5) needs ρ to be set as 1 1 true system impulse response . However, we need a new method in the constant selection because Equation (5) is not practical. Therefore, Section 4 proposes a new method to set this constant.
For a practical method for the constant selection, we can consider using the estimated vector ŵ instead of using the true vector true w because ŵ , the solution with l1-norm, will be closer to the sparse true vector than the solution of the conventional RLS. The more iteration is repeated, the more ŵ converges to the true value. Conventional RLS also converges to the true value; however, the solution with l1-norm, is closer to the sparse true value. Therefore, we can use sparse estimate ŵ instead of true w when we set ρ , and the uncertainty arising from this is compensated through a TLS

New ρ Selection Method in the Sparsity Regularization Constant γ k
Section 2 shows that the regularization constant γ k in Equation (5) needs ρ to be set as ρ = true system impulse response 1 = w true 1 . However, we need a new method in the constant selection because Equation (5) is not practical. Therefore, Section 4 proposes a new method to set this constant.
For a practical method for the constant selection, we can consider using the estimated vectorŵ instead of using the true vector w true becauseŵ, the solution with l 1 -norm, will be closer to the sparse true vector than the solution of the conventional RLS. The more iteration is repeated, the moreŵ converges to the true value. Conventional RLS also converges to the true value; however, the solution with l 1 -norm, is closer to the sparse true value. Therefore, we can use sparse estimateŵ instead of w true when we set ρ, and the uncertainty arising from this is compensated through a TLS solution in the next section. When we determine ρ using the estimatedŵ, we choose between the average ρ and the current estimate ŵ 1 . Table 1 summarizes the ρ selection steps.
The determination method for ρ value shown in Table 1 is as follows. In Step 1, the sparsity of the estimatedŵ is calculated. The sparsity represents the sparseness ofŵ as a number [23]. In Step 2, l 1 -norm of the estimatedŵ is scaled and the value is averaged with the previous ρ value. The scaling value approaches 1 as the sparsity, χ, gets close to 1. However, the scaling value gets close to e −1 0.37 as the sparsity, χ, gets close to 0. Therefore, the scaling does not change l 1 -norm ofŵ for the sparseŵ.
Instead the scaling changes the l 1 -norm smaller for the denseŵ. In Step 3, the smaller one between the averaged ρ and the l 1 -norm of the estimatedŵ is selected as the new ρ value. In this case, the ρ value becomes completely new if the l 1 -norm of the estimatedŵ is selected, otherwise the previous trend is maintained. In Figure 1, the reference value 0.75 used in Step 3 means that less than 16% of all the impulse response taps are not zero. Table 1. ρ selection method in the sparsity regularization constant γ k .
Step 1 Sparsity: where L is the length of the impulse response.

New Modeling for l 1 -RLS with Uncertainty in the Regularization Factor
If we set ρ = constant, the regularization factor becomes Then, Using Equation (8), Equation (4) becomes ∇ s w 1 = sgn(w) is represented as By applying Equation (10) to Equation (9), where w i is i-th element of w(k). Then it is simplified as Appl. Sci. 2019, 9, 202 5 of 11 Equation (12) is very similar to the system model in Figure 2 that is contaminated by noise both in input and in output. Suppose that an example of the system in Figure 2 is represented as where x k is x(k), n i,k is n i (k), and n o,k is n o (k). Equation (13) is simplified as If we multiply Equation (14) by A H and average it, we get We can rewrite Equation (15) as follows Then, it can be represented as When we compare Equation (12) with Equation (17), the two system models have almost the same form. Therefore, it is feasible that the TLS method can be applied to Equation (12) [24][25][26][27][28][29][30]. Therefore, we expect to obtain almost the same performance as l 1 -RLS with the true channel response if we apply the TLS method by the regularization factor with the new ρ in Table 1. In the next section, we summarize l 1 -RTLS (recursive total least squares) algorithm in [29].

Summarize l 1 -RTLS for the Solution of l 1 -RLS with Uncertainty in the Regularization Factor
Lim, one of the authors of this paper, has proposed the TLS solution for l 1 -RLS known as l 1 -RTLS [30]. In this section, we summarize l 1 -RTLS in [30] for the solution of Equation (11).
The TLS system model assumes that both input and output are contaminated by additive noise as Figure 2. The output is given by where the output noise n o (k) is the Gaussian white noise with variance σ 2 o . The noisy input vector in the system is modeled by where n i (k) = [n i (k), n i (k − 1), · · · n i (k − M + 1)] T and the input noise n i (k) is the Gaussian white noise with variance σ 2 i . For the TLS solution, we set the augmented data vector as The correlation matrix is represented as where [27,28], the TLS problem becomes to find the eigenvector associated with the smallest eigenvalue of R. Equation (22) is the typical cost function to find the eigenvector associated with the smallest eigenvalue of R.
where R(k) is a sample correlation matrix at k-th instant, and w(k) = ŵ T (k), −1 T in whichŵ(k) is the estimation result for the unknown system at k-th instant. We modify the cost function by adding a penalty function in order to reflect prior knowledge about the true sparsity system.

Simulation Results
This section confirms the performance of the proposed algorithm in sparse channel estimation. In the first experiment, the channel estimation performance is compared with other algorithms using randomly generated sparse channels. In this simulation, we follow the same scenario in the experiments as [17]. The true system vector w true is 64 dimensions. In order to generate the sparse channel, we set the number of the nonzero coefficients, S, in the 64 coefficients and randomly position the nonzero coefficients. The values of the coefficients are taken from an N(0, 1/S) distribution, where N( ) is the normal distribution. In the simulation, we estimate the channel impulse response by the proposed algorithms that are l 1 -RLS using the ρ in Table 1 and l 1 -RTLS using the ρ in Table 1. For the comparison, we estimate the channel impulse response by l 1 -RLS using the true channel response; in addition, we also execute the regular RLS algorithm in an oracle setting (oracle-RLS) where the positions of the true nonzero system parameters are assumed to be known. For the estimated channel results, we calculate the mean standard deviation (MSD), where MSD = E |ŵ − w true | 2 ,ŵ is the Appl. Sci. 2019, 9, 202 8 of 11 estimated channel response and w true is the true channel response. For the performance evaluation, we simulate the algorithms in the sparse channels for S = 4, 8, 16, and 32. Figure 3 illustrates the MSD curves. For S = 4, Figure 3a shows that the estimation performance of l 1 -RTLS using the regularization factor with the ρ in Table 1 is almost the same as the l 1 -RLS using regularization with a true channel impulse response. However, the performance of l 1 -RLS using the regularization factor with the ρ in Table 1 is gradually degraded and shows a kind of uncertainty accumulation effect. In the other cases of S, we can observe the same trend in the MSD curves. Therefore, we can confirm that the new regularization factor selection method and the new modeling for l 1 -RLS can estimate the sparse channel as good as l 1 -RLS using the regularization with the true channel impulse response. In all the simulation scenarios, oracle RLS algorithm produces the lowest MSD as expected.
Appl. Sci. 2019, 9, 202 8 of 11 response; in addition, we also execute the regular RLS algorithm in an oracle setting (oracle-RLS) where the positions of the true nonzero system parameters are assumed to be known. For the estimated channel results, we calculate the mean standard deviation (MSD), where ( ) , ŵ is the estimated channel response and true w is the true channel response. For the performance evaluation, we simulate the algorithms in the sparse channels for S = 4, 8, 16, and 32. Figure 3 illustrates the MSD curves. For S = 4, Figure 3a shows that the estimation performance of l1-RTLS using the regularization factor with the ρ in Table 1 is almost the same as the l1-RLS using regularization with a true channel impulse response. However, the performance of l1-RLS using the regularization factor with the ρ in Table 1 is gradually degraded and shows a kind of uncertainty accumulation effect. In the other cases of S, we can observe the same trend in the MSD curves. Therefore, we can confirm that the new regularization factor selection method and the new modeling for l1-RLS can estimate the sparse channel as good as l1-RLS using the regularization with the true channel impulse response. In all the simulation scenarios, oracle RLS algorithm produces the lowest MSD as expected.  In the second experiment, we compare channel estimation performance using room impulse response. The size of the room is (7.49, 6.24, 3.88 m). The position of the sound source is (1.53, 0.96, 1.12 m) and the position of the receiver is (1.81, 5.17, 0.71 m), respectively. T60 is set to 100 ms and 400 ms. The impulse response of the room is generated using the program in [31]. We focus on the direct reflection part and the early reflection part in the RIR because the direct reflection and early reflection part of the RIR has a sparse property. This is the part that is estimated in the AEC applications [32]. This part is also related to localization and clarity in room acoustics [33][34][35]. Comparing the impulse response (IR) generated by setting T60 = 100 ms to the channel with 65 coefficients used in the first  Table 2 summarizes the steady-state MSD values as varying S from 4 to 32. The results show that the proposed l 1 -RTLS with the new ρ is comparable to l 1 -RLS with the true channel.
In the second experiment, we compare channel estimation performance using room impulse response. The size of the room is (7.49, 6.24, 3.88 m). The position of the sound source is (1.53, 0.96, 1.12 m) and the position of the receiver is (1.81, 5.17, 0.71 m), respectively. T60 is set to 100 ms and 400 ms. The impulse response of the room is generated using the program in [31]. We focus on the direct reflection part and the early reflection part in the RIR because the direct reflection and early reflection part of the RIR has a sparse property. This is the part that is estimated in the AEC applications [32]. This part is also related to localization and clarity in room acoustics [33][34][35]. Comparing the impulse response (IR) generated by setting T60 = 100 ms to the channel with 65 coefficients used in the first experiment, it is equivalent to S = 4 in the channel with 65 coefficients. In the same manner, the IR generated by setting T60 = 400 ms is equivalent to S = 10.  Table 3 summarizes the steady-state MSD values. The results also show the same trend as Table 2. In RIR estimation, the proposed l 1 -RTLS with the new ρ is also comparable to l 1 -RLS with the true channel.

Conclusions
In this paper, we have proposed the regularization factor for recursive adaptive estimation. The regularization factor needs no prior knowledge of the true channel impulse response. We have also reformulated the recursive estimation algorithm as l 1 -RTLS type. This formulation is robust to the uncertainty in the regularization factor without a priori knowledge of the true channel impulse response. Simulations show that the proposed regularization factor and l 1 -RTLS algorithm provide good performance comparable to l 1 -RLS with the knowledge of the true channel impulse response.