Frequency-Domain Filtered-x LMS Algorithms for Active Noise Control: A Review and New Insights

: This paper presents a comprehensive overview of the frequency-domain ﬁltered-x least mean-square (FxLMS) algorithms for active noise control (ANC). The direct use of frequency-domain adaptive ﬁlters for ANC results in two kinds of delays, i


Introduction
Acoustic noise control is essential for reducing the noise level in modern society since noise seriously affects human health [1,2]. Passive noise control, which is based on using reactive devices, e.g., Helmholtz resonators and quarter wavelength resonators, and using resistive materials, e.g., acoustic linings and porous membranes, is very effective for reducing high-frequency noise, but not so effective for low-frequency noise reduction. Active noise control (ANC) based on the principle of superposition is an appealing method for low-frequency noise reduction [3][4][5][6]. In practice, passive noise control and ANC methods are usually combined to provide wideband noise reduction.
In ANC systems, the filtered-x least mean-square (FxLMS) algorithm is widely used to update the weights of the control filter, but the complexity of the FxLMS algorithm increases linearly with the filter length. In certain applications, the control filter is on the order of several thousand. Therefore, the FxLMS and other time-domain algorithms, e.g., the filtered-x affine projection (FxAP) [7][8][9], are too complex, which is prohibitive for real-time systems. The frequency-domain adaptive filter (FDAF) [10][11][12][13][14][15][16][17] has been successfully used in echo cancellation [18][19][20][21], acoustic feedback cancellation [22], and beamforming [23] due to its good convergence behavior and low complexity. In [24][25][26][27], the block LMS (BLMS) was extended to ANC applications and the corresponding frequency-domain implementation was provided. The experiment in [27] used both the periodic and band-limited signals as the reference signals, but the delay problem was not discussed. In [28], a more efficient implementation of BLMS was presented, where the filtering of the reference signal by the secondary path was also implemented using FFT. Meanwhile, the fast Hartley transform (FHT) was adopted to reduce the complexity. The frequency-domain implementation was extended to the multichannel case in [26,29]. The aforementioned algorithms are obtained by applying the FDAF originally used in echo cancellation to the ANC problem. However, the algorithm requirements for echo cancellation and ANC are quite different. Specifically, the direct use of the FDAF for ANC introduces two kinds of delays. First, there is at least one-block delay between the input of the reference signal and the output of the cancelling signal because the FDAF algorithm is implemented on a block-by-block basis. This delay can violate the causality constraint, which is a major concern for broadband ANC [4,30,31]. Second, the delay between the weight adaptation and the observation of the error signal is introduced because of the effect of the secondary path. The behavior of the filtered-x algorithm is similar to that of the delayed-LMS algorithm, which reduces the upper bound of the step size and slows the convergence and reconvergence rate [32]. In addition, the peak complexity of the FDAF algorithms presented in [28] is quite high because the FFT operation should be completed in one sampling interval; therefore, these algorithms are impractical for real-time implementation.
Many approaches have been proposed to reduce the two kinds of delays. The modified filtered-x structure [33,34] was used in the frequency-domain ANC in [35], which makes the adaptive algorithm behave like the standard FDAF and thus removes the delay in the weight adaptation. The partitioned-block FDAF (PBFDAF) algorithm was introduced for ANC to reduce the delay in the signal path [36][37][38][39]. This is achieved by partitioning the whole impulse response into several small sections, but this method does not completely eliminate the input-output delay. Several delayless algorithms were then proposed to totally remove the delay in the signal path. The calculation of the adaptive filter output can be implemented directly using the time-domain convolution and hence the delay in the signal path is removed [40][41][42][43]. However, the complexity of the convolution in the time domain is still high. In [44], a delayless FDAF algorithm originally used in echo cancellation [45] was extended to ANC, but the computational burden at certain sampling periods is rather high. Two computationally efficient FDAF algorithms for ANC were proposed in [46,47] using the delayless approach in [48,49], but the delay in the filter adaptation is not removed.
Although many efforts have been devoted to the frequency-domain FxLMS algorithms, these algorithms are not well compared and analyzed. To fill this gap, this paper presents a comprehensive review of the frequency-domain FxLMS algorithms. Only mono-channel algorithms are considered in this paper, since they can be straightforwardly extended to the multi-channel case. The conventional frequency-domain FxLMS algorithms are reviewed, and then the effects of the two kinds of delays on the overall performance are analyzed. Specifically, it was found that calculating the adaptive filter output and the weight vector update in different ways leads to different convergence performances, which is quite different from system identification problems. The update-first approach is then proposed to improve the stability. Several delayless frequency-domain algorithms for ANC are surveyed, but these algorithms did not remove the delay related to the secondary path. To address this problem, we present a new delayless frequency-domain FxLMS algorithm to completely remove the aforementioned two types of delays and overcome the shortcomings of the state-of-the-art frequency-domain FxLMS algorithms. The complexity of the frequency-domain approaches in terms of both peak and average multiplications per sample are evaluated. Simulations are carried out to evaluate the convergence performance and stability of the frequency-domain FxLMS algorithms.

FxLMS Algorithm
The diagram of the FxLMS algorithm is presented in Figure 1, where P(z) is the primary path and S(z) is the secondary path. We define the weight vector of the control filter at the time index n as w(n) = [w 0 (n), w 1 (n), · · · , w N w −1 (n)] T with a length of N w . The reference signal x(n) is picked up by a reference microphone and then filtered by the adaptive filter to generate the cancelling signal y(n) driven by a secondary loudspeaker y(n) = x T (n)w(n) (1) where x(n) = [x(n), x(n − 1), · · · , x(n − N w + 1)] T is the reference signal vector. The cancelling signal y(n) is then filtered by the secondary path S(z) to obtain the control signal z(n) at the location of the error microphone z(n) =ȳ T (n)s (2) whereȳ(n) = [y(n), y(n − 1), · · · , y(n − N s + 1)] T , and s = [s 0 , s 1 , · · · , s N s −1 ] T is the weight vector of the secondary path with a length of N s . The residual error signal picked up by the error microphone is where d(n) is the disturbance signal at the error microphone.
The FxLMS algorithm is commonly used to update the weight vector [3,4] w(n + 1) = w(n) − µv(n)e(n) (4) where µ is the step size, and v(n where x s (n) = [x(n), x(n − 1), · · · , x(n − N s + 1)] T , andŝ = [ŝ 0 ,ŝ 1 , · · · ,ŝ N s −1 ] T is an estimate of the actual weight vector of the secondary path. An accurate estimate of the secondary path is required for the FxLMS to work properly, which can be obtained by an offline-or online-modeling method [4,6]. The FxLMS algorithm requires 2N w + N s multiplications. In certain scenarios, the length of the weight vector may reach several thousand taps. Thus, the computational complexity is a major concern for real-time systems.

Conventional Frequency-Domain ANC Algorithms
The conventional frequency-domain ANC algorithms in [24][25][26][27][28]37,39] can be understood as direct frequency-domain implementations of the FxLMS algorithm. Since the FDAF algorithm is a special case of PBFDAF, we only present the PBFDAF in this paper. The corresponding method using the conventional filtered-x scheme is referred to as the frequency-domain partitioned block filtered-x LMS (FPBFxLMS) algorithm. The method using the modified filtered-x scheme is referred to as the frequency-domain partitioned block modified filtered-x LMS (FPBMFxLMS) algorithm.

FPBFxLMS Algorithm
The diagram of the FPBFxLMS algorithm is shown in Figure 2. The linear convolution operations (1) and (5) and the cross-correlation (4) can be implemented using FFTs. The adaptive filter w(n) is segmented into P w partitions as w(n) = [w T 0 (n), ..., w T P w −1 (n)] T , where w p (n) = [w pL (n), ..., w (p+1)L−1 (n)] T is the p-th subfilter with L = N w /P w taps. The frequency-domain weight vector of the p-th partition is where k denotes the frame index, 0 1×L is a 1 × L all-zero vector, and F is the Fourier transform matrix whose (p, q)-th element is exp(−j 2πpq The output of the control filter at the k-th frame is where Q 01 = [0 L I L ] is the projection matrix (the notations 0 L and I L being the L × L zeros and identity matrices, respectively), and is the frequency-domain reference matrix with For the FxLMS algorithm, the output of the control filter is calculated on a sample-by-sample basis and hence there is only one-sample delay for the generation of the cancelling signal. For the frequency-domain implementation in (7), however, a block of the reference signals should be collected before the calculation of y(k). There is at least one-block delay even if (7) is completed in one sample period. Thus, the cancelling signal driven by the loudspeaker is a delayed version of y(n): where D denotes the delay, which includes both the buffering time and the processing time. In [28,37], the calculation of (7) is completed in one-sample period and hence the total delay D = L. In [39], the calculation of (7) is distributed in one block KL ≤ n ≤ KL + L, and thus the delay is D = 2L. The advantages and limitations of each implementation method are discussed later. The impulse response of the secondary path is segmented into P s partitions asŝ = [ŝ T 0 ,ŝ T 1 , ...,ŝ T P s −1 ] T , whereŝ p = [ŝ pL , ...,ŝ (p+1)L−1 ] T is the p-th subfilter with L = N s /P s taps. The Fourier transform ofŝ p iŝ The linear convolution in (5) can also be implemented using FFT The update equation of the FPBFxLMS algorithm is [10,16,37,39] where (·) H denotes the Hermitian operation, is the windowing matrix that forces the last L time-domain elements to zero, is the frequency-domain error vector with e(k) = [e(kL − L + 1), ..., e(kL)] T , is the PSD matrix of the filtered reference signal which is used to improve the convergence, and the parameter m is related to the algorithm latency The PSD matrix Λ(k) can be computed recursively [10,16] where λ is a smoothing factor, 0 < λ < 1.
To reduce the complexity while retaining good convergence properties, the constraint in (12) can be added periodically [50,51]. Another way to reduce the complexity of (12) is to use the unconstrained algorithm [13] W The unconstrained PBFDAF exhibits a lower complexity than the constrained version, but the constrained version has a better convergence. Thus, we only consider the constrained FDAF.
The frequency-domain FxLMS algorithms with D = L and D = 2L are presented in Tables 1 and  2, respectively, where A denotes the number of multiplications of the 2L-point FFT operation. The differences between the two algorithms are twofold. First, the FPBFxLMS II algorithm has a larger delay, which limits its application in broadband ANC. Second, the complexity of the FPBFxLMS II algorithm is distributed evenly in one frame, while steps (1)-(7) of the FPBFxLMS I algorithm should be completed in one-sample period. Therefore, the peak complexity of the FPBFxLMS I algorithm is rather high. This means that a strong processor should be used, and a large portion of power is wasted due to the unbalanced complexity [46,48,49]. From the perspective of real-time implementation, the FPBFxLMS I algorithm is not practical. A real-time multichannel ANC system using the FPBFxLMS II algorithm was implemented in the Graphics Processing Unit (GPU) in [39], and a comprehensive complexity evaluation was carried out. Table 1. FPBFxLMS I [28,37].

Analysis
In this section, we carry out theoretical analysis on the effects of the two kinds of delays on the adaptive algorithm convergence behaviors. When the FxLMS converges, the optimal solution (which may not be realizable) in the z-domain is [4] For the FPBFxLMS algorithms, a D-sample delay is introduced to the signal path, and hence the optimal solution becomes When the FDAF algorithm is applied for echo cancellation, the algorithm delay has no effect on the algorithm convergence. However, this is not the case for broadband ANC because the causality solution of (20) may not exist and the extra delay in the signal path deteriorates the noise reduction performance [30,31]. For the broadband ANC system to work properly, the casualty condition should be satisfied [3] where τ sec includes the delay in the antialiasing filter, A/D converter, D/A converter, reconstruction filter and loudspeaker and the acoustic delay between the secondary loudspeaker and the error microphone, τ buff denotes the buffering time, τ proc is the processing time of the adaptive algorithm, τ AD is the acoustic delay between the reference sensor and the error microphone. The algorithm latency is the sum of τ buff and τ proc , and we have τ buff + τ proc = DT s , where T s is the system sampling period. For the FxLMS algorithm, only one-sample delay is introduced, i.e., D = 1. For the aforementioned two FPBFxLMS algorithms, however, the algorithm latency is very large, i.e., D = L or D = 2L. In the frequency-domain ANC algorithm, the distance between the reference sensor and the secondary loudspeaker should be large enough to satisfy the casualty condition [30,31]. However, this is not always achievable because of practical installation constraints. Therefore, the delay in FPBFxLMS algorithms is a major limitation for broadband ANC. At this point, we investigate the effect of the secondary path on the weight update. Using (3), we rewrite the error vector e(k) as where * denotes the linear convolution. The last element of e(k) can be expressed as wheres p = [s (p+1)L−1 , ..., s pL ] T . From (23), it can be seen that the generation of e(kL) requires P s past weight vector W q (k − m − p), p = 0, ..., P s − 1. Similarly, the generation of e(k) requires W q (k − m − p), p = 0, ..., P s . Therefore, the following assumption is used implicitly for deriving (12) The relationship in (24) holds only when the step size µ is very small, and the frequency-domain algorithm used in ANC is very similar to the delayed LMS algorithm [32]. Hence, the maximum step size that can guarantee the stability of the algorithm is reduced, and the fastest convergence speed of the algorithm is lower. Therefore, the frequency-domain ANC algorithm becomes quite different from the standard FDAF algorithm in terms of convergence behavior.
The above analysis provides another interesting insight to the implementation of the frequency-domain algorithm. For the two algorithms in Tables 1 and 2, we first calculate the filtering-out using W p (k) and then update W p (k), which is called the filtering-first approach. We could propose an alternative method, namely, the update-first approach. That is, we first update the weight vector to obtain W p (k + 1) using (18), and then we use the new weight vector W p (k + 1) to calculate the filtering-out By so doing, the delay between the weight update and the observation of the error signal can be reduced, and hence the convergence performance can be improved. We then obtain two algorithms in Tables 3 and 4 by applying the update-first approach to the algorithms in Tables 1 and 2, respectively. The update-first approach exhibits a better convergence behavior than the filtering-first approach, while these approaches have a similar computational complexity. Table 3. FPBFxLMS III.

FPBMFxLMS Algorithm
The aforementioned frequency-domain algorithms directly employ the residual error signal e(n) to update the weight vector, resulting in a delay between the weight adaptation and the observation of the error signal. To reduce this type of delay, the modified filtered-x structure was adopted to update the weight vector in [35], as shown in Figure 3. The basic idea is summarized as follows. The disturbance signald(n) at the error microphone is estimated aŝ is the estimate of the control signal at the location of the error microphone. To reduce the complexity, the linear convolution in (27) is implemented using FFT. Then,ẑ(n), kL − L + 1 ≤ n ≤ kL can be computed asẑ (k) = [ẑ(kL − L + 1), ...,ẑ(kL)] T = [u(kL − L + 1) * ŝ, ..., u(kL) * ŝ] T = Q 01 F −1 where is the frequency-domain cancelling signal matrix with u p (k) = {u[(k − p − 2)L + 1], ..., u[(k − p)L]} T . The estimated disturbance signal vector iŝ Recall thatd(k) is generated by V p (k − 2), and the pseudo frequency-domain error vector can be calculated asÊ where is the overlap-save projection matrix, and is the reconstructed frequency-domain disturbance signal vector. The pseudo frequency-domain error vectorÊ(k) is only used for the weight update. The weight update equation is Equations (31) and (34) describe the standard PBFDAF algorithm, which removes the effect of the secondary path. The output of the control filter can be computed using W p (k + 1) according to (25). However, the delay in the signal path still exists. The FPBMFxLMS algorithm is presented in Table 5. Table 5. FPBMFxLMS algorithm [35].

Delayless Frequency-Domain ANC Algorithms
The algorithm in Table 5 removes the delay in the weight update but does not remove the delay in the signal path. It is more essential to remove the delay in the signal path because this delay has a major effect on the performance of broadband ANC systems. Several delayless frequency-domain ANC algorithms have been presented to remove delays in the signal path [44,46,47]. This section first reviews several delayless frequency-domain ANC algorithms, and then, a new delayless algorithm with the modified filtered-x scheme is proposed.

Qiu's Delayless Algorithm
The delayless frequency-domain ANC is presented in Figure 4. To avoid the delay in the signal path, the calculation of the adaptive filter output can be directly implemented using the time-domain convolution [40][41][42][43] If the filter length N w is small, the complexity required in (35) may be acceptable. However, when N w is large, the computational complexity may be too high for a resource-limited system. To resolve this problem, a hybrid fast convolution scheme is adopted to calculate the control filter output [44] y(k + 1) = [y(kL + 1), ..., y(kL + L)] T =   x wherex(n) = [x(n), x(n − 1), . . . , x(n − L + 1)] T , and Please note that the calculation of Θ(k + 1) should be completed between kL and kL + 1, and the first term of the right-hand side of (36) is calculated in the time domain on a sample-by-sample basis. The update equation is The calculation of w 0 (k) does not require extra computational cost and is expressed by Because the calculation of Θ(k + 1) uses W p (k + 1), the weight update in (38) should be completed between kL and kL + 1, which means the update-first approach must be used. By using this approach, the delay in the signal path is removed. However, this algorithm is not practical because (36) and (38) should be completed within one sampling period, which requires (5 + 2P w )A + 10LP w + 4LP s + L multiplications per sample. In addition, the algorithm requires L multiplications per sample between kL + 2 and kL + L. Thus, the complexity of the algorithm is not evenly distributed over time, i.e., its peak complexity is rather high. A complexity measurement carried out in [46] indicated that the complexity of this algorithm in the first sample of each block is up to 200 times higher than the mean computational load. Even if a high-performance processor is employed to complete this task, most of this power is wasted. Furthermore, the delay in the weight update still exists, which limits the available step size for adaptation. Qiu's delayless algorithm is summarized in Table 6. Table 6. Qiu's delayless algorithm [44].

Fink's Delayless Algorithm
A computationally efficient delayless FDAF algorithm was presented in [46] that uses the fast filtering approach of [48]. The difference between Qiu's and Fink's delayless algorithms is twofold. First, the convolutions in the first and second partitions are calculated in the time domain for the Fink's algorithm. Second, the previous weight vector W p (k) instead of W p (k + 1) is used to calculate the control filter output y(k + 1) = y 0 (k + 1) + y 1 (k + 1) where It should be noted that (42) is completed in the k-th frame but not in one sampling period as in Qiu's algorithm, and hence the high peak complexity is avoided. The elements of y 0 (k + 1) and y 1 (k + 1) are computed in the time domain on a sample-by-sample basis The weight vector is updated using (38), which is completed between kL + 1 and kL + L. Accordingly, the total complexity of the algorithm is distributed evenly in one frame. However, the delay in the weight adaptation is not removed. Fink's delayless algorithm is presented in Table 7.
The pipeline of the process should be considered carefully. There are two tasks, the time-domain convolution on a sample-by-sample basis and the frequency-domain procedure on a block-by-block basis. When new block data are collected, the task for the weight update in (38) and the calculation of b(k + 1) are scheduled with low priority and should be completed in one block. When new data are available, the frequency-domain task is interrupted by the time-domain task. Then, the convolution operation is scheduled with high priority and should be completed in one sampling period. After the convolution operation has been executed, the interrupted frequency-domain task is recovered.  [46].

Proposed Delayless Algorithm
Several approaches have been proposed to remove the delay in the signal path or the weight update delay, but none of these algorithms can remove both kinds of delays. To address this problem, we propose a new delayless frequency-domain FxLMS algorithm as shown in Figure 5. First, we employ the modified filtered-x scheme to the frequency-domain ANC to remove the delay in the weight adaptation. Second, the delayless fast filtering approach in Section 4.2 is used to remove the signal path delay.
The estimate of the control signal at the location of the error microphone iŝ where is the frequency-domain cancelling signal matrix with y p (k) = {y[(k − p − 2)L + 1], ..., y[(k − p)L]} T . Then, we can use (30) to estimate the disturbance signal vectord(k). The weight vector is updated aŝ Once we obtain the new weight vector W p (k + 1), we can use (40) to calculate the control filter output. By doing so, the delay in the signal path is removed by using the hybrid fast filtering approach, and the delay in the weight update is removed by means of the modified filtered-x structure. Furthermore, the total complexity is distributed evenly in one block, which makes the algorithm practical. The proposed delayless algorithm is presented in Table 8. Table 8. Proposed delayless algorithm.

Computational Complexity Analysis
The complexity of the aforementioned frequency-domain FxLMS algorithms is summarized in Table 9. We assume that one 2L-point real FFT can be realized using one L-point complex FFT plus additional operations, requiring 2Llog 2 (L) + 4L real multiplications [4]. The peak and average multiplications per sample required for the frequency-domain ANC algorithms involved are presented in Figure 6, where we use L = 256 and P s = 2. The average complexity of all the frequency-domain algorithms is lower than that of the FxLMS algorithm, which shows the complexity advantage of the frequency-domain adaptive algorithms. The FPBFxLMS I in [28,37] and Qiu's delayless algorithm in [44] both have high peak complexity and hence are unrealizable in real-time systems. The proposed delayless method achieves load-balanced implementation and is only slightly more complex than Fink's delayless algorithm in [46]. Table 9. Complexity of the frequency-domain ANC algorithms.

Evaluations
In this section, we carry out computer simulations to evaluate the convergence performance of several frequency-domain algorithms in the context of ANC. The sampling rate is f s = 32768 Hz. The length of the primary and secondary path are N = 1536 and N s = 256, respectively. The optimal solution of the control filter is computed using the MINT method [52] and has a length of N w = 1536. Their corresponding time-domain impulse responses are plotted in Figure 7. Figure 7c shows that maximum delay allowed in the signal path is D = 256 to guarantee the causality condition of the system. The signal-to-noise ratio (SNR) measured at the error microphone is 30 dB. Two signals, i.e., the narrow band signal between 100-800 Hz and a multi-tone signal at 40, 340 and 700 Hz, are used as the reference. The convergence performance is evaluated by the learning curve given by where P d (k) and P e (k) denote the powers of the disturbance signal d(n) and the error signal e(n), which are computed recursively as where β is the smoothing factor. In the following, we use λ = 0.8 and β = 0.8.   Tables 1-4. The narrow band noise is used as input, and the block length is L = 64. The update-first approach is more stable than the filtering-first approach given the same step size, which agrees with the above analysis. In addition, the two methods have the same complexity, and hence the update-first approach is recommended for practical implementation. The effect of the block length on the convergence performance of several frequency-domain FxLMS algorithms is investigated in Figures 9 and 10. In Figure 9, we use the multi-tone signal as the reference, and the step size µ = 0.03. The block length is set to L = 64, 128, 256. All the algorithms are convergent for L = 64, 128, while they diverge for L = 256. We found that all the algorithms can converge using a smaller step size µ = 0.001 for the block length L = 256. For the multi-tone input, all the frequency-domain algorithms can converge if a sufficiently small step size is employed.  The experiment in Figure 9 is repeated using the narrow band noise as input, and the learning curves are presented in Figure 10. The three delayless frequency-domain algorithms are stable and have a similar convergence for narrow band noise input in all the experiments. However, their convergence rate decreases as the block length. The convergence performance of the traditional frequency-domain FxLMS algorithms is dramatically affected by the block length. For L = 64, the causality condition is fulfilled for all the algorithm involved, and they exhibit a similar convergence. When the block length is L = 256, the FPBFxLMS IV and the FPBMFxLMS algorithms with a delay of D = 512 do not gain any noise reduction, and the FPBFxLMS III algorithm with a delay of D = 256 only exhibits a 10-dB noise reduction. The experiment demonstrates that the delay in the signal path has a major effect on the broadband noise reduction.
In Figure 11, we investigate the stability of the frequency-domain algorithms with and without the modified filtered-x structure. Specifically, we compare the FPBFxLMS IV, FPBMFxLMS, and three delayless algorithms. The narrow band white noise is used as the reference. We use a small step size µ = 0.01 for Figure 11a and a large step size µ = 0.08 for Figure 11b. All the algorithms involved exhibit a similar convergence performance when the step size is small. However, when a large step size is adopted, the frequency-domain algorithms without the modified filtered-x structure diverge, but the modified filtered-x algorithms are stable because the available step size of the modified filtered-x algorithm is larger than those of the FxLMS-type algorithms. The proposed delayless algorithm outperforms all the other methods.

Conclusions
This paper presented a comprehensive review of the frequency-domain FxLMS algorithms in terms of convergence performance, stability, and computational burden. Practical applications include active noise control of open windows [53], the ventilation system [54], and automotive cabins [55], where several thousands of coefficients are required to model the control filters. The frequency-domain algorithms used in ANC and system identification are quite different. First, broadband ANC is sensitive to algorithm latency, and a large delay may cause the system to fail. Second, direct use of an FDAF for ANC results in a delay in the weight update and the observation of the error signal, which reduces the maximum step size of the algorithm. This paper reviewed the delayless filtering methods and the modified filtered-x structures to address the two delay problems. Specifically, the update-first approach was proposed to improve the stability of the FPBFxLMS algorithms. Several frequency-domain FxLMS algorithms have a very high peak complexity and thus are not practical for real-time systems, although these algorithms are correct in theory. A computationally efficient delayless frequency-domain algorithm was proposed that combines the hybrid fast filtering approaches and the modified filtered-x structure, which completely removes the aforementioned two types of delays. Extensive simulations were carried out to evaluate the convergence performance of the state-of-the-art frequency-domain FxLMS algorithms.
However, more work needs to be done regarding the frequency-domain FxLMS algorithms. For instance, the step size bound of the traditional frequency-domain FxLMS algorithms in Tables 1-4 should be determined. Analyzing the statistical convergence behavior of the frequency-domain FxLMS algorithms is an interesting topic [56][57][58][59].
Author Contributions: F.Y. performed the experiments and wrote the paper; Y.C., M.W. and F.A. gave valuable comments and edited the paper which improves the quality of the article; J.Y. conceived the experiments and proofread the paper.