Modified GSC Method to Reduce the Distortion of the Enhanced Speech Signal Using Cross ‐ Correlation and Sidelobe Neutralization

: The generalized sidelobe canceller (GSC) method is a common algorithm to enhance audio signals using a microphone array. Distortion of the enhanced audio signal consists of two parts: the residual acoustic noise and the distortion of the desired audio signal, which means that the desired audio signal is damaged. This paper proposes a modified GSC method to reduce both kinds of dis ‐ tortion when the desired audio signal is a non ‐ stationary speech signal. First, the cross ‐ correlation coefficient between the canceling signal and the error signal of the least mean square (LMS) algo ‐ rithm was added to the adaptive process of the GSC method to reduce the distortion of the enhanced signal while the energy of the desired signal frame was increased suddenly. The sidelobe pattern of beamforming was then presented to estimate the noise signal in the beamforming output signal of the GSC method. The noise component of the beamforming output signal was decreased by sub ‐ tracting the estimated noise signal to improve the denoising performance of the GSC method. Fi ‐ nally, the GSC ‐ SN ‐ MCC method was proposed by merging the above two methods. The experiment was performed in an anechoic chamber to validate the proposed method in various SNR conditions. Furthermore, the simulated calculation with inaccurate noise directions was conducted based on the experiment data to inspect the robustness of the proposed method to the error of the estimated noise direction. The experiment data and calculation results indicated that the proposed method could reduce the distortion effectively under various SNR conditions and would not cause more distortion if the estimated noise direction is far from the actual noise direction.


Introduction
Online meetings are widely used in the current time. The quality of meeting audio has become increasingly important. Some problems of meeting audio, such as interfering speech, background noise, and speech reverberation, could be alleviated using several signal-processing methods with a single-channel microphone. These problems are all related to the spatial direction. Therefore, a multichannel microphone array is used to extract the desired signal by analyzing the spatial information of signals.
The primary method of a microphone array is the delay-and-sum beamforming (DSB) method presented by Flanagan et al. [1]. The multichannel signals are aligned at the look direction by compensation with the proper delay. The aligned signals are summed to ensure that the gain of the look direction is maximized. In other words, the signals from other directions are suppressed. More microphones should be employed to suppress the non-look direction signal more effectively. Capon [2] proposed the minimum variance distortionless response (MVDR) method in 1969 to achieve better de-noising performance without increasing the number of microphones. The weight coefficients of different channels were adjusted to preserve the gain of the look direction and decrease the total power of the final output signal. Frost [3] developed the MVDR method that extended the constraint in the MVDR method to linear equations. Therefore, the weight coefficients could be adjusted adaptively based on the constraints. This method is referred to as the linearly constrained minimum variance (LCMV) method.
The constraints could be independent of the LCMV method to simplify the algorithm. Griffiths and Jim [4] proposed the generalized sidelobe canceller (GSC) method in 1982. The GSC method consisted of three parts. The first part is the same as a conventional beamforming method to reduce the noise roughly. The second part is a block matrix to generate a reference noise signal by removing the desired signal from the original multichannel signals. The third part is an adaptive filter to estimate the noise component of the first part's output signal. Then, the estimated noise is subtracted from the output signal of the first part to obtain a cleaner desired signal. The GSC method converts the optimization goal of the LCMV method from the optimal weights with constraints to the optimal weights without constraints. Therefore, the simple adaptive algorithm could be adopted to reduce the residual noise in the final output signal.
The three parts of the GSC method could be optimized individually to improve the aggregate denoising performance of the adaptive algorithm. Hoshuyama et al. [5] and Lee et al. [6] modified the block matrix to decrease the sensitivity of the GSC method to a mismatch between the estimated and actual direction of arrival (DOA) of the desired signal. The robustness of the GSC method was increased by reducing the leakage of the desired signal into the reference noise signal [7]. Gannot et al. [8] considered the complicated acoustic environments and proposed a transfer function GSC (TF-GSC). The block matrix of the TF-GSC method was modified with the transfer function ratio (between different microphones in an array) to adapt the GSC method to reverberation conditions. Reuven et al. [9] combined the TF-GSC with an acoustic echo cancellation to improve the performance of the algorithm in noisy and reverberant environments. Reuven et al. [10] also extended the TF-GSC method to the DFT-GSC method for double talk scenarios to cancel the nonstationary interferences signal. Rombouts et al. [11] estimated a room impulse response and a desired speech signal model jointly to reduce the computational complexity dramatically for specific applications. Krueger et al. [12] estimated the acoustical transfer function ratios in the presence of stationary noise and tracked the eigenvector adaptively to obtain better noise and interference reduction. The GSC method also could be optimized by using efficient complex value arithmetic to reduce the calculation complexity [13] and by introducing the external microphones into the local microphone array to improve the performance of the speech estimate [14].
The least mean square (LMS) algorithm is a common adaptive algorithm for the GSC method to estimate the noise signal. The LMS algorithm adjusts the filter coefficients to make the mean square of the error signal (difference between the canceling signal and the received noise signal) lowest [15,16]. The stochastic gradient descent method has been used to ensure that the error signal can be decreased iteratively in real-time, which was derived by Widrow and Hoff [17] in 1960. Several modified LMS algorithms were proposed for some practical consideration. Nagumo and Noda [18] introduced the normalized LMS (NLMS) algorithm to make the algorithm convergence speed independent of the reference signal power, because the step size of the filter coefficient updating was inversely proportional to the reference signal power. Shan and Kailath [19] proposed the correlation LMS algorithm, which adjusted the step size of the filter coefficient updating to be proportional to the cross-correlation coefficient between the reference signal and the error signal. This means that when the reference signal and the error signal are uncorrelated, the step size will be too small to update the weight coefficients of the adaptive filter. This step size control scheme makes the adaptive algorithm more robust to disturbances, such as the system-measured noise. Gitlin et al. [20] proposed the leaky LMS algorithm, in which a leakage factor was added to the adaptive weight update path. The leakage factor could avoid the overflow of the unconstrained weight to increase the stability of the algorithm.
Some studies [21][22][23] introduced post-filtering methods to the microphone array for additional noise reduction to the beamforming output signal. Some studies [24][25][26] decomposed the multichannel signal into two subspace domains, the desired signal subspace domain and noise signal subspace domain, based on statistic features (like the singular value decomposition of the covariance matrix of the multichannel signal). The signal in the noise subspace domain was removed or suppressed, and the desired signal was restored from the desired signal subspace domain. Some researchers [27,28] attempted to extract the desired signal that had been contaminated by noise without a signal model or transmission model, which can be referred to as blind source separation or independent component analysis, most of which were based on the statistic features or a neural network. This research is mainly in the laboratory stage because of the complexity of calculation and the feasibility in actual environments.
The distortion of the enhanced signal consists of two parts: the residual noise and the distortion of the desired signal, which means that the desired signal is damaged. Most researchers attempted to reduce the residual noise of the output of the microphone array. In contrast, the desired signal is sometimes a nonstationary signal, as in speech application. The varying speech signal may damage the desired signal and degrade the denoising performance of the adaptive algorithm. Some researchers [29] employed the voice activity detection (VAD) method to make the algorithms only to be adjusted in speech-absent frames and avoid the affection of a varying speech signal. On the other hand, the VAD method is not always practical, especially when the background noise is heavy or the sound field is complex.
In this study, the conventional GSC method was modified to reduce the enhanced speech distortion without the VAD method. The step size of the adaptive algorithm in the GSC method was controlled by the cross-correlation coefficient between the cancelling signal and the error signal to reduce the distortion that occurred by the change of the speech energy. Furthermore, based on the excess mean-square error (MSE) of the LMS algorithm [30,31] and the theoretical limits of the noise reduction performance [32], the sidelobe neutralization (SN) method was presented to reduce the noise component in the beamforming output signal of the GSC method to improve the denoising performance. This paper is organized as follows. Section 1 introduces the research background and reviews the relative academic achievements. Section 2 proposes the GSC method with the sidelobe neutralization method and the GSC method with the cross-correlation coefficient method. The modified GSC algorithm is given by combining them. Section 3 presents the implementation of the experiment and the result analysis. Section 4 provides the conclusions of this study.

The Conventional GSC Method
The GSC algorithm can be separated into two transmission paths: the primary path and auxiliary path, as shown in Figure 1. The primary path generates a denoising signal with residual noise by the fixed beamforming method. The auxiliary path estimates the noise component in the output signal of the primary path. ωm is the weight vector to make the microphone array signal x(n) aligned in the look direction to suppress noise from the other direction. ym(n) is the sum of the aligned signals as the beamforming output signal of the primary path. The auxiliary path estimates the noise component yb(n) in the beamforming output signal of the primary path as the cancelling signal. The cancelling signal will be removed from the beamforming output signal. Hence, the enhanced signal yout(n) consists of the desired signal, and the residual noise signal is obtained.
The block matrix (BM) is adopted to generate the reference noise signal, excluding the desired signal from the noisy signal, in the auxiliary path for estimating the noise component of the beamforming output signal. Various methods can be used to design the block matrix for different purposes. The typical block matrix is as follows: where M is the number of microphones in the microphone array. This matrix can block the desired signal completely in theory and introduce a low calculation complexity. The output zb(n) of the block matrix is offered to the adaptive filter as the reference noise signal to estimate the noise component of the beamforming output signal. The LMS algorithm is commonly used as an adaptive filter to adjust the filter coefficients ωf. Thus, the error signal of the LMS algorithm yout(n), which is also the output signal of the GSC method, is obtained by removing the estimated noise signal from the beamforming output signal. The filter coefficient ωf is then updated based on the reference noise signal and error signal.

GSC Method with Cross-Correlation Coefficient
In practice, the error signal of the GSC method consists of a residual noise signal and an enhanced desired signal. In speech application, the desired signal is often a non-stationary speech signal. When the amplitude of the speech signal increases suddenly, the amplitude of the error signal should also be increased synchronously, while the amplitude of the residual noise signal may remain stable. Hence, over-subtraction may be caused in the subsequent frames. This problem can be solved using the GSC method with the crosscorrelation coefficient (GSC-CC). In this method, the cross-correlation coefficient is introduced to the weight updating path, as shown in Figure 2. Because the speech signal usually is uncorrelated with noise signal in practical situations, the cross-correlation coefficient between the estimated noise yb(n) and the error signal yout(n) is used to control the step size of the adaptive filter weight update process, as expressed in the following equations. Finally, where ρ(k) is the cross-correlation coefficient of the k th frame between the estimated noise yb(k) and the error signal yout(k). sspeech(k) represents the clean desired speech signal, and nresidual(k) is the residual noise component of the output signal. Because the speech signal is commonly assumed to be uncorrelated with the noise signal, the cross-correlation coefficient between the estimated noise and the speech signal is close to zero and can be neglected. Therefore, ρ(k) is equal to the cross-correlation coefficient between the estimated noise signal yb(k) and the residual noise signal nresidual(k) as expressed in Equation (4). This means that if the ρ(k) is small, the residual noise is nearly uncorrelated with the estimated noise, so the weight coefficient will be changed only slightly. Otherwise, the weight coefficient will be changed considerably to reduce the residual noise rapidly. Figures 3 and 4 illustrate the influence of the varying speech signal on the estimated noise signal. The input original of a speech signal was extracted from the TIMIT database [33], and the noise signal was the white noise obtained from the NOISEX-92 database [34]. Figure 3a shows the original speech and noise signal, in which the signal-to-noise ratio (SNR) was equal to −5 dB (the ratio of the energy of the whole speech signal and the energy of the noise signal during the same time). Detailed information about the experiment implementation is depicted in Section 3. Figure 3b presents the cross-correlation coefficient between the estimated noise signal and the output signal with the conventional GSC method and the GSC-CC method. The negative peak of the cross-correlation coefficient with the conventional GSC method was generated when the amplitude of the speech signal increased suddenly (Figure 3b), which implies the phenomenon of over-subtraction. Figure 4 compares the energy of the estimated noise frame with different methods. Figure 4b displays the partially enlarged view of Figure 4a from the 50th to the 100th frame. The energy of the estimated noise frame with the conventional GSC method would increase when the energy of the speech frame was much more than the actual noise frame. The GSC-CC method could flatten the energy curve of the estimated noise frame, which is consistent with the trend of the energy curve of the original white noise signal. This means that the distortion of the enhanced speech signal was reduced for the desired signal, and it was damaged less when the amplitude of the speech signal changed.  The GSC-CC method could decrease the overestimation of the noise efficiently when the desired speech signal amplitude was increased suddenly (Figure 4). On the other hand, the figures also show that the energy curve of the estimated noise frames with the GSC-CC method was lower than the conventional GSC method. As the correlation coefficient was less than or equal to one, the step size of the adaptive filter of the GSC-CC method was smaller than the GSC method. Consequently, the energy of the residual noise signal with the GSC-CC method was higher than the conventional GSC method.
The GSC method with the minimum cross-correlation coefficient (GSC-MCC) method was proposed by combining the conventional GSC method and the GSC-CC method to utilize the advantages of the two methods. The conventional GSC method and the GSC-CC method ran synchronously. The output signal frame, which had the smaller cross-correlation coefficient (between the estimated noise signal and the corresponding output signal) of the two methods, was adopted to synthesize the final output signal. Considering the overlap of speech frames and the nonstationary nature of a speech signal, the principle of the GSC-MCC method to choose the frame to synthesize the final output signal was modified using Equation (5).
where yout-corr(k) and yout-conv(k) represent the output signal frame of the GSC-CC method and the conventional GSC-CC method, respectively.

Beamforming Pattern
In this study, the microphone array was assumed to be a uniform circle array (UCA), with M microphones deployed uniformly on a circle with a radius of r. The sound source and the microphone array were assumed to be on the same horizontal plane for convenience. That is, the incident elevation angle of the desired signal and the noise signal both were 0°. The sound source direction vector ν was obtained as shown Equation (8), based on the angle interval αm and relative distance xr between microphones.
where 2 sin( ) 2 and ω(s,m) is the weight of the microphone channel to adjust the look direction Φs of the microphone array. Φi represents the actual incident direction of the sound source.
Based the Jacobi-Anger identity, where Jn is the n th -order Bessel function; let b = n/M, If M is even: Consequently, the beamforming pattern could be formed by the Bessel functions. Considering the property of the Bessel function, the value of the second item of Equation (13) will be close to zero and can be neglected with an increasing number of microphones. Thus, the beamforming pattern could be approximated as a zero-order Bessel function.

Beamforming Sidelobe Neutralization
When the configuration of the microphone array was determined, the sidelobe of the beamforming pattern could be calculated as in Equation (13). If there are two sound sources, s1 and s2, deployed in the direction vector ν1 and ν2, respectively, in the sound field, the sidelobe neutralization method could be derived using the following equations: the received signal of microphones xm could be written as: The output signal of the beamforming function to the corresponding direction is: where uab represents the attenuation coefficient of the sidelobe when the sound source is placed at direction b and beamformed to direction a. Moreover, the attenuation coefficient will be the conjugate of uab when the cast direction is inverse.
Hence, Equation (19) could be used for direction ν1 to reduce the noise received from direction ν2. This equation proves that the noise signal from the interference direction can theoretically be removed entirely from the beamforming output signal of the look direction. When the attenuation coefficient is not equal to one, the signal amplitude from the look direction could be maintained by multiplying a proper scale factor. On the other hand, if the attenuation coefficient of the frequency is equal to one, this frequency signal would be lost completely and could not be compensated.
Therefore, the sidelobe neutralization method could be modified as in Equations (20) and (21). The desired signal component in the noise direction was calculated and subtracted from the beamforming output of the noise direction as in Equation (20). Hence, the estimated noise signal becomes the pure noise without the desired signal. The output of the sidelobe neutralization method was calculated using Equation (21). This equation proves that the sidelobe neutralization method would not miss the frequency signal with an attenuation coefficient of even one. Because most of the time, uab is less than 1, the cube of uab is less than itself. Combining Equations (16) and (21), it means that (u12) 3 However, the estimated noise direction is often inaccurate in practice. The error of the estimated noise direction would affect the performance of the sidelobe neutralization method. ν3 is assumed to be the inaccurately estimated noise direction, and Equations (20) and (21) where rν3 is the ratio of the noise signal processed by the sidelobe neutralization method with the inaccurate noise direction to that by the beamforming method. rν3 can be analyzed qualitatively as follows: If u12 is small or near to zero, thus,     (2) ν3 is far from ν2; thus, u32 is small, Assume an instance, in which ν1 is 0° and ν2 is 60°, to illustrate the affection of the inaccurately estimated noise direction for the residual noise. According to Equation (25), Figure 5 shows rν3 with different estimated noise direction (ν3 from −180° to +180°). The blue gap near 60° in Figure 5a shows that the noise signal processed by the sidelobe neutralization method with an inaccurate noise direction was less than that by the original beamforming method in a relatively wide direction range near the actual noise direction. The ratio in the frequency band between 1000 to 1500 Hz in all directions was high, because the sidelobe attenuation coefficient of the actual noise direction in this frequency range was close to zero (u12, as shown in Figure 5b).
When the estimated noise direction was far away from the actual noise direction, the ratio was close to one at the greatest portion of the figure, which means that the energy of the noise signal processed by the sidelobe neutralization method was almost equal to that by the beamforming method, except in the high-frequency band in several directions. Even in these areas, the ratio was still less than two.
Therefore, the conclusion could be derived that the sidelobe neutralization method could work effectively when the estimated noise direction was near the actual noise direction. When the estimated noise direction was far away from the actual noise direction, the denoising performance of the sidelobe neutralization method was similar to the beamforming method, except in the frequency range, in which the sidelobe attenuation of the actual noise direction was strong.

The GSC Method with Sidelobe Neutralization
In LMS theory, the denoising performance of the algorithm is relative to the energy of the received noise signal [30]. Therefore, the GSC-SN method was proposed by adding the sidelobe neutralization method to the primary path of the conventional GSC method. The sidelobe neutralization method was used to reduce the received noise to improve the performance of the GSC method. When the noise weight vector was estimated to be ωn, the noise signal yn(n) in the noise direction could be estimated using Equation (17). The desired signal in the noise direction ynm(n) was calculated and subtracted from the estimated noise signal as in Equation (20). Therefore, the pure estimated noise signal ynsm(n) was obtained without the desired signal. The pure estimated noise signal was subtracted from the beamforming output signal of the desired signal direction ym(n). Thus, the beamforming output of the primary path with less noise was obtained as Equation (21). The denoising performance of the GSC method could be improved by the beamforming output signal with less noise.

The Proposed GSC-SN-MCC Method
Finally, the GSC-SN-MCC method was proposed by combining the GSC-MCC method and the GSC-SN method to improve the performance of the GSC method. The GSC-MCC method was used to reduce the distortion of the enhanced signal caused by the varying desired signal. The sidelobe neutralization method was placed in the primary path of the GSC-MCC method to generate the beamforming output signal with less noise to improve the denoising ability of the proposed method.
As shown in Figure 6, the received signal of the microphone array was beamformed in the desired signal direction ωm and the noise signal direction ωn, respectively. Based on Equations (16), (17), (20), (21), the beamforming output signal with less noise than the original beamforming method was obtained. Then, the system was run as the usual GSC method until updating the adaptive filter coefficient. The cross-correlation coefficient ρ (between the estimated noise signal and the corresponding output signal) was added to the adaptive filter coefficient updated equation to control the overestimating of the noise component, as in Equation (1).

Experiment Implementation
The laboratory experiments were conducted in the anechoic chamber to validate the proposed method. The experimental layout is shown in Figure 7. The laptop was the control center to manage the audio system for simulating a real environment. Two omnidirectional speakers were connected to the laptop as the speech and noise sources, respectively, using an INTERM L-2400 power amplifier. The microphone array consisted of 6 MEMS microphones, distributed uniformly in a circle with a radius of 0.1 m. The printed circuit board (PCB) of the SCIEN company was connected to the microphones to realize the signal amplification function, the A/D conversion function, and transfer the digital signal to the laptop for recording. The signal was processed by the MATLAB software. To simplify the experiment, the speakers and the microphone array were deployed in the same horizontal plane. This means that the elevation angles of the incident signals (desired signal and noise signal) were 90°.
Based on the common far-field equation, expressed as in Equation (29), the minimum distance between the microphone array and the sound source was calculated to ensure that the position of the speakers satisfied the far-field condition. Assuming that the sound speed c is 343 m/s, the maximum frequency f was 4000 Hz, and the diameter of microphone array L was 0.2 m. Hence, the minimum distance d was approximately 0.933 m. Thus, the speakers were deployed about 2.5 m away from the microphone array, considering the space of the anechoic chamber and the minimum distance d. The azimuth angles of the speech signal source and the noise signal source were 0° and 90°, respectively. The speech of a female was selected randomly from the TIMIT database [32] as the desired signal. White noise in the NOISE-92 database [33] was chosen as the noise signal. The power of the noise signal was adjusted to simulate different SNR conditions. The sampling rate of the microphone array was 16,000 Hz, and the received signal was filtered using a lowpass filter, of which the cutoff frequency was 4000 Hz. The high sampling rate could increase the time resolution of the system and exhibit the phase difference between microphones more obviously. Because most of the energy of speech is concentrated in the low and medium frequency bands, the received signal was filtered by a lowpass filter to reduce the calculation burden of the hardware, while most of the speech information was preserved. Figure 8 compares the distortion-noise ratio of frames with different methods while SNR is −5 dB. The distortion of the enhanced speech signal was calculated using Equation (30), where k represents the k th frame and l is the length of the frame. yout(n) and sspeech(n) represent the output signal of the method and the clean desired speech signal without noise, respectively. Less distortion means better denoising performance of the method. Equation (31) calculates the distortion-noise ratio as the distortion divided by the pure noise energy. Figure 8b presents an enlarged view of Figure 8a from the 50th to 100th frames, which included two peaks of the speech frame energy. The figure shows that peaks of the signal distortion were generated using the GSC method when the energy of the speech frame was increased suddenly. The GSC-CC method could alleviate this problem, while the residual noise would increase at other frames where the speech energy is stable. The GSC-SN method could reduce the distortion of most frames caused by the residual noise, but the above-mentioned distortion problem with damage to the desired signal exists. The GSC-SN-MCC method could take advantage of the above methods to decrease both kinds of distortion of the enhanced signal. The maximum distortion occurred at the 83 rd frame in this case, as shown in Figure  8. Figure 9 presents the spectrum of signal distortion of the 83rd frame where the fast Fourier transform was utilized. For convenience, the amplitude of the spectrum was normalized. Figure 9a shows the spectrum of signal distortion of the 83rd frame over the entire frequency band with different methods. When a different method was employed, there were apparent differences in the amplitude from 500 Hz to 2000 Hz, where the speech energy was concentrated. Figure 9b shows the enlarged view of this frequency band. The distortion spectrum with the GSC-SN-MCC method was smaller and flatter than the conventional GSC method. Hence, the distortion of the enhanced signal caused by damaging the desired signal was reduced successfully using the proposed method.  Table 1 lists the average of the distortion-noise ratio, and Table 2 presents the distortion value of the 83rd frame (maximum distortion frame) under various conditions. For comparison, the average value of the distortion-noise ratio and the distortion value of the 83rd frame was normalized to the result of the GSC method, which was one, as shown in Equations (32) and (33). Table 2 indicates that the GSC-CC method could reduce the peak value of the enhanced signal distortion effectively under most SNR conditions, except that the noise was too heavy that the speech energy barely affected the estimation of the residual noise. The GSC-SN method showed better performance under low SNR conditions ( Table 1). The proposed GSC-SN-MCC method was a combination of the above two methods. The tables show that the proposed method could work effectively under both high and low SNR conditions. The best performance of the proposed method occurred when the energy of speech was almost equal to the energy of noise.

Effect of the Inaccurate Estimated Noise Direction
The simulation calculation was conducted based on the previous experiment data to show the influence on the performance of the proposed method when the estimated noise direction was not accurate. Therefore, the incident direction of the desired signal was 0°, and the actual noise direction was 90° in this section. The SNR condition was performed at −5 dB. Twelve angles were chosen as the inaccurately estimated noise directions, which were distributed uniformly in a circle. The meaning of the coordinate axes in the figures and the calculation equations of the result were the same as in the previous subsection. Figure 10 presents the distortion-noise ratio of the frames with different methods when the estimated noise direction was 60°. Figure 10b is the enlarged view of Figure 10a from the 50th to 100th frame. Figure 11 shows the spectrum of signal distortion of the 83rd frame on the entire frequency band using different methods. Figure 11b displays the enlarged view of Figure 11a from 500 Hz to 2000 Hz.
The figures indicate that even with an error angle between the actual noise direction and the estimated noise direction of 30°, the proposed method could effectively reduce both the peak value and the average value of the signal distortion in this case. Tables 3 and 4 compare the performance of the different methods in various estimated noise directions. Table 3 shows that the performance of the GSC-SN method would decrease with the estimated noise direction far away from the actual noise direction. The GSC-SN method would reduce to the conventional GSC method when the estimated noise direction was identical to the desired signal direction. When the estimated noise direction was identical to the actual noise signal direction, the GSC-SN method would achieve the best denoising performance. The trend of the performance of the proposed GSC-SN-MCC method was similar to the GSC-SN method. Better performance of the proposed method could be attained if the estimated noise direction was closer to the actual noise direction. Even when the estimated noise direction was the opposite of the actual noise direction, the performance of the proposed method was still not worse than the conventional GSC method. Table 4 lists the GSC-CC method that could reduce the peak value of the signal distortion, which was independent of the estimated noise direction. The ability to decrease the peak of the enhanced signal distortion by the proposed method was similar to the GSC-CC method.
The tables show that the proposed method could work effectively when the estimated noise direction was near the actual noise direction. When the estimated noise direction was far from the actual noise direction, the proposed method could still reduce the peak value of the enhanced signal distortion effectively, and the average of the enhanced signal distortion using the proposed method was similar to the conventional GSC method.

Conclusions
This paper proposed a modified GSC method to reduce the distortion of the enhanced signal using cross-correlation and sidelobe neutralization. The GSC-MCC method was proposed first by adding the cross-correlation coefficient between the cancelling signal and the error signal to the filter weight update path to control the distortion of the enhanced desired signal when the energy of the desired signal was increased suddenly. The beamforming sidelobe pattern was presented and combined with the conventional GSC method (refer to the GSC-SN method) to improve the denoising performance of the GSC method by reducing the noise component in the beamforming output signal of the GSC method. Finally, the GSC-SN-MCC method was proposed by combining the two methods to take advantage of them.
The experiment was performed in the anechoic chamber, and the experimental data showed that the proposed method could reduce the enhanced signal distortion effectivity on different noise levels (the SNR range was between −20 and 20 dB). The performance of the proposed method was better when the SNR condition was almost 0 dB. The simulated calculation was conducted to reveal the influence of the inaccurate estimated noise directions on the denoising performance of the proposed method. The calculation results showed that the proposed method could work effectively when the estimated noise direction was near the actual noise direction. Even if the estimated noise direction was far from the actual noise direction, the peak of the enhanced signal distortion would be decreased successfully, while the average of the enhanced signal distortion would still be similar to the conventional GSC method, which indicates the feasibility of the proposed method in practice.
Author Contributions: C.-M.L. gave academic guidance to this research work and revised the manuscript. H.S. designed the core methodology of this study, programmed the algorithms, carried out the experiments, and drafted the manuscript. Both authors read and approved the final manuscript.
Funding: This research received no external funding.