A CFAR-Enhanced Spectral Whitening Method for Acoustic Sensing via UAVs

The following paper addresses the issue of performing CFAR detection on signals with colored noise distributions, such as that found when performing acoustic sensing via UAVs. With respect to the outlined considerations, a CFAR-enhanced spectral whitening method is proposed to maintain detector functionality without inhibiting detection sensitivity. The performance of the method is also demonstrated using acoustic data taken from experiments involving a fixed-wing UAV. From the results obtained, it is evident the approach performs significantly better than standard techniques such as inverse spectral whitening, which tend to attenuate acquired target source components.


Introduction
Often for many real-world applications, signal noise does not follow a Gaussian distribution but rather exhibits some colored form that is a function of frequency.For example, the transmission of acoustic energy in a viscoelastic medium results in amplitude attenuation that is proportional to the square of the component frequency as described by Stokes' Law [1].The resulting effect is a coloring of acoustic energy across the frequency band of interest, which includes both noise and desired signal components.For transmission in atmospheric conditions, semi-empirical models have been developed to predict attenuation levels based on source frequency and thermodynamic properties of the medium [2].Thus, frequency-dependent attenuation or coloring of acoustic spectra is typical when performing acoustic sensing of aircraft such as unmanned aerial vehicles (UAVs).This can be observed in Figure 1 displayed below which provides power spectra for various propeller-driven aircraft during a fly-by.From the plots, it is evident that the self-generated noise consists of strong harmonic narrowband components, superimposed on a frequency-dependent broadband base.Although the general downward power trend can be attributed to transmission effects, the specific shape is due to more complex features regarding the aircraft design [3].The presence of spectral coloring may greatly influence the ability to perform operations such as target source detection.For example, common CFAR detection methods require noise to follow a small class of known distributions that must remain stationary with time [4].For situations in which the underlying distribution is either unknown or does not follow a standard form (e.g., Gaussian and Exponential), distribution-free methods may be used to achieve constant false alarm rates.However, distribution-free CFAR (DF-CFAR) methods require noise samples be independent and identically distributed (IID) to maintain accurate functionality [5].
One particular application subject to these conditions is acoustic sensing via UAVs.Although generally considered unconventional, this technology is currently being investigated for the detection of various ground-based and airborne acoustic sources [6][7][8][9][10][11].However, current studies have not employed the use of CFAR detectors to establish fixed false alarm rates during flight operations.Robust detection techniques using DF-CFAR detectors have been previously presented to address bandwidth limitations and nonstationary signals associated with this particular application [12].Here we examine the issue of applying these detectors to acoustic signals with colored noise distributions.Thus, the purpose of this paper is to present a spectral whitening technique that would enable the use of DF-CFAR detectors for acoustic sensing via UAVs.
Drones 2018, 2, 1 2 of 10 employed the use of CFAR detectors to establish fixed false alarm rates during flight operations.Robust detection techniques using DF-CFAR detectors have been previously presented to address bandwidth limitations and nonstationary signals associated with this particular application [12].
Here we examine the issue of applying these detectors to acoustic signals with colored noise distributions.Thus, the purpose of this paper is to present a spectral whitening technique that would enable the use of DF-CFAR detectors for acoustic sensing via UAVs.

Background Information
The two most common forms of spectral whitening are inverse filtering and frequency-band gain control [13].The frequency-band method is a time domain approach, where multiple band-pass filters are applied in parallel to section the signal into various frequency bands.Each of the filtered sections are then equalized using an active scaling approach such as automatic gain control (AGC) or linear predictive coding (LPC).The benefit of this method is a continuously whitened output that does not require any block-based processing such as that inherent with FFT operations.The major downside is potential phase distortions since these scaling operations are typically non-linear [13].In contrast, inverse filtering is typically performed in the frequency domain and does not produce phase distortions.It involves dividing the spectrum of concern by the mean of its noise approximation according to the following [14]: where | ( )| is the approximated or smoothed magnitude spectrum of ( ), is a scaling or degree-of-flattening factor, and is a constant to prevent division by zero.If desired, we may exclude the division constant by simply performing the operation in the log-decibel domain instead: To reconstruct the complex signal, the whitened spectrum is simply multiplied by the original phase response: To obtain the noise approximation | ( )|, multiple spectra are typically taken consecutively in time and averaged together.If the signal is continuously windowed and frequency transformed, a moving average function may be applied to obtain an accurate approximation.Common averaging methods include the cumulative mean, the recursive exponential mean, and the windowed mean as given by the following equations, respectively:

Background Information
The two most common forms of spectral whitening are inverse filtering and frequency-band gain control [13].The frequency-band method is a time domain approach, where multiple band-pass filters are applied in parallel to section the signal into various frequency bands.Each of the filtered sections are then equalized using an active scaling approach such as automatic gain control (AGC) or linear predictive coding (LPC).The benefit of this method is a continuously whitened output that does not require any block-based processing such as that inherent with FFT operations.The major downside is potential phase distortions since these scaling operations are typically non-linear [13].In contrast, inverse filtering is typically performed in the frequency domain and does not produce phase distortions.It involves dividing the spectrum of concern by the mean of its noise approximation according to the following [14]: where X( f ) is the approximated or smoothed magnitude spectrum of X( f ), γ is a scaling or degree-of-flattening factor, and C is a constant to prevent division by zero.If desired, we may exclude the division constant by simply performing the operation in the log-decibel domain instead: To reconstruct the complex signal, the whitened spectrum is simply multiplied by the original phase response: where To obtain the noise approximation X( f ) , multiple spectra are typically taken consecutively in time and averaged together.If the signal is continuously windowed and frequency transformed, a moving average function may be applied to obtain an accurate approximation.Common averaging methods include the cumulative mean, the recursive exponential mean, and the windowed mean as given by the following equations, respectively: where w is the current windowed segment number, ξ is the recursive forgetting factor (0 < ξ < 1), and W is the total number of windows used for the mean estimate.
Although simplistic and often effective, the major drawback with the above approach is the potential attenuation of desired signal components from a contaminated noise estimate.If target signal components are present in past windowed spectra that constitute the current noise estimate, the normalization process will act to remove them from the whitened spectra.The obvious solution to this problem is to simply remove these components from the spectra before taking a mean estimate.However, in many instances, the desired signal component(s) and frequency location(s) are not known to facilitate removal.For such cases, the above methods are clearly not optimal in any sense.

Description
A proposed solution to the problem of attenuating target signal components by inclusion into the mean noise estimate, is to simply remove all peak components that may constitute a potential target signal.This can be achieved using a CFAR detector, such as the DF-CFAR previously mentioned.Using the detector, potential signals can be identified and effectively removed from the noise estimate by flooring them to some scaled value of the CFAR detection threshold used.To ensure all potential components are successfully located, a very high false alarm probability is used to maximize sensitivity.By using a value much higher than that of the final target detection stage (performed after whitening), the inability to detect a source component and subsequent inclusion into the mean noise estimate will not affect the final detection performance.It is proposed that the OS-CFAR detector be utilized since this form offers computational simplicity and superior performance in multi-target environments [15].However, essentially any CFAR detector may be used instead.Some common forms include the cell-averaging CFAR (CA-CFAR) [16], the greatest-of cell averaging CFAR (GOCA-CFAR) [17], the smallest-of cell averaging CFAR (SOCA-CFAR) [17], the ordered statistic CFAR (OS-CFAR) [18], the censored mean level CFAR (CML-CFAR) [19], and the trimmed mean CFAR (TM-CFAR) [20].Each of these detectors operate using the same principles, with differences only in the method in which the reference noise level is determined.For the OS-CFAR detector, the following binary testing function may be constructed: where η( f , w) is the threshold factor given by where α os is the order statistic scaling factor, and |X k ( f , w)| is the k th largest spectral component contained in the noise sample bandwidth of size N taken about the test cell |X( f , w)|.
Prior to calculating the mean approximation, potential signal components are effectively removed by flooring their value to some scaled fraction of the detection threshold used.This can be expressed by the following operation: where δ is the flooring scale factor.The mean approximation is then found by substituting the above value into Equations ( 4)- (6).Finally, the spectrally whitened form can then be obtained via Equations ( 1) and ( 2) with γ = 0.

Validation
To confirm the validity of the proposed whitening approach, the method is applied to acoustic data taken from a fixed-wing Delta X-8 UAV fitted with 4 microphones during flight operations.Using a single channel recording, probability distributions were calculated from consecutive FFT spectra for the unwhitened and whitened signals, and compared to that of ideal white Gaussian noise.Since the purpose is to evaluate the broadband spectral noise distribution, narrowband self-noise components generated by the aircraft propulsion system were first removed via the referenceless adaptive IIR approach previously proposed by Tan [21][22][23][24].Since notch filter bandwidth is typically very narrow, the process has little to no effect on the underlying broadband features.To calculate the probability distributions, the FFT was applied to the 1150 s flight recording using 0.5 s rectangular windows with a 50% overlap, producing 4599 windowed points for each frequency bin.Using these observations, the probability density functions (PDFs) for each spectral form (whitened, unwhitened, etc.) were then calculated as a function of frequency.Figure 2 displayed below provides the results obtained using the magnitude spectra for the original, whitened, and Gaussian noise signals.From the plots, it is evident that broadband noise in the original notch filtered signal are not IID since density values vary largely as a function of frequency.In contrast, the whitened signal PDF is nearly identical to the ideal response obtained from Gaussian noise which follows a Rayleigh distribution.Thus, we may conclude that the broadband noise components were effectively whitened to form a group of IID spectral components as desired.
where is the flooring scale factor.The mean approximation is then found by substituting the above value into Equations ( 4)- (6).Finally, the spectrally whitened form can then be obtained via Equations ( 1) and ( 2) with = 0.

Validation
To confirm the validity of the proposed whitening approach, the method is applied to acoustic data taken from a fixed-wing Delta X-8 UAV fitted with 4 microphones during flight operations.Using a single channel recording, probability distributions were calculated from consecutive FFT spectra for the unwhitened and whitened signals, and compared to that of ideal white Gaussian noise.Since the purpose is to evaluate the broadband spectral noise distribution, narrowband self-noise components generated by the aircraft propulsion system were first removed via the referenceless adaptive IIR approach previously proposed by Tan [21][22][23][24].Since notch filter bandwidth is typically very narrow, the process has little to no effect on the underlying broadband features.To calculate the probability distributions, the FFT was applied to the 1150 s flight recording using 0.5 s rectangular windows with a 50% overlap, producing 4599 windowed points for each frequency bin.Using these observations, the probability density functions (PDFs) for each spectral form (whitened, unwhitened, etc.) were then calculated as a function of frequency.Figure 2 displayed below provides the results obtained using the magnitude spectra for the original, whitened, and Gaussian noise signals.From the plots, it is evident that broadband noise in the original notch filtered signal are not IID since density values vary largely as a function of frequency.In contrast, the whitened signal PDF is nearly identical to the ideal response obtained from Gaussian noise which follows a Rayleigh distribution.Thus, we may conclude that the broadband noise components were effectively whitened to form a group of IID spectral components as desired.

Experimental Results & Discussion
The performance of the proposed CFAR-enhanced whitening approach is now illustrated using experimental data.In brief, the experiment involved flybys of a Delta X-8 aircraft fitted with four microphones at approximately 20 knots overhead a ground-based loudspeaker emitting various pure tone frequencies.Here we examine the separate cases of a 200 Hz and 500 Hz pure tone being emitted.Further details regarding the experimental setup can be found in [10].Recorded signals were first decimated to reduce data processing requirements since the recorded sampling rate was 48 kHz, but only information to up approximately 1000 Hz was found to be useful.The sample-reduced signals were then notch filtered to remove narrowband self-noise components using the referenceless

Experimental Results & Discussion
The performance of the proposed CFAR-enhanced whitening approach is now illustrated using experimental data.In brief, the experiment involved flybys of a Delta X-8 aircraft fitted with four microphones at approximately 20 knots overhead a ground-based loudspeaker emitting various pure tone frequencies.Here we examine the separate cases of a 200 Hz and 500 Hz pure tone being emitted.Further details regarding the experimental setup can be found in [10].Recorded signals were first decimated to reduce data processing requirements since the recorded sampling rate was 48 kHz, but only information to up approximately 1000 Hz was found to be useful.The sample-reduced signals were then notch filtered to remove narrowband self-noise components using the referenceless adaptive IIR approach previously proposed by Tan [21][22][23][24].Figure 3 provides spectrograms of the original and notch filtered signals for the 200 Hz source case.The filtered signals were then windowed, frequency transformed using the FFT operation, and spectrally whitened via the CFAR-enhanced method.Finally, detection statistics were established using the selective cell distribution-free CFAR detector (SCDF-CFAR) in conjunction with the single trial (ST), binary integration (BI), and robust binary integration (RBI) detection schemes.Details regarding the SCDF-CFAR detector and various detection schemes can be found in [12] and are not discussed here since it is outside the scope of this paper.
Drones 2018, 2, 1 5 of 10 adaptive IIR approach previously proposed by Tan [21][22][23][24].Figure 3 provides spectrograms of the original and notch filtered signals for the 200 Hz source case.The filtered signals were then windowed, frequency transformed using the FFT operation, and spectrally whitened via the CFARenhanced method.Finally, detection statistics were established using the selective cell distributionfree CFAR detector (SCDF-CFAR) in conjunction with the single trial (ST), binary integration (BI), and robust binary integration (RBI) detection schemes.Details regarding the SCDF-CFAR detector and various detection schemes can be found in [12] and are not discussed here since it is outside the scope of this paper.Tables 1-3 provide the notch filter, FFT, spectral whitening, and SCDF-CFAR parameters used.As previously discussed, the proposed whitening procedure can be effectively employed without reducing the probability of detection by using a threshold value which produces a much higher false alarm rate than that used in the final detection stage.Here, thresholding values are chosen using the values displayed below such that a false alarm rate of = 0.1 is achieved.This is considerably higher than that offered by the SCDF-CFAR detector as indicated in the table ( = 0.001).Note that SC indicates testing a single cell in the acquired signal FFT spectrum, while ST indicates testing all cells across the frequency band of interest.Tables 1-3 provide the notch filter, FFT, spectral whitening, and SCDF-CFAR parameters used.As previously discussed, the proposed whitening procedure can be effectively employed without reducing the probability of detection by using a threshold value which produces a much higher false alarm rate than that used in the final detection stage.Here, thresholding values are chosen using the values displayed below such that a false alarm rate of P FA = 0.1 is achieved.This is considerably higher than that offered by the SCDF-CFAR detector as indicated in the table (P SC FA = 0.001).Note that SC indicates testing a single cell in the acquired signal FFT spectrum, while ST indicates testing all cells across the frequency band of interest.We first illustrate the effectiveness of the proposed whitening approach in avoiding attenuation of source signal components.Figure 4 displayed below provides a spectrogram of the pre-whitened signal with a 500 Hz source component clearly visible, while Figure 5 displays spectrograms for the standard inverse and CFAR-enhanced whitened signals, respectively.The noise approximation was calculated using the recursive mean as previously given by Equation ( 5) with ξ = 0.5, and using a flooring scale value of δ = 1.From observation of the three plots, it is evident that both methods whiten broadband noise components since power levels remain relatively constant across the frequency band.However, the standard approach also greatly attenuates the target source component to near noise-floor levels.It is evident that the proposed CFAR method does not attenuate the source signal, but actually increases the SNR slightly while still maintaining an overall whitened response.This effect can be better visualized by Figure 6, which depicts the whitening process for a single windowed segment taken at 8.4 min into the flight.Here, |X( f )| is the orgional unwhitened signal, X( f ) −1 is the inverse noise approximation, |X k ( f )| is the CFAR detection threshold used to establish the inverse noise approximation, and |Y w ( f )| is the whitened signal spectrum.
Drones 2018, 2, 1 6 of 10 We first illustrate the effectiveness of the proposed whitening approach in avoiding attenuation of source signal components.Figure 4 displayed below provides a spectrogram of the pre-whitened signal with a 500 Hz source component clearly visible, while Figure 5 displays spectrograms for the standard inverse and CFAR-enhanced whitened signals, respectively.The noise approximation was calculated using the recursive mean as previously given by Equation ( 5) with = 0.5, and using a flooring scale value of = 1.From observation of the three plots, it is evident that both methods whiten broadband noise components since power levels remain relatively constant across the frequency band.However, the standard approach also greatly attenuates the target source component to near noise-floor levels.It is evident that the proposed CFAR method does not attenuate the source signal, but actually increases the SNR slightly while still maintaining an overall whitened response.This effect can be better visualized by Figure 6, which depicts the whitening process for a single windowed segment taken at 8.      We now provide quantitative results showing the effectiveness of the proposed whitening approach to increase general detection capabilities.Table 4 provides the detection results for the 200 Hz and 500 Hz source signals with a passing altitude of 150 m.For each source frequency, results are provided for the both the whitened and unwhitened signals.It should be noted that the SNR values quoted are not calculated in the manner typical of most signal processing applications.The "effective SNR" was used instead, which closely resembles the spurious free dynamic range.This method provides a more meaningful measure since it compares the peak signal value to the point at which it can no longer be detected (noise floor or detection threshold).It is depicted in the sample spectra displayed in Figure 7. From a comparison of the results obtained, it is apparent that the whitened signals produce significantly better results compared to the standard unwhitened forms; SNR values and detection rates were generally much higher for all of the detection schemes used.In addition to SNR values and overall detection rates, initial detection times were also found to be less for the whitened forms.From a visual inspection of spectrograms displayed in Figure 8 it is apparent that a single harmonic component was present at the 400 Hz location.Although the emitted source contained only a pure 200 Hz tone, the harmonic component was generated by the presence of a reflecting boundary (ground) located directly behind the speaker.However, it is apparent from the power spectra displayed in Figure 8 that this component would not be detectable for the unwhitened signal, since the non-flat power distribution produces an inherently high detection threshold.For the whitened signal, this is not the case and the harmonic component would instead be detected.We now provide quantitative results showing the effectiveness of the proposed whitening approach to increase general detection capabilities.Table 4 provides the detection results for the 200 Hz and 500 Hz source signals with a passing altitude of 150 m.For each source frequency, results are provided for the both the whitened and unwhitened signals.It should be noted that the SNR values quoted are not calculated in the manner typical of most signal processing applications.The "effective SNR" was used instead, which closely resembles the spurious free dynamic range.This method provides a more meaningful measure since it compares the peak signal value to the point at which it can no longer be detected (noise floor or detection threshold).It is depicted in the sample spectra displayed in Figure 7. From a comparison of the results obtained, it is apparent that the whitened signals produce significantly better results compared to the standard unwhitened forms; SNR values and detection rates were generally much higher for all of the detection schemes used.In addition to SNR values and overall detection rates, initial detection times were also found to be less for the whitened forms.From a visual inspection of spectrograms displayed in Figure 8 it is apparent that a single harmonic component was present at the 400 Hz location.Although the emitted source contained only a pure 200 Hz tone, the harmonic component was generated by the presence of a reflecting boundary (ground) located directly behind the speaker.However, it is apparent from the power spectra displayed in Figure 8 that this component would not be detectable for the unwhitened signal, since the non-flat power distribution produces an inherently high detection threshold.For the whitened signal, this is not the case and the harmonic component would instead be detected.We now provide quantitative results showing the effectiveness of the proposed whitening approach to increase general detection capabilities.Table 4 provides the detection results for the 200 Hz and 500 Hz source signals with a passing altitude of 150 m.For each source frequency, results are provided for the both the whitened and unwhitened signals.It should be noted that the SNR values quoted are not calculated in the manner typical of most signal processing applications.The "effective SNR" was used instead, which closely resembles the spurious free dynamic range.This method provides a more meaningful measure since it compares the peak signal value to the point at which it can no longer be detected (noise floor or detection threshold).It is depicted in the sample spectra displayed in Figure 7. From a comparison of the results obtained, it is apparent that the whitened signals produce significantly better results compared to the standard unwhitened forms; SNR values and detection rates were generally much higher for all of the detection schemes used.In addition to SNR values and overall detection rates, initial detection times were also found to be less for the whitened forms.From a visual inspection of spectrograms displayed in Figure 8 it is apparent that a single harmonic component was present at the 400 Hz location.Although the emitted source contained only a pure 200 Hz tone, the harmonic component was generated by the presence of a reflecting boundary (ground) located directly behind the speaker.However, it is apparent from the power spectra displayed in Figure 8 that this component would not be detectable for the unwhitened signal, since the non-flat power distribution produces an inherently high detection threshold.For the whitened signal, this is not the case and the harmonic component would instead be detected.

Conclusions
Based on the results obtained from the analysis provided, it is evident that the proposed CFARenhanced spectral whitening method is an effective means to whiten signals in the frequency domain without attenuating potential source components.In addition, the method was also found to transform colored frequency-dependent distributions to a frequency-independent form, thus producing IID variables.This property is considered significant since methods such as the DF-CFAR detector require this feature to accurately predict false alarm rates.The effectiveness of the approach was demonstrated using experimental data, with results confirming the method provides increased detection of narrowband signals when embedded in colored broadband noise.

Conclusions
Based on the results obtained from the analysis provided, it is evident that the proposed CFAR-enhanced spectral whitening method is an effective means to whiten signals in the frequency domain without attenuating potential source components.In addition, the method was also found to transform colored frequency-dependent distributions to a frequency-independent form, thus producing IID variables.This property is considered significant since methods such as the DF-CFAR detector require this feature to accurately predict false alarm rates.The effectiveness of the approach was demonstrated using experimental data, with results confirming the method provides increased detection of narrowband signals when embedded in colored broadband noise.

Figure 1 .
Figure 1.Power spectra of various aircraft during fly-by.

Figure 1 .
Figure 1.Power spectra of various aircraft during fly-by.

Figure 3 .
Figure 3. Spectrograms of noise corrupted and notched filtered signals for the 200 Hz source.

Figure 3 .
Figure 3. Spectrograms of noise corrupted and notched filtered signals for the 200 Hz source.
4 min into the flight.Here, | ( )| is the orgional unwhitened signal, | ( )| is the inverse noise approximation, | ( )| is the CFAR detection threshold used to establish the inverse noise approximation, and | ( )| is the whitened signal spectrum.

Figure 5 .
Figure 5. Spectrogram of whitened signal using standard and CFAR-enhanced methods.

Figure 6 .
Figure 6.Comparison of standard and CFAR-enhanced whitening methods.

Figure 6 .
Figure 6.Comparison of standard and CFAR-enhanced whitening methods.

Figure 6 .
Figure 6.Comparison of standard and CFAR-enhanced whitening methods.

Figure 7 .
Figure 7. Power spectra illustrating signal detection for whitened and unwhitened signals.

Figure 7 .
Figure 7. Power spectra illustrating signal detection for whitened and unwhitened signals.

Figure 8 .
Figure 8. Spectrograms of whitened and unwhitened signal segments for the 200 Hz source.

Figure 8 .
Figure 8. Spectrograms of whitened and unwhitened signal segments for the 200 Hz source.

Table 1 .
Signal preprocessing and filter parameters.

Table 1 .
Signal preprocessing and filter parameters.

Table 4 .
Detection results for 150 m passing altitude.