Article Radio Frequency Interference Detection and Mitigation Algorithms Based on Spectrogram Analysis

Radio Frequency Interference (RFI) detection and mitigation algorithms based on a signal’s spectrogram (frequency and time domain representation) are presented. The radiometric signal’s spectrogram is treated as an image, and therefore image processing techniques are applied to detect and mitigate RFI by two-dimensional filtering. A series of Monte-Carlo simulations have been performed to evaluate the performance of a simple thresholding algorithm and a modified two-dimensional Wiener filter.


Introduction
Microwave radiometry is today routinely used to remotely measure geophysical parameters. Microwave radiometers are highly sensitive instruments, but measurements may be corrupted by the presence of Radio-Frequency Interference (RFI). Sources of RFI include spurious signals and harmonics from lower frequency bands, spread-spectrum signals overlapping the "protected" band of operation, or out-of-band emissions not properly rejected by pre-detection filters.
The presence of RFI in the pre-detected voltage signal modifies the value of the measured power leading to an erroneous retrieval of the measured geophysical parameters [1][2][3]. In recent years, techniques to detect the presence of RFI in radiometric measurements have been developed. They include time-and/or frequency domain analyses [4], or statistical analysis of the received signal [5,6].

OPEN ACCESS
In addition, the so-called Spectral Kurtosis algorithm has been studied in [7] with good results, taking advantage of the effectiveness of the combination of different RFI detection algorithms.
Combining the time and frequency domains analysis jointly, the spectrogram stands as a powerful tool which has been previously used in RFI detection in radio-astronomy [8,9]. The spectrogram consists of an intensity plot (usually on a decibel scale) of the Short-Time Fourier Transform (STFT) magnitude [10]. The STFT consists of a set of Fourier Transforms (FT) of consecutive windowed data segments from a longer data set. Windows usually overlap in time, thus data segments may have redundant information, as sketched in Figure 1. The STFT provides time-localized spectral information of the frequency components of a signal varying over time, whereas the standard FT provides the frequency information averaged over the entire signal time interval [10]. As a spectrogram is a two-dimensional intensity plot, it can be analyzed as an image, thus having all the image processing tools at our hand. In this work, a simple thresholding algorithm to detect and eliminate interference patterns in radiometric signal spectrograms is developed first, and simulation results are presented. The simulated RFI signals are generically from the sinusoidal and chirp families. By adjusting the signals' parameters, the bandwidth and temporal duration, many different types of signals can be obtained. A two-dimensional (2D) Wiener filter is then applied to the spectrogram in order to try to improve the cancellation of RFI components in the radiometric signal. A brief description of the spectrogram calculation is given in Section 2. Section 3 describes the image processing algorithm proposed in this work, which is called "Smoothing Algorithm". In Section 4 the Wiener filtering process is described and compared to the "Smoothing Algorithm" proposed. Section 5 shows the results obtained with the application of both algorithms to a thermal noise signal contaminated with a set of RFI chirp signals. Finally, Section 6 summarizes the conclusions of this work.

Spectrogram Calculation
In this work, radiometric signals are assumed to be sampled at an adequate sampling rate satisfying the Nyquist criterion and the spectrograms are obtained from these discrete time signals.
The spectrogram calculation depends on several parameters which are: the total number of signal samples, the FFT size, the window used for the FFT calculation of the data segments, and the overlapping between these data segments, as shown in Figure 1.
The spectrogram used in this paper is calculated using discrete samples, each one corresponding to a measurement of the pre-detected voltage signal. The number of samples (N) which forms the spectrogram will determine the radiometric resolution obtained by each spectrogram.
The FFT size is the number of samples used to compute each FFT and determines the spectral and temporal resolutions. If the FFT size (L) increases, the spectral resolution increases, while the temporal resolution decreases, since the time lapse between consecutive data segments increases. In contrast, smaller FFT sizes lead to lower frequency resolution, and higher temporal resolution. Therefore, selection of the FFT size must be chosen carefully depending on the RFI present in the radiometric measurements.
The FFT calculation requires the use of an analysis window function in order to minimize unwanted "side lobes" and ringing in the FFT resulting from abrupt truncations at both ends of the data segment. This situation causes an oscillatory behavior in the FFT called the Gibbs phenomenon [11]. Several functions can be used to avoid this problem, though when the data source is unknown, the Hann window is one of the most common windows used in spectrum analysis because of its excellent roll-off rate at 18 dB/octave [12].
In order to avoid loss of data when using a window in the FFT calculation, some overlapping (O) must exist between consecutive data segments ( Figure 1). In addition, overlapping increases the temporal resolution as the beginning of the data segments is reduced. However, increasing the overlap leads to an increase of the total number of samples to be computed and managed, complicating the hardware design. Overlap factor is recommended to be 75% for a Hann window [13,14]; in this work a value of O = 75% is used.
The spectrograms calculated in this work have the following parameters: • Number of samples: this parameter must be as high as possible to maximize the radiometric resolution, as a higher number of samples means a longer integration time. On the other hand too large values increase the computation time required to perform the Monte-Carlo simulations. The value chosen in this work is N = 2 18 samples.
• Overlap: for the case of 2 18 samples, an overlap value of O = 75% is reasonable [13,14] to avoid information loss.
• FFT size: The FFT size determines the frequency resolution and the number of FFT segments, determines the temporal resolution. For the simulations performed in this work, and supposing that any information of the RFI is known, the FFT size was chosen to be equal to the number of FFT segments, to have similar spectral and temporal resolutions (both resolutions may not be the optimal ones); thus the FFT size should be the square root of the number of samples (N). Taking into account that there exists a 75% overlap between consecutive Hann windows, the number of FFT segments has to be multiplied by 4. Thus, to equilibrate both resolutions, maintaining the original compromise, the FFT size has been selected to be L = 2 10 , and now the total number of samples is 2 20 (accounting redundant points due to overlapping).
As it has been said, the spectrogram operation consists of the power (square of the absolute value) of the STFT magnitude. The spectrogram operation changes the probability density function of the received thermal noise which is a pair of Gaussian distributed random variables (one for the in-phase component and the other one for the quadrature component). The square of the amplitude is equal to the sum of the squares of both Gaussian distributions, which is the definition of a chi-square distribution with two degrees of freedom, which is equivalent to an exponential distribution [15]. Figure 2 sketches this process.

Smoothing Algorithm
The most obvious way to detect the presence of interference in a radiometric signal is by detecting power peaks in the received signal that are larger than the variance of the measured thermal noise in the absence of RFI. This detection can be performed either in both the time and frequency domains.
This technique can be straight-forwardly extended to the spectrogram. Considering that a power peak is produced by an RFI signal, the threshold value must be a function of the thermal noise variance (power), and must maximize the probability of RFI detection (P det ) and minimize the probability of false alarm (P fa ) detection. RFI probability of detection cannot be computed in advance since the presence of RFI is unknown, but the probability of false alarm is easy to obtain as the thermal noise follows a known Gaussian distribution.
All simulations performed in this work have been performed assuming an antenna temperature (T A ) of 300 K and a radiometer receiver noise temperature of 100 K. In order to make a relationship between the noise power and the RF interfering signal power, the Interference-to-Noise Ratio (INR) is defined as: where σ 2 RFI is the power of the interfering signal and σ 2 Noise is the thermal noise power. The threshold value of this algorithm must be selected to limit the error in the estimated antenna temperature produced by false alarms.
If an RFI-free spectrogram is observed ( Figure 1, panel 4) it is obvious that the noise power peaks are scattered (and usually isolated) in both frequency and time, while in an RFI contaminated spectrogram (e.g., Figure 4a), the RFI is usually clustered in determined regions in the frequency-time plane. Considering the spectrogram as an image, noise power peaks present higher spatial frequency contents (rapid variations) than the RFI, which are usually clustered. Hence, applying a 2D smoothing filter (low-pass filter) will attenuate these exponential "peaks", while the RFI contaminated regions will be preserved. RFI is preserved so it can be detected with a more restrictive threshold.
The algorithm proposed in this work can be summarized as follows with the aid of Figures 3 and 4: • Power spectrogram calculation of a data segment of 2 18 samples using data segments of 2 10 samples, with 75% overlap, and a Hann window (Figure 4a).
• Convolution of the power spectrogram with a 2D low pass filter of a determined size (in this work the Hann window has been chosen, although a Gaussian or a Hamming window would perform similarly (see Figure 4b)).
• Thresholding: a threshold is used in the smoothed power spectrogram, in order to detect clusters of RFI-contaminated pixels ( Figure 4c). The optimum threshold will be discussed in Section 5.
• Antenna noise power is calculated by averaging all spectrogram pixels below the threshold.
• Finally, T A is obtained from the antenna noise power.  (d) RFI-mitigated spectrogram using "blanking" of the RFI contaminated pixels.

Wiener Filtering
As already discussed, the spectrogram of a noise signal with sinusoidal interference signals can be considered as a noisy image, where the noise is the spectrogram of the radiometric signal (the one we want to measure), and the image to be detected is the spectrogram of the interference (the one to be cancelled). Therefore, designing a filter to eliminate the noise from the image is the way to estimate the RFI, for a later removal of the interference without loss of the radiometric data.
The Wiener filter is a well-known adaptive filter used in communications, which provides the best estimation of a signal, equalizes communications channel, and eliminates the noise present in the received signal. In this section the Wiener filter is used to estimate the RFI in the spectrogram image, for a later cancellation and its performance is compared to the smoothing algorithm. In our case of study, it is not necessary to know the effect of the communications channel in the spectrogram, thus, it will only be necessary to differentiate between the noise and the interfering signal. The way to perform this task is by using a locally adaptive linear minimum mean square error (LLMMSE) [16] to estimate the interfering signal's spectrogram.
The LLMMSE algorithm consists of an optimal linear estimator of our interfering signal ŷ(t u , f v ) in combination with additive noise. and is the spectrogram of the Gaussian noise (independent from the interfering signal spectrogram), t u is the u th time point of the spectrogram, f v is the v th frequency point and α and β are two parameters chosen to minimize the mean square estimation error criterion ε described in Equation (4): The minimum error (ε) is found for: where ∂ denotes partial derivative. From Equations (4), (5.1) and (5.2), and taking into account that the additive noise power of the spectrogram does not have a zero mean, Equation (6) can be derived: where μ s (t u , f v ) represents the local mean of s(t u , f v ) around a square of L × L pixels, described in Equation (7), σ s 2 (t u , f v ) represents the local variance of s(t u , f v ) around a square window of L × L pixels, described in Equation (8), μ n is the mean of the noise spectrogram, and σ n The use of different window sizes (L value, M 1 + M 2 + 1 = L) affects the resulting estimated interference ŷ(t u , f v ). If L is too small, the noise filter algorithm is not effective. On the other hand, if L is too large subtle details of the interference will be lost in the filtering process. Figure 6 shows the simulated error in the estimation of the noise power as a function of the RFI power present in the received signal; assuming that the thermal noise power (σ n 2 ) is perfectly known. It is observed that the value of L depends on the actual RFI level: if the RFI level is 17 dB lower than the thermal noise, the best choice is a 5 × 5 window, otherwise, a 6 × 6 window will outperform. In our case of study, the 6 × 6 window is chosen as the maximum value of the algorithm error is 0.74 K.  Interference to Noise Ratio (dB) 4x4 filter size 5x5 filter size 6x6 filter size 7x7 filter size 8x8 filter size

Results
In this section, results obtained with the two image processing algorithms presented (Smoothing and thresholding and 2D Wiener filtering) are shown and discussed.

Smoothing Algorithm Applied with Pulsed Sinusoidal and Chirp RFI
The use of the FFT implies that the Smoothing Algorithm is implicitly searching for sinusoidal interferences [5][6][7], therefore, in order to determine the correct performance of this algorithm, it has been tested with a set of linear chirp and sinusoidal interfering signals as defined in Equation (11) where P is the number of chirp signals that conform the chirped RFI, A p , d p , t p , f p , υ p , β p are the amplitude, effective duration, central time, initial frequency, initial phase, and chirp rate of the p th chirp signal respectively; R is the number of sinusoidal RFI signals, A r , f r , υ r , ν r , η r are the amplitude, frequency, phase, initial time and final time of the r th sinusoidal signal respectively; rect() is the rectangular function, T s is the sampling period, and N is the number of samples of the RFI signal. The retrieved error in T A as a function of the threshold's P fa , the size of the smoothing filter and the power of the interfering signal is presented in Figures 7-9. To test the performance of the Smoothing Algorithm, a 15 × 15 Hann window is selected. Retrieved T A and error in the retrieved T A as a function of the threshold's P fa are represented in Figures 8 and 9, respectively. Figure 7 shows the retrieved T A [K], for different interfering signal power (and even with no interference, black dotted line), with a fixed smoothing filter size (in this case a 15 × 15 Hann window), as a function of the P fa of the threshold used in our algorithm. The T A value is obtained by means of 1024 Monte Carlo simulations on a Pentium D at 3.4 GHz with 2 GB of RAM. Simulation time (16 P fa levels and 9 INR levels) was 3 h and 5 min. The actual T A value is 300 K which is asymptotically achieved by the black dotted line (in abscense of RFI) as P fa decreases. As it can be seen, the selection of the P fa threshold is crucial, as a low P fa values decreases the probability of detection of RFI-contaminated pixels, leading to a retrieved temperature higher than the real antenna temperature. On the other hand, a high P fa value will produce a high number of false alarms which will produce a clipping in the probability distribution function of the power spectrogram pixels, leading to a retrieved T A lower than the real antenna temperature.
In Figure 8 Combining different simulations with RFIs of different powers (INR from 5 dB to −30 dB compared to the noise power) it is observed that the threshold with optimum performance has a P fa ~ 7.24 ×· 10 −4 (P fa | Opt in Figure 8 Table 1. In this table, the values of the threshold with the lower T A RMS error independently of the RFI power are presented, in addition to the P fa associated with this threshold and the T A RMS error obtained with this threshold in the absence of RFI. For all cases, it is important to have a low T A RMS error value for the most suitable threshold in the black dotted curve, as absence of RFI should be the most probable case. In Table 1 it is observed that, as the filter size used in the Smoothing Algorithm increases, the maximum T A RMS error value decreases; in addition this value also decreases in absence of RFI.  The best performance is obtained with the largest window (35 × 35 Hann window). However, large smoothing windows exhibit a poorer radiometric resolution, as explained below. The radiometric resolution of an ideal total power radiometer is inversely proportional to the square root of the product of the noise bandwidth and the integrating time, where ΔT is the radiometric resolution, T rec is the receivers temperature, B is the noise bandwidth and τ is the integration time. Spectrogram pixels have a resolution of a determined bandwidth ΔB and a determined integration time Δτ, and the sum over all pixels of ΔB· Δτ corresponds exactly to the product of B and τ. Pixel elimination produced by the Smoothing Algorithm decreases the number of pixels available for radiometric measurements and the radiometric resolution degrades. In Equations (13) and (14), the relationship between the radiometric resolution after and before applying the Smoothing Algorithm is developed.
where ΔT| SA is the radiometric resolution after applying the Smoothing Algorithm, ΔB and Δτ are the frequency and time resolutions of the spectrogram pixels, N is the number of pixels of the spectrogram, N| SA is the number of pixels of the spectrogram that passes the Smoothing Algorithm filtering process, and N el is the number of eliminated pixels by the Smoothing Algorithm. Figure 10 shows the degradation of the radiometric resolution due to loss of radiometric data. It is observed that the degradation increases with the window size. When the window size increases, RFI power peaks are convolved over the spectrogram leading to an increase of the number of RFI-contaminated pixels. Degradation also increases with the RFI power as high powers are more likely to be detected by the algorithm even when they are smoothed, and contaminate a larger area of the time-frequency image. Therefore, as the filter size increases, the number of eliminated pixels increases, degrading the radiometric resolution. Figure 10. Radiometric sensitivity degradation due to pixel elimination as a function of the filter size. Eight RFI signals have been used, with total INR value labeled for each case. Thresholds used are the most suitable ones for each window size (Table 2). Radiometric sensitivity is normalized by the radiometric sensitivity value obtained when the Smoothing Algorithm is not applied; i.e., without the elimination of any pixel of the time-frequency image. 5    The retrieved error in T A as a function of the threshold's P fa , the size of the smoothing filter and the power of the interfering signal has been calculated in the same way that in the previous section, and it is presented in Figures 11-13.    In Figure 11, the error introduced by the Smoothing Algorithm in the retrieved antenna temperature ε A T [K] is represented for different interfering signal powers, with a fixed smoothing filter size: (15 × 15 Hann window) for two different broadband interfering signal. It can be observed in Figure 11a that a very high P fa must be used to detect a PRN interfering signal. In this case, the compromise of the P fa threshold selection leads to a minimum error of retrieved antenna temperature of 14.39 K, (P fa = 0.0384) for an INR value of 5 dB.
On the other hand, Figure 11b shows that an OFDM broadband signal can be detected with a threshold with a lower P fa , thus leading to a minimum error of retrieved antenna temperature of 9.07 K, (P fa = 0.02).
Following the RFI study in the same way as in the previous section, three additional Monte-Carlo simulations have been performed for different Hann window smoothing filter sizes (35 × 35, 25 × 25, and 5 × 5) for both broadband RFI cases; which are presented in Figures 12-13. Results of Figure 12 shows that it does not exist a P fa | Opt for all INR parameter values as in Figure 8 due to the fact that an increase of the INR parameter leads to an increase of the error in the retrieved T A ; therefore higher INR parameter values leads to higher error in the retrieved T A values.
In contrast, for the OFDM RFI case, there exists a P fa | Opt for all INR parameter values as in the sinusoidal RFI case (Figure 13). Table 2 summarizes the optimal P fa and threshold values and the maximum error in the retrieved T A for the OFDM RFI case; this table is similar to Table 1, but with lower thresholds values and higher error in the retrieved T A values. In contrast, error on the retrieved T A produced by the contamination of an OFDM interfering signal is lower than in the PRN signal's case, as this signal is based in a frequency modulation, and the FFT process can concentrate the energy of the interfering signal.
The main problem of the presence of broadband RFI is that, even if it is correctly detected, radiometric resolution of the measurements will be degraded due to the minimum introduced T A error and the loss of radiometric resolution due to the high number of eliminated pixels, as it can be seen in  Tables 3,4.  Table 4. Retrieved T A error for the threshold used in Figure 15 Table 3 and Figure 14 is the most suitable threshold to detect PRN RFI signals with an INR parameter value lower or equal to −5 dB.
On the other hand, threshold used in Table 4 and Figure 15 is the most suitable threshold to detect OFDM RFI signals with any INR parameter value. (h) (i)

Drawbacks of the 2D Wiener Filter
Results obtained with the LLMMSE filter show that this algorithm is suitable for denoising signals ( Figure 16), but it has an important drawback.
For an optimal performance, it is necessary to accurately estimate in advance the power of the thermal noise (T A ). Error in the estimation of the thermal noise power, introduces an error in the denoising process which leads to an error in the retrieved T A itself as it can be seen in Figure 17. Thus, thermal noise power must be first estimated for a proper extraction of the RFI from the radiometric signal. In addition, it is observed that the error introduced by the LLMMSE algorithm is almost equal to the error of the estimated noise power itself.
The reason for performing RFI extraction is to accurately estimate T A , which in fact is the thermal noise power. However, the LLMMSE algorithm does not improve the accuracy in the estimation of the thermal noise as it can be seen in Figure 17, and therefore this algorithm is not considered suitable for RFI mitigation.

Conclusions
Two new RFI detection and mitigation algorithms have been presented. Both are based on processing of the radiometric signal's spectrogram and thus operate in the time and frequency domains simultaneously.
The Smoothing (and thresholding) Algorithm is studied and its performance is estimated using Monte Carlo simulations. The threshold value in the Smoothing Algorithm is the most critical parameter, as it minimizes the retrieved T A error depending on the filter size and the RFI power, which is a priori unknown. For a determined filter size, the best threshold can be calculated varying the RFI signal power and keeping the noise power constant. In case that the interference is sinusoidal, it is found that there is an optimal threshold value which minimizes the retrieved T A error for any RFI power. This threshold value diminishes with the filter size used in the Smoothing Algorithm. For a sinusoidal RFI, with a threshold value of 1.37· σ 2 n , the retrieved T A error is 2 K for a filter size of 25 × 25 pixels, and any INR value.
In case that the RFI is broadband, two cases have been studied, which are a PRN RFI, and an OFDM RFI. The PRN RFI behaves like noise and it is found that there is not an optimal threshold value, as increasing the RFI power always increases the retrieved T A error. The OFDM RFI behaves like a sinusoidal RFI, so there also exists an optimal threshold value, although the maximum retrieved error in the T A is higher than in the sinusoidal case. In addition, broadband RFI's contaminate extense areas of the spectrogram, resulting in a poorer radiometric resolution as many more pixels of the spectrogram have been eliminated.
A 2D de noising filter based on the optimum Wiener filter (LLMMSE) is also studied to estimate the RFI in the signal's spectrogram and to substract it from the contaminated one. However, it has been found that the Wiener filter has an acceptable performance applied to the spectrogram to detect RFI signals for subsequent substraction only if the noise power of the received signal is precisely known, which is actually the magnitude to be determined. Therefore although the Wiener filter is optimal for signal denoising in signal processing, the accuracy required in the estimation of the noise power is much higher in microwave radiometry than in typical telecommunications applications, and therefore it does not prove to be suitable for RFI mitigation.
These two algorithms have been tested only with sinusoidal, chirp, PRN and OFDM like signals, testing other types of RFI signals with these algorithms will be performed in the future as well as processing measured data.