RFI Mitigation in Microwave Radiometry Using Wavelets

: The performance of microwave radiometers can be seriously degraded by the presence of radio-frequency interference (RFI). Spurious signals and harmonics from lower frequency bands, spread-spectrum signals overlapping the “protected” band of operation, or out-of-band emissions not properly rejected by the pre-detection filters due to the finite rejection modify the detected power and the estimated antenna temperature from which the geophysical parameters will be retrieved. In recent years, techniques to detect the presence of RFI have been developed. They include time-and/or frequency domain analyses, or statistical analysis of the received signal which, in the absence of RFI, must be a zero-mean Gaussian process. Current mitigation techniques are mostly based on blanking in the time and/or frequency domains where RFI has been detected. However, in some geographical areas, RFI is so persistent in time that is not possible to acquire RFI-free radiometric data. In other applications such as sea surface salinity retrieval, where the sensitivity of the brightness temperature to salinity is weak, small amounts of RFI are also very difficult to detect and mitigate. In this work a wavelet-based technique is proposed to mitigate RFI (cancel RFI as much as possible). The interfering


Introduction
The performance of microwave radiometers can be seriously degraded by the presence of radiofrequency interference (RFI).This is especially important in populated land areas [1][2][3].If RFI produces an increase in the power detected by the radiometer, which is larger than the natural variability of the geophysical phenomena, it can be flagged and the data discarded, but when the RFI is comparable or smaller than the natural variability, it is difficult to detect, and the retrieved geophysical parameters are erroneous [3].In the last decade, RFI detection and mitigation methods have been devised to address this problem.The three approaches that have been used are: time, frequency or statistical analyses of the received signal.Time analysis looks for anomalous behavior of the received signal in form of power peaks above the expected range of values.These values are blanked (discarded) and only the RFI-free data are left [4,5].However, since the detected power is a smoothed version of the instantaneous one, if the RFI peaks are shorter than the integration time they may not be detected.Frequency analysis looks for the spectrum of the received signal within the radiometer's band, and discards sub-bands with power higher than the rest [6].Finally, since for a zero-mean Gaussian signal, such as thermal noise, all the odd moments must be zero, and the even ones must satisfy (where  is the signal's standard deviation), statistical tests analyze some mathematical properties of the received signal and compare them to the theoretical values within some error margin, discarding the data series that do not satisfy some pre-defined criteria.Among these test the most widely used is the Kurtosis (ratio of the fourth and the square of the second moments) which must be equal to 3 in RFI-free conditions [7][8][9].The sixth order moment [10], and other algorithms [11,12] have also been studied.Most of the previous studies have focused on pulsed sinusoidal signals within the radiometer's band.In radio-astronomy some techniques have been proposed to deal with GLONASS RFI [13], but since the RFI signal is much lower than the noise, some a priori knowledge of the interferent signal must be known.These are called "physical modeling" in communications' terminology and a different model is required for each type of RFI.
Another type of analyses fall within the so-called "statistical-physical modeling" category, that provide universal model for natural and man-made RFI.The main models are the Middleton's class A (narrowband RFI within receiver's band), class B (broadband RFI wider than receiver's band), and class C (mixture of class A and B) canonical models for which the mathematical form is independent of the physical environment [14].The RFI mitigation approach is then based on the estimation of the model parameters and apply a linear optimal filtering (Wiener filter) or optimal detection rules [15].
In this work we propose a different approach to mitigate the effect of RFI in microwave radiometry.It is based on the use of the power of the wavelet transform to denoise (remove noise from a signal).The interfering signal ˆ( ) s t (RFI) is estimated without any a priori knowledge of it.This signal is then subtracted from the received signal ( ) x t , to obtain a quasi RFI-free noise signal ˆ( ) ( ) ( ) n t x t s t   from which the power is detected (Figure 1).x t s t n t   , so as to obtain a quasi RFI-free noise signal ˆ( ) n t .
In the following sections the principles of denoising are briefly reviewed.The optimum parameters to detect and mitigate four different types of RFI (sinusoidal, Doppler-like1 , chirp2 , and pseudorandom noise) are studied: type of thresholding, sequence length, decomposition level, and type of wavelet (among 75 different types).In order to the study completely general and a homogeneous intercomparison between these different signals:  the maximum instantaneous frequency has been set to be equal to 1 Hz for all of them, so that the sequence length corresponds to the number of samples per signal period, and  the amplitude of the different interfering signals has been properly scaled so that their power is also the same, and so the interference-to-noise (INR) ratio.The noise power is assumed to be equal to one, and the algorithm's performance is expressed in terms of the error of the detected output power as a function of the INR.

Principles of Denoising
Wavelet analysis is based in the multi-resolution analysis.Signals are decomposed into localized component waves of variable duration (wavelets).A discrete signal f is first decomposed into a first trend a 1 and a first fluctuation d 1 : . Then, in the second step (or level) the first trend a 1 is decomposed again into a second trend a 2 , and a second fluctuation d 2 : . The process is repeated a number of steps L (levels): . By ordering the wavelet transform coefficients by decreasing magnitude, signal compression and denoising can be performed by neglecting (or attenuating) the amplitude of the coefficients below a given threshold (see [16] for an introduction on the wavelet theory and its applications).
Consider the problem of denoising an unknown signal [17][18][19][20][21][22][23]: where X Wx  , Ŝ Ws  and N Wn  are the wavelet transforms of x, s and n.For a smooth function with additive Gaussian white noise, a theoretical threshold exists that completely removes the noise and successfully reproduces the original function: where [ ( 1), (2),..

., ( )]
One method of filtering is thresholding.A denoised signal is obtained by applying the inverse wavelet transform 1 W  (with the same wavelet basis) to a limited number of the highest coefficients of the DWT spectrum (Y).There are a number of ways to decide which coefficients should be retained.The two simplest methods are the hard and soft thresholding: • The hard threshold filter H hard removes the coefficients below a threshold value T, determined by the noise variance and keeps the others to the same value: • The soft threshold filter H soft shrinks the wavelet coefficients above, and below the threshold reduces them towards zero: Figure 2 shows the transfer function of the hard (red) and soft (blue) threshold filters (with T = 1).If the resulting signal has to be smooth, it has been shown that the soft threshold filter must be used [22].The selection of the threshold value can be difficult.In practice, if the noise-free signal   s t is unknown, a smooth approximation of the signal is looked for.Small threshold values lead to noisy results, while large threshold values introduce bias.Experimental studies have demonstrated that for some applications, the optimal threshold is simply computed as a constant c times the noise variance [23].Four approaches are used in this study [18]:  One approach utilizes a selection rule based on Stein's Unbiased Risk Estimate or SURE (quadratic loss function).If the signal to noise ratio is very small, the SURE estimate is very noisy.


The Universal method assigns a threshold level equal to the noise variance times where M is the sample size [21].


The heuristic approach is a mixture of the two previous ones, and if the signal to noise ratio is detected to be very small, the fixed threshold is used.


The fourth method uses a fixed threshold selected to yield the minimax performance for mean square error against an ideal procedure (minimum of the maximum mean square error).The choice of the proper wavelet scaling function is always the most important thing.Generally, for denoising, the wavelet scaling function must have mathematical properties (shape, continuity of the signal and its derivatives) similar to the original signal.For example, to denoise pulsed signals, the Haar wavelet (box scaling function) will perform well, but it will not perform as well to denoise sinusoidal signals.On the other hand, high order Daubechies wavelets (e.g.order 8) will perform well for sinusoidal signals.If the level of decomposition increases (higher fluctuation terms d L included), signals can be reproduced with increasingly accuracy.Therefore, a trade-off exists between the decomposition level and the complexity to evaluate the wavelet transform.The number of different wavelets is very large.See [25] for a quite detailed list of wavelet types and their properties.

PRN
The effect of the wavelet type (Haar wavelet) and the threshold selection method can be visually seen in Figure 3 for the four different types of signals been analyzed in this work.In this case, the pseudo-random noise signal (bottom panel) is better reconstructed than the other three continuous signals.

Optimum Parameters Selection
In this work 75 different wavelet types have been tested [25], but simulation results (average of 100 realizations each) are presented first only for the simplest wavelet: Haar wavelet or Daubechies 1. Once the trends are understood and the optimum parameters are found for the Haar wavelet, the optimum performance for the each signal type and each wavelet type is presented.

Interference Rejection
As it can be appreciated, at least 20 samples per period are required to reject RFI better than 20 dB, and as the sequence length increases, the estimated noise power error decreases (interference rejection increases) by the same factor, saturating around 3•10 -4 (INR > 55 dB) for the PRN signal at sequence lengths longer than 2 13 .In general, the fixed threshold estimation provides the worst performances, except for the SURE thresholding with the chirp signal for lengths between 64 and 1024.In general, for all signal types, the best performance is achieved using the heuristic SURE thresholding for any sequence length.Only for the PRN type of signal and below 256 samples, the minimax thresholding methods outperforms the heuristic SURE method.

Decomposition Level
Figure 5 shows the performance of the proposed RFI mitigation wavelet technique using the Haar wavelet, with 2 16 sequence length, as a function of the decomposition level.As intuitively expected, the larger the number of decomposition levels, the better the reconstruction (slope ~-1/3 decade per unit level), the smoother the function, and the smaller the number of terms in the decomposition that have to be used to reconstruct the signal properly.The quality of the reconstruction of the sinusoidal signal (and later cancellation) saturates above five levels, the Doppler signal saturates above six levels, the chirp signal saturates above seven levels, and only the PRN signals starts saturating at 11 levels.In all cases the heuristic SURE thresholding method is used, which is the one that achieves the best performance, only matched by the fixed thresholding for PRN signal.

RFI Mitigation Performance vs. Interference-to-Noise Ratio (INR)
Figures 6 and 7 show the RFI mitigation performance using the Haar transform or the optimum wavelet transform respectively, as a function of the threshold method and INR, for a sequence length of 2 16 and a decomposition level equal to 12.As it can be appreciated, using the soft heuristic SURE thresholding the rejection is very good (~40 dB) for high INR = 100, decreasing with decreasing INR.Below INR ~ 2•10 -4 the algorithm is no longer able to estimate the RFI signal (Figure 6).  1and decomposition level = 12.Threshold method: squares = fixed threshold with single level noise estimation, circles = soft SURE, diamonds = soft heuristics SURE, and triangles = minimax.RFI signal: solid line = sine, dashed line = Doppler, dotted = chirp, and dash-dot = pseudo-random noise.The optimum wavelets for each signal type have found to be the symlet 3 (sym 3) for the sinusoidal RFI, the reverse biorthogonal wavelets 1.5 (rbio1.5)for the Doppler signal, the discrete approximation of Meyer (dmey) wavelet for the chirp signal, and the reverse biorthogonal wavelets 1.3 (rbio1.3)for the PRN signals.When using the optimum wavelet types for each RFI signal the performance is significantly improved for high INRs (Figure 7), but remains stable as INR decreases so below INR ~ 2•10 -4 , as in the case of the Haar wavelet, no improvement is seen.

Noise Correlation Effects
As shown in Section 3.1, at least 20 samples are required per signal period to achieve a good performance (interference rejection  20 dB) of the RFI mitigation algorithm (or 300-400 samples per period for an interference rejection  30 dB), while the Nyquist sampling rate requires just two samples per period.It is worth to address the effect of the correlation between consecutive samples introduced by the receiver's frequency response.
To analyze this effect, the input white noise (zero-mean random Gaussian variable) is convolved with a constant FIR (Finite Impulse Response) filter of variable length (C) and amplitude such that the variance of the output remains the same ( 2 =1) 3 .The RFI is then added to the "colored" (filtered) noise and the wavelet-based RFI mitigation algorithm is applied.Figure 8 shows the average of 10 realizations using different correlation lengths from 1 (no correlation) up to 32. Results show that the impact of the correlation in the interference mitigation is negligible (< 0.1 dB).Results appear a bit noisy, but this is due to the small scale in the vertical axis.Figure 8. RFI mitigation performance using the Symlet 3 wavelet transform and a sinusoidal RFI for the heuristic SURE threshold method as a function of the correlation length (in samples) of the noise.Sequence length 2 16 , decomposition level = 12.

Quantization Effects
So far, the analysis has been performed using floating point representation.It is worth to evaluate the impact of signal quantization in this process, since analog signals are sampled using analog-todigital converters (ADC) with finite number of bits, and usually the higher the sampling frequency, the smaller the number of bits.To test the impact of quantization in the performance of the abovedescribed algorithms, signals (noise + RFI) have been quantized using a variable number of bits from 4 to 16, and the ADC window has been set to 3.5•.Enlarging the ADC window truncates less the input signal, better capturing the tails of the noise probability density function and -eventually-the RFI that may be present, but reduces the number of sampling levels per , which degrades the quality of the signal representation.Reducing the ADC window the input signal is truncated and the detected power is smaller than it actually is.With an ADC window of 3.5•, the detected power is 1% less (0.99• 2 ) the true input power.
Figure 9 shows the simulation results for a sinusoidal RFI and the Symlet 3 wavelet transform.For large INR, the interference signal is so large that is clipped by the ADC and noise power error saturates to 1 (recall that noise variance is normalized to  2 = 1).Only for INR < 10 -3 , the effect of the number of bits used in the quantization can be distinguished, with better performance when a larger number of bits are used in the ADC.The reason for obtaining good results with a moderate number of bits is due to the fact that noise power is estimated after integration of thousands or millions of samples (sampling frequency times the integration time).Bosch-Lluis et al. [26] demonstrated that for a 4.05•ADC window and 8 bits quantization, the estimated antenna temperature error was smaller than 0.01 K, suitable for sea surface salinity measurements, while coarser quantization schemes can be used for less demanding applications.

Computing Requirements
When considering the practical implementation of the described algorithms, it is critical to evaluate the number of floating operations (Flops) required.Figure 10 shows the number of floating operations required to perform the described RFI cancellation algorithm for INR = 100, as a function of the wavelet type (Haar and optimum ones in Figure 7) and the sequence length, both for decomposition levels equal to 6 (Figure 10a) and 12 (Figure 10b), for which the algorithm performance is very similar except for PRN like signals (Figure 5).Differences in the number of Flops with other INR values down to 0 (no RFI) are 4% and are not shown not to overload the figure.
As expected, the simplest wavelet type (Haar) is the one that requires the smallest number of Flops.On the other hand, the discrete Meyer (dmey) is the one that requires the largest number of Flops, about 10-20 times more than the Haar wavelet.For the other wavelet types (reverse biorthogonal 1.3 and 1.5 -rbio1.3 and rbio1.5-,and symlets 3 -sym3-) the number of required floating operations is about 2-3 times larger than the Haar one, which may justify its use in large RFI scenarios when the highest RFI rejection may be required (Figures 6 and 7).The improved performance of the discrete Meyer transform may be questioned when balanced with the much larger computation requirements.
Below 10-20 samples length there is a saturation effect due to the computational overhead, which suggests using sequences longer than 20 samples.The differences between Figures 10a and 10b are just a factor  2.5 for the shortest sequence lengths, being nearly identical for the sequences longer than ~10-100 samples per period.Assuming a system operating at L-band (1.400-1.427MHz) that acquires 20 samples per period to warrant enough RFI rejection (Figure 4), the sampling rate is 20 samples/period • 27•10 6 periods /s = 540 Msamples/s.Unfortunately, this high data rate cannot be processed with today's spaceborne processors (e.g. 4 MFlops/s with Leon 3 [27]) or even ground-based FPGAs [28].However, the proposed algorithm can be applied either at intermediate frequency or for the in-phase and quadrature components of the demodulated signal separately, to reduce the signal bandwidth by a factor of 2 and accordingly the computing requirements per processor, at the expense of doubling the number of processors.Further reductions can be achieved by sub-banding (dividing the original band into smaller ones) and applying the proposed algorithm to each of the sub-bands.

Conclusions
A RFI mitigation algorithm has been presented.It is based on the estimation of the RFI signal using wavelet denoising techniques, and its subtraction from the original signal so as to obtain an estimate of the noise-only signal.
It has been found that in general, the soft heuristic SURE thresholding method is the one that best performs for any type of signal, except for very weak RFI (INR ~ 10 -3 … 10 -4 ) in which the fixed thresholding performs slightly better.
The minimum number of samples per period is ~20, for which at least a 20 dB interference rejection is achieved using the Haar transform (depending on the signal type and the thresholding method).Higher rejections (40 dB) are achieved at the expense of longer sequences (2 16 samples) and increased computational load (~3•10 5 Flops for 20 samples/period to 7•10 8 Flops for 2 16  samples/period, Figure 10a).Increasing the number of decomposition levels above 6-7 does not improve the quality of the RFI mitigation, except for PRN (pulsed) signals.
The minimum inteference rejection (cancellation) is ~40 dB for high INRs (INR~100) when using the Haar wavelet transform, but this value may increase up to ~60-70 dB if the optimum wavelet type for each signal is selected: symlet 3 for the sinusoidal RFI, the reverse biorthogonal wavelets 1.5 for the Doppler signal, the discrete approximation of Meyer wavelet for the chirp signal, and the reverse biorthogonal wavelets 1.3 for the PRN signals.Therefore there is a trade-off between interference rejection and computational complexity which must be evaluated depending on the application.
Even in the case of using the simple Haar transform, the proposed RFI mitigation algorithm allows one to make useful radiometric measurements in areas heavily corrupted by RFI.Since it is unlikely that with today's computers and FPGAs this technique can be applied for real time processing in microwave radiometers, the RFI cancellation could be applied a posteriori to the recorded data.This can be of special interest in scenarios corrupted by human activity where the errors induced by RFI are equal or larger than the measurement uncertainty (T) and blanking is not an option due to the data loss.For example, if the antenna temperature is 100 K, the receivers noise temperature is 300 K, and the measurement uncertainty is T = 1 K, the minimum detectable power is 1 K/(100 K + 300 K) = 2.5•10 -3 (assuming  2 = 1), which determines the minimum useful INR for which it makes sense to apply the RFI mitigation algorithm.In other demanding applications, such as sea surface salinity retrieval, where the required T is much smaller (T = 0.05 K), RFI mitigation can be useful down to INR of ~0.05 K/(100 K +300 K)  10 -4 , which is at the limit of the performance of the RFI cancellation algorithm, regardless of the wavelet type used.

Figure 1 .
Figure 1.RFI mitigation technique: an estimate of the RFI signal ˆ( ) s t is subtracted from the received signal ( ) ( ) ( )x t s t n t   , so as to obtain a quasi RFI-free noise signal ˆ( ) n t .

Figure 2 .
Figure 2. H hard and H soft transfer functions of the threshold filters (with T = 1) currently used in denoising.
red dashed line) and H soft (blue solid line) threshold filters soft thresholding hard thresholding

Figure 4
Figure4shows the performance of the proposed RFI mitigation wavelet technique using the Haar wavelet, as a function of the threshold methods described in Section II and the sequence length, for the four different types of interferent signals (sine, Doppler, chirp, and pseudo-random noise), decomposition level = 12, and INR = 100.

Figure 4 .
Figure 4. RFI mitigation performance using the Haar transform as a function of the threshold method and sequence length (number of samples per signal period) for four different types of interferent signals, and decomposition level = 12.Threshold method: squares = fixed threshold with single level noise estimation, circles = soft SURE, diamonds = soft heuristics SURE, and triangles = minimax.RFI signal: solid line = sine, dashed line = Doppler, dotted = chirp, and dash-dot = pseudo-random noise.

Figure 5 .
Figure5.RFI mitigation performance using the Haar transform as a function of the threshold method and decomposition level for four different types of interferent signals, and sequence length 216 .Threshold method: squares = fixed threshold with single level noise estimation, circles = soft SURE, diamonds = soft heuristics SURE, and triangles = minimax.RFI signal: solid line = sine, dashed line = Doppler, dotted = chirp, and dash-dot = pseudo-random noise.

Figure 6 .
Figure 6.RFI mitigation performance using the Haar transform as a function of the threshold method and INR for four different types of interferent signals, sequence length 216 and decomposition level = 12.Threshold method: squares = fixed threshold with single level noise estimation, circles = soft SURE, diamonds = soft heuristics SURE, and triangles = minimax.RFI signal: solid line = sine, dashed line = Doppler, dotted = chirp, and dash-dot = pseudo-random noise.

Figure 7 .
Figure 7. RFI mitigation performance using the optimum wavelet type for each signal as a function of the INR for the heuristic SURE threshold method for four different types of interferent signals, sequence length 216 and decomposition level = 12.RFI signal: solid line = sine, dashed line = Doppler, dotted = chirp, and dash-dot = pseudo-random noise.

1 RFI
mitigation performance -Optimum wavelet for each signal including correlation between consecutive samples -Sym 3 transform & sinusoidal RFI Correlation length = 1 sample Correlation length = 2 samples Correlation length = 4 samples Correlation length = 8 samples Correlation length = 16 samples Correlation length = 32 samples

Figure 9 .
Figure9.RFI mitigation performance using the Symlet 3 wavelet type for a sinusoidal signal as a function of the INR for the heuristic SURE threshold method as a function of the number of bits of the analog-to-digital converter (ADC) from 4 to 16, assuming a sampling window of 3.5• to minimize signal clipping in the absence of RFI.Sequence length 216 , decomposition level = 12.At high INR values of RFI clipping occurs.For larger ADC sampling windows RFI is better sampled, but noise signals suffer more from discretization effects due to the smaller number of samples per .

4 N
-Noise Ratio [lin] Noise power error,  2 noise = 1 RFI mitigation performance vs. number of bits of ADC -Sym 3 transform & sinusoidal RFI N bits =

Figure 10 .
Figure 10.Number of floating operations required by the RFI cancellation algorithm (INR = 100) vs. sequence length, for different wavelet types and decomposition levels equal to a) 6, and b) 12.