RFI Suppression for SAR via a Dictionary-Based Nonconvex Low-Rank Minimization Framework and Its Adaptive Implementation

: Synthetic aperture radar (SAR) frequently suffers from radio frequency interference (RFI) due to the simultaneous presence of numerous wireless communication signals. Recently, the narrowband RFI is found to possess the low-rank property beneﬁting from stable frequency occupancy, hence the reconsideration of RFI suppression as a joint sparse and low-rank optimization problem. The existing methods either use the non-sparse useful signal itself as the sparse regularizer, or employ the nuclear norm to approximate the rank function, which punishes all singular values with the same penalty via singular value thresholding (SVT), resulting in the improper punishment problem. Hence, both are consequentially subject to performance limitation. In this paper, a novel dictionary-based nonconvex low-rank minimization (DNLRM) optimization framework is proposed for RFI suppression, which concurrently considers the improvements for both the sparse regularizer and the low-rank regularizer. For the former, an over-completed dictionary is constructed, for which the sparse coefﬁcient acts as the sparse regularizer. For the latter, the rank function is more accurately approximated by innovatively introducing the nonconvex function, for which the supergradient is synchronously used to generate the weighted penalty, thus solving the improper punishment problem. The derivation of the closed-form solution and the convergence analysis are described in detail. Additionally, the adaptive selection scheme for the model parameter is uniquely proposed for further ensuring the practicality of the DNLRM framework. The superiority of the proposed method is demonstrated via not only the RFI-free real SAR data combined with the measured RFI, but the RFI-contaminated real SAR data.


Introduction
Synthetic aperture radar (SAR) is an active remote sensing instrument that has been widely used for Earth observation with the special capability of working all-time and all-weather [1,2]. Due to the increasingly complex electromagnetic environment, SAR frequently suffers from radio frequency interference (RFI). RFI can cause degradation to image quality, which appears as misty streaks in the imaging result. The RFI-contaminated SAR images can even affect subsequent operations, such as polarimetry, interferometry, and target detection [3][4][5][6], thus highlighting the importance of RFI suppression.
(RNN) method [20] firstly considers a scheme to calculate the weighted penalties based on the reciprocal of the singular values. Nevertheless, the theoretical basis for the rank approximation is not explicitly presented, and the influence of the weighting function on the updating of the low-rank matrix is not clearly analyzed. Furthermore, the non-sparse useful signal is directly used as the sparse regularization and, thus, the restricted performance. Moreover, the preceding studies barely investigated the combination of simultaneously using the overcompleted dictionary and approximating the rank function in a more accurate way, let alone analyzing the behavior of the weighting function in the punishment operation.
Considering the above problems, to mitigate the limitation of RFI suppression performance caused by inaccurate modeling, the dictionary-based nonconvex low-rank minimization (DNLRM) framework is innovatively proposed in this paper to more accurately model the entire RFI suppression problem, with the reason being twofold. Firstly, nonconvex functions are introduced to approximate the rank function more accurately. Note that this is the first attempt to suppress RFIs using nonconvex functions. The improper punishment problem in terms of singular values can be overcome by treating larger ones with smaller penalties. Noteworthily, the supergradient of the nonconvex function is directly related to the calculation of the weighted penalty [30][31][32][33]. Secondly, an over-completed dictionary constructed by the discrete time-shifted versions of the transmitted signal is considered. Therefore, the corresponding sparse coefficient, rather than the useful signal itself, is used as the sparse regularizer for better modeling and protection. The constructed dictionary is inspired by [17,21,29], but is also combined with nonconvex functions for the first time, then further applied to our new framework. Benefitting from the discrimination for different singular values, the obtained low-rank matrix can better represent RFIs, and the useful signal protected by the sparser regularization has better fidelity with less residual RFIs. In addition, after an in-depth analysis of the behavior of the weighting function in the punishment operation, the adaptive selection scheme for the model parameter in DNLRM (ADNLRM) is uniquely proposed, based on a modified boxplot. Adaptively searching the optimal parameter for weighting functions without human intervention, ADNLRM can further improve the practicality of the proposed framework.
In conclusion, the main contributions of this paper are embodied in the following three aspects: (1) The DNLRM framework is innovatively proposed to solve the inaccurate modeling problems and unify the preceding similar methods. The alternating direction method of multiplier (ADMM)-based detailed derivation for the closed-form solution is provided, along with the convergence analysis. (2) Different nonconvex functions and the corresponding supergradients are originally introduced into RFI suppression to overcome the improper punishment problem. Additionally, ADNLRM is uniquely proposed for adaptively selecting the optimal parameter, which improves the applicability for varying RFI suppression missions. (3) The artificial combination of the RFI-free real SAR data with the measured RFI are considered alongside the RFI-contaminated real SAR data, to further verify the practicability of the proposed methods in a realistic environment.
The remainder of this paper is organized as follows. Section 2 reviews the RFIcontaminated SAR signal model and basic optimization frameworks. The proposed DNLRM framework with the corresponding formula derivation, and the adaptive selection scheme (ADNLRM) are described in Section 3. Section 4 presents the optimization framework analysis. Section 5 reports the experiment results and discussions. Finally, Section 6 summarizes the conclusions and presents future prospects in the discipline.

Signal Model
In the stop-and-go model, SAR mounted on the moving platform sends and receives signals at each azimuth position. Benefiting from its high two-dimensional matching gain, SAR can obtain high-resolution images of the target scene and clearly display the details Remote Sens. 2022, 14, 678 4 of 31 of the objects of interest. On the other hand, as an active microwave system, SAR often shares the working frequency bandwidth with other forms of electromagnetic signals. The presence of RFI seriously affects the quality of SAR imaging, which is certainly detrimental to the implementation of SAR image interpretation.
In a crowded electromagnetic environment, SAR may always receive mixed data consisting of the useful signal x(t, η), RFI l(t, η), and noise n(t, η), which can be formulated as: where y(t, η) represents the received signal, and t and η are, respectively, the fast time and the slow time. The useful signal and RFI overlap in the time domain, thus they are difficult to separate directly. Therefore, it is necessary to find a suitable domain where their energy is separated from each other to effectively perform RFI separation and suppression. The signal form of x(t, η) is relevant to the transmitted signal. For a linear frequency modulation (LFM) SAR system, x(t, η) is the corresponding LFM signal. Meanwhile, narrowband RFI can usually be modeled as the sum of R sinusoid-like components as: where A r (t, η) is the complex envelope, f r is the carrier frequency, and φ r (t, η) is the initial phase of the r th sinusoid RFI component. This concise model proves to be suitable to describe narrowband RFI.
If the received signals are stacked along the azimuth, the corresponding two-dimensional SAR data matrix can be obtained. The received signal Y ∈ C N r ×N a , with N r and N a respectively denoting the samples along range and the number of pulses along azimuth, can thus be formulated as: where X, L, and N are, respectively, the useful signals, RFI, and additive white Gaussian noises in the matrix form. Considering the signal from the perspective of the twodimensional matrix, we can further extract the signal property, which cannot be revealed by the traditional one-dimensional analysis. Therefore, under the premise of accurately exploring the signal property and appropriately establishing the optimization model, the RFI suppression mission can be reconsidered as a matrix separation problem.

Optimization Frameworks
Once the two-dimensional SAR data matrix is transformed into the range-frequency and azimuth-time domains, thin and bright stripes appear along the azimuth time, as shown in Figure 1a. Such a high correlation in the azimuth time is the fundamental source for RFI being low-rank. The available low-rank property of RFI, benefiting from its stable frequency occupancy, indicates the singular values' precipitously dropping down, as shown in Figure 1b. The amplitudes of the singular values were normalized based on the Frobenius norm of the RFI matrix. Additionally, the normalized RFI spectrum of a certain azimuth pulse marked in the red ellipse is shown in Figure 1c, which reveals its complexity and diversity. The extraction of the low-rank component can be obtained by a minimization of the rank function. Additionally, the sparsity of the useful signal X should be regularized by an l 0 norm for protection purposes [22,23]. To sum up, the RFI separation is accomplished via the following optimization: min L,X rank(L) + τ X 0 where rank (·) means the rank function, τ acts as a weighting factor between the low-rank RFI and the useful signal, · 0 denotes the l 0 norm, · F denotes the Frobenius norm, and δ is a fairly small positive number. As mentioned in [17,21], the useful signal is noise-like and non-sparse, thus not suitable for sparse regularization directly. One alternative scheme considers the sparsity under the over-completed dictionary, D x ∈ C N r ×K , constructed by the time-shifted version of the transmitted signal [29,34], where K represents the number of atoms in the dictionary. Therefore, the corresponding optimization is modeled as: where A is the corresponding sparse coefficient of the useful signal under dictionary D x .
Remote Sens. 2022, 14, x FOR PEER REVIEW 5 of 33 azimuth pulse marked in the red ellipse is shown in Figure 1c, which reveals its complexity and diversity. The extraction of the low-rank component can be obtained by a minimization of the rank function. Additionally, the sparsity of the useful signal X should be regularized by an 0 l norm for protection purposes [22,23]. To sum up, the RFI separation is accomplished via the following optimization: where rank ⋅ ( ) means the rank function, τ acts as a weighting factor between the lowrank RFI and the useful signal, ⋅ ‖ ‖ 0 denotes the 0 l norm, ⋅ ‖ ‖ F denotes the Frobenius norm, and δ is a fairly small positive number. As mentioned in [17,21], the useful signal is noise-like and non-sparse, thus not suitable for sparse regularization directly. One alternative scheme considers the sparsity under the over-completed dictionary, ∈ ℂ × , constructed by the time-shifted version of the transmitted signal [29,34], where K represents the number of atoms in the dictionary. Therefore, the corresponding optimization is modeled as: where A is the corresponding sparse coefficient of the useful signal under dictionary D x .
For the NP-hard peculiarity of the above optimization, the rank function and 0 l norm are usually relaxed to the nuclear norm ⋅ ‖ ‖ * and the 1 l norm ⋅ ‖ ‖ 1 respectively, thus resulting in the following optimization: Such an optimization can be solved by the ADMM method [35,36], which successively decomposes and deals with the L subproblem and the A subproblem. For the L subproblem in the form of the nuclear norm, the SVT algorithm is used to generate a closed-form solution. However, different singular values are treated with the same penalty level, which can result in an incomplete RFI separation. As large singular values belong to the RFIs, they need less punishment. Therefore, a dictionary-based nonconvex For the NP-hard peculiarity of the above optimization, the rank function and l 0 norm are usually relaxed to the nuclear norm · * and the l 1 norm · 1 respectively, thus resulting in the following optimization: Such an optimization can be solved by the ADMM method [35,36], which successively decomposes and deals with the L subproblem and the A subproblem. For the L subproblem in the form of the nuclear norm, the SVT algorithm is used to generate a closed-form solution. However, different singular values are treated with the same penalty level, which can result in an incomplete RFI separation. As large singular values belong to the RFIs, they need less punishment. Therefore, a dictionary-based nonconvex low-rank minimization optimization framework is proposed and described in detail in Section 3.

Methods
In this section, the nonconvex regularizer and the corresponding supergradient are firstly analyzed in a comparative manner. Subsequently, the proposed DNLRM framework is described in detail, including the establishment of the framework and the ADMMbased derivation of the closed-form solution. Furthermore, after carefully considering the relation between the weighting function and singular values, the adaptive selection scheme ADNLRM for the model parameter is originally provided to further ensure the practicability of the proposed framework.

Nonconvex Regularizer
Inspired by [30], which suggests that the nonconvex sparse optimization usually outperforms convex models and summarizes several common nonconvex functions, the nonconvex regularizer is innovatively introduced into the RFI suppression mission under Remote Sens. 2022, 14, 678 6 of 31 our specific consideration. Figure 2a shows the nonconvex functions for approximating the rank function with the corresponding supergradients illustrated in Figure 2c. The definitions of the considered nonconvex functions are formulated as: and the corresponding supergradients are:

Nonconvex Regularizer
Inspired by [30], which suggests that the nonconvex sparse optimization usually outperforms convex models and summarizes several common nonconvex functions, the nonconvex regularizer is innovatively introduced into the RFI suppression mission under our specific consideration. Figure 2a shows the nonconvex functions for approximating the rank function with the corresponding supergradients illustrated in Figure 2c. The definitions of the considered nonconvex functions are formulated as: x Logarithm ln x (7) and the corresponding supergradients are: In fact, any non-zero singular value, whatever the value is, should only increase the rank by 1. However, for the nuclear norm, the rank function is approximated as the sum of all the singular values, where the contribution of a non-zero singular value to the rank function directly depends on its value. As a result, the nuclear norm is often dominated by large singular values, and thus far from the true rank (see the difference between the orange line and the red line shown in Figure 2a), whereas for the nonconvex functions, such as Lp and Logarithm, the value slowly increases as the argument intensifies. Their values are closer to 1, that is to say, they can approximate the true rank more accurately. An enlarged view of the densely curved area within the red rectangle of Figure 2a is illustrated in Figure 2b, which shows the trend of different curves more clearly. Furthermore, as revealed in Figure 2c, the supergradient of a nonconvex function has the monotonically nonincreasing property. It plays a central role in the subsequent algorithm analysis for the guarantee of large singular values receiving small penalties.
Notice the enlarged area in Figure 2d corresponding to the red rectangle in Figure 2c; the blue curve lies under the green one when the argument is greater than the intersection (see the red point), while the opposite behavior occurs when the argument is smaller than Remote Sens. 2022, 14, 678 7 of 31 the intersection. The mentioned phenomenon implies that the supergradient of Logarithm, which corresponds to the blue curve, can provide further smaller penalties for the larger singular values than that of Lp. This also means that, for Logarithm, the penalties of the larger singular values vary more extensively from those of the smaller ones, thus resulting in a more appropriate punishment to both RFI and the useful signal. Furthermore, note that the values of the orange line always stay the same in Figure 2c, which explains the invariant penalty of the nuclear norm. Additionally, the blue curve, representing Logarithm in Figure 2a, presents a smaller difference with the true rank than the green one representing Lp, which also predicts a better rank approximation performance.

The Proposed DNLRM Framework
In order to provide the singular values with a more appropriate punishment, the DNLRM framework for RFI suppression is proposed as: where σ i (·) is the i th singular value that follows σ 1 ≥ σ 2 ≥ · · · ≥ σ N ≥ 0, N = min(N r , N a ). W(·) is a nonconvex function that operates on the singular values to achieve a more accurate approximation of the true rank. Note that if W(σ i (L)) = σ i (L), then ∑ i W(σ i (L)) degenerates to the nuclear norm L * , and Formula (9) degenerates to Formula (6). As for the over-completed dictionary D x ∈ C N r ×K , shown in Figure 3, it is constructed from the time-shifted version of the transmitted signal s(t). In our LFM SAR situation, the transmitted signal, s(t), can be specifically described by the LFM signal. K represents the number of atoms in the dictionary, and each atom is constructed as s(t − k∆t). Generally, to ensure that the constructed dictionary can effectively represent the entire image scene, K should be preset as the number of range bins. Moreover, ∆t denotes the time resolution, which affects the accuracy and computation burden during the optimization process.
Referring to the preceding literature [29,34], on the premise of not introducing too much computation burden and concurrently matching the system bandwidth, ∆t is chosen as 1/F r , where F r denotes the system sampling rate.
Remote Sens. 2022, 14, x FOR PEER REVIEW 8 of 33 the optimization process. Referring to the preceding literature [29,34], on the premise of not introducing too much computation burden and concurrently matching the system bandwidth, Δt is chosen as 1 r F , where r F denotes the system sampling rate. By introducing the Lagrange multiplier, Z , Formula (9) can be rewritten as: where ⋅ ⋅ , represents the inner product and μ is a hyperparameter in the Lagrange multiplier framework. To solve the above optimization problem, ADMM is used to disassemble it into the following two subproblems: By introducing the Lagrange multiplier, Z, Formula (9) can be rewritten as: where ·, · represents the inner product and µ is a hyperparameter in the Lagrange multiplier framework. To solve the above optimization problem, ADMM is used to disassemble it into the following two subproblems: then, followed with the update of the Lagrange multiplier: and the hyperparameter: where α is the step size for updating µ, and µ max is the preset upper bound of µ. The solutions of the L subproblem and the A subproblem will be described in detail in the following sections.

ADMM-Based Solution Derivation for the DNLRM Framework
To solve the L subproblem (11), we can substitute it with where ω(σ i ) ∈ ∂W(σ i ) and ∂W(σ i ) mean the set of all the supergradients at σ i . If W(·) is differentiable at σ i , then ∇W(σ i ) is the unique supergradient. Since σ 1 ≥ σ 2 ≥ · · · ≥ σ N ≥ 0, along with the monotonously nonincreasing property of the supergradient, the following equation can be obtained: The acquisition of (15), as well as the definition and property of the supergradient, can be referred to [30]. Note that Equation (16) is the primary condition that we expect to solve the L subproblem.
Then, based on the Von Neumann's trace inequality [20], a weighted SVT (WSVT) is considered to obtain a closed-form solution as [37]: where U k and V k are, respectively the left and right singular matrices of Y − D x A k + Z k /µ k , and the diagonal elements of matrix Σ Y−D x A k +Z k /µ k are the corresponding singular values. S ω(·)/µ (·) means the generalized soft-thresholding operator Notably, to make the solution (17) feasible, the weighting function ω(·) should meet the condition required in WSVT, which is the same in Equation (16). This means that the larger singular values correspond to the smaller weights. As expected, when the supergradient of a nonconvex function is chosen as the weighting operator, the monotonically nonincreasing property shown in (16) guarantees that the solution (17) can be well obtained.
Taking the aforementioned factors into consideration, the nonconvex functions are introduced to approximate the rank function more accurately instead of the nuclear norm. In our RFI suppression mission, such a dramatic change of the supergradient-based weighting function allows the penalties of the larger singular values to vary more broadly from the smaller ones. This explains why we chose the nonconvex functions for the rank function approximation, and the corresponding supergradients for the weighted penalty calculation.
For the A subproblem (12), firstly, we define and then approximate it via the second-order Taylor expansion as: where β is a parameter related to the spectrum radius ρ(D H x D x ) and satisfies β > ρ D H x D x [19]. Note that the dictionary, D x , of the useful signal can be designed in advance according to the SAR system parameters, so both D x and β can be calculated offline to reduce the computation time. Therefore, Formula (12) can be rewritten as: Such a form of optimization can be solved by the well-known soft-thresholding algorithm, and the closed-form solution is as follows [38,39]: with sign(P i,j ) being same the modified shrinkage manner as P i,j / P i,j for the complex matrix. Finally, as a termination condition of the optimization when a satisfactory solution is available, the following relationship needs to be satisfied: where ε represents the fairly small positive number, such as 1 × 10 −4 , that is chosen in this paper. To sum up, the whole process is tabulated as the pseudocode in Algorithm 1 for the purposes of an intuitive view. Once L * and A * are successively obtained, they are used for RFI suppression. Generally, there are two ways to suppress RFI, of which the first one is by directly using the recovered useful signal D x A * to represent the output SAR signal after RFI suppression. The second one considers the RFI suppression result as subtracting the recovered RFI from the raw data, namely Y − L * . The second way, analyzed in [21], shows a more robust RFI suppression performance; similarly, in the present paper, we also considered such a proper way to suppress RFI. With the separated useful signal matrix, SAR imaging is subsequently carried out. Finally, based on the imaging result with RFI suppression, the evaluation is considered for quantitatively describing the performance. The corresponding flowchart of the processing procedure is illustrated in Figure 4.

Y L D A Y
Once * L and * A are successively obtained, they are used for RFI suppression. Generally, there are two ways to suppress RFI, of which the first one is by directly using the recovered useful signal * x D A to represent the output SAR signal after RFI suppression.
The second one considers the RFI suppression result as subtracting the recovered RFI from the raw data, namely − * Y L . The second way, analyzed in [21], shows a more robust RFI suppression performance; similarly, in the present paper, we also considered such a proper way to suppress RFI. With the separated useful signal matrix, SAR imaging is subsequently carried out. Finally, based on the imaging result with RFI suppression, the evaluation is considered for quantitatively describing the performance. The corresponding flowchart of the processing procedure is illustrated in Figure 4.

The Proposed Adaptive Selection Scheme for Parameter λ
As previously described, WSVT is used for solving the L subproblem, and the corresponding closed-form solution is represented in (17). In our DNLRM framework, the weighting function, ω(·), is substituted by the supergradients of different nonconvex functions. As for the mentioned supergradients, there are two influential parameters, λ and γ. The former proportionably affects the weighted values, whereas the curvature of the weighting curve is determined by the latter. In general, to make the singular values belonging to the RFIs obtain a smaller penalty, and the ones corresponding to the useful signal simultaneously deserve the larger penalty, γ is chosen as 0.5 in our experiments for a proper and balanced performance. Once γ is chosen, the parameter λ will ultimately determine the multiples of the weighting functions. Recall the relationship shown in (18); the generalized soft-thresholding operator needs to compare with the ω(·)-related threshold.
If ω(·) becomes, large to a certain extent, even the large singular values will be excessively punished. Such a situation can still lead to the over-punishment of RFIs. Conversely, if ω(·) is so small that all the singular values cannot receive enough of a penalty, then there is less of a presence of a useful signal, which causes severe degradation to the imaging result. Therefore, the appropriate value of ω(·), determined by parameter λ, is crucial to the implementation of the proposed DNLRM method.
In considering these facts, a modified boxplot based adaptive selection scheme of parameter λ is proposed in the present paper, which can be regarded as an outlier detection problem. For outlier detection, the 3σ rule and Z-score method are based on the assumption that data obeys the normal distribution, but the actual data does not always fit the distribution as expected. Their criteria for judging the outliers are directly based on the mean value and the standard deviation of the data. However, both of them are of such small tolerance, that the outliers themselves can have a great impact on them. On the contrary, the boxplot merely depends on the actual data itself, without assuming whether the data obeys a particular distribution. The boxplot judges the outliers via the quartile and the inter quartile range (IQR), which are more robust [40], thus more effective in terms of identifying the outliers. According to [41], a boxplot consists of the maximum, minimum, upper quartile (Q1), median (Q2), and lower quartile (Q3). The location of Q1 can be calculated as i(M + 1)/4, where index i stands for the ith quartile and M represents the sequence length, namely the number of singular values in our RFI suppression mission. Q2 and Q3 are then similarly obtained. Subsequently, IQR is calculated by subtracting Q1 from Q3, which denotes the main body of the sequence with the elements having similar values. Therefore, a value is detected as the outlier once it is larger than the threshold Q3 + 3IQR.
The above calculation is the traditional detection mode used for the boxplot-based outliers, which is not appropriate if directly introduced into our adaptive selection scheme. Therefore, we improved it to fit our RFI suppression mission. Recall that the core idea of the proposed DNLRM method is to obtain the weighted penalty for singular values via the supergradient of the nonconvex function. In order to make the supergradient weighted penalties work effectively, the penalty curve needs to reach the order of magnitude of the singular values. Fortunately, the parameter λ can be used to raise the weighted value proportionally, then the core problem becomes the determination of the appropriate proportion. As considered above, the threshold Q3 + 3IQR can be used to set the boundary to distinguish the strong RFIs. Based on this, we suggest extending the boundary along the decreasing direction of the singular values for better distinguishing the performance for two reasons. The first one lies in that the threshold Q3 + 3IQR only considers the extremely strong outliers, namely the strong RFIs, whereas the relatively weaker RFIs may fall into the acceptable interval, which require further identification by the threshold extension. The second one is closely related to the proposed DNLRM method. As shown in Figure 5, the orange curve represents the supergradient-based penalty curve with the threshold Q3 + 3IQR, where the small singular values are protected but the larger ones are still over punished. After the threshold extension, the intersection of the singular value curve and the orange curve moves down (see the yellow star), and the green penalty curve is obtained. Accordingly, the penalties for the large singular values are further reduced and sufficient punishment for the small ones is ensured simultaneously. Through the threshold extension, the phenomenon that singular values of relatively weaker RFIs diffuse into the ones belonging to the useful signal is also considered.
Remote Sens. 2022, 14, x FOR PEER REVIEW 13 of 33 but the larger ones are still over punished. After the threshold extension, the intersection of the singular value curve and the orange curve moves down (see the yellow star), and the green penalty curve is obtained. Accordingly, the penalties for the large singular values are further reduced and sufficient punishment for the small ones is ensured simultaneously. Through the threshold extension, the phenomenon that singular values of relatively weaker RFIs diffuse into the ones belonging to the useful signal is also considered.
(a) (b) Figure 5. Relationship between the weighted penalty curves and singular values. The blue curve represents the singular values, and the orange and the green ones, respectively represents the weighted penalty curve obtained by the boxplot without and with threshold extension. The curves in the red rectangle of (a) are enlarged to (b) for clearer vision.
Specifically, we recommend the following quantitative threshold extension scheme: IQR median (25) where σ mean and σ median are the mean and the median of the singular values of Both are calculated before performing WSVT (17). The rationality of the above scheme is twofold. First, IQR represents the interval where the sequence values remain stable, so it can be considered as the proper baseline for threshold extension. The second one lies in that the ratio of the mean to the median, which can express the fluctuation of a sequence at a moderate level. Furthermore, as the signal-to-interference ratio (SIR) decreases, both of them become larger while the numerator changes more dramatically, hence the increasing ratio and the correspondingly more relaxed threshold shown in (25). Importantly, the concomitancy between SIR and the threshold is worth considering. When the RFI level increases, meaning a lower SIR, the singular values of relatively weaker RFIs become larger. This results in a more distinct diffusion, which demands a more relaxed threshold. Therefore, the statistics-based threshold extension (25) is needed to better adapt to the varying RFIs. Notice the enlarged area in Figure 5; the green curve obtained with a threshold extension, according to (25), is reduced appropriately compared with the orange one with no threshold extension. Correspondingly, the benefits are twofold. For the larger singular values, they further obtain a smaller penalty, and thus a better separation of strong RFIs. For the singular values slightly smaller than the intersection, they are fortunately reconsidered. The corresponding relatively weaker RFIs can be extracted from what will otherwise be pigeonholed to the useful signal. Finally, a transferring function, ⋅ T( ) , is used to adjust the threshold, Γ , for specific supergradient-based weighting functions to obtain the target parameter: Specifically, we recommend the following quantitative threshold extension scheme: where mean σ and median σ are the mean and the median of the singular values of Y − D x A + Z/µ. Both are calculated before performing WSVT (17). The rationality of the above scheme is twofold. First, IQR represents the interval where the sequence values remain stable, so it can be considered as the proper baseline for threshold extension. The second one lies in that the ratio of the mean to the median, which can express the fluctuation of a sequence at a moderate level. Furthermore, as the signal-to-interference ratio (SIR) decreases, both of them become larger while the numerator changes more dramatically, hence the increasing ratio and the correspondingly more relaxed threshold shown in (25). Importantly, the concomitancy between SIR and the threshold is worth considering. When the RFI level increases, meaning a lower SIR, the singular values of relatively weaker RFIs become larger. This results in a more distinct diffusion, which demands a more relaxed threshold. Therefore, the statistics-based threshold extension (25) is needed to better adapt to the varying RFIs. Notice the enlarged area in Figure 5; the green curve obtained with a threshold extension, according to (25), is reduced appropriately compared with the orange one with no threshold extension. Correspondingly, the benefits are twofold. For the larger singular values, they further obtain a smaller penalty, and thus a better separation of strong RFIs. For the singular values slightly smaller than the intersection, they are fortunately reconsidered. The corresponding relatively weaker RFIs can be extracted from what will otherwise be pigeonholed to the useful signal. Finally, a transferring function, T(·), is used to adjust the threshold, Γ, for specific supergradient-based weighting functions to obtain the target parameter: Recall (8): assuming that γ = 0.5, we have T(Γ) = 2 √ Γ for l p, and T(Γ) = Γ for logarithm. This can be accordingly generalized to other possible weighting functions.
The parameter obtained from the above modified boxplot is considered as the initial value λ 0 for the subsequent adaptive fine search, to finally obtain the optimal λ * . Specifically, based on the obtained initial value, λ 0 , the step size for the adaptive fine search is chosen as one order of magnitude less than λ 0 (i.e., λ 0 /10). Firstly, take one positive step, obtaining λ 0 + λ 0 /10, then obtain the corresponding metric and compare this with that of λ 0 . The search will continue in the current direction if the obtained metric is better than that of the initial λ 0 , otherwise the opposite direction will be pursued. Therefore, the search direction for the optimal value and the corresponding step size are obtained. Afterwards, the metric corresponding to the current λ k obtained each time is compared with that of the last choice λ k−1 . If a better result is obtained, the search will continue for λ k+1 ; if worse, the last result of λ k−1 is considered as the better one. Successively, based on λ k−1 , λ le f t and λ right will be obtained with ±0.5 times the step size. Their corresponding metrics are then calculated and compared with that of λ k−1 to determine the final optimal parameter λ * . Note that there is no human intervention needed in the entire searching process. Both the effectiveness and convenience of the adaptive selection scheme are reflected in the subsequent experiments.

Results
In this section, a series of experiments were conducted to verify the effectiveness and practicability of the proposed framework. First, we considered adding the measured RFI into the RFI-free real SAR data, which originally did not contain interference, to analyze the influence of different SIRs on several mentioned methods. Necessarily, the sparse scene and the dense scene were both considered to synthetically compare the RFI suppression performance. The sparse scenes refer to those imaging scenes with fewer scattering points, but containing obvious bright spots. On the contrary, those imaging scenes with relatively homogeneous and densely distributed scattering points were considered as dense scenes. In order to quantitatively evaluate the recovered images after RFI suppression, the normalized mean square error (NMSE) was used as the metric, which is formulated as: where S andŜ are the imaging results, respectively, obtained from the original RFI-free real SAR data and the RFI suppression result when it was interfered with by RFIs. The images are generated via the ωKA imaging algorithm. As can be seen from the definition of NMSE, it considers the difference between the processed image and the reference one, so a smaller value represents a better RFI suppression performance. In addition, considering that the purpose of RFI suppression is to improve the image quality of the SAR image, both the image entropy and image contrast were also used to quantitatively analyze the image quality after RFI suppression. The image entropy is defined as: where P is the normalized histogram of the SAR image and I is the gray level. Additionally, referring to [42], the image contrast can be modeled as: (|F(n a , n r )| − mean(|F(n a , n r )|)) 2 mean |F(n a , n r )| 2 (29) where F(n a , n r ) presents a certain point in the SAR image and mean(·) denotes the mean value of the analyzed image. When an image becomes blurred due to the interference, the corresponding image entropy increases, indicating a worse image contrast. Accordingly, after RFI suppression, the image quality can be improved, which corresponds to a smaller image entropy and a larger image contrast. Subsequently, to further verify the performance of the proposed DNLRM method in the actual situation, experiments for suppressing the RFI-contaminated real SAR data were also carried out. The RFI-contaminated data means that RFIs were already contained in the received signal when collecting the data in real time, rather than the artificial combination of the RFI-free data with the measured RFI. The above two experimental configurations are the most realistic and referential ones, which provide a powerful illustration of the performance of different RFI suppression methods. Finally, the influence of the model parameter of the DNLRM method on RFI suppression performance was analyzed.

RFI-Free Real SAR Data with Measured RFI
To verify the outstanding performance of the proposed DNLRM method, sufficient experiments are conducted in this section. The RFI-free real SAR data was combined with the measured RFI to generate the interfered data for analysis. Meanwhile, the power of RFI was adjusted appropriately so that SIR, respectively, reached −20 dB, −15 dB, and −10 dB. Moreover, for a more comprehensive analysis of the RFI suppression performance, both the sparse scene and the dense scene with the figure size of 1536 × 2048 were under consideration. The specific parameters related to the RFI-free real SAR data are listed in Table 1, which are from the RADARSAT-1 SAR system. In this section, the influence of RFI on sparse scene is analyzed, and the performance of the different RFI suppression methods is compared. Figure 6a illustrates the RFI-free SAR imaging result with the range and azimuth, respectively, lying along the horizontal direction and vertical one. The scene of the port and several ships is clearly visible, and so are the enlarged parts. Once RFI with a certain SIR, −20 dB here, is added to the RFI-free data, the imaging result is shrouded by the misty streaks, making the target scene completely unrecognizable, as shown in Figure 6b. To mitigate the effects of RFI, several methods, namely the NF method, ESP method, RNN method, DLRM method, the proposed DNLRM method with the nonconvex function selected as Lp and Logarithm, and the corresponding ADNLRM method, are used for the performance comparison. Note that the hyperparameters τ and µ not only act as tradeoff factors, but also participate in calculating the penalty. Therefore, both play an important part in the implementation of the optimization related methods, including our proposed method. In our experiments, τ and µ are, respectively, chosen as 1/ max(N a , N r ) and 200/ Y 2 F according to [20]. Moreover, α and µ max are, respectively, 1.2 and 1 × 10 6 .
corresponding ADNLRM method, are used for the performance comparison. Note that the hyperparameters τ and μ not only act as tradeoff factors, but also participate in calculating the penalty. Therefore, both play an important part in the implementation of the optimization related methods, including our proposed method. In our experiments, τ and μ are, respectively, chosen as 1 / max( , )  [20]. Moreover, α and μ max are, respectively, 1.2 and 1 × 10 6 .
(a) (b) Taking the case of −20 dB SIR as the example, detailed comparisons are analyzed as below. After being processed by the NF method, there are not only spectrum fractures, but also a large amount of remaining RFI sidelobes, which still leaves the imaging result blurred in severe RFI remnants, as shown in Figure 7c. The ESP method suppressed the RFI by constructing the RFI subspace, onto where the received signal was projected to separate RFI from the useful signal. Due to the fact that RFIs are a combination of many components with varying amplitudes, the subspace is often difficult to accurately classify. The orthogonality of subspace is consequently difficult to guarantee, which causes distinct streaks, shown in Figure 7d, due to the residual RFI. Figure 7e, processed via the RNN method, illustrates a large performance improvement compared with the first two. The low-rank property of RFI is exploited and the over-punishment problem is also considered by introducing the weighted penalty associated with the reciprocal of singular values, whereas in the RNN method, the non-sparse useful signal is directly used as the sparse regularization. The insufficient protection of the useful signal leads to the serious trailing of the imaging targets, as clearly shown in the enlarged area in Figure 8e, which still leaves room for performance improvement. On the contrary, the DLRM method intelligently uses the over-completed dictionary, in which the useful signal has a sparser representation, and the corresponding sparse coefficients are employed for regularizing the useful signal. Unfortunately, the over-punishment problem is not considered and the singular values are treated with the same penalty when updating low-rank matrices. As a result, parts of the RFIs are decomposed into the useful signal matrix, resulting in the imaging result shrouded in the noise-like residual RFIs, as shown in Figure 7f. In Figure  7g, the proposed DNLRM method reveals a superior RFI suppression performance with few RFI remnants and higher signal fidelity. The reason lies in the fact that the proposed DNLRM method simultaneously considers the appropriate sparse regularization with the usage of the over-completed dictionary and the more reasonable punishment scheme for the singular values. Here, larger singular values are treated with smaller penalties via the Taking the case of −20 dB SIR as the example, detailed comparisons are analyzed as below. After being processed by the NF method, there are not only spectrum fractures, but also a large amount of remaining RFI sidelobes, which still leaves the imaging result blurred in severe RFI remnants, as shown in Figure 7c. The ESP method suppressed the RFI by constructing the RFI subspace, onto where the received signal was projected to separate RFI from the useful signal. Due to the fact that RFIs are a combination of many components with varying amplitudes, the subspace is often difficult to accurately classify. The orthogonality of subspace is consequently difficult to guarantee, which causes distinct streaks, shown in Figure 7d, due to the residual RFI. Figure 7e, processed via the RNN method, illustrates a large performance improvement compared with the first two. The low-rank property of RFI is exploited and the over-punishment problem is also considered by introducing the weighted penalty associated with the reciprocal of singular values, whereas in the RNN method, the non-sparse useful signal is directly used as the sparse regularization. The insufficient protection of the useful signal leads to the serious trailing of the imaging targets, as clearly shown in the enlarged area in Figure 8e, which still leaves room for performance improvement. On the contrary, the DLRM method intelligently uses the over-completed dictionary, in which the useful signal has a sparser representation, and the corresponding sparse coefficients are employed for regularizing the useful signal. Unfortunately, the overpunishment problem is not considered and the singular values are treated with the same penalty when updating low-rank matrices. As a result, parts of the RFIs are decomposed into the useful signal matrix, resulting in the imaging result shrouded in the noise-like residual RFIs, as shown in Figure 7f. In Figure 7g, the proposed DNLRM method reveals a superior RFI suppression performance with few RFI remnants and higher signal fidelity. The reason lies in the fact that the proposed DNLRM method simultaneously considers the appropriate sparse regularization with the usage of the over-completed dictionary and the more reasonable punishment scheme for the singular values. Here, larger singular values are treated with smaller penalties via the supergradient of the chosen nonconvex function. Furthermore, Figure 7g,h display a similar performance, and it is difficult to tell the difference using the naked eye. This is because their processing flows are both based on the proposed DNLRM method, except for choosing different nonconvex functions. Specifically, Figure 7g is obtained with Lp, thus Lp-DNLRM, and Figure 7h is obtained with Logarithm, thus Log-DNLRM. The negative correlation between the supergradient and the singular values guarantees the small penalties of the large singular values, which ensures fewer RFI remnants. Further, the results of the proposed adaptive selection scheme shown in Figure 7i,j are, respectively, based on Lp and Logarithm, thus the Lp-ADNLRM and the Log-ADNLRM. In addition to the superior RFI suppression performance, the trailing phenomenon is accordingly reduced, which can be clearly seen from the enlarged figures in Figure 8. Moreover, compared with DNLRM using the corresponding nonconvex function, ADNLRM can also achieve considerable performance while no human intervention is needed for searching parameter λ. This facilitates the adaptation to varying RFI suppression tasks. To observe the difference more clearly, the area within the red rectangle in Figure 7 is zoomed in to the corresponding position in Figure 8. The quantitative comparison of RFI suppression performance is listed in Tables 2-4. After a comprehensive evaluation, the proposed method achieves satisfactory performance improvement. correlation between the supergradient and the singular values guarantees the small penalties of the large singular values, which ensures fewer RFI remnants. Further, the results of the proposed adaptive selection scheme shown in Figure 7i,j are, respectively, based on Lp and Logarithm , thus the Lp -ADNLRM and the Log -ADNLRM. In addition to the superior RFI suppression performance, the trailing phenomenon is accordingly reduced, which can be clearly seen from the enlarged figures in Figure 8. Moreover, compared with DNLRM using the corresponding nonconvex function, ADNLRM can also achieve considerable performance while no human intervention is needed for searching parameter λ . This facilitates the adaptation to varying RFI suppression tasks. To observe the difference more clearly, the area within the red rectangle in Figure 7 is zoomed in to the corresponding position in Figure 8. The quantitative comparison of RFI suppression performance is listed in Tables 2-4. After a comprehensive evaluation, the proposed method achieves satisfactory performance improvement.   Lp -DNLRM, and Figure 7h is obtained with Logarithm , thus Log -DNLRM. The negative correlation between the supergradient and the singular values guarantees the small penalties of the large singular values, which ensures fewer RFI remnants. Further, the results of the proposed adaptive selection scheme shown in Figure 7i,j are, respectively, based on Lp and Logarithm , thus the Lp -ADNLRM and the Log -ADNLRM. In addition to the superior RFI suppression performance, the trailing phenomenon is accordingly reduced, which can be clearly seen from the enlarged figures in Figure 8. Moreover, compared with DNLRM using the corresponding nonconvex function, ADNLRM can also achieve considerable performance while no human intervention is needed for searching parameter λ . This facilitates the adaptation to varying RFI suppression tasks. To observe the difference more clearly, the area within the red rectangle in Figure 7 is zoomed in to the corresponding position in Figure 8. The quantitative comparison of RFI suppression performance is listed in Tables 2-4. After a comprehensive evaluation, the proposed method achieves satisfactory performance improvement.

RFI Suppression Analysis for the Dense Scene
Comparative experiments for the dense scene are considered here. The imaging result of the RFI-free data shown in Figure 9a,b illustrates the image when it's contaminated by RFI with the SIR being adjusted to −20 dB. The size of the figures is 1536 × 2048. Due to the presence of strong RFIs, the texture information of the target city area is completely invisible. The enlarged area, which is originally an area of water, only presents the severe RFI stripes after being contaminated.

. RFI Suppression Analysis for the Dense Scene
Comparative experiments for the dense scene are considered here. The imaging result of the RFI-free data shown in Figure 9a,b illustrates the image when the contaminated by RFI with the SIR was adjusted to −20 dB. The size of the figures is 1536 × 2048. Due to the presence of strong RFIs, the texture information of the target city area is completely invisible. The enlarged area, which is originally an area of water, only presents the severe RFI stripes after being contaminated. For RFI mitigation, the methods and parameter configurations mentioned earlier in the sparse scene are used here. The −20 dB case shown in Figure 10 is used as the example for a detailed analysis, and the area within the red rectangle is enlarged in Figure 11. By comparison, it is obvious that there are still many streaks left in the imaging results after the use of the traditional NF and ESP methods. The NF method can suppress parts of For RFI mitigation, the methods and parameter configurations mentioned earlier in the sparse scene are used here. The −20 dB case shown in Figure 10 is used as the example for a detailed analysis, and the area within the red rectangle is enlarged in Figure 11. By comparison, it is obvious that there are still many streaks left in the imaging results after the use of the traditional NF and ESP methods. The NF method can suppress parts of obvious RFI with greater power, but fails to deal with the complex interferences. As a result, as shown in Figure 10c, the imaging result is still covered by severe RFI. As for the ESP method, the subspace is firstly constructed based on the RFI-contaminated received signal. Then, based on the amplitude difference in the singular values, respectively, corresponding to the RFI and useful signals, the separation of their subspaces is completed. However, the complexity and diversity of RFI makes it difficult to achieve an accurate subspace separation, resulting in the inevitable RFI remnants shown in Figure 10d. Furthermore, the optimization-based RNN method and DLRM method have made some improvements, but so not solve the problem of insufficient RFI suppression due to their respective inaccurate regularization. The former RNN method considers the low-rank property of RFI for the signal description on one hand; on the other hand, the weighted penalty is also employed by means of the reciprocal of singular values. Nevertheless, noise-like and non-sparse nature of the useful signal are not given proper consideration. As shown in Figure 10e, the direct usage of the non-sparse useful signal, as the sparse regularization leaves it, is in need of further optimization. For the latter DLRM method, the over-completed dictionary was considered to ensure the useful signal a sparser expression. Unfortunately, the SVT algorithm with the constant penalty was used to solve the low-rank minimization issue. The neglected improper punishment problem leads to parts of the RFIs being left in the useful signal, which results in the image still being surrounded by the noise-like RFIs, as illustrated in Figure 10f. Conversely, because both the appropriate selection of sparse regular and the logical punishment of the singular values are concurrently taken into account, the proposed DNLRM method achieves a considerably improved performance. Respectively obtained with the nonconvex functions of Lp and Logarithm, Figure 10g,h obtained a more sufficient RFI suppression performance while properly protecting the useful signals. Moreover, benefiting from the proposed adaptive searching scheme for λ, the considerable performance of DNLRM can be automatically obtained by ADNLRM, where the parameter searching process requires no human participation. Notably, this is true in both cases of the nonconvex function Lp and Logarithm, which verifies the robustness of the ADNLRM and is revealed in Figure 10i,j, respectively, with Lp-ADNLRM and Log-ADNLRM. The above performance comparisons are shown in greater detail in the enlarged view in Figure 11. As can be observed in Figure 11, the outline of the area of water gradually emerges with the improving performance, and the noise in the dark area accordingly decreases. Among all the mentioned methods, the Log-ADNLRM method achieves the most outstanding RFI suppression result, as shown in Figure 11j, which is closest to the ground truth. It is worth noting that the scatterers in the dense scene are more densely distributed than those in the sparse scene, thus presenting a more severe influence on each other in the optimization process. For a quantitative analysis, see Tables 5-7. In a comprehensive view, the proposed method is verified to be superior.   (f) (g) (h) (i) (j) Figure 10. Dense scene: imaging results with SIR being −20 dB. Both (a) and (b) are the imaging results acquired by the RFI-free real SAR data before and after being interfered with by the measured RFI, respectively. The imaging results after RFI suppression are obtained with (c) the NF method, (d) the ESP method, (e) the RNN method, (f) the DLRM method, (g) the Lp -DNLRM method, (h) the Log -DNLRM method, (i) the Lp -ADNLRM method, and (j) the Log -ADNLRM method. Figure 11. Dense scene: imaging results of the corresponding enlarged area in Figure 10 with SIR being −20 dB.   Figure 10 with SIR being −20 dB.

RFI Suppression Analysis against Different SIRs
In this section, the RFI suppression performance with various methods under different SIRs are analyzed. For the sparse scene, the quantitative metrics can refer to Tables 2-4. Similar comparisons for the dense scene are quantitatively shown in Tables 5-7. As the SIR decreases, the details of the imaging scene gradually deteriorate, and the SAR image is gradually dominated by RFI streaks leaving little valuable information. After performing RFI suppression with several methods, we obtain different levels of RFI suppression performance against the varying SIRs.
For the lateral comparison, comprehensively considering the evaluation results under the three metrics, namely NMSE, image entropy, and image contrast, the proposed method was verified to achieve a superior RFI suppression performance. Meanwhile, there were some detailed issues that need to be stated. Compared to the other mentioned methods, the performance improvement of DNLRM-related methods increased with the decrease in SIR. However, in a larger SIR, the advantages of the proposed method were not obvious enough. Taking RNN as the example, the reasons can be explained in the following three aspects. Firstly, due to the direct usage of the non-sparse useful signal as the sparse regularization in the RNN method, distortion correspondingly occurs in the recovered useful signal, and certain jitters appear in the spectrum. Larger SIR, for instance −10 dB, corresponded to the moderate interference, the spectrum of the processed signal was thereby less affected when suppressing RFI, and the influence of RNN processing on the recovered signal was relatively smaller. As the SIR worsened, the received signal was covered by more severe RFI, and its overall spectrum was increasingly affected. When pursuing a more adequate suppression of severe RFI, the signal spectrum will also be more seriously damaged due to the direct usage of the non-sparse useful signal in RNN, which brings nonnegligible errors to the recovered useful signal. Secondly, for moderate RFI, there were fewer large singular values diffusing into the ones that belonged to the useful signal. The RFIs were therefore easier to distinguish from the useful signal, even though the useful signal was not in its sparser form in the RNN. Finally, the core consideration in RFI suppression was to suppress the RFI as much as possible on the premise of protecting useful signals. However, the measured RFI signal was a mixture of interference signals and other noise signals obtained from the actual complex electromagnetic environment, which contained RFI with less low-rank properties and uncontrollable noise. These uncontrollable factors correspondingly influence the performance of RFI suppression. Additionally, although DLRM can achieve lower image entropy in larger SIR, the other two metrics are obviously worse than the proposed method, and the visual effect of the RFI suppression result using the DLRM method is also inferior. These results are due to the improper punishment problem caused by the usage of the constant penalty in DLRM. Furthermore, ADNLRM can achieve a similar considerable performance as DNLRM. Meanwhile, considering the advantage of not requiring human intervention, ADNLRM is more practical and suitable for the actual variable RFI environment.
For a longitudinal comparison, the three employed metrics showed a general trend of gradual deterioration with the decline in SIR, and there are two reasons accounting for this phenomenon. On one hand, the signal obtained in the listen-only mode was actually the combination of RFIs and the noise from the real-time working environment. With the decrease in the SIR, the noise accordingly increased, making the image after the RFI suppression still different from the original RFI-free imaging result. On the other hand, notice the fact that the measured RFI was a mixture of various forms of interference in practice, which was possibly not strictly low-rank. This caused the diffusing of larger singular values along the decreasing direction, making the low-rank decomposition more challenging, and, thus, the RFI remnants. Consequently, for the decreasing SIR, the energy of the residual RFIs and noise gradually increased. The difference between the original image and the RFI suppression result was accordingly expanded, hence the worse metrics. Notably, review Figure 5 that shows the threshold extension for the diffusion effect; the RFIs not being strictly low-rank was partly taken into account in the proposed ADNLRM method. The superiority lying in the implementation principle of ADNLRM method ensured its RFI suppression performance. The above laws are applicable to both the sparse case and the dense case. Furthermore, overall, the latter presented a worse performance for the more serious interactions of the densely distributed scatterers.

RFI-Contaminated Real SAR Data
In order to further verify the practicability of the proposed method, the experiments based on the RFI-contaminated real SAR data are analyzed here. Unlike the previous case, the RFI-contaminated data means that the SAR signal was already contaminated by the RFIs when receiving the echoes, rather than the combination of the RFI-free data with the measured RFI. Such a situation represents the most realistic electromagnetic environment. The RFI-contaminated SAR data was acquired via a P-band airborne SAR system from the Aerospace Information Research Institute of Chinese Academy of Sciences, with the system parameters listed in Table 8. In reality, there were many wireless communication devices operating within the frequencies around P-band. Once an SAR system is affected by the electromagnetic interference, there are severe RFI streaks overlaying the imaging result, as shown in Figure 12a, making it difficult to distinguish the scene information, let alone the subsequent operations. The size of the imaging result is 1638 × 4096. Notably, there is a large difference in the range resolution and azimuth resolution, and to make the height-to-width ratio of the imaging results closer to the actual scene, the figures have been reduced by 0.4 times in azimuth.  Note that there was no access to the original clean data for the RFI-contaminated case, thereby NMSE could be considered for the performance evaluation, in the present study. Instead, the metric of the ratio of signal-to-noise (SNR) was used for an effective evaluation. Specifically, for the RFI suppression results, the energy of the dark background with less scattering information (marked by a yellow rectangle in Figure 12) is chosen as the measurement of the noise, while the useful signal is measured by the energy of the intercepted feature-rich area (marked by a blue rectangle in Figure 12). Calculating the energy in the form of variance, SNR is defined as: where  2 s and  2 n respectively represent the variance of the selected useful signal region and the noise area. Clearly, a larger SNR means less residual interference energy, Note that there was no access to the original clean data for the RFI-contaminated case, thereby NMSE couldn't be considered for the performance evaluation, in the present study. Instead, the metric of the ratio of signal-to-noise (SNR) was used for an effective evaluation. Specifically, for the RFI suppression results, the energy of the dark background with less scattering information (marked by a yellow rectangle in Figure 12) is chosen as the measurement of the noise, while the useful signal is measured by the energy of the intercepted feature-rich area (marked by a blue rectangle in Figure 12). Calculating the energy in the form of variance, SNR is defined as: where σ 2 s and σ 2 n respectively represent the variance of the selected useful signal region and the noise area. Clearly, a larger SNR means less residual interference energy, thus a better RFI suppression result. Meanwhile, image entropy and image contrast were also considered to analyze the image quality. After RFI suppression, the improved image quality indicated corresponding smaller image entropy and larger image contrast.
To obtain the original appearance of the contaminated SAR image, a series of methods were used for suppressing RFIs with the corresponding performance shown in Figure 12. For a clearer visual comparison, the area marked in a red rectangle is enlarged in Figure 13. After being processed by the NF method and the ESP method, the imaging results can present the outline of the target scene, but with nonnegligible RFI remnants and noise. The improved imaging result is obtained via the RNN method, but there are still residual RFI streaks, as can be seen in Figure 13d. Additionally, the imaging result in Figure 13e appears to be shrouded in noise-like residual RFIs, which was obtained by the DLRM method associated with the improper punishment problem. Furthermore, as shown in Figure 13f,g, the imaging results, which are, respectively, obtained via the proposed Lp-DNLRM and Log-DNLRM, demonstrate further improvements to the performance with fewer RFI streaks and noises. Finally, based on the introduced adaptive scheme, the optimal performance were automatically obtained. Figure 13h,i illustrate the RFI suppression results via Lp-ADNLRM and Log-ADNLRM, respectively. The above experiments demonstrated the effectiveness and superiority of the proposed method, even in the realistic environment. The quantitative performance analysis is shown in Table 9, showing that the proposed method has the optimal comprehensive performance. Note that there was no access to the original clean data for the RFI-contaminated case, thereby NMSE could be considered for the performance evaluation, in the present study. Instead, the metric of the ratio of signal-to-noise (SNR) was used for an effective evaluation. Specifically, for the RFI suppression results, the energy of the dark background with less scattering information (marked by a yellow rectangle in Figure 12) is chosen as the measurement of the noise, while the useful signal is measured by the energy of the intercepted feature-rich area (marked by a blue rectangle in Figure 12). Calculating the energy in the form of variance, SNR is defined as: where σ 2 s and σ 2 n respectively represent the variance of the selected useful signal region and the noise area. Clearly, a larger SNR means less residual interference energy, thus a better RFI suppression result. Meanwhile, image entropy and image contrast were also considered to analyze the image quality. After RFI suppression, the improved image quality indicated corresponding smaller image entropy and larger image contrast.
To obtain the original appearance of the contaminated SAR image, a series of methods were used for suppressing RFIs with the corresponding performance shown in Figure  12. For a clearer visual comparison, the area marked in a red rectangle is enlarged in Figure 13. After being processed by the NF method and the ESP method, the imaging results can present the outline of the target scene, but with nonnegligible RFI remnants and noise. The improved imaging result is obtained via the RNN method, but there are still residual RFI streaks, as can be seen in Figure 13d. Additionally, the imaging result in Figure 13e appears to be shrouded in noise-like residual RFIs, which was obtained by the DLRM method associated with the improper punishment problem. Furthermore, as shown in Figure 13f,g, the imaging results, which are, respectively, obtained via the proposed Lp -DNLRM and Log -DNLRM, demonstrate further improvements to the performance with fewer RFI streaks and noises. Finally, based on the introduced adaptive scheme, the optimal performance were automatically obtained. Figure 13h,i illustrate the RFI suppression results via Lp -ADNLRM and Log -ADNLRM, respectively. The above experiments demonstrated the effectiveness and superiority of the proposed method, even in the realistic environment. The quantitative performance analysis is shown in Table 9, showing that the proposed method has the optimal comprehensive performance.    In addition, to illustrate the applicability of the proposed method to varying RFI suppression missions, another set of comparative experiments based on the RFI-contaminated real SAR data was carried out. The imaging results are shown in Figure 14 and enlarged to Figure 15, and the quantitative comparison is correspondingly listed in Table 10. The meanings of the marked boxes in Figure 14 are the same as those in Figure 12. The experimental results here have similar RFI suppression comparison results to the previous experiment, indicating that the proposed method can achieve a better RFI suppression performance and adapt to different RFI situations in reality, which further verifies its practicability.
(g) (h) (i) Figure 13. Enlarged imaging results of the corresponding area are marked within the red rectangle in Figure 12. The size of the figures is 285 × 285. In addition, to illustrate the applicability of the proposed method to varying RFI suppression missions, another set of comparative experiments based on the RFI-contaminated real SAR data was carried out. The imaging results are shown in Figure 14 and enlarged to Figure 15, and the quantitative comparison is correspondingly listed in Table 10. The meanings of the marked boxes in Figure 14 are the same as those in Figure 12. The experimental results here have similar RFI suppression comparison results to the previous experiment, indicating that the proposed method can achieve a better RFI suppression performance and adapt to different RFI situations in reality, which further verifies its practicability.

Model Parameter Analysis
The regularity of parameter λ in the weighting function for affecting the RFI suppression performance is further unearthed in this section. The relationship between λ and NMSEs at different SIRs for the sparse scene is simulated and illustrated in Figure 16,

Model Parameter Analysis
The regularity of parameter λ in the weighting function for affecting the RFI suppression performance is further unearthed in this section. The relationship between λ and NMSEs at different SIRs for the sparse scene is simulated and illustrated in Figure 16, where the noteworthy information is threefold. Firstly, for the optimal NMSE values of the three SIRs, they become worse with the deterioration of SIR. The reasons are clearly explained when analyzing the RFI suppression performance against the different SIRs evaluated earlier. Secondly, as the SIR decreases, the value of the optimal parameter λ marked by red arrows in Figure 16 concomitantly decreases. The reason for this is that a worse SIR means a more serious RFI, which increases the corresponding singular values belonging to the RFIs. To more thoroughly separate the RFIs from the useful signal, the penalty threshold must naturally be smaller for fewer RFI remains, thus a decreasing optimal parameter λ. Finally, the optimal performance is achieved only at the sole optimal parameter λ. There are two reasons accounting for this. The first is that when λ is small, the penalty for the useful signal is accordingly insufficient, namely the under-punishment of the useful signal. This results in the fact that part of the useful signal is divided into RFIs. As for the second reason, once λ increases to a certain extent, the over-punishment of large singular values occurs, thus the incomplete separation of RFIs. Both of them will cause a degradation to the image quality, leading to higher NMSEs. Without the loss of generality, the above regularities can fit in both Lp and Logarithm, which are, respectively, shown in Figure 16a,b. In addition, note that the selection of parameter λ is related to the target scene and the corresponding RFI level. Therefore, to make the DNLRM method more intelligent and practical, our adaptive scheme for parameter λ is further proposed, as described earlier.
cause a degradation to the image quality, leading to higher NMSEs. Without the loss of generality, the above regularities can fit in both Lp and Logarithm , which are, respectively, shown in Figure 16a,b. In addition, note that the selection of parameter λ is related to the target scene and the corresponding RFI level. Therefore, to make the DNLRM method more intelligent and practical, our adaptive scheme for parameter λ is further proposed, as described earlier.

Convergence Analysis
In order to further verify the effectiveness of the proposed DNLRM framework, the convergence is theoretically analyzed here. The main idea of the proof is as follows, which is inspired by [30,31] and modified to specifically fit our optimization model.

Computational Complexity Analysis
In this section, the computational complexity of the mentioned optimization-based approaches is analyzed. Suppose that > r a N N , then for the RNN algorithm [20], one SVD

Convergence Analysis
In order to further verify the effectiveness of the proposed DNLRM framework, the convergence is theoretically analyzed here. The main idea of the proof is as follows, which is inspired by [30,31] and modified to specifically fit our optimization model. Firstly, prove that the sequence {L k , A k , Z k } generated by the proposed DNLRM framework is bounded. Secondly, the Karush-Kuhn-Tucker (KKT) conditions are deduced to be satisfied. Finally, the accumulation point of the bounded sequence {L k , A k , Z k } is proved to be the corresponding stationary point of the original problem. The specific derivation results can be seen in Appendix A.

Computational Complexity Analysis
In this section, the computational complexity of the mentioned optimization-based approaches is analyzed. Suppose that N r > N a , then for the RNN algorithm [20], one SVD operation with O(N r N 2 a ), one vector inversion with O(N a ), and one soft-thresholding operation with O(2N r N a ) are needed. Therefore, the total computational complexity of the RNN algorithm is O(N r N 2 a + N a + 2N r N a ). As for the original dictionary-based low-rank minimization (DLRM) framework [17], it takes one SVD operation with O(N r N 2 a ), which is accompanied by one additional matrix multiplication related to the dictionary D x with O(KN r N a ) to update the low-rank matrix. In addition, one soft-thresholding operation with O(2N r N a ) is needed, which contains two additional matrix multiplications with O(2KN r N a ) to update the sparse matrix. Therefore, the total computational complexity reaches O(N r N 2 a + 3KN r N a + 2N r N a ). The proposed DNLRM algorithm has the same computational complexity as the DLRM algorithm, except for one lightweight operation for calculating the weighed penalty with O(N a ), so we obtain a total computational complexity with O(N r N 2 a + 3KN r N a + 2N r N a + N a ). As for the adaptive selection scheme, ADNLRM, it is implemented based on DNLRM. The modified boxplot-based threshold extension is first conducted to obtain the initial value, which is needed only once outside the loop. Then, in the fine searching process, the number of corresponding searches determines the final computational work, where the complexity of each attempt is the same as DNLRM. The related computational complexities are listed in Table 11. In summary, the proposed DNLRM algorithm effectively improves the RFI suppression performance at the cost of a lightweight increase in the computational complexity.

Relationship with Preceding Optimization Frameworks
Considering the priority of low rank, a series of optimization frameworks were proposed whereas, in most cases, the updating of low-rank RFI matrix was obtained via an SVT operation that treated all the singular values with the same penalty, thus resulting in the improper punishment. Moreover, the useful signal, which was noise-like and non-sparse, can also cause degradation to performance once directly used as the sparse regularization.
To overcome these problems, the DNLRM framework was proposed in this paper by introducing the supergradient of the nonconvex function for calculating the weighted penalty. As a result, the larger singular values can be treated by correspondingly smaller penalties and vice versa. Here, we further emphasized the relationship and difference between the DNLRM framework and some preceding ones.
On one hand, when D x is ignored and the rank function is approximated by the Logrithm function, namely W(σ i (L)) = λln(σ i (L) + γ) with ω(σ i (L)) = λ/(σ i (L) + γ), the DNLRM framework degenerates to: which represents the RNN algorithm. RNN considers the over-punishment problem, but the non-sparsity of the useful signal is not taken into account. The useful signal itself is directly used to act as sparse regularization, hence the restricted recovery performance. When the over-punishment problem is further ignored with W(σ i (L)) = σ i (L), in other words, the rank function is approximated by the nuclear norm, then the DNLRM framework further degrades to min L,X L * + τ X 1 which is the classic RPCA form. When updating the low-rank matrix, the invariable penalty is considered via the SVT operation, which causes the improper punishment problem and the following incomplete RFI separation. On the other hand, if D x is considered but the rank function is still approximated by the nuclear norm, then the DNLRM framework converts back into the DLRM framework, that is: where the improper punishment problem is likewise ignored.
Through the above analysis, as a result of simultaneously considering both the improper punishment problem and the non-sparsity of the useful signal, the proposed DNLRM framework outperforms the frameworks mentioned for comparison, especially in severe RFI situations. Compared with the preceding optimization methods mentioned in the manuscript, our method realizes the unification of them and further develops the strengths. Notably, this paper mainly focuses on the narrowband interference with a low rank, and there are two issues worth considering. For the first issue, although only narrowband interference was analyzed in this paper, the proposed method can also be used for effectively solving wideband interference once it satisfies the low-rank property. For the second issue, the RFI suppression performance was related to the degree to which the interference satisfied the low-rank property, regardless of being narrowband or wideband.

Conclusions
In this paper, a novel DNLRM method was proposed to more effectively suppress RFIs while guaranteeing the fidelity of the useful signal. Both the appropriate sparse regularization of the useful signal and the accurate low-rank approximation of RFIs were simultaneously considered. Innovatively, the advantages of the combined usage of the over-complete dictionary and the nonconvex approximating function were demonstrated theoretically and experimentally. Additionally, the mechanisms of over-punishment and under-punishment were analyzed adequately, and the corresponding solutions with appropriate penalties were provided. Subsequently, the role of weighted penalty based on the super-gradient in singular value punishment was revealed in detail. Afterwards, an adaptive selection scheme for the model parameter was proposed to adapt to different chosen nonconvex functions and the varying severity of RFIs, effectively and conveniently. Sufficient experiment results proved the superiority and practicability of the proposed method, especially those including the case of RFI-contaminated real SAR data, which reflected the most actual electromagnetic environment. Moreover, the proposed method was based on the low-rank assumption on RFI, whose degradation can affect the processing performance accordingly. Furthermore, as the dimension of the SAR data matrix increased, the computational efficiency of the algorithm could be gradually limited, which will be further studied in future work.  Acknowledgments: The authors would like to thank all reviewers and editors for their comments on this paper.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Convergence Analysis of the Proposed DNLRM Method
In Section 5.1, the main idea and detailed steps of convergence analysis for the proposed DNLRM method are presented, and the following is the specific proof process.
then taking the partial derivative of the augmented Lagrangian with respect to A, we obtain Considering that A k+1 is the optimal solution of the A subproblem, A k+1 satisfies the first order optimal condition, namely is bounded and D x is an over-completed dictionary constructed from the discrete time-shifted versions of the transmitted signal with a finite value, so {Z k } is bounded.
Proof of Lemma A2. Based on the updating formulas shown in Section 3.2, the following derivation can be obtained: keep iterating along this way, then we obtain: where U denotes the upper bound of Z p − Z p−1 2 F since {Z k } is bounded, known from Lemma A1. Once ∞ ∑ p=1 µ p + µ p−1 /µ 2 p−1 < ∞, the right-hand side of (A5) is bounded, thus L(L k+1 , A k+1 , Z k ; µ k ) is bounded. Simultaneously, define Θ = ∑ i ω k+1 i σ i (L k+1 ) + τ A k+1 1 , then Recall that L(L k+1 , A k+1 , Z k ; µ k ) and {Z k } are bounded and µ k is incremental, therefore {L k } and {A k } are both bounded.
Theorem A1. Let {L k , A k , Z k } be the sequence generated by the proposed DNLRM framework with {L * , A * , Z * } being a corresponding accumulation point, then {L * , A * , Z * } is a stationary point of the original problem once ∞ ∑ p=1 µ p + µ p−1 /µ 2 p−1 < ∞.