High-Resolution ISAR Imaging and Autofocusing via 2D-ADMM-Net

: A deep-learning architecture, dubbed as the 2D-ADMM-Net (2D-ADN), is proposed in this article. It provides effective high-resolution 2D inverse synthetic aperture radar (ISAR) imaging under scenarios of low SNRs and incomplete data, by combining model-based sparse reconstruction and data-driven deep learning. Firstly, mapping from ISAR images to their corresponding echoes in the wavenumber domain is derived. Then, a 2D alternating direction method of multipliers (ADMM) is unrolled and generalized to a deep network, where all adjustable parameters in the reconstruction layers, nonlinear transform layers, and multiplier update layers are learned by an end-to-end training through back-propagation. Since the optimal parameters of each layer are learned separately, 2D-ADN exhibits more representation ﬂexibility and preferable reconstruction performance than model-driven methods. Simultaneously, it is able to better facilitate ISAR imaging with limited training samples than data-driven methods owing to its simple structure and small number of adjustable parameters. Additionally, beneﬁting from the good performance of 2D-ADN, a random phase error estimation method is proposed, through which well-focused imaging can be acquired. It is demonstrated by experiments that although trained by only a few simulated images, the 2D-ADN shows good adaptability to measured data and favorable imaging results with a clear background can be obtained in a short time.


Introduction
High-resolution inverse synthetic aperture radar (ISAR) imaging plays a significant role in space situation awareness and air target surveillance [1,2].Under ideal observational environments with high signal-to-noise ratios (SNRs) and complete echo matrices, wellfocused imaging can be acquired by classic techniques such as the range-Doppler (RD) algorithm and the polar formatting algorithm (PFA) [3].For a target with a small radar cross section (RCS) or a long observation distance, however, the SNR of the received echoes is low due to limited transmitted power.In addition, the existence of strong jamming and the resource scheduling of the cognitive radar may result in incomplete data along the range or/and azimuth direction(s).The complex observational environments discussed above, i.e., incomplete data and low SNRs, cause severe performance degradation or even invalidate the available imaging techniques.As ISAR images are generally sparse in the image domain, high-resolution ISAR imaging under complex observational environments based on the theory of sparse signal reconstruction has received intensive attention in the radar imaging community in recent years [4,5], in which the reconstruction of sparse images (i.e., the distribution of dominant scattering centers) from noisy or gapped echoes given the observation dictionary is sought.
In addition, the motion of the target can be decomposed into translational motion and rotational motion, where the former is not beneficial to imaging and needs to be compensated by range alignment and autofocusing.Traditional autofocusing algorithms can obtain satisfying imaging results from complete radar echoes [6].For complex observational environments, however, they cannot achieve good performance due to the deficiency of radar echoes and low SNRs.In recent years, parametric autofocusing techniques [7] have been applied for sparse imaging.Although they are superior to traditional methods under sparse aperture conditions, their performance is dependent on imaging quality.Therefore, in turn, a better sparse reconstruction method is needed.
The available sparse ISAR imaging methods can be divided into three categories: (1) model-driven methods; (2) data-driven methods; and (3) combined model-driven and data-driven methods.Among them, model-driven methods construct the sparse observation model and obtain high-resolution images by l 0 -norm or l 1 -norm optimization.The l 0 -norm optimization, e.g., orthogonal MP (OMP) [8] and smoothed l 0 -norm method [9], cannot guarantee that the solution is sparsest and may converge to the local minima.The l 1norm optimization, e.g., the fast iterative shrinkage-thresholding algorithm (FISTA) [10,11] and alternating direction method of multipliers (ADMM) [12,13], is the convex approximation of the l 0 -norm [14].However, the regularization parameter directly affects the performance and how to determine its optimum value remains an open problem [4].Additionally, vectorized optimization requires long operating times and a large memory storage space.To improve the efficiency, methods based on matrix operations such as 2D-FISTA [15] and 2D-ADMM [16] are proposed.
Data-driven methods solve the nonlinear mapping from echoes to the 2D image by designing and training a deep network [17].In the training process, the target echoes and corresponding ISAR images are adopted as the inputs and the label, respectively, and the loss function is the NMSE between the network output and the label.In order to minimize the loss function (i.e., to obtain the optimal network parameters), the network parameters are randomly initialized and updated by the gradient descent method iteratively until convergence [18].Then, the trained network is applied to generate focused imaging of an unknown target.Facilitated by off-line network training, such methods achieve the reconstruction of multiple images rapidly, and typical networks include the complex-value deep neural network (CV-DNN) [19].Nevertheless, the subjective network design process lacks unified criterion and theoretical support, which makes it difficult to analyze the influence of network structure and parameter settings on reconstruction performance.In addition, the large number of unknown parameters require massive training samples to avoid overfitting.
The combined model-driven and data-driven methods first expand the model-driven methods into a deep network [20], and then utilize only a few training samples to learn the optimal values of the adjustable parameters [21].Finally, they output the focused image of an unknown target by the trained network.Such methods effectively solve the difficulties in: (1) setting proper parameters for model-driven methods; (2) clearly explaining the physical meaning of the network; and (3) generating a large number of training samples to avoid overfitting for data-driven methods.A common technique to expand the modeldriven methods is unrolling [22], which utilizes a finite-layer hierarchical architecture to implement iterations.As a typical imaging network, the deep ADMM network [23] uses measured data for effective training, which is usually limited due to observation conditions, and autofocusing is not considered for sparse aperture ISAR imaging.
To tackle the above-mentioned problems, this article proposes the 2D-ADMM-Net (2D-ADN) to achieve well-focused 2D imaging under complex observational environments, and its key contributions mainly include the following: (a) Mapping from the ISAR images to echoes in the wavenumber domain is established, which forms as a 2D sparse reconstruction problem.Then, the 2D-ADMM method is provided with phase error estimation for focused imaging.(b) Based on the 2D-ADMM, 2D-ADN is designed to include the reconstruction layers, nonlinear transform layers, and multiplier update layers.Then, the adjustable parameters are estimated by minimizing the loss function through back-propagation in the complex domain.(c) Simulation results demonstrate that the 2D-ADN, which is trained by a small number of samples generated from point-scattering model, obtains the best reconstruction performance.For both the complete and incomplete data with low SNRs, the 2D-ADN combined with random phase error estimation obtains better-focused imaging of measured aircraft data with a clearer background than the available methods.
The remainder of this article is organized as follows.Section 2 establishes the sparse observation model for the high-resolution 2D imaging and provides the iterative formulae of 2D-ADMM with random phase error estimation.Section 3 introduces the construction of 2D-ADN in detail.Section 4 gives the network loss function and derives the backpropagation formulae in the complex domain.Section 5 carries out various experiments to prove the effectiveness of 2D-ADN.In Section 6, we discuss the performance of 2D-ADN, and Section 7 concludes the article with suggestions for future work.

2D Modeling
After translational motion compensation [24,25], echoes Y ∈ C P×Q in the wavenumber domain satisfy: where Φ 1 ∈ R P×U is the over-complete range dictionary, X ∈ C U×V is the 2D distribution of the scattering centers, Φ 2 ∈ R V×Q is the over-complete Doppler dictionary, N ∈ C P×Q is the complex noise matrix, and E ∈ C Q×Q is the diagonal random phase error matrix.
In (2), ϕ q denotes phase error of the qth echo.
For convenience, we vectorized (1) as: where y ∈ C PQ is the vector form of Y, ⊗ is the Kronecker product, Φ = (Φ 2 E) T ⊗ Φ 1 ∈ C PQ×UV , x ∈ C UV is the vector form of X, and n ∈ C PQ is the vector form of N.

The 2D-ADMM Method
Finding the optimal solution to (3) is a linear inverse problem, which can be further converted into an unconstrained optimization by introducing the regularization term: where λ is the regularization parameter.
According to the variable splitting technique [26], (4) is equivalent to: and the augmented Lagrangian function is: Remote Sens. 2021, 13, 2326 where •, • is the inner product, α is the vector of Lagrangian multiplier, and ρ is the penalty parameter.ADMM decomposes (6) into three sub-problems by minimizing L ρ (x, z, α) with re- spect to x, z, and α, respectively, where n is the iteration index.Let b = α ρ , the solutions to (7) satisfy: where S(•) is the shrinkage function [27] defined by S(w, τ) = sign(w)max{|w| − τ, 0}, τ is the threshold, and η is an update rate for the Lagrangian multiplier.
The 2D-ADMM estimates the 2D image X (n) by: In addition, the matrix forms of z and b, i.e., Z and B, are calculated by, 1) ; λ/ρ (10)

Phase Error Estimation and Algorithm Summation
The phase error is estimated by optimizing the following objective function [28]: where Y(•, q) is the qth column of Y, and Let the derivative of ( 12) with respect to ϕ q be zero, then: where Re(•) and Im(•) represent the real and the imaginary parts, respectively, and the phase error matrix Ê is constructed by (2).Algorithm 1 summarizes the high-resolution 2D ISAR imaging and autofocusing method based on 2D-ADMM and random phase error estimation, where N denotes the total number of iterations.Analysis and experiments have shown that the choice of λ and ρ has a great influence on the imaging quality, and improper initialization may generate a defocused image with a noisy background.In addition, λ and ρ cannot be adaptively adjusted in each iteration, demonstrating a lack of flexibility.

Structure of 2D-ADN
To tackle the aforementioned problems, we modified 2D-ADMM and expanded it into 2D-ADN.There are similarities between deep networks and iterative algorithms [22] such as 2D-ADMM.Particularly, the matrix multiplication is similar to the linear mapping of the deep network, the shrinkage function is similar to the nonlinear operation, and the adjustable parameters are similar to network parameters.Therefore, the 2D-ADMM algorithm can be unfolded into 2D-ADN.As shown in Figure 1, the network has N stages, and stage n, n ∈ [1, N], represents the nth iteration described by Algorithm 1.Typically, one stage consists of three layers, i.e., the reconstruction layer, the nonlinear transform layer, and the multiplier update layer, which correspond to ( 9), (10), and (11), respectively.The inputs of 2D-ADN are echoes in the 2D wavenumber domain, and the output is the reconstructed 2D high-resolution image.Below, we will derive the forward-propagation formulae of each layer.

Reconstruction Layer
As shown in Figure 2, the inputs of the reconstruction layer are  


, and the output is: where the penalty parameter   n  is the adjustable parameter.
 

Reconstruction Layer
As shown in Figure 2, the inputs of the reconstruction layer are Z (n−1) and B (n−1) , and the output is: 14)   where the penalty parameter ρ (n) is the adjustable parameter.
where the penalty parameter   n  is the adjustable param , the output serves as the input of Z output is adopted as the input of the loss function.

Nonlinear Transform Layer
As shown in Figure 3, the inputs of the nonlinear tran , and the output is: To learn a more flexible non-linear activation functi linear function   PLF S  for the shrinkage function [29].
, where i p and sition and adjustable value of the i th point, respectively.

 
PLF S  on the real and imaginary parts of the complex s For n = 1, Z (0) and B (0) are initialized to zero matrices and thus the output is: For n ∈ [1, N], the output serves as the input of Z (n) and B (n) .For n = N + 1, the output is adopted as the input of the loss function.

Nonlinear Transform Layer
As shown in Figure 3, the inputs of the nonlinear transform layer are X (n) and B (n−1) , and the output is: is initialized as a zero matrix and the To learn a more flexible non-linear activation function, we substituted a piecewise linear function S PLF (•) for the shrinkage function [29].Specifically, S PLF (•) is determined by N c control points p i , q , where p i and q (n) i denote the predefined position and adjustable value of the ith point, respectively.In particular, we performed the S PLF (•) on the real and imaginary parts of the complex signal, respectively.
For n = 1, B (0) is initialized as a zero matrix and the output is: The output of this layer serves as the input of X (n+1) and B (n) .

Multiplier Update Layer
As shown in Figure 4, the inputs of the multiplier update layer are B (n−1) , X (n) , and Z (n) , and the output is: where the learning rate η (n) is an adjustable parameter.
; , The output of this layer serves as the input of   1 n X

Multiplier Update Layer
As shown in Figure 4, the inputs of the multiplier up   n Z , and the output is: where the learning rate   n  is an adjustable parameter.
 , the output of this layer serves as the .For n N  , the output is adopted as the input of the reco

Loss Function
In this article, the loss function   E Θ is defined as t ror (NMSE) between the network output X and the la truth of the scattering center distribution: For n = 1, B (0) is initialized as a zero matrix and the output is: For n ∈ [1, N − 1], the output of this layer serves as the input of X (n+1) , Z (n+1) , and B (n+1) .For n = N, the output is adopted as the input of the reconstruction layer.

Loss Function
In this article, the loss function E(Θ) is defined as the normalized mean square error (NMSE) between the network output X and the label image X gt , i.e., the ground truth of the scattering center distribution: where Y are the input echoes in the wavenumber domain defined by (14), is the set of the adjustable parameters, • F is the Frobenius norm, and Γ = Y, X gt is the training set with card{Γ} = γ.

Back-Propagation
We optimized the parameters of 2D-ADN utilizing the gradient-based limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm.To this end, we computed the gradients of the loss function with respect to Θ = ρ (n) , q (n) i , η (n) through back-propagation over the deep architectures.Following the structures and writing styles of the available literatures dealing with deep networks [30] and the conjugate complex derivative of composite function [31], we derived the back-propagation of the three layers, respectively.To be consistent with the previous definition, the gradients of the matrices were expressed in matrix forms for convenience, and they were also calculated in matrix forms for efficiency.
As shown in Figure 5, the gradient transferred to the reconstruction layer through B (n) and Z (n) for n ∈ [1, N] satisfies: propagation over the deep architectures.Following the s the available literatures dealing with deep networks [30] rivative of composite function [31], we derived the backrespectively.To be consistent with the previous definitio were expressed in matrix forms for convenience, and they forms for efficiency.
As shown in Figure 5, the gradient transferred to th For +1 n N  , the gradient of the loss function transfe where: For n = N + 1, the gradient of the loss function transferred to X (n) equals to: The gradient of ρ (n) is: where: and the gradients transferred to Z (n−1) and B (n−1) are calculated by: where: where 1 represents the matrix with all ones: where: As shown in Figure 6, the gradient transferred to the nonlinear transform layer through B (n) and X (n+1) satisfies: where 1 represents the matrix with all ones: where: As shown in Figure 6, the gradient transferred to through . Back-propagation of the nonlinear transform layer.


, and   n X ar of the piecewise linear function defined in [24].
As shown in Figure 7, the gradient transferred to the

Z
, and satisfies: The gradient calculations of q 1) , and X (n) are consistent with the derivation of the piecewise linear function defined in [29].
For n N  , the gradient transferred from the last rec where: and the gradient of this layer transferred to   For n = N, the gradient transferred from the last reconstruction layer to B (n) is: The gradient of η (n) is: where: and the gradient of this layer transferred to B (n−1) , X (n) and Z (n) are calculated by: where: where: where: The training stage can be summarized as follows: Step1: Define the NMSE between the network output and the label image as the loss function; Step2: Back-propagation.Calculate the gradients of the loss function with respect to the penalty parameter, the piecewise linear function, and the learning rate of each stage; Step3: Utilize the L-BFGS algorithm to update the network parameters according to their current values and the gradients; Step4: Repeat Step2~Step4 until the difference between the loss functions of the two adjacent iterations is less than 10 −6 .
For multiple training samples, we utilized the average gradient and loss function.

2D High-Resolution ISAR Imaging Based on 2D-ADN
According to the above discussions, high-resolution 2D ISAR imaging based on 2D-ADN includes the following steps: Step1: Training set generation.Initialize the phase error matrix E = I, construct Φ 1 and Φ 2 according to the radar parameters and data missing pattern, generate randomly distributed scattering centers X gt with Gaussian amplitudes, calculate Y according to (1), and obtain the data set Γ = Y, X gt .
Step2: Network training.Initialize the adjustable parameters Θ = ρ (n) , q (n) i , η (n) , and utilize Y, X gt to train the network according to Section IV-B.
Step3: Testing.For simulated data, feed echoes in the wavenumber domain into the trained 2D-ADN and obtain the high-resolution image.For measured data with random phase errors, estimate the high-resolution image and random phase errors by the trained 2D-ADN and (13) iteratively until convergence.
As the distribution and the amplitudes of simulated scattering centers mimic true ISAR targets, optimal network parameters suitable for measured data imaging can be learned after network training.By this means, the issue of insufficient measured training data is effectively tackled.
In 2D-ADN the number of stages N determines the network depth.It is observed that the loss function first decreases rapidly and then tends to be stable with the increment of N. Therefore, we choose N according to the convergence condition given below: where E N (Θ) is the loss function of the trained network with N stages, E N−1 (Θ) is the loss function of the trained network with N − 1 stages, and ε is a threshold.For a single iteration, 2D-ADN and 2D-FISTA share the same computational complexity of O(UPQ).As the number of stages in 2D-ADN is much smaller than the number of iterations required for 2D-FISTA to converge, the computational time of 2D-ADN is shorter.

Experimental Results
In this section, we will demonstrate the effectiveness of 2D-ADN by high-resolution ISAR imaging of complete data, incomplete range data, incomplete azimuth data, and 2D incomplete data.The SNR of the range-compressed echoes is set to 0 dB by adding Gaussian noise, and the loss rate of the incomplete data is 50%.
For network training, 40 samples are generated following Step1 in Section IV-C, and a typical label image is shown in Figure 8a.Specifically, the first 20 samples constitute the training set and the rest constitute the test set.Later experiments will demonstrate that a small training set is adequate as unfolded deep networks have the potential to developing efficient high-performance architectures from reasonably sized training sets [22].
incomplete data.The SNR of the range-compressed echoes is set to 0 dB by adding Gaussian noise, and the loss rate of the incomplete data is 50%.
For network training, 40 samples are generated following Step1 in Section IV-C, and a typical label image is shown in Figure 8a.Specifically, the first 20 samples constitute the training set and the rest constitute the test set.Later experiments will demonstrate that a small training set is adequate as unfolded deep networks have the potential to developing efficient high-performance architectures from reasonably sized training sets [22].
In the training stage, the adjustable parameters are initialized as   0.  For the simulated test data, we fed the test samples into the trained 2D-ADN, and calculated the NMSE, the peak signal-to-noise ratio (PSNR), the structure similarity index measure (SSIM), and the entropy of the image (ENT) according to the output and the label for quantitative performance evaluation.In addition, we compared the imaging results of the 2D-FISTA, untrained 2D-ADN, UNet [32], and trained 2D-ADN.In particular, as a data-driven method, the UNet has much more trainable parameters than the 2D-ADN.To avoid overfitting, we generated 1000 samples as the simulated data set, where 800 samples constituted the training set and the rest constituted the test set.The network was trained by 28 epochs, and the training time was 22 min.For 2D-ADN, the training terminated when the relative error of the loss fell below  In the training stage, the adjustable parameters are initialized as ρ (n) = 0.2 and η (n) = 1.In addition, the piecewise linear function is initialized as a soft threshold function with τ = 1/20, and the control points are equally spaced with N c = 101.Then, the 2D-ADN is trained following Section 4.2, and the number of stages is set to 7 according to (40).
For the simulated test data, we fed the test samples into the trained 2D-ADN, and calculated the NMSE, the peak signal-to-noise ratio (PSNR), the structure similarity index measure (SSIM), and the entropy of the image (ENT) according to the output and the label for quantitative performance evaluation.In addition, we compared the imaging results of the 2D-FISTA, untrained 2D-ADN, UNet [32], and trained 2D-ADN.In particular, as a data-driven method, the UNet has much more trainable parameters than the 2D-ADN.To avoid overfitting, we generated 1000 samples as the simulated data set, where 800 samples constituted the training set and the rest constituted the test set.The network was trained by 28 epochs, and the training time was 22 min.For 2D-ADN, the training terminated when the relative error of the loss fell below 10 −6 , and the training time was 19 min.For 2D-FISTA, the parameters were initialized as λ = 0.05 and the algorithm terminated when the image NMSE in adjacent iterations fell below 10 −6 .
Additionally, we fed the measured data of a Yak-42 aircraft with random phase errors into the trained 2D-ADN and obtained the high-resolution imaging in various observation conditions.The original RD image with complete data and high SNR is shown in Figure 8b.The algorithm terminated when the NMSE between adjacent iterations fell below 10 −4 .
The imaging results were obtained with MATLAB coding without optimization, using an Intel i9-10920X 3.50-GHz computer with a 12-core processor.

Complete Data
For the test sample illustrated in Figure 8a, imaging results are shown in Figure 9 and the corresponding metrics are shown in Table 1.It was observed that the image obtained by 2D-FISTA and untrained 2D-ADN are noisy with spurious scattering centers.On the contrary, the image obtained by trained 2D-ADN has the smallest NMSE and ENT, and the highest PSNR and SSIM, demonstrating its superior imaging performance.In addition, 2D-FISTA has the longest running time due to slow convergence.The UNet obtains satisfying denoising performance and has the shortest running time of only 0.01 s.

Complete Data
For the test sample illustrated in Figure 8a, imaging results the corresponding metrics are shown in Table 2.It was observe by 2D-FISTA and untrained 2D-ADN are noisy with spurious contrary, the image obtained by trained 2D-ADN has the sma the highest PSNR and SSIM, demonstrating its superior imag tion, 2D-FISTA has the longest running time due to slow conve satisfying denoising performance and has the shortest running  For measured data of the Yak-42 aircraft, imaging results a the corresponding entropies and running times are shown in T available methods, the trained 2D-ADN obtained better-focu background.In addition, the running time increased because 2  For measured data of the Yak-42 aircraft, imaging results are shown in Figure 10 and the corresponding entropies and running times are shown in Table 2. Compared with the available methods, the trained 2D-ADN obtained better-focused images with a clearer background.In addition, the running time increased because 2D-ADN was implemented multiple times for phase error estimation.The untrained 2D-ADN had the longest running time since the strong background noise hindered fast convergence.

Incomplete Range Data
The data missing pattern of the incomplete range data is shown in Figure 8c, where the white bars denote the available echoes and the black ones denote the missing echoes.
For the same test sample, the reconstruction results are shown in Figure 11, and the metrics for quantitative comparison are shown in Table 4.Still, the trained 2D-ADN demonstrated the best reconstruction performance.

Incomplete Range Data
The data missing pattern of the incomplete range data is shown in Figure 8c, where the white bars denote the available echoes and the black ones denote the missing echoes.
For the same test sample, the reconstruction results are shown in Figure 11, and the metrics for quantitative comparison are shown in Table 3.Still, the trained 2D-ADN demonstrated the best reconstruction performance.

Incomplete Range Data
The data missing pattern of the incomplete range data is show the white bars denote the available echoes and the black ones deno For the same test sample, the reconstruction results are shown metrics for quantitative comparison are shown in Table 4.Still, demonstrated the best reconstruction performance.For the same measured data of the Yak-42 aircraft, the reconstruction results are shown in Figure 12, where the trained 2D-ADN generated better-focused images with a clearer background than other methods.The corresponding entropies and running times are shown in Table 4, where 2D-FISTA has the longest time.For the same measured data of the Yak-42 aircraft, the reconstruction results are shown in Figure 12, where the trained 2D-ADN generated better-focused images with a clearer background than other methods.The corresponding entropies and running times are shown in Table 5, where 2D-FISTA has the longest time.

Incomplete Azimuth Data
The data missing pattern of the incomplete azimuth data is shown Figure 8d.For the same test sample, the reconstruction results are shown in Figure 13, and the metrics for quantitative comparison are shown in Table 6.

Incomplete Azimuth Data
The data missing pattern of the incomplete azimuth data is shown Figure 8d.For the same test sample, the reconstruction results are shown in Figure 13, and the metrics for quantitative comparison are shown in Table 5.For the same measured data, the imaging results are shown in Figure 14, and the corresponding entropies and running times are shown in Table 6.Similarly, 2D-ADN achieved well-focused imaging with the shortest running time.

Incomplete Azimuth Data
The data missing pattern of the incomplete azimuth data is show same test sample, the reconstruction results are shown in Figure 13 quantitative comparison are shown in Table 6.For the same measured data, the imaging results are shown in Figure 14, and the corresponding entropies and running times are shown in Table 7.Similarly, 2D-ADN achieved well-focused imaging with the shortest running time.

2D Incomplete Data
The data missing pattern of the 2D incomplete data is shown in Figure 8e.For the same test sample, the reconstruction results are shown in Figure 15, and the metrics for the quantitative comparison are shown in Table 8.

2D Incomplete Data
The data missing pattern of the 2D incomplete data is shown in Figure 8e.For the same test sample, the reconstruction results are shown in Figure 15, and the metrics for the quantitative comparison are shown in Table 7.To further analyze the reconstruction performance of the proposed method, we designed more experiments using only 25% and 10% of the available data, where the SNRs were set to 0 dB, 5 dB, and 10 dB, respectively.The imaging results are shown in Figures 17 and 18.The corresponding metrics are shown in Tables 9 and 10.To further analyze the reconstruction performance of the proposed method, we designed more experiments using only 25% and 10% of the available data, where the SNRs were set to 0 dB, 5 dB, and 10 dB, respectively.The imaging results are shown in Figures 17 and 18.The corresponding metrics are shown in Tables 10 and 11.To further analyze the reconstruction performance of the proposed method, we designed more experiments using only 25% and 10% of the available data, where the SNRs were set to 0 dB, 5 dB, and 10 dB, respectively.The imaging results are shown in Figures 17 and 18.The corresponding metrics are shown in Tables 10 and 11.It was observed that the imaging quality degraded heavily with the decrease of the available data for SNR of 0 dB.If we raised the SNR to 5 dB or 10 dB, however, the imaging performance improved rapidly.Therefore, the SNR has a greater impact on the imaging quality than the data loss rate.Furthermore, although it inherits the reconstruction performance of 2D-ADMM and utilizes a more flexible piecewise linear function as the denoiser, the 2D-ADN is still sensitive to low SNR.

Choice of the Optimal Regularization Parameter
In 2D-ADMM, it is necessary to perform multiple manual adjustments of the regularization parameter λ and penalty parameter ρ to obtain the best results.In addition, the adjustable parameters of the 2D-ADMM are fixed during iteration, which lacks flexibility.On the contrary, the 2D-ADN learns the optimal parameters of each layer separately, thus having more flexibility and a better reconstruction performance than the 2D-ADMM.
For a data loss rate of 50% and an SNR of 0dB, we obtained the optimal parameters with the minimum NSME by manual tuning, i.e., λ = 0.05 and ρ = 0.8.Imaging results of the optimal 2D-ADMM are shown in Figure 19 and Table 11.It was observed that the quality of the 2D-ADN image was better than the 2D-ADMM image obtained by parameter tuning.
In 2D-ADMM, it is necessary to perform multiple manua larization parameter  and penalty parameter  to obtain th the adjustable parameters of the 2D-ADMM are fixed during i bility.On the contrary, the 2D-ADN learns the optimal param rately, thus having more flexibility and a better reconstruction ADMM.
For a data loss rate of 50% and an SNR of 0dB, we obtaine with the minimum NSME by manual tuning, i.e., 0.05   and of the optimal 2D-ADMM are shown in Figure 19 and Table 12 quality of the 2D-ADN image was better than the 2D-ADMM i eter tuning.

Difference Between 2D-ADMM and 2D-ADN
Through the use of the unrolling method, 2D-ADN devi ADMM algorithm.Figure 20 shows the variations of NMSE for respectively.Figure 21 shows the outputs of each stage for 2D-A

Difference Between 2D-ADMM and 2D-ADN
Through the use of the unrolling method, 2D-ADN deviates from the original 2D-ADMM algorithm.Figure 20 shows the variations of NMSE for 2D-ADMM and 2D-ADN, respectively.Figure 21

Conclusions
This article proposed 2D-ADN for high-resolution 2D ISAR imaging and autofocu ing under complex observational environments.Firstly, the 2D mapping from the ISA images to echoes in the wavenumber domain was established.Then, iteration formula based on the 2D-ADMM were derived for high-resolution ISAR imaging and, combine with the phase error estimation method, an imaging and autofocusing method was pr posed.On this basis, the 2D-ADMM was generalized and unrolled into an N -stage 2D ADN, which consisted of reconstruction layers, nonlinear transform layers, and multipli update layers.The 2D-ADN effectively tackles the parameter adjustment problem model-driven methods and possesses more interpretability than data-driven methods.E periments have shown that after the end-to-end training by randomly generated sampl off-line, the 2D-ADN achieves the better-focused 2D imaging of measured data with ra dom phase errors than the available methods while maintaining computational efficienc Future work will be focused on designing network architectures which incorpora residual translational motion compensation and 2D imaging, and on designing noise an jamming-robust network architectures in a Bayesian framework.

Figure 5 .
Figure 5. Back-propagation of the reconstruction layer.

Figure 5 .
Figure 5. Back-propagation of the reconstruction layer.

Figure 6 .
Figure 6.Back-propagation of the nonlinear transform layer.

Figure 7 .
Figure 7. Back-propagation of the multiplier update layer.
In addition, the piecewise linear function is initialized as a soft threshold function with =1 / 20  , and the control points are equally spaced with 101 c N  .Then, the 2D-ADN is trained following Section 4.2, and the number of stages is set to 7 according to (40).

Figure 8 .
Figure 8.(a) Label image of a test sample.(b) RD image of the Yak-42 aircraft with complete data and high SNR.(c) Data missing pattern of the incomplete range data.(d) Data missing pattern of the incomplete azimuth data.(e) Data missing pattern of the 2D incomplete data.

6 10 
, and the training time was 19 min.For 2D-FISTA, the parameters were initialized as 0.05   and the algorithm terminated when the image NMSE in adjacent iterations fell below 6 10  .Additionally, we fed the measured data of a Yak-42 aircraft with random phase errors into the trained 2D-ADN and obtained the high-resolution imaging in various observation conditions.The original RD image with complete data and high SNR is shown in Figure8b.The algorithm terminated when the NMSE between adjacent iterations fell below 4 10  .

Figure 8 .
Figure 8.(a) Label image of a test sample.(b) RD image of the Yak-42 aircraft with complete data and high SNR.(c) Data missing pattern of the incomplete range data.(d) Data missing pattern of the incomplete azimuth data.(e) Data missing pattern of the 2D incomplete data.

Figure 11 .
Figure 11.Images of the incomplete range data with data missing pattern s are obtained by (a) 2D-FISTA, (b) untrained 2D-ADN, (c) UNet, and (d) tr

Table 2 .
Quantitative performance evaluation for the complete simula

Table 1 .
Quantitative performance evaluation for the complete simulation data.

Table 3 .
Quantitative performance evaluation for the complete measured data.

Table 4 .
Quantitative performance evaluation for the incomplete range simulation data.

Table 2 .
Quantitative performance evaluation for the complete measured data.

Table 3 .
Quantitative performance evaluation for the complete measured

Table 4 .
Quantitative performance evaluation for the incomplete range sim

Table 3 .
Quantitative performance evaluation for the incomplete range simulation data.

Table 5 .
Quantitative performance evaluation for the incomplete range measured data.

Table 6 .
Quantitative performance evaluation for the incomplete azimuth simulation data.

Table 4 .
Quantitative performance evaluation for the incomplete range measured data.

Table 5 .
Quantitative performance evaluation for the incomplete azimuth simulation data.

Table 6 .
Quantitative performance evaluation for the incomplete azimuth si

Table 7 .
Quantitative performance evaluation for the incomplete azimuth measured data.

Table 6 .
Quantitative performance evaluation for the incomplete azimuth measured data.

Table 8 .
Quantitative performance evaluation for the 2D incomplete measured data.

Table 9 .
Quantitative performance evaluation for the 2D incomplete measured data.

Table 9 .
Quantitative performance evaluation for the 2D incomplete measured data.

Table 9 .
Quantitative performance evaluations using 25% available data.

Table 10 .
Quantitative performance evaluations using 10% available data.

Table 12 .
Quantitative performance evaluation for different methods.

Table 11 .
Quantitative performance evaluation for different methods.