Sparse Regularization with a Non-Convex Penalty for SAR Imaging and Autofocusing

Zi-Yao Zhang; Odysseas Pappas; Igor G. Rizaev; Alin Achim

doi:10.3390/rs14092190

,

and

Visual Information Laboratory, University of Bristol, Bristol BS1 5TE, UK

^*

Author to whom correspondence should be addressed.

Remote Sens.2022, 14(9), 2190;https://doi.org/10.3390/rs14092190

This article belongs to the Special Issue Synthetic Aperture Radar (SAR) Imaging of the Sea Surface: Simulation, Modelling, and Processing

Version Notes

Order Reprints

Review Reports

Abstract

In this paper, SAR image reconstruction with joint phase error estimation (autofocusing) is formulated as an inverse problem. An optimization model utilising a sparsity-enforcing Cauchy regularizer is proposed, and an alternating minimization framework is used to solve it, in which the desired image and the phase errors are estimated alternatively. For the image reconstruction sub-problem (

f

-sub-problem), two methods are presented that are capable of handling the problem’s complex nature. Firstly, we design a complex version of the forward-backward splitting algorithm to solve the

f

-sub-problem iteratively, leading to a complex forward-backward autofocusing method (CFBA). For the second variant, techniques of Wirtinger calculus are utilized to minimize the cost function involving complex variables in the

f

-sub-problem in a direct fashion, leading to Wirtinger alternating minimization autofocusing (WAMA) method. For both methods, the phase error estimation sub-problem is solved by simply expanding and observing its cost function. Moreover, the convergence of both algorithms is discussed in detail. Experiments are conducted on both simulated and real SAR images. In addition to the synthetic scene employed, the other SAR images focus on the sea surface, with two being real images with ship targets, and another two being simulations of the sea surface (one of them containing ship wakes). The proposed method is demonstrated to give impressive autofocusing results on these datasets compared to state-of-the-art methods.

Keywords:

SAR autofocusing; Cauchy regularization; Wirtinger calculus; forward-backward splitting; KL property; sea surface

1. Introduction

Synthetic aperture radar (SAR) has become one of the most-employed modalities in the field of remote sensing, largely due to its capability to collect data in all kinds of weather and lighting conditions. As a coherent imaging radar system often mounted on an airplane or satellite platform, it can transmit signals after frequency modulation to a certain scene on the ground and record the return radar echoes in flight. As with other coherent radar systems, these raw radar returns must then be processed to form an image suitable for visual interpretation. A detailed introduction on the working mechanisms of SAR can be found in [1,2].

The SAR data acquisition process is unfortunately frequently plagued by phase errors. Due to inaccuracies in the SAR platform trajectory measurement as well as the possible existence of moving targets in the observed scene, the acquired data (radar returns) will contain phase errors. These phase errors, in turn, result in a defocusing effect in the formed SAR images. Techniques aiming at the direct estimation of these phase errors from the raw SAR data and the removal of them so as to improve the quality of the reconstructed SAR images are called autofocusing techniques. Additionally, extending autofocusing to the case of moving targets, such as ships in motion, is also of particular interest. It can not only address the additional blurring effects caused by target motion, but also address the direct target return displacement effect, where a ship appears disjointed from its wake, making localisation and velocity/heading estimation problematic [3]. An example of such displacement can be seen in Figure 1.

Figure 1. Example of moving ship targets appearing displaced from their wakes in Sentinel-1 image (IW intensity image, VV polarisation).

Among the earliest autofocusing techniques, phase gradient autofocus (PGA) [4] is a very well-known method. It first circularly shifts and windows the data, then uses these processed data to estimate the gradient of the phase errors, and finally integrates the estimations to obtain the phase errors themselves. Mapdrift autofocus is another classical technique for SAR autofocusing [5]. It partitions the whole aperture into several sub-apertures from which the images (maps) are reconstructed, and measures the drift between each pair of maps to obtain the phase errors. These classic methods often serve as the basis for newer methods, such as the variants of the mapdrift and phase gradient autofocus algorithms presented in [6], which employ fractional lower-order moments of the alpha-stable distribution for modelling the phase history data.

Differing from them, various methods based on optimization techniques have been proposed in recent years. Many of these methods belong to one of two categories. The first involves the construction of a sharpness metric to be maximized. For instance, a power function is chosen as the sharpness metric in [7,8], and it is further demonstrated in [8] that when the image is of multiple columns, and under certain statistical assumptions, the perfectly focused image can be closely approximated according to the strong law of large numbers. Image entropy is another popular alternative [9,10,11], though in this case it is minimized (rather than maximized) to enforce sharpness.

The second category of methods adopts an inverse problem approach. Based on a forward observation model associating the corrupted phase history with the underlying SAR image, SAR autofocusing is formulated as an inverse problem, and variational models with a variety of regularizers have been designed to obtain its solution. For instance, Onhon et al. uses the pth power of the approximate

l_{p}

norm as the regularization term and an alternating minimization framework to solve the problem [12]. There are also various methods addressing the problem in a compressive sensing context, such as the majorization-minimization-based method [13,14], iteratively re-weighted augmented Lagrangian-based method [15,16] and conjugate gradient-based method with a cost function involving hybrid regularization terms (approximate

l_{1}

norm and approximate total variation regularization) [17].

Besides these, there are also SAR autofocusing approaches built by directly strengthening traditional SAR imaging methods. For example, an autofocusing method which maximizes a sharpness metric for each pulse in the imaging process of back-projection is proposed in [18] and further extended to the case of moving ship targets [19]. A polar format algorithm-based autofocusing approach [20] which combines [12] with a classical autofocusing method like PGA has been proposed recently as well.

Moreover, with the development of deep learning, deep neural networks have also recently been considered in SAR autofocusing. A recurrent auto-encoder-based SAR imaging network mimicking the behavior of an iterative shrinkage thresholding algorithm (ISTA) is proposed [21], in which the removal of phase errors is achieved by learning the forward model (observation matrix). Another auto-encoder and decoder-based neural network is built in [22], and the motion compensation is achieved by adding an updating step of the observation matrix in their alternating minimization framework.

In this paper, inspired by [12], we formulate the SAR autofocusing problem as an inverse problem. This enables us to jointly estimate the desired SAR image and the unknown phase errors instead of treating autofocusing as a post-processing step after image reconstruction like traditional autofocusing methods do, thus leading to better performance. In the proposed cost function, we use a non-convex Cauchy regularization on the magnitude of the desired image (thus, we call it “magnitude Cauchy”), exploiting its sparsity-enforcing ability and differentiability. However, this inverse problem poses an interesting challenge in comparison to many other inverse problems in imaging in that it deals with real-valued functions with complex variables. To handle its complex nature, we present two methods for SAR image formation and autofocusing under a framework of alternating minimization. The first method is based on a complex-domain adaptation of the forward-backward splitting algorithm, and is called CFBA. The second method is based on Wirtinger calculus [23,24,25,26] which is designed to deal with differentiable (in the sense of [23]) real-valued functions with complex variables. The second method has been introduced in our previous work [27], referred to as the Wirtinger alternating minimization autofocusing (WAMA) method, but we further give a thorough discussion of its convergence and show that it can be extended to the cases of several other regularizers.

When validating the performance of our methods, we mainly focus on datasets representing the sea surface, i.e., real SAR images containing ships and simulated images of sea surface containing sea wave and ship wakes. The autofocusing of this kind of SAR image can be beneficial for many practical applications, including ship detection [28], the characterization of ship wakes [29], and so on. The source code used to produce the presented results has been made available on GitHub at https://github.com/zy-zhangc/SAR-autofocusing-with-a-non-convex-penalty (accessed on 3 March 2022).

The rest of this paper is organized as follows. In Section 2, a brief introduction is given to the data acquisition model for SAR and the associated problem formulation. In Section 3, the proposed forward-backward splitting-based SAR autofocusing method is described in detail, then its convergence is analyzed. In Section 4, the WAMA method is reviewed, and some more discussion is added, including its extension to the cases of other regularizers and its convergence. In Section 5, experimental results on both simulated scenes and real SAR images are shown to demonstrate the effectiveness of the proposed method. Finally, conclusions are presented in Section 6.

2. SAR Data Acquisition Model

In this paper, a SAR platform operating in spotlight mode is considered, whose transmitted signal at each azimuth position can be expressed as:

s (t) = R e {e^{j (ω_{0} t + α t^{2})}},

(1)

where

ω_{0}

is the carrier frequency, 2

α

is the chirp rate, and t is fast time.

The obtained data

r_{m} (t)

at the mth aperture position and the latent SAR image

F (x, y)

can be associated by:

r_{m} (t) = \int \int F (x, y) e^{- j U (x c o s θ + y s i n θ)} d x d y .

(2)

The region over which the integral is computed is

x^{2} + y^{2} \leq L^{2}

, with L being the radius of the circular patch on the ground to be imaged.

θ

is the look angle, and U is defined by

U = \frac{2}{c} (ω_{0} + 2 α (t - τ_{0})),

(3)

with

τ_{0}

being the demodulation time. Let M be the number of aperture positions, K be the number of range samples in each range line, and N be the number of pixels of the desired SAR image; then, the discretized version of this model is

r_{m} = C_{m} f, m = 1, \dots, M,

(4)

where

r_{m}

and

C_{m}

are the

K \times 1

column vector of the phase history and the

K \times N

observation matrix for the mth aperture position, respectively.

f

is the

N \times 1

column vector of the underlying SAR image. Stacking (4) with respect to all the aperture positions, and considering phase errors as well as possible noise, the model becomes

g = C (ϕ) f + n,

(5)

with

g

being the

M K \times 1

vector of the corrupted phase history,

ϕ

being the

M K \times 1

vector of phase errors, and

n

being the

N \times 1

vector of Gaussian white noise.

C (ϕ)

is the

M K \times N

corrupted observation matrix. For more details on this signal model, please see [30,31]. In this paper, only the case of 1D phase errors varying in the azimuth direction is dealt with, which leads to [12]:

C_{m} (ϕ) = e^{j ϕ_{m}} C_{m},

(6)

where

C_{m} (ϕ)

and

ϕ_{m}

are the corrupted observation sub-matrix for the mth aperture position and the phase error for the mth aperture position, respectively. The case of 2D separable phase errors as well as the case of 2D non-separable phase errors can be formulated similarly; see [12]. For simplicity, we set

M K = N

in this paper.

3. The Proposed CFBA Method

3.1. The Optimization Model

We formulate SAR autofocusing as an inverse problem and minimize the following cost function:

J (f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} - λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| f_{i} |}^{2}},

(7)

where

λ

is the regularization parameter and

γ

is the scale parameter for a Cauchy distribution.

The penalty term in (7) is a Cauchy regularization merely imposed on the magnitude of the latent SAR image. Consequently, we refer to it as “magnitude Cauchy regularization”. Like the

l_{p}

norm, it too is a regularization term enforcing statistical sparsity [32], whose effectiveness has already been validated in SAR autofocusing [27], as well as SAR imaging and other inverse problems [33,34].

In (7), the desired SAR image

f

and the phase errors

ϕ

are both unknown. By jointly estimating them, SAR image reconstruction and the removal of phase errors are accomplished simultaneously. To do this, we adopt an alternating minimization autofocusing framework similar to [12]. Specifically,

f

and

ϕ

are updated alternatively by fixing one of them while optimizing the other. This iterative process will terminate when the relative error between

f^{(n)}

and

f^{(n + 1)}

is smaller than

10^{- 3}

.

3.2. Complex Forward-Backward Splitting-Based Method

Overall, in our alternating minimization framework, we iteratively implement the following two steps:

(1) Image reconstruction step: solve the

f

-sub-problem by

f^{(n + 1)} = \arg min_{f \in C^{N}} J (f, ϕ^{(n)});

(8)

(2) Phase error estimation step: solve the

ϕ

-sub-problem by

ϕ^{(n + 1)} = \arg min_{ϕ \in C^{N}} J (f^{(n + 1)}, ϕ) .

(9)

The first step will be detailed in Section 3.2.1, while the second step will be introduced in Section 3.2.2.

3.2.1. Image Reconstruction Step

Under the framework of alternating minimization, each

f

-sub-problem to be solved is

f^{(n + 1)} = \arg min_{f \in C^{N}} {∥ g - C (ϕ^{(n)}) f ∥}_{2}^{2} - λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| f_{i} |}^{2}} .

(10)

Unlike many other inverse problem formulations in computational imaging, Equation (10) is an optimization problem involving a complex unknown vector, and it thus needs to be handled with appropriate mathematical tools. To this end, we design an iterative solution, which is a complex version of the well-known forward-backward splitting algorithm, and thus we call it complex forward-backward autofocusing (CFBA). It holds the benefit that the computation of the involved complex proximity operators can be converted to the computation of real proximity operators related to the magnitudes of the original complex vector’s components. Therefore, the techniques for real optimization regarding proximal operators can be leveraged. As a result, it is also possible to solve (7) with these techniques when the magnitude Cauchy regularization is replaced by certain non-smooth regularizers. Note that this is not always the case with the WAMA algorithm—the ability to be easily generalised to a number of non-smooth regularizers is a distinct advantage of the CFBA method.

Similar to the forward-backward splitting algorithm for a real case, we recast (10) as

f^{(n + 1)} = \arg min_{f \in C^{N}} H (f) + G (f),

(11)

where

H (f) = ∥ g - C (ϕ^{(n)}) {f ∥}_{2}^{2}

and

G (f) = \sum_{i = 1}^{N} r (f_{i}) = - λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| f_{i} |}^{2}}

.

To minimize (11), with a given initial

o^{(0)}

, we iteratively implement the following step:

o^{(k + 1)} = {prox}_{μ G} (o^{(k)} - 2 μ \nabla_{f} H (o^{(k)})),

(12)

with the proximity operator defined by

{prox}_{μ G} (x) = \arg min_{y \in C^{N}} \frac{1}{2} {∥ x - y ∥}_{2}^{2} + μ G (y),

(13)

and the function

\frac{1}{2} {∥ x - y ∥}_{2}^{2} + μ G (y)

on its right-hand side is called a Moreau envelope [35].

f^{(n + 1)}

is given by the final output

o^{(K)}

of this inner iterative loop.

In (12),

H (f)

is a real-valued function with a complex vector variable, which is much discussed in Wirtinger calculus. Therefore, instead of using an ordinary gradient operator defined in a real case ∇, here we use the complex gradient operator

\nabla_{f}

defined by Wirtinger calculus. Its definition is first proposed in [23] and further extended in [25]:

\nabla_{f} h = Ω_{f}^{- 1} {(\frac{\partial h}{\partial f})}^{H} .

(14)

where

Ω_{f}^{- 1}

is a metric tensor. Using Brandwood’s setting, i.e., letting it be equal to the identity matrix, we have

\nabla_{f} h = {(\frac{\partial h}{\partial f})}^{H} .

(15)

As in (14) and (15), in the rest of this paper, we will use

\nabla_{f}

to denote the complex gradient operator defined by Wirtinger calculus, and ∇ to denote the ordinary gradient operator defined in the real case.

Since the cost function is real-valued, according to [25], we have:

{(\frac{\partial h}{\partial f})}^{H} = {\bar{(\frac{\partial h}{\partial f})}}^{T} = {(\frac{\partial h}{\partial \bar{f}})}^{T} = {(\frac{\partial h}{\partial \bar{f_{1}}}, \dots, \frac{\partial h}{\partial \bar{f_{N}}})}^{T} .

(16)

The last term in the right side of (16) can be computed using the chain rule and the definition of the conjugate

R

-derivative, i.e.,

\frac{\partial h}{\partial \bar{f_{i}}} = \frac{1}{2} (\frac{\partial h}{\partial x_{i}} + i \frac{\partial h}{\partial y_{i}})

, with

x_{i}

and

y_{i}

being the real and imaginary part of

f_{i}

, respectively [25].

It should also be noted that in (12), the step size is written as

2 μ

rather than the commonly used

μ

in the real-case forward-backward splitting algorithm. This choice is implied by the relationship between the complex gradient

\nabla_{f}

and real gradient ∇; see e.g., [23]. Furthermore,

μ

should satisfy

μ \leq \frac{1}{L}

, as will be further discussed in Section 3.2.

As a result of Wirtinger calculus, it can be shown that

\nabla_{f} H (f) = C {(ϕ^{(n)})}^{H} (C (ϕ^{(n)}) f - g) .

(17)

As for the computation of (13), we first expand it as

{prox}_{μ G} (x) = \arg min_{y \in C^{N}} \frac{1}{2} \sum_{i = 1}^{N} {| x_{i} - y_{i} |}^{2} - μ λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| y_{i} |}^{2}} .

(18)

Then we solve (18) by independently solving

{prox}_{μ λ r} (x_{i}) = \arg min_{y_{i} \in C} \frac{1}{2} {| x_{i} - y_{i} |}^{2} - μ λ ln \frac{γ}{γ^{2} + {| y_{i} |}^{2}}

(19)

for each

i (i = 1, \dots, N)

.

For (19), observe that the logarithm term in the Moreau envelope merely depends on

| y_{i} |

; thus the solution

y_{i}^{*}

must lie on the line passing through the origin and the input

x_{i}

, as long as

x_{i}

is not 0. When searching among all the

y_{i}

for

y_{i}^{*}

, if

| y_{i} |

is fixed, then only the first term of (19) needs to be minimized, since the second term therein is a constant in this case. To minimize the former, we just need to find the point on a circle with radius

| y_{i} |

in the complex plane which is closest to the fixed point

x_{i}

. Obviously, this point would be the one which also lies on the line passing through the origin and

x_{i}

, if

x_{i}

is not 0. Therefore, the desired point will have the same argument as

x_{i}

. Since the choice of

| y_{i} |

is arbitrary, the argument of

y_{i}^{*}

must be the same as that of

x_{i}

.

However, if

x_{i} = 0

, every

y_{i}

on a certain circle of the complex plane will be the solution of (19). In this case, we set the argument of

y_{i}

as 0, as in [35].

Therefore, the solution of (19) can now be split into two steps. The first is to solve the corresponding real optimization problem which gives

| y_{i}^{*} |

:

| y_{i}^{*} | = \arg min_{y \in R} \frac{1}{2} (| x_{i} {| - y)}^{2} - μ λ ln \frac{γ}{γ^{2} + y^{2}} .

(20)

The second step is to let

y_{i}^{*} = \{\begin{matrix} | y_{i}^{*} | e^{j ϕ_{x_{i}}} & x_{i} \neq 0 \\ | y_{i}^{*} | & x_{i} = 0 \end{matrix}

(21)

where

e^{j ϕ_{x_{i}}} = x_{i} / | x_{i} |

. Similar techniques can also be seen in phase retrieval [35] and SAR imaging [36].

Note that (20) is a non-convex optimization problem. If we compute the gradient of the Moreau envelope in the right-hand side and set it to 0, we will get a cubic equation. It may have three real roots, which stands for three stationary points. Here, however, we can add some constraints to simplify the problem. By restricting

γ > \frac{\sqrt{μ λ}}{2}

, this Moreau envelope becomes strictly convex and thus implies the existence of only one, real stationary point [34]. In this case, the corresponding cubic equation must have a single real root and a pair of complex roots, and since our desired solution is a magnitude, then the solution we seek must be the real root. This real root is given by [33]:

| y_{i}^{*} | = \frac{| x_{i} |}{3} + s + t,

(22)

where

s = \sqrt[3]{\frac{q}{2} + \sqrt{\frac{p^{3}}{27} + \frac{q^{2}}{4}}},

(23)

t = \sqrt[3]{\frac{q}{2} - \sqrt{\frac{p^{3}}{27} + \frac{q^{2}}{4}}},

(24)

p = γ^{2} + 2 μ λ - \frac{| x_{i} |^{2}}{3},

(25)

q = γ^{2} | x_{i} | + \frac{2 | x_{i} |^{3}}{27} - \frac{γ^{2} + 2 μ λ}{3} | x_{i} | .

(26)

3.2.2. Optimization of the Phase Errors

After obtaining each

f^{(n + 1)}

, we use it to compute

ϕ^{(n + 1)}

. For 1D phase errors varying along the azimuth direction, the vector of phase errors can be updated by solving the following sequence of sub-problems concerning its components:

ϕ_{m}^{(n + 1)} = \arg min_{ϕ_{m}} {∥ g_{m} - e^{j ϕ_{m}} C_{m} f^{(n + 1)} ∥}_{2}^{2}, m = 1, \dots, N,

(27)

with

g_{m}

and

C_{m}

being the parts of

g

and

C

corresponding to the mth aperture position. According to [12], the solution is

ϕ_{m}^{(n + 1)} = arctan (\frac{Re {{[f^{(n + 1)}]}^{H} C_{m} g_{m}}}{Im {{[f^{(n + 1)}]}^{H} C_{m} g_{m}}}) .

(28)

The corrupted observation matrix can then be estimated by:

C_{m} (ϕ_{m}^{(n + 1)}) = e^{j ϕ_{m}^{(n + 1)}} C_{m} .

(29)

The cases of 2D phase errors varying in both range direction and cross-range direction can also be solved by similar methods; see [12] for more details. The whole process of the proposed CFBA method is summarized in Algorithm 1.

We point out here that CFBA differs from [33] mainly in two aspects. First, the authors of the other study do not deal with the same problem. The method in [33] is applied to SAR imaging, while CFBA addresses a different and more challenging problem, i.e., SAR autofocusing, which removes phase errors and reconstructs SAR image simultaneously. Second, only part of the solving process in CFBA is inspired by [33]. When facing each f-sub-problem of CFBA, techniques similar to [35,36] are used to tackle the complex nature of the problem, which makes it possible to handle the magnitudes and the arguments of the components of the complex solution separately. On the basis of this, Formulas (22)–(26) in [33] are utilized to facilitate the computation of the magnitudes.

Algorithm 1: CFBA

Initialize $n = 0$ , $f^{(0)} = C^{H} g$ , $ϕ^{(0)} = 0$ , $C (ϕ^{0}) = C$ , and set the values of $γ$ , $λ$ , and $μ$ according to $μ \in (0, \frac{1}{L})$ and $γ \geq \frac{\sqrt{μ λ}}{2}$
while $n < 300$ or $∥ f^{(n + 1)} - f^{(n)} ∥ / ∥ f^{(n)} ∥ > 0.001$ do
1. Compute $f^{(n + 1)}$ by complex forward-backward splitting method, i.e.,
while $k < 500$ or $∥ o^{(k + 1)} - o^{(k)} ∥ / ∥ o^{(k)} ∥ > 0.001$ do
Find $o^{(k + 1)} = {prox}_{λ μ R} (o^{(k)} - μ C {(ϕ^{(n)})}^{H} (C (ϕ^{(n)}) o^{(k)} - g))$ by (20)–(26)
$k = k + 1$
end while
2. Compute $ϕ_{m}^{(n + 1)}$ by (28)
3. Compute $C (ϕ_{m}^{(n + 1)})$ by (29)
4. $n = n + 1$
end while

3.3. Convergence Analysis

For the proposed CFBA method, the issue of convergence is twofold. That is to say, the discussion needs to cover the convergence of the inner complex forward-backward splitting algorithm as well as the convergence of the outer alternating minimization algorithm.

3.3.1. Convergence of the Inner Complex Forward-Backward Splitting Algorithm

In the nth image reconstruction step, we find

f^{(n + 1)} = \arg min_{f \in C^{N}} {∥ g - C (ϕ^{(n)}) f ∥}_{2}^{2} - λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| f_{i} |}^{2}},

(30)

and in each step of the complex FB splitting algorithm which solves (30) iteratively, we compute

o^{(k + 1)} = \arg min_{o \in C^{N}} \frac{1}{2} {∥ o - z^{(k)} ∥}_{2}^{2} - μ λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| o_{i} |}^{2}},

(31)

where

z^{(k)} = o^{(k)} - 2 μ C {(ϕ^{(n)})}^{H} (C (ϕ^{(n)}) o^{(k)} - g) .

(32)

Since

ϕ^{(n)}

is fixed for the nth f-sub-problem, for simplicity of notation, we will denote

C (ϕ^{(n)})

by

C^{(n)}

in this subsection. Using the notations of (11), we denote the cost function in (30) as

J_{n} (f) = H (f) + G (f)

, with

H (f) = ∥ g - C^{(n)} {f ∥}_{2}^{2}

and

G (f) = - λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {| f_{i} |}^{2}}

.

Now, note that for an arbitrary N-dimensional

f = {(f_{1}, \dots, f_{N})}^{T} = {(x_{1} + i y_{1}, \dots, x_{N} + i y_{N})}^{T} \in C^{N}

, we can obtain a

2 N

-dimensional real vector

\tilde{f} = {(x_{1}, \dots, x_{N}, y_{1}, \dots, y_{N})}^{T} \in R^{2 N}

. Conversely, for an arbitrary

2 N

-dimensional

\tilde{f} = {({\tilde{f}}_{1}, \dots, {\tilde{f}}_{N}, {\tilde{f}}_{N + 1}, \dots, {\tilde{f}}_{2 N})}^{T} \in R^{2 N}

, we can obtain a N-dimensional complex vector

f = {(f_{1}, \dots, f_{N})}^{T} = {({\tilde{f}}_{1} + i {\tilde{f}}_{N + 1}, \dots, {\tilde{f}}_{N} + i {\tilde{f}}_{2 N})}^{T} \in C^{N}

. For simplicity, we denote

{\tilde{f}}_{R} = {({\tilde{f}}_{1}, \dots, {\tilde{f}}_{N})}^{T}

and

{\tilde{f}}_{I} = {({\tilde{f}}_{N + 1}, \dots, {\tilde{f}}_{2 N})}^{T}

for

\tilde{f}

. With these notations, we give the following lemma.

Lemma 1.

J_{n} (f)

can be rewritten as

{\tilde{J}}_{n} (\tilde{f})

, which is a real analytic function.

The proof of this lemma is put in Appendix A.

Now, since

\tilde{J_{n}} (\tilde{f})

is real analytic, it satisfies the Kurdyka–Lojasiewicz (KL) property [37], which means that for every

\tilde{f}^{'} \in R^{2 N}

and every bounded neighborhood U of

\tilde{f}^{'}

, there exists

κ \in (0, + \infty)

,

η \in (0, + \infty)

and

θ \in [0, 1)

such that

∥ \nabla \tilde{J_{n}} (\tilde{f}) ∥ \geq κ | \tilde{J_{n}} (\tilde{f}) - \tilde{J_{n}} (\tilde{f}^{'}) |^{θ}

(33)

for every

\tilde{f} \in U \cap {\tilde{f} | | \tilde{J_{n}} (\tilde{f}) - \tilde{J_{n}} (\tilde{f}^{'}) | \leq η}

.

The proof of Lemma 1 also implies the proof of the convergence of the CFBA algorithm from a perspective of real vector variables, because

J_{n} (f)

(a function of the complex

f

) can now be viewed as

\tilde{J_{n}} (\tilde{f})

(a function of real vector variable

\tilde{f}

). If the real forward-backward splitting algorithm minimizing

\tilde{J_{n}} (\tilde{f})

can be connected with the proposed complex forward-backward splitting algorithm minimizing

J_{n} (f)

, then the convergence analysis of the latter will benefit from the convergence analysis of the former.

Theorem 1.

Note that

k \to + \infty

,

o^{(k)}

generated by (31) will converge to some critical point of

J_{n} (f)

. Moreover, denote by

f^{*}

the global minimizer of

J_{n} (f)

; then for each

r > 0

, there exist

u \in (0, r),

δ > 0

such that when the inequalities

min J_{n} (f) < J_{n} (o^{(0)}) < δ + min J_{n} (f)

and

∥ o^{(0)} - f^{*} ∥ < u

hold, the sequence

o^{(k)}

generated for each

J_{n} (f)

will converge to some

o^{*}

with

o^{(k)} \in B (f^{*}, r)

for an arbitrary k and

J_{n} (u^{*}) = min J_{n} (f)

.

Proof.

Let us first formulate the real forward-backward splitting algorithm minimizing

{\tilde{J}}_{n} (\tilde{f})

with respect to

\tilde{f}

. First denote

u^{(0)} = \tilde{o^{(0)}}

. Then in each step, this algorithm finds

\begin{matrix} u^{(k + 1)} = {prox}_{μ \tilde{G}} (w^{(k)}) = \arg min_{u \in R^{2 N}} \frac{1}{2} {∥ u - w^{(k)} ∥}_{2}^{2} - μ λ \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {∥ s_{(i)} u ∥}^{2}}, \end{matrix}

(34)

where

w^{(k)} = u^{(k)} - 2 μ {\tilde{C^{(n)}}}^{T} (\tilde{C^{(n)}} u^{(k)} - \tilde{g}) .

(35)

Observe that

{(C^{(n)})}^{H} f = {(\bar{c_{11}} f_{1} + \dots + \bar{c_{N 1}} f_{N}, \dots, \bar{c_{1 N}} f_{1} + \dots + \bar{c_{N N}} f_{N})}^{T},

(36)

and

\tilde{{(C^{(n)})}^{H} f} = (\begin{matrix} {(c_{11})}_{R} x_{1} + {(c_{11})}_{I} y_{1} + \dots + {(c_{N 1})}_{R} x_{N} + {(c_{N 1})}_{I} y_{N} \\ ⋮ \\ {(c_{1 N})}_{R} x_{1} + {(c_{1 N})}_{I} y_{1} + \dots + {(c_{N N})}_{R} x_{N} + {(c_{N N})}_{I} y_{N} \\ {(- c_{11})}_{I} x_{1} + {(c_{11})}_{R} y_{1} + \dots + {(- c_{N 1})}_{I} x_{N} + {(c_{N 1})}_{R} y_{N} \\ ⋮ \\ {(- c_{1 N})}_{I} x_{1} + {(c_{1 N})}_{R} y_{1} + \dots + {(- c_{N N})}_{I} x_{N} + {(c_{N N})}_{R} y_{N} \end{matrix}) .

(37)

This can be exactly decomposed as

\tilde{{(C^{(n)})}^{H} f} = {\tilde{C^{(n)}}}^{T} \tilde{f},

(38)

where

\tilde{C^{(n)}}

is as in (A5).

Therefore, we have

w^{(0)} = \tilde{o^{(0)}} - 2 μ {\tilde{C^{(n)}}}^{T} (\tilde{C^{(n)}} \tilde{o^{(0)}} - \tilde{g}) = \tilde{o^{(0)}} - 2 μ \tilde{\nabla h (o^{(0)})} = \tilde{z^{(0)}},

(39)

with

z^{(0)}

defined by (32) for CFBA.

If now we rewrite the

u

in (34) by

u = \tilde{o}

, then the Moreau envelope therein becomes a function of

\tilde{o}

. Additionally, if this function is rewritten as a function of

o

, the result will exactly take the form of the Moreau envelope in (31). Moreover, we have

\nabla \tilde{J_{n}} (\tilde{f}) = {({(\frac{\partial \tilde{J_{n}}}{\partial {\tilde{f}}_{R}})}^{T}, {(\frac{\partial \tilde{J_{n}}}{\partial {\tilde{f}}_{I}})}^{T})}^{T}

by the definition of the real gradient and

\nabla_{f} J_{n} (f) = \frac{1}{2} (\frac{\partial \tilde{J_{n}}}{\partial {\tilde{f}}_{R}} + i \frac{\partial \tilde{J_{n}}}{\partial {\tilde{f}}_{I}})

by the definition in Wirtinger calculus. Therefore, if

f

is a stationary point of

J_{n} (f)

, then the corresponding

\tilde{f}

is a stationary point of

\tilde{J_{n}} (\tilde{f})

. Since we have forced convexity of the Moreau envelopes in (34) and (31) by restricting the range of parameters, they will each have only one stationary point. Due to these three conclusions,

u^{(1)} = \tilde{o^{(1)}}

holds.

By induction, following similar deduction, a sequence

o^{(k)}

complying with (31) in the CFBA algorithm and meanwhile satisfying

u^{(k)} = \tilde{o^{(k)}}

for all k can be obtained. Therefore, the convergence of the CFBA algorithm can be analyzed equivalently by discussing this real FB splitting algorithm.

The analysis above also implies that the nth f-sub-problem can be solved by finding the corresponding real solution for

\tilde{J_{n}} (\tilde{f})

and transforming it back to the desired complex solution. However, the proposed CFBA algorithm is more compact in form, since it deals with N dimensional vectors instead of

2 N

dimensional vectors, and it does not require the construction of

\tilde{C}

based on C.

Now,

{\tilde{J}}_{n} (\tilde{f}) = \tilde{G} (\tilde{f}) + \tilde{H} (\tilde{f})

is proper and lower semicontinuous, bounded from below if hyper-parameters are chosen appropriately (by selecting

λ > 0

and

γ < 1

, we have

{\tilde{J}}_{n} (\tilde{f}) > λ N ln γ

), and it satisfies the KL property.

\tilde{H} (\tilde{f})

is finite-valued, differentiable, and has a Lipschitz continuous gradient. Moreover,

\tilde{G} (\tilde{f})

is continuous on its domain. That is to say, all the conditions in Theorem 5.1 of [38] are satisfied. Therefore, according to that theorem, with the conditions aforementioned and the setting

μ \leq \frac{1}{L}

(which guarantees the monotonic decreasing nature of

{\tilde{J}}_{n} (u^{(k)})

), we come to the conclusion that the iterates

u^{(k)}

will converge to some critical point of

{\tilde{J}}_{n} (\tilde{f})

. Therefore, equivalently, the iterates

o^{(k)}

produced by the proposed CFBA method will converge to some critical point of

J_{n} (f)

.

Moreover, let us denote by

{\tilde{f}}^{*}

the global minimizer of

{\tilde{J}}_{n} (\tilde{f})

. According to Theorem 2.12 in [38], for each

r > 0

, there exist

u \in (0, r), δ > 0

such that when the inequalities

∥ u^{(0)} - {\tilde{f}}^{*} ∥ < u

and

min {\tilde{J}}_{n} (\tilde{f}) < {\tilde{J}}_{n} (u^{(0)}) < δ + min {\tilde{J}}_{n} (\tilde{f})

hold, the sequence

u^{(k)}

generated for each

{\tilde{J}}_{n} (\tilde{f})

will converge to some

u^{*}

with

u^{(k)} \in B ({\tilde{f}}^{*}, r)

for arbitrary k and

{\tilde{J}}_{n} (u^{*}) = min {\tilde{J}}_{n} (\tilde{f})

. That is to say, under some assumptions about the initial value

u^{(0)}

, convergence of the sequence

u^{(k)}

to a global minimizer of

{\tilde{J}}_{n} (\tilde{f})

can be obtained. Equivalently, for each

r > 0

, there exist

u \in (0, r), δ > 0

such that when the inequalities

∥ o^{(0)} - f^{*} ∥ < u

and

min J_{n} (f) < J_{n} (o^{(0)}) < δ + min J_{n} (f)

hold, the sequence

o^{(k)}

generated for each

J_{n} (f)

will converge to some

o^{*}

with

o^{(k)} \in B (f^{*}, r)

for arbitrary k and

J_{n} (u^{*}) = min J_{n} (f)

, i.e., the iterates

o^{(k)}

produced by the proposed CFBA method will converge to a global minimizer of

J_{n} (f)

. □

In fact, the above proof, which is elaborated from a real perspective, can also be done alternatively by working on the complex iterates

o^{(k)}

themselves. However, this process is more complicated (see Appendix A).

3.3.2. Convergence of the Outer Alternating Minimization Method

Based on Theorem 1, the following result can be obtained:

Theorem 2.

For each

f

-sub-problem, if the assumptions related to the initial value

o^{(0)}

stated in Theorem 1 are satisfied, with

f^{(n)}

and

ϕ^{(n)}

computed by CFBA,

J (f^{(n)}, ϕ^{(n)})

will converge to a certain value (not necessarily equal to

inf J (f, ϕ)

).

Proof.

For each

f

-sub-problem, if the assumptions related to the initial value

o^{(0)}

stated in the last section are satisfied, the sequence

o^{(k)}

will converge to a global minimizer of

J_{n} (f)

. In this case,

J (f^{(n + 1)}, ϕ^{(n)}) \leq J (f^{(n)}, ϕ^{(n)})

. Additionally, since each

ϕ

-sub-problem has a closed form solution, we have

J (f^{(n + 1)}, ϕ^{(n + 1)}) \leq J (f^{(n + 1)}, ϕ^{(n)})

. As a result,

J (f^{(n + 1)}, ϕ^{(n + 1)}) \leq J (f^{(n)}, ϕ^{(n)})

holds for every n, i.e.,

J (f^{(n)}, ϕ^{(n)})

is a monotonically decreasing sequence. Since it is also bounded below, it will converge to a certain value, though not necessarily equal to

inf J (f, ϕ)

. □

If stronger assumptions are satisfied, better results of convergence can be obtained. For instance, if the five-point property [39,40] holds, i.e., if

J (f, ϕ) + J (f, ϕ^{(n)}) \geq J (f, ϕ^{(n + 1)}) + J (f^{(n + 1)}, ϕ^{(n)})

(40)

holds for every

f, ϕ

, and n, then

lim_{n \to + \infty} J (f^{(n)}, ϕ^{(n)}) = inf J (f, ϕ) .

(41)

4. Wirtinger Alternating Minimization Autofocusing

4.1. The Original Method

In this section, we review the Wirtinger alternating minimization autofocusing (WAMA) method originaly proposed in [27]. After that, we briefly expand on how to extend this method to several other cost functions with the same fidelity term but with different regularizers. Finally, we will discuss the convergence of this method, a topic not covered in previous publications.

The WAMA method also adopts the framework of alternating minimization, including two types of sub-problems to be solved. For each

f

-sub-problem formulated in (10), Wirtinger calculus is used to solve it. On the one hand, Wirtinger calculus is a powerful theory covering the analysis of real-valued functions of complex variables, and within which many real optimization problems can have their complex counterparts defined. On the other hand, it is also a rather elegant approach, due to its ability to address the problem in a concise way. Namely, there is no need to expand the complex variables as real vectors in the computational process.

Specifically, to solve (10), we compute the complex gradient of the cost function therein directly using Wirtinger calculus. Using the notation in Section 3.2.1, we have

{(\nabla_{f} G (f))}_{i} = \frac{λ f_{i}}{γ^{2} + {| f_{i} |}^{2}}, i = 1, \dots, N .

(42)

Since

\nabla_{f} H (f)

is already given in (17), the complex gradient of (10) can be written as:

\nabla_{f} J (f, ϕ) = C {(ϕ^{(n)})}^{H} (C (ϕ^{(n)}) f - g) + λ W (f) f,

(43)

where

W (f) = diag (s),

(44)

s_{i} = \frac{1}{γ^{2} + {| f_{i} |}^{2}}, i = 1, \dots, N .

(45)

Now we set (43) to zero, according to the necessary and sufficient condition for a stationary point of a real-valued complex function [23,25], which leads to

[C {(ϕ^{(n)})}^{H} C (ϕ^{(n)}) + λ W (f)] f = C {(ϕ^{(n)})}^{H} g .

(46)

It is worth pointing out that even though the exact solution of (46) can be obtained, that solution is not necessarily the global minimum of (10) due to the non-convexity of the Cauchy penalty.

Now we rewrite (46) as

A f = b

, where

A = λ W (f) + C {(ϕ^{(n)})}^{H} C (ϕ^{(n)})

and

b = C {(ϕ^{(n)})}^{H} g

. Since

W (f)

depends on

f

, so does

A

. This makes (46) nonlinear with respect to

f

, and it is difficult to find its closed-form solution. However, if we sacrifice some accuracy and approximate the

f

in

W (f)

with the

f

computed during the last iteration of the alternating minimization framework,

A

is converted into a constant matrix, and thus

A f = b

becomes a linear system of equations which can be solved efficiently.

That is to say, when computing an unknown

f^{(n + 1)}

, the actually solved equation is

[C {(ϕ^{(n)})}^{H} C (ϕ^{(n)}) + λ W (f^{(n)})] f^{(n + 1)} = C {(ϕ^{(n)})}^{H} g .

(47)

This equation can be viewed as a fixed-point algorithm with a single iterative step, and its solution can be efficiently obtained by using the conjugate gradient (CG) algorithm [41]. The experimental results in Section 5 imply that the obtained solution is sufficiently good.

As for the

ϕ

-sub-problems, the solutions are the same as that of CFBA introduced in Section 3.1, so they are not presented here. Now, the whole process of the WAMA method can be summarized as Algorithm 2 as follows:

Algorithm 2: WAMA

Initialize $n = 0$ , $f^{(0)} = C^{H} g$ , $ϕ^{(0)} = 0$ , $C (ϕ^{0}) = C$ , and set the values of $γ$ and $λ$
while $n < 300$ or $∥ f^{(n + 1)} - f^{(n)} ∥ / ∥ f^{(n)} ∥ > 0.001$ do
1. Compute $f^{(n + 1)}$ by finding the solution of (47) via CG
2. Compute $ϕ_{m}^{(n + 1)}$ by (28)
3. Compute $C (ϕ_{m}^{(n + 1)})$ by (29)
4. $n = n + 1$
end while

4.2. Extension to Several Other Regularizers

As an extension, the same computational processes can also be followed to handle the cases where the magnitude Cauchy regularization in (7) is replaced by some other

R

-differentiable regularizers [25]. For those cases, Equation (47) will also be obtained, despite the fact that the involved

s

is different. We give several examples as follows:

(1): The pth power of an approximate $l_{p}$ norm
In this case,

$G (f) = λ \sum_{i = 1}^{N} (| f_{i} {|^{2} + β)}^{\frac{p}{2}},$

(48)

and now

$s_{i} = \frac{p}{2 (| f_{i} {|^{2} + β)}^{1 - \frac{p}{2}}}, i = 1, \dots, N .$

(49)

This yields the same algorithm as in this case of [12], but no reference to the literature of Wirtinger calculus is made therein. Therefore, this deduction can be viewed as an alternative perspective to interpret SDA;
(2): Approximate total variation
For approximate total variation, the situation is more complicated. However, the result can still be incorporated in the form of (47). Let $F$ be the 2D $a \times a$ matrix form of the N-dimensional vector $f (N = a \times a)$ ; then

$G (f) = λ \sum_{i = 1}^{a} \sum_{j = 1}^{a} \sqrt{| {(\nabla_{i} F)}_{i, j} |^{2} + {| {(\nabla_{j} F)}_{i, j} |}^{2} + β},$

(50)

with

${(\nabla_{i} F)}_{i, j} = \{\begin{matrix} F_{i, j} - F_{i - 1, j} & i > 1 \\ 0 & i = 1 \end{matrix}$

(51)

${(\nabla_{j} F)}_{i, j} = \{\begin{matrix} F_{i, j} - F_{i, j - 1} & j > 1 \\ 0 & j = 1 \end{matrix}$

(52)

Now

$W (f) = W^{^{'}} D^{^{'}} + W^{^{'}} D^{^{″}} + W^{^{″}} D^{^{‴}} + W^{^{‴}} D^{^{⁗}},$

(53)

$W^{^{'}} = diag (vec (S^{^{'}})), W^{^{″}} = diag (vec (S^{^{″}})), W^{^{‴}} = diag (vec (S^{^{‴}})),$

(54)

where vec is the operation which turns a matrix into a column vector by stacking its columns in order. Additionally,

${(S^{^{'}})}_{i, j} = \frac{1}{2 \sqrt{| {(\nabla_{i} F)}_{i, j} |^{2} + {| {(\nabla_{j} F)}_{i, j} |}^{2} + β}},$

(55)

${(S^{^{″}})}_{i, j} = \frac{1}{2 \sqrt{| {(\nabla_{i} F)}_{i, j + 1} |^{2} + {| {(\nabla_{j} F)}_{i, j + 1} |}^{2} + β}},$

(56)

${(S^{^{‴}})}_{i, j} = \frac{1}{2 \sqrt{| {(\nabla_{i} F)}_{i + 1, j} |^{2} + {| {(\nabla_{j} F)}_{i + 1, j} |}^{2} + β}} .$

(57)

As for $D^{^{'}}, D^{^{″}}, D^{^{‴}}$ , and $D^{^{⁗}}$ , they are matrices that contain only 0, 1, and −1 and are constructed so that they realize the following relations:

$\begin{matrix} D^{^{'}} f = vec ({(\nabla_{i} F)}_{i, j}), D^{^{″}} f = vec ({(\nabla_{j} F)}_{i, j}), \\ D^{^{‴}} f = - vec ({(\nabla_{j} F)}_{i, j + 1}), D^{^{⁗}} f = - vec ({(\nabla_{i} F)}_{i + 1, j}); \end{matrix}$

(58)
(3): Welsh potential
In this case, a $l_{2} - l_{0}$ regularization [42] is imposed on the magnitude of f, and we have

$G (f) = λ \sum_{i = 1}^{N} (1 - e^{- \frac{| f_{i} |^{2}}{2 δ^{2}}}),$

(59)

and

$s_{i} = \frac{e^{- \frac{| f_{i} |^{2}}{2 δ^{2}}}}{2 δ^{2}}, i = 1, \dots, N;$

(60)
(4): Geman–McClure potential
In this case, another variant of $l_{2} - l_{0}$ regularization [42] is imposed on the magnitude of f, and we have

$G (f) = λ \sum_{i = 1}^{N} \frac{| f_{i} |^{2}}{2 δ^{2} + {| f_{i} |}^{2}}$

(61)

and

$s_{i} = \frac{2 δ^{2}}{(2 δ^{2} + | f_{i} {|^{2})}^{2}}, i = 1, \dots, N .$

(62)

4.3. Convergence Analysis

The approximation (47) used for the solution of each image reconstruction step adds much difficulty to the discussion of the convergence of the WAMA method. However, the WAMA method can be analyzed from another perspective, which renders its convergence analysis tractable.

Theorem 3.

As

n \to \infty

, with

f^{(n)}

and

ϕ^{(n)}

computed by WAMA,

J (f^{(n)}, ϕ^{(n)})

will converge to a certain value (not necessarily equal to

inf J (f, ϕ)

).

Proof.

Similar to [43], the key point is the construction of a

K (b, f, ϕ)

such that

{inf}_{b} K (b, f, ϕ) = J (f, ϕ)

, with

J (f, ϕ)

given by (7). Following the theories introduced in [44], this

K (b, f, ϕ)

is constructed as

K (b, f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} - λ \sum_{i = 1}^{N} [(| f_{i} |^{2} + γ^{2}) b_{i} - ln (γ b_{i}) - 1],

(63)

where

b

is an auxiliary vector.

Now, for verification, we let

\frac{\partial K}{\partial b} = 0

to find the

b^{*}

minimizing

K (b, f, ϕ)

for a fixed

f

and

ϕ

. Consequently,

b_{i}^{*} = \frac{1}{γ^{2} + {| f_{i} |}^{2}} .

(64)

Substituting (64) into (63),

K (b^{*}, f, ϕ) = J (f, ϕ)

is obtained, and therefore the equality

{inf}_{b} K (b, f, ϕ) = J (f, ϕ)

is verified.

Therefore, minimizing the original cost function (7) with respect to

f

and

ϕ

is equivalent to minimizing (63) with respect to

b

,

f

, and

ϕ

. If an alternating minimization scheme is imposed directly on

K (b, f, ϕ)

, the procedure will consist of the repetition of the following three steps:

1.: Find $b^{(n + 1)}$ by

$b^{(n + 1)} = \arg min_{b} K (b, f^{(n)}, ϕ^{(n)}) .$

(65)

This leads to:

$b_{i}^{(n + 1)} = \frac{1}{γ^{2} + {| f_{i}^{(n)} |}^{2}};$

(66)
2.: Find $f^{(n + 1)}$ by

$f^{(n + 1)} = \arg min_{f} K (b^{(n + 1)}, f, ϕ^{(n)}) .$

(67)

This leads to:

$[C {(ϕ^{(n)})}^{H} C (ϕ^{(n)}) + λ W] f^{(n + 1)} = C {(ϕ^{(n)})}^{H} g,$

(68)

where

$W = diag (b^{(n + 1)});$

(69)
3.: Find $ϕ^{(n + 1)}$ by

$ϕ^{(n + 1)} = \arg min_{ϕ} K (b^{(n + 1)}, f^{(n + 1)}, ϕ) .$

(70)

This leads to:

$ϕ_{m}^{(n + 1)} = arctan (\frac{Re {{[f^{(n + 1)}]}^{H} C_{m} g_{m}}}{Im {{[f^{(n + 1)}]}^{H} C_{m} g_{m}}}) .$

(71)

Notice that if we combine (66) and (68) as one step, then the Formulas (66), (68) and (71) are exactly the same as (47) and (28). Therefore, the convergence of the WAMA method can be analyzed by discussing this equivalent alternating minimization process.

Therefore, since (66), (68), and (71) all give closed-form solutions or sufficient accuracy (suggested by the convergence analysis; see (73)), we assert that:

K (b^{(n + 1)}, f^{(n + 1)}, ϕ^{(n + 1)}) \leq K (b^{(n)}, f^{(n)}, ϕ^{(n)}) .

(72)

That is to say,

K (b^{(n)}, f^{(n)}, ϕ^{(n)})

is a monotonically decreasing sequence. Since it is bounded below, it will converge to a certain value as n goes to infinity. □

Computationally, according to WAMA, conjugate gradient (CG) method is used to obtain the solution of (68). Denote by

f^{* (n + 1)}

the exact solution of (68), and by

q^{(j)}

(

j = 1, \dots, J

) the iterates in the loop of CG are such that

f^{(n + 1)} = q^{(J)}

. According to [45], if the matrix

A = [C {(ϕ^{(n)})}^{H} C (ϕ^{(n)}) + λ W]

is non-singular, we can get

∥ q^{(j)} - f^{* (n)} ∥_{A}^{2} \leq {∥ q^{(0)} - f^{* (n)} ∥}_{A}^{2} {(\frac{\sqrt{c o n d (A)} - 1}{\sqrt{c o n d (A)} + 1})}^{2 j},

(73)

with

{∥ x ∥}_{A}^{2} = x^{H} A x

, and

c o n d (A)

being the condition number of

A

. That is to say, each set of iterates

q^{(j)}

generated by the conjugate gradient method will converge to its corresponding

f^{* (n)}

as j goes to infinity. Therefore,

f^{(n + 1)}

is of sufficient accuracy. This is also validated by the experimental results in Section 5.

Note that similar analysis can be carried out for the two variants using

l_{2} - l_{0}

regularization and approximate

l_{p}

regularization, as mentioned in Section 4.2.

When the overall cost function takes the form

J (f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} (1 - e^{- \frac{| f_{i} |^{2}}{2 δ^{2}}}),

(74)

the corresponding

K (b, f, ϕ)

is constructed as:

K (b, f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} [(| f_{i} |^{2} - 2 δ^{2}) b_{i} + 2 δ^{2} b_{i} ln (2 δ^{2} b_{i}) + 1] .

(75)

This leads to:

b_{i}^{*} = \frac{e^{- \frac{| f_{i} |^{2}}{2 δ^{2}}}}{2 δ^{2}} .

(76)

When the overall cost function takes the form

J (f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} \frac{| f_{i} |^{2}}{| f_{i} |^{2} + δ^{2}},

(77)

the corresponding

K (b, f, ϕ)

is can then be constructed as:

K (b, f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} [(| f_{i} |^{2} + 2 δ^{2}) b_{i} - 2 \sqrt{2} δ \sqrt{b_{i}} + 1] .

(78)

This leads to:

b_{i}^{*} = \frac{2 δ^{2}}{(| f_{i} {|^{2} + 2 δ^{2})}^{2}} .

(79)

Whereas when the overall cost function takes the form

J (f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} (| f_{i} {|^{2} + β)}^{\frac{p}{2}},

(80)

the corresponding

K (b, f, ϕ)

can be constructed as:

K (b, f, ϕ) = {∥ g - C (ϕ) f ∥}_{2}^{2} + λ \sum_{i = 1}^{N} [b_{i} (| f_{i} |^{2} + β) + \frac{2 - p}{2} {(\frac{2 b_{i}}{p})}^{\frac{p}{p - 2}}] .

(81)

This leads to:

b_{i}^{*} = \frac{p}{2 (| f_{i} {|^{2} + β)}^{1 - \frac{p}{2}}} .

(82)

5. Experimental Results

For the numerical experiments in this paper, the same radar system model as in [12] is used, whose parameters are listed in Table 1.

Table 1. Parameters of the radar system.

In each experiment, this radar system model is used to generate a simulated phase history from a given reflectivity scene. This phase history is then corrupted by adding 1D random phase errors along the azimuth direction. After that, complex white Gaussian noise is added so that the SNR of the data is 25 dB. This corrupted phase history is used to reconstruct a SAR image while correcting for the phase errors.

We compare the performance of four methods in each experiment. The first method is the traditional polar format algorithm [46] which does not involve a process of autofocusing and is therefore expected to result in a blurry formed image as a result of the phase errors added into the simulated phase history. The second method is the sparsity-driven autofocus (SDA) method of [12] (we choose the approximate

l_{1}

norm as the regularizer of their cost function as an example), a state-of-the-art SAR autofocusing technique operating in an inverse problem framework similar to the one proposed in this paper. The remaining two methods are the WAMA method in [27] and the proposed CFBA method.

Apart from visual comparison, two numerical metrics are also computed to better assess the performance of each method. One is the mean square error (MSE) between the reconstructed SAR image (using the corrupted phase history) and the ground truth (the reconstructed SAR image from the uncorrupted phase history). The second metric we employ is the entropy of the reconstructed SAR image as an indicator of sharpness. For both of these two metrics, smaller values indicate better performance. For all the compared methods with tunable parameters, we present the result corresponding to the setting of the parameters which gives the best MSE value for that method.

In the first experiment, we use a simulated scene measuring

32 \times 32

pixels. The visual results are presented in Figure 1, and the numerical results are listed in Table 2. It can be observed that the visual result of the polar format suffers from a severe defocusing effect, making it impossible to discern the targets. However, the reconstructed images by SDA, WAMA, and the proposed CFBA method are all very focused and highly resemble the original scene. Since the values of MSE and entropy for WAMA and the proposed CFBA method are lower than SDA, their results are suggested to be sharper and more similar to the original scene.

Table 2. Numerical evaluation of the experimental results for all the methods.

In the second and the third experiment, a real SAR image from TerraSAR-X and another real SAR image from Sentinel-1 are used. These two images originally contain multiple ships scattered on sea surface. For both experiments, however, due to the high computational burden of CFBA and WAMA even for scenes of moderate size (mainly due to the need to construct the observation matrix

C

), a

64 \times 64

patch is cut from the original SAR image and regarded as an input scene such that one ship target is still discernible in it. The corrupted pseudo-phase history is then generated as described above.

Figure 2 shows the reconstructed images by all 4 methods for the second experiment (in the first row) and the third experiment (in the second row). Table 2 again contains all the corresponding results of the two numerical indices for these scenes. According to Figure 2 and Figure 3, the polar format algorithm once again gives reconstructed results with seriously smeared ship targets, which is especially notable in Figure 2. In contract, SDA, WAMA and the proposed CFBA method can remove phase errors effectively and present focused ship targets, displaying significant improvement over the result of the polar format algorithm. Nevertheless, the results of the numerical indices in Table 2 demonstrate that WAMA and the proposed CFBA method both give results slightly better than that of SDA.

Figure 2. Visual results for Scene 1, a simulated

32 \times 32

scene obtained by various methods. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA.

Figure 3. Visual results for Scene 2 (a

64 \times 64

patch from TerraSAR-X) and Scene 3 (a

64 \times 64

patch from Sentinel-1) obtained by various methods. First row for Scene 2, the second row for Scene 3. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA. (f) Original. (g) Polar format. (h) SDA. (i) WAMA. (j) CFBA.

In the fourth and the fifth experiment, two simulated images of the sea surface were used as the original scene, with the first one only including sea waves, and the second including a travelling ship and its wake as well. Simulated as the scenes are, they are not as simple as Scene 1, which is a mere combination of black and white regions resembling point reflectors, but are rather based on an exquisite model taking the most important SAR imaging effects into account [29]. The scenes are based on a model of the sea surface using the Pierson–Moskowitz spectrum and cosine-squared spreading function with wind speed

V_{w} = 8

m/s for the first image and

V_{w} = 4

m/s for the second image, with waves traveling at

45^{\circ}

relative to the SAR flight direction. For the second image, the size of the ship is 55 m, with 8 m beam and 3 m draft, moving at a velocity of 8 m/s at

45^{\circ}

relative to the SAR flight direction. The original size of both SAR images is

1 \times 1

km with a spatial resolution of 1.25 m, while SAR platform parameters are as follows: the platform altitude is 2.5 km, platform velocity is 125 m/s, the incidence angle is

θ_{r}

=

35^{\circ}

, and signal parameters are X-band (9.65 GHz) and VV polarization. Both scenes shown here are of

128 \times 128

pixels, patches from the original images due to heavy computational burden, and their corresponding corrupted pseudo-phase histories are generated from them in the same way as aforementioned.

The visual results for all 4 methods for the fourth experiment and the fifth experiment are shown in Figure 4 in the first row and the second row, respectively. Since the original pixel values in the images are rather small, for visual convenience, the “imadjust” function in Matlab was used before depicting. For the fourth experiment, the results of SDA, WAMA, and CFBA are with better contrast, i.e., the bright regions in (c), (d), (e) of Figure 4 are brighter than those in (b), and their dark regions are darker. For the fifth experiment, (h), (i), (j) are sharper than (g) and display much more concentrated ship wakes. Meanwhile, the numerical results in Table 2 also demonstrate that CFBA, SDA, and WAMA give comparable performance. In terms of MSE, SDA outperforms WAMA and is outperformed by CFBA. As for entropy values, WAMA is the best.

Figure 4. Visual results for Scene 4 and Scene 5, two

128 \times 128

simulated sea surfaces, obtained by various methods. First row for Scene 4, the second row for Scene 5. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA. (f) Original. (g) Polar format. (h) SDA. (i) WAMA. (j) CFBA.

As for the convergence of CFBA and WAMA, Figure 5 displays how

J (f^{(n)}, ϕ^{(n)})

changes with increasing n until the stopping criterion is satisfied for both the WAMA method and the proposed CFBA method, taking the first three experiments as examples. In each sub-figure, the vertical axis represents

J (f^{(n)}, ϕ^{(n)})

, the value of the cost function (7) computed for

f^{(n)}

and

ϕ^{(n)}

, while the horizontal axis represents the iterative numbers n in the loop of the alternating minimization. The red line represents the case of WAMA, and the blue line represents the case of CFBA. It can be seen that in all three experiments,

J (f^{(n)}, ϕ^{(n)})

decreases monotonically for both CFBA and WAMA. This is consistent with the conclusions of our convergence analysis, and gives an experimental validation for the convergence of CFBA and WAMA in a sense. Moreover, it can be observed that CFBA converges more rapidly, since it reaches the stopping criterion with fewer iterations.

Figure 5. The values of

J (f^{(n)}, ϕ^{(n)})

for WAMA (red lines) and CFBA (blue lines) until the stopping criterion was satisfied. (a) Scene 1. (b) Scene 2. (c) Scene 3.

6. Conclusions

In this paper, an optimization solution regularized by the magnitude Cauchy penalty is proposed for the problem of simultaneously reconstructing and autofocusing a SAR image. An alternating minimization framework named CFBA is proposed to solve this inverse problem, in which the sub-problems related to the desired SAR image are solved by a complex forward-backward splitting method, and its convergence is analyzed. Additionally, the WAMA method based on Wirtinger calculus is revisited and further discussed, in particular in respect to its convergence. Experimental results on simulated phase history data derived from several simulated as well as real SAR images of the sea surface demonstrate that the proposed CFBA method can reconstruct highly focused SAR images and effectively remove phase errors, showing performance competitive with that of WAMA. Our current work focuses on the design of SAR autofocusing approaches in the context of compressive data acquisition, and we will aim to address the problem by combining our current model-based techniques with data-driven methods.

Author Contributions

Methodology, Z.-Y.Z., O.P., A.A.; software, Z.-Y.Z.; resources, O.P., A.A., I.G.R.; writing—original draft preparation, Z.-Y.Z.; writing—review and editing, O.P., A.A., I.G.R.; supervision, A.A., O.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by a Chinese Scholarship Council PhD studentship (to Zhang) and in part by the Engineering and Physical Sciences Research Council (EPSRC) under grant EP/R009260/1 (AssenSAR).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. The Proof of Lemma 1

Proof.

A plain but important fact is that

{∥ f ∥}_{2}^{2} = f^{H} f = {(\tilde{f})}^{T} \tilde{f} = {∥ \tilde{f} ∥}_{2}^{2}

. Therefore, for

H (f)

, we have

H (f) = ∥ g - C^{(n)} {f ∥}_{2}^{2} = {∥ \tilde{g} - \tilde{C^{(n)} f} ∥}_{2}^{2} .

(A1)

Since

C^{(n)} f = {(c_{11} f_{1} + \dots + c_{1 N} f_{N}, \dots, c_{N 1} f_{1} + \dots + c_{N N} f_{N})}^{T},

(A2)

we have

\tilde{C^{(n)} f} = (\begin{matrix} {(c_{11})}_{R} x_{1} - {(c_{11})}_{I} y_{1} + \dots + {(c_{1 N})}_{R} x_{N} - {(c_{1 N})}_{I} y_{N} \\ ⋮ \\ {(c_{N 1})}_{R} x_{1} - {(c_{N 1})}_{I} y_{1} + \dots + {(c_{N N})}_{R} x_{N} - {(c_{N N})}_{I} y_{N} \\ {(c_{11})}_{I} x_{1} + {(c_{11})}_{R} y_{1} + \dots + {(c_{1 N})}_{I} x_{N} + {(c_{1 N})}_{R} y_{N} \\ ⋮ \\ {(c_{N 1})}_{I} x_{1} + {(c_{N 1})}_{R} y_{1} + \dots + {(c_{N N})}_{I} x_{N} + {(c_{N N})}_{R} y_{N} \end{matrix}),

(A3)

where

(c_{i j}), i = 1, \dots, N, j = 1, \dots, N

are elements of

C^{(n)}

, with

{(c_{i j})}_{R}, i = 1, \dots, N,

j = 1, \dots, N

and

{(c_{i j})}_{I}, i = 1, \dots, N, j = 1, \dots, N

being the real and imaginary part of

(c_{i j})

, respectively.

Furthermore, (A3) can be rewritten as

\tilde{C^{(n)} f} = \tilde{C^{(n)}} \tilde{f},

(A4)

where

\tilde{C^{(n)}} = (\begin{matrix} {(c_{11})}_{R} & \dots & {(c_{1 N})}_{R} & {(- c_{11})}_{I} & \dots & {(- c_{1 N})}_{I} \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ {(c_{N 1})}_{R} & \dots & {(c_{N N})}_{R} & {(- c_{N 1})}_{I} & \dots & {(- c_{N N})}_{I} \\ {(c_{11})}_{I} & \dots & {(c_{1 N})}_{I} & {(c_{11})}_{R} & \dots & {(c_{1 N})}_{R} \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ {(c_{N 1})}_{I} & \dots & {(c_{N N})}_{I} & {(c_{N 1})}_{R} & \dots & {(c_{N N})}_{R} \end{matrix}) .

(A5)

As a result,

H (f) = ∥ \tilde{g} - \tilde{C^{(n)}} \tilde{f} ∥_{2}^{2} = \tilde{H} (\tilde{f})

.

Now,

\tilde{H} (\tilde{f})

can be expressed as

\tilde{H} (\tilde{f}) = l (H_{(1)} (\tilde{f}), \dots, H_{(2 N)} (\tilde{f}))

, i.e., the composition of real analytic functions

l (x) = x^{T} x

and

H_{(i)} (\tilde{f}) = {(\tilde{g})}_{i} - {(\tilde{C^{(n)}} \tilde{f})}_{i}, i = 1, \dots, 2 N

. According to [47],

\tilde{H} (\tilde{f})

is a real analytic function of

\tilde{f}

.

On the other hand, for

G (f)

, we have

\begin{matrix} G (f) = - \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {∥ \tilde{f_{i}} ∥}^{2}} = - \sum_{i = 1}^{N} ln \frac{γ}{γ^{2} + {∥ s_{(i)} \tilde{f} ∥}^{2}} = \tilde{G} (\tilde{f}) \end{matrix}

(A6)

where

s_{(i)}

is a

2 \times 2 N

matrix whose first row and second row are the ith row and the

(i + N)

th row of a

2 N \times 2 N

identity matrix, respectively. Therefore, each summed term of

\tilde{G} (\tilde{f})

, i.e.,

- ln \frac{γ}{γ^{2} + {∥ s_{(i)} \tilde{f} ∥}^{2}}

, can be written as

- ln \frac{γ}{γ^{2} + l (G_{(1)} (\tilde{f}), G_{(2)} (\tilde{f}))}

, where

l (x) = x^{T} x

and

G_{(j)} (\tilde{f}) = {(s_{(i)} \tilde{f})}_{j}, j = 1, 2

. That is to say, it is a composition of real analytic functions and thus is itself real analytic. As a result,

\tilde{G} (\tilde{f})

is a real analytic function of

\tilde{f}

.

Based on the above two conclusions,

J_{n} (f) = H (f) + G (f) = \tilde{H} (\tilde{f}) + \tilde{G} (\tilde{f}) = \tilde{J_{n}} (\tilde{f})

is a real analytic function of

\tilde{f}

. □

Appendix B. Another Proof of Theorem 1

It has already been shown in the main text that

{\tilde{J}}_{n} (\tilde{f})

satisfies the KL property. In order to apply this conclusion to

J_{n} (f)

, just notice that the definitions of

\nabla_{f} J_{n} (f)

and

\nabla {\tilde{J}}_{n} (\tilde{f})

imply

∥ \nabla_{f} J_{n} (f) ∥ = \frac{1}{2} ∥ \nabla {\tilde{J}}_{n} (\tilde{f}) ∥

.

Therefore,

J_{n} (f)

is a KL function. It can also be shown that

J_{n} (f) = G (f) + H (f)

is proper, continuous, and bounded from below;

H (f)

is finite-valued, differentiable, and has a Lipschitz continuous gradient;

G (f)

is continuous on its domain. Apart from differentiability, which is defined by Wirtinger calculus in a special way, the rest of these mentioned properties can be easily established in the complex case by directly replacing the real variable in the original definitions with a complex variable.

Now, we continue to show that all the three assumptions for Theorem 4.2 in [38] are satisfied. We point out that this is not a trivial task, because the inner product used in the original proof [38] takes only real values, which does not hold in our complex case.

First, by computing the optimality condition of the Moreau envelope for

J_{n} (f)

, and letting

v^{(k + 1)} \in \partial G (o^{(k + 1)})

, we have:

\begin{matrix} 2 μ v^{(k + 1)} + 2 μ \nabla H (o^{(k)}) + o^{(k + 1)} - o^{(k)} = 0, \end{matrix}

(A7)

and therefore

\begin{matrix} ∥ v^{(k + 1)} + \nabla H (o^{(k)}) ∥ = \frac{1}{2 μ} ∥ o^{(k + 1)} - o^{(k)} ∥ . \end{matrix}

(A8)

For

H (f) = ∥ g - C^{(n)} {f ∥}_{2}^{2}

, according to the convexity of

H (f)

[48] and the property of

\nabla H (f)

[49], we have for any

f_{1}

and

f_{2}

:

H (f_{1}) - H (f_{2}) \leq 2 Re {{(f_{1} - f_{2})}^{H} \nabla H (f_{1})} .

(A9)

Therefore,

H (f_{1}) - H (f_{2}) - 2 Re {{(f_{1} - f_{2})}^{H} \nabla H (f_{2})} \leq 2 Re {{(f_{1} - f_{2})}^{H} (\nabla H (f_{1}) - \nabla H (f_{2}))} .

(A10)

Since

\nabla H (f) = {(C^{(n)})}^{H} (C^{(n)} f - g)

, we have

Re {{(f_{1} - f_{2})}^{H} (\nabla H (f_{1}) - \nabla H (f_{2})} = ∥ C^{(n)} (f_{1} - f_{2}) ∥_{2}^{2} .

(A11)

Therefore, the right side of (A10) is real, and thus

\begin{matrix} H (f_{1}) - H (f_{2}) - 2 Re {{(f_{1} - f_{2})}^{H} \nabla H (f_{2})} \leq 2 {(f_{1} - f_{2})}^{H} (\nabla H (f_{1}) - \nabla H (f_{2})) \\ \leq 2 ∥ (f_{1} - f_{2}) ∥ ∥ \nabla H (f_{1}) - \nabla H (f_{2} ∥ \leq 2 L ∥ f_{1} - f_{2} ∥_{2}^{2} . \end{matrix}

(A12)

As a result, we have

\begin{matrix} H (o^{(k + 1)}) \leq H (o^{(k)}) + 2 Re {{(o^{(k + 1)} - o^{(k)})}^{H} \nabla H (o^{(k)})} + 2 L {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} . \end{matrix}

(A13)

On the other hand, from the definition of a proximal operator,

μ G (o^{(k + 1)}) + \frac{1}{2} ∥ o^{(k + 1)} - o^{(k)} + 2 μ \nabla H (o^{(k)}) ∥_{2}^{2} \leq μ G (o^{(k)}) + \frac{1}{2} {∥ 2 μ \nabla H (o^{(k)}) ∥}_{2}^{2} .

(A14)

Expanding (A14) yields

μ G (o^{(k + 1)}) + \frac{1}{2} {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} + 2 μ Re {{(o^{(k + 1)} - o^{(k)})}^{H} \nabla H (o^{(k)})} \leq μ G (o^{(k)}),

(A15)

and therefore

G (o^{(k + 1)}) + \frac{1}{2 μ} {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} + 2 Re {{(o^{(k + 1)} - o^{(k)})}^{H} \nabla H (o^{(k)})} \leq G (o^{(k)}),

(A16)

Combining (A13) with (A16), we have

\begin{matrix} G (o^{(k + 1)}) + H (o^{(k + 1)}) + \frac{a - 4 L}{2} {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} \\ \leq G (o^{(k + 1)}) + H (o^{(k)}) + 2 Re {{(o^{(k + 1)} - o^{(k)})}^{H} \nabla H (o^{(k)})} + \frac{a}{2} {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} \\ \leq G (o^{(k)}) + H (o^{(k)}), \end{matrix}

(A17)

with

a = \frac{1}{μ}

. This is actually

J_{n} (o^{(k + 1)}) + \frac{a - 4 L}{2} {∥ o^{(k + 1)} - o^{(k)} ∥}_{2}^{2} \leq J_{n} (o^{(k)})

, as long as

a > 4 L

. In contrast, in the proof, from a real perspective, the corresponding requirement is just

a > L

. Therefore, if some other appropriate techniques are utilized, it may be possible to get an inequality better than (A13) and obtain

a > L

.

Second, using differential rule, we have

v^{(k + 1)} + \nabla h (o^{(k + 1)}) \in \nabla J_{n} (o^{(k + 1)})

.

At last, with (A8), it can be deduced that

\begin{matrix} ∥ v^{(k + 1)} + \nabla H (o^{(k + 1)}) ∥ \leq ∥ v^{(k + 1)} + \nabla H (o^{(k)}) ∥ + ∥ \nabla H (o^{(k + 1)}) - \nabla H (o^{(k)}) ∥ \\ \leq \frac{a}{2} ∥ o^{(k + 1)} - o^{(k)} ∥ + L ∥ o^{(k + 1)} - o^{(k)} ∥ . \end{matrix}

(A18)

Now we are exactly in the case of Theorem 4.2 in [38] and the rest of the proof is similar (just formally substitute the real vectors therein by complex vectors). In conclusion, we can get the same result as Theorem 5.1 in [38], i.e., the convergence of the sequence

o^{(k)}

to a critical point of

J_{n} (f)

.

Moreover, denote by

f^{*}

the global minimizer of

J_{n} (f)

. According to Theorem 2.12 in [38], for each

r > 0

, there exist

u \in (0, r), δ > 0

such that the inequalities

∥ o^{(0)} - f^{*} ∥ < u

and

min J_{n} (f) < J (o^{(0)}) < δ + min J_{n} (f)

imply that the sequence

o^{(k)}

will converge to some

o^{*}

with

o^{(k)} \in B (f^{*}, r)

for arbitrary k and

J_{n} (o^{*}) = min J_{n} (f)

.

References

Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A Tutorial on Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef] [Green Version]
Ouchi, K. Recent Trend and Advance of Synthetic Aperture Radar with Selected Topics. Remote Sens. 2013, 5, 716–807. [Google Scholar] [CrossRef] [Green Version]
Graziano, M.D.; D’Errico, M.; Rufino, G. Wake Component Detection in X-Band SAR Images for Ship Heading and Velocity Estimation. Remote Sens. 2016, 8, 498. [Google Scholar] [CrossRef] [Green Version]
Wahl, D.E.; Eichel, P.; Ghiglia, D.; Jakowatz, C. Phase Gradient Autofocus - a Robust Tool for High Resolution SAR Phase Correction. IEEE Trans. Aerosp. Electron. Syst. 1994, 30, 827–835. [Google Scholar] [CrossRef] [Green Version]
Calloway, T.M.; Donohoe, G.W. Subaperture Autofocus for Synthetic Aperture Radar. IEEE Trans. Aerosp. Electron. Syst. 1994, 30, 617–621. [Google Scholar] [CrossRef]
Tsakalides, P.; Nikias, C. High-resolution Autofocus Techniques for SAR Imaging Based on Fractional Lower-order Statistics. IEE Proc.-Radar Sonar Navig. 2001, 148, 267–276. [Google Scholar] [CrossRef]
Fienup, J.; Miller, J. Aberration Correction by Maximizing Generalized Sharpness Metrics. J. Opt. Soc. Am. A 2003, 20, 609–620. [Google Scholar] [CrossRef]
Morrison, R.L.; Do, M.N.; Munson, D.C. SAR Image Autofocus by Sharpness Optimization: A Theoretical Study. IEEE Trans. Image Process. 2007, 16, 2309–2321. [Google Scholar] [CrossRef]
Kragh, T.J.; Kharbouch, A.A. Monotonic Iterative Algorithm for Minimum-entropy Autofocus. Adapt. Sens. Array Process. Workshop 2006, 40, 1147–1159. [Google Scholar]
Zeng, T.; Wang, R.; Li, F. SAR Image Autofocus Utilizing Minimum-entropy Criterion. IEEE Geosci. Remote Sens. Lett. 2013, 10, 1552–1556. [Google Scholar] [CrossRef]
Kantor, J.M. Minimum Entropy Autofocus Correction of Residual Range Cell Migration. In Proceedings of the 2017 IEEE Radar Conference (RadarConf), Seattle, WA, USA, 8–12 May 2017; pp. 0011–0016. [Google Scholar]
Onhon, N.Ö.; Cetin, M. A Sparsity-driven Approach for Joint SAR Imaging and Phase Error Correction. IEEE Trans. Image Process. 2011, 21, 2075–2088. [Google Scholar] [CrossRef]
Kelly, S.I.; Yaghoobi, M.; Davies, M.E. Auto-focus for Under-sampled Synthetic Aperture Radar. In Proceedings of the Sensor Signal Processing for Defence (SSPD 2012), London, UK, 25–27 September 2012. [Google Scholar]
Kelly, S.; Yaghoobi, M.; Davies, M. Sparsity-based Autofocus for Undersampled Synthetic Aperture Radar. IEEE Trans. Aerosp. Electron. Syst. 2014, 50, 972–986. [Google Scholar] [CrossRef] [Green Version]
Güngör, A.; Cetin, M.; Güven, H.E. An Augmented Lagrangian Method for Autofocused Compressed SAR Imaging. In Proceedings of the 2015 3rd International Workshop on Compressed Sensing Theory and Its Applications to Radar, Sonar and Remote Sensing (CoSeRa), Pisa, Italy, 17–19 June 2015; pp. 1–5. [Google Scholar]
Güngör, A.; Çetin, M.; Güven, H.E. Autofocused Compressive SAR Imaging Based on the Alternating Direction Method of Multipliers. In Proceedings of the 2017 IEEE Radar Conference (RadarConf), Seattle, WA, USA, 8–12 May 2017; pp. 1573–1576. [Google Scholar]
Uḡur, S.; Arıkan, O. SAR Image Reconstruction and Autofocus by Compressed Sensing. Digit. Signal Process. 2012, 22, 923–932. [Google Scholar] [CrossRef]
Ash, J.N. An Autofocus Method for Backprojection Imagery in Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Lett. 2011, 9, 104–108. [Google Scholar] [CrossRef]
Sommer, A.; Ostermann, J. Backprojection Subimage Autofocus of Moving Ships for Synthetic Aperture Radar. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8383–8393. [Google Scholar] [CrossRef]
Kantor, J.M. Polar Format-Based Compressive SAR Image Reconstruction With Integrated Autofocus. IEEE Trans. Geosci. Remote Sens. 2019, 58, 3458–3468. [Google Scholar] [CrossRef]
Mason, E.; Yonel, B.; Yazici, B. Deep learning for SAR image formation. In Algorithms for Synthetic Aperture Radar Imagery XXIV; International Society for Optics and Photonics: San Diego, CA, USA, 2017; Volume 10201, p. 1020104. [Google Scholar]
Pu, W. Deep SAR Imaging and Motion Compensation. IEEE Trans. Image Process. 2021, 30, 2232–2247. [Google Scholar] [CrossRef]
Brandwood, D. A Complex Gradient Operator and its Application in Adaptive Array Theory. IEE Proc. H (Microw. Opt. Antennas) 1983, 130, 11–16. [Google Scholar] [CrossRef]
Van Den Bos, A. Complex Gradient and Hessian. IEE Proc.-Vis. Image Signal Process. 1994, 141, 380–382. [Google Scholar] [CrossRef]
Kreutz-Delgado, K. The Complex Gradient Operator and the CR-calculus. arXiv 2009, arXiv:0906.4835. [Google Scholar]
Bouboulis, P. Wirtinger’s Calculus in General Hilbert Spaces. arXiv 2010, arXiv:1005.5170. [Google Scholar]
Zhang, Z.Y.; Pappas, O.; Achim, A. SAR Image Autofocusing using Wirtinger calculus and Cauchy regularization. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 1455–1459. [Google Scholar]
Pappas, O.; Achim, A.; Bull, D. Superpixel-level CFAR detectors for ship detection in SAR imagery. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1397–1401. [Google Scholar] [CrossRef] [Green Version]
Rizaev, I.G.; Karakuş, O.; Hogan, S.J.; Achim, A. Modeling and SAR Imaging of the Sea Surface: A Review of the State-of-the-Art with Simulations. ISPRS J. Photogramm. Remote Sens. 2022, 187, 120–140. [Google Scholar] [CrossRef]
Çetin, M.; Karl, W.C. Feature-enhanced synthetic aperture radar image formation based on nonquadratic regularization. IEEE Trans. Image Process. 2001, 10, 623–631. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Carrara, W.; Goodman, R.S.; Majewski, R.M. Spotlight Synthetic Aperture Radar: Signal Processing Algorithms; Artech House: Norwood, MA, USA, 1995. [Google Scholar]
McCullagh, P.; Polson, N.G. Statistical Sparsity. Biometrika 2018, 105, 797–814. [Google Scholar] [CrossRef]
Karakuş, O.; Achim, A. On Solving SAR Imaging Inverse Problems Using Non-convex Regularization with a Cauchy-Based Penalty. IEEE Trans. Geosci. Remote Sens. 2021, 59, 5828–5840. [Google Scholar] [CrossRef]
Karakuş, O.; Mayo, P.; Achim, A. Convergence Guarantees for Non-convex Optimisation with Cauchy-based Penalties. IEEE Trans. Signal Process. 2020, 68, 6159–6170. [Google Scholar] [CrossRef]
Soulez, F.; Thiébaut, É.; Schutz, A.; Ferrari, A.; Courbin, F.; Unser, M. Proximity Operators for Phase Retrieval. Appl. Opt. 2016, 55, 7412–7421. [Google Scholar] [CrossRef] [Green Version]
Güven, H.E.; Güngör, A.; Cetin, M. An Augmented Lagrangian Method for Complex-valued Compressed SAR Imaging. IEEE Trans. Comput. Imaging 2016, 2, 235–250. [Google Scholar] [CrossRef]
Attouch, H.; Bolte, J.; Redont, P.; Soubeyran, A. Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz Inequality. Math. Oper. Res. 2010, 35, 438–457. [Google Scholar] [CrossRef] [Green Version]
Attouch, H.; Bolte, J.; Svaiter, B.F. Convergence of Descent Methods for Semi-algebraic and Tame Problems: Proximal Algorithms, Forward–backward Splitting, and Regularized Gauss–Seidel Methods. Math. Program. 2013, 137, 91–129. [Google Scholar] [CrossRef]
Csiszár, I. Information Geometry and Alternating Minimization Procedures. Stat. Decis. 1984, 1, 205–237. [Google Scholar]
Byrne, C.L. Alternating Minimization as Sequential Unconstrained Minimization: A Survey. J. Optim. Theory Appl. 2013, 156, 554–566. [Google Scholar] [CrossRef]
Barrett, R.; Berry, M.; Chan, T.F.; Demmel, J.; Donato, J.; Dongarra, J.; Eijkhout, V.; Pozo, R.; Romine, C.; Van der Vorst, H. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods; SIAM Press: Philadelphia, PA, USA, 1994. [Google Scholar]
Florescu, A.; Chouzenoux, E.; Pesquet, J.C.; Ciuciu, P.; Ciochina, S. A Majorize-minimize Memory Gradient Method for Complex-valued Inverse Problems. Signal Process. 2014, 103, 285–295. [Google Scholar] [CrossRef] [Green Version]
Çetin, M.; Karl, W.C.; Willsky, A.S. Feature-preserving Regularization Method for Complex-valued Inverse Problems with Application to Coherent Imaging. Opt. Eng. 2006, 45, 017003. [Google Scholar]
Geman, D.; Reynolds, G. Constrained Restoration and the Recovery of Discontinuities. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 367–383. [Google Scholar] [CrossRef]
Joly, P.; Meurant, G. Complex Conjugate Gradient Methods. Numer. Algorithms 1993, 4, 379–406. [Google Scholar] [CrossRef]
Walker, J.L. Range-Doppler Imaging of Rotating Objects. IEEE Trans. Aerosp. Electron. Syst. 1980, AES-16, 23–52. [Google Scholar] [CrossRef]
Krantz, S.G.; Parks, H.R. A Primer of Real Analytic Functions; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
Zhang, S.; Xia, Y.; Zheng, W. A Complex-valued Neural Dynamical Optimization Approach and its Stability Analysis. Neural Netw. 2015, 61, 59–67. [Google Scholar] [CrossRef]
Liu, S.; Jiang, H.; Zhang, L.; Mei, X. A Neurodynamic Optimization Approach for Complex-variables Programming Problem. Neural Netw. 2020, 129, 280–287. [Google Scholar] [CrossRef]

Figure 1. Example of moving ship targets appearing displaced from their wakes in Sentinel-1 image (IW intensity image, VV polarisation).

Figure 2. Visual results for Scene 1, a simulated

32 \times 32

scene obtained by various methods. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA.

Figure 3. Visual results for Scene 2 (a

64 \times 64

patch from TerraSAR-X) and Scene 3 (a

64 \times 64

patch from Sentinel-1) obtained by various methods. First row for Scene 2, the second row for Scene 3. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA. (f) Original. (g) Polar format. (h) SDA. (i) WAMA. (j) CFBA.

Figure 4. Visual results for Scene 4 and Scene 5, two

128 \times 128

simulated sea surfaces, obtained by various methods. First row for Scene 4, the second row for Scene 5. (a) Original. (b) Polar format. (c) SDA. (d) WAMA. (e) CFBA. (f) Original. (g) Polar format. (h) SDA. (i) WAMA. (j) CFBA.

Figure 5. The values of

J (f^{(n)}, ϕ^{(n)})

for WAMA (red lines) and CFBA (blue lines) until the stopping criterion was satisfied. (a) Scene 1. (b) Scene 2. (c) Scene 3.

Table 1. Parameters of the radar system.

Carrier Frequency	$2 π \times 10^{10} rad / s$
Chirp Rate	$2 π \times 10^{12} {rad / s}^{2}$
Pulse Duration	$4 \times 10^{- 4} s$
Angular Range	$2.3$ °

Table 2. Numerical evaluation of the experimental results for all the methods.

MSE
Method	Scene 1	Scene 2	Scene 3	Scene 4	Scene 5
SDA	5.4310 $\times 10^{- 6}$	6.4964 $\times 10^{- 5}$	6.3576 $\times 10^{- 5}$	1.3997 $\times 10^{- 5}$	6.8909 $\times 10^{- 6}$
WAMA	1.2227 $\times 10^{- 6}$	6.3029 $\times 10^{- 5}$	5.3663 $\times 10^{- 5}$	2.2250 $\times 10^{- 5}$	7.8785 $\times 10^{- 6}$
CFBA	1.1836 $\times 10^{- 6}$	6.2940 $\times 10^{- 5}$	5.4803 $\times 10^{- 5}$	1.3483 $\times 10^{- 5}$	6.5628 $\times 10^{- 6}$
Entropy
Method	Scene 1	Scene 2	Scene 3	Scene 4	Scene 5
SDA	1.4621	5.4410	5.6918	4.5720	4.2847
WAMA	0.3327	5.4333	5.6641	4.5230	4.2782
CFBA	0.3430	5.4228	5.6602	4.5617	4.2916

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Sparse Regularization with a Non-Convex Penalty for SAR Imaging and Autofocusing

Abstract

1. Introduction

2. SAR Data Acquisition Model

3. The Proposed CFBA Method

3.1. The Optimization Model

3.2. Complex Forward-Backward Splitting-Based Method

3.2.1. Image Reconstruction Step

3.2.2. Optimization of the Phase Errors

3.3. Convergence Analysis

3.3.1. Convergence of the Inner Complex Forward-Backward Splitting Algorithm

3.3.2. Convergence of the Outer Alternating Minimization Method

4. Wirtinger Alternating Minimization Autofocusing

4.1. The Original Method

4.2. Extension to Several Other Regularizers

4.3. Convergence Analysis

5. Experimental Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. The Proof of Lemma 1

Appendix B. Another Proof of Theorem 1

References

Article Metrics

Citations

Article Access Statistics