Three-Dimensional Microwave Imaging: Fast and Accurate Computations with Block Resolution Algorithms

This paper considers the microwave imaging reconstruction problem, based on additive penalization and gradient-based optimization. Each evaluation of the cost function and of its gradient requires the resolution of as many high-dimensional linear systems as the number of incident fields, which represents a large amount of computations. Since all such systems involve the same matrix, we propose a block inversion strategy, based on the block-biconjugate gradient stabilized (BiCGStab) algorithm, with efficient implementations specific to the microwave imaging context. Numerical experiments performed on synthetic data and on real measurements show that savings in computing time can reach a factor of two compared to the standard, sequential, BiCGStab implementation. Improvements brought by the block approach are even more important for the most difficult reconstruction problems, that is, with high-frequency illuminations and/or highly contrasted objects. The proposed reconstruction strategy is shown to achieve satisfactory estimates for objects of the Fresnel database, even on the most contrasted ones.


Introduction
Microwave imaging aims at estimating the dielectric properties (permittivity and conductivity) of objects illuminated by incident electromagnetic fields [1]. This technique has been used in a wide range of applications such as biomedical imaging [2,3], geophysics [4,5] and nondestructive testing [6,7]. Quantitative inverse scattering methods estimate the complex permittivity map inside the object from a set of measured scattered electromagnetic fields around it. When the object size is comparable to the wavelength, diffraction occurs and the forward scattering model, which defines the scattered field as a function of the dielectric properties at any point of the object, becomes non-linear. The resulting reconstruction problem is ill-posed [8,9], and its resolution becomes particularly difficult with high-frequency illuminations and/or for highly contrasted objects-that is, if the permittivity inside the object strongly differs from that of the background-where the Born approximation is not valid anymore.
Imaging methods usually rely on solving an optimization problem, whose complexity is strongly impacted by the non-linearity of scattering. In order to circumvent this difficulty, several works introduced additional variables, corresponding to the total fields inside the object, which are jointly estimated together Specific tunings of the block-BiCGStab algorithm are discussed, and the overall efficiency of the inversion method is evaluated as a function of the problem difficulty (illuminating frequency and contrast). We show in particular that improvements are even greater as the problem complexity increases. We provide reconstruction results obtained on the Fresnel database [54], where the proposed implementation allows the accurate reconstruction of the most difficult objects.
In Section 2, the 3-D forward scattering model is built using an integral formulation. A discretization procedure based on the method of moments is used, and our inversion formulation based on additive regularization is detailed. In Section 3, we propose efficient adaptations of block-BiCGStab for forward scattering computations, and for their implementation within the reconstruction procedure. In Section 4, algorithmic performance is evaluated with simulated data, as a function of the object complexity, with different illuminating frequencies and contrast levels. In Section 5, our method is applied to real measurements extracted from the Fresnel database [54], where reconstruction quality and computation times are discussed. Concluding comments are given in Section 6.

Forward Scattering Problem
The forward scattering model defines the total electric field in a given volume V containing the scattering object, as a function of the spatial distribution of the object permittivity. We adopt a standard approach based on the integral formulation of the domain equation [1], followed by a discretization procedure based on the method of moments [55]-see for example [14,36,37,43,44] for similar choices. We consider a background medium with constant complex permittivity b , containing the object under study with complex permittivity (r) where r denotes the vector of spatial coordinates. The object is illuminated by an incident wave with angular frequency ω. In the sequel, the time dependency exp(−jωt) of all quantities is omitted. The electric field integral equation gives an implicit relation between the unknown total electric field at any point inside the domain under study and the object permittivity distribution [1]: where E tot (r) and E inc (r) respectively denote the total and the known incident 3-D electric field vectors at point r ∈ V, µ 0 is the vacuum permeability, k b = ω 2 µ 0 b is the wavenumber in the background medium, g(r) = exp(jk b r )/4π r is the Green function and χ(r) = ( (r) − b )/ b is the contrast function. The term ∇∇· represents the gradient of the divergence with respect to r.
Using the discretization procedure based on the method of moments (see [14] for the 3-D case), the volume V is divided into N cubic voxels and the integral Equation (1) turns into: The size-3N vectors e inc and e tot respectively contain the three spatial components of the 3-D incident and total electric field vectors, discretized on the N voxels that are sorted according to the lexicographic order. The size-N vector x corresponds to the contrast function discretized on the N voxels, put in vector form, and X is the 3N × 3N diagonal matrix whose diagonal contains three replicas of x. The 3N × 3N matrix G D corresponds to the discretization of the 3 × 3 Green tensor k 2 b I 3 + ∇∇ g(r − r ) on the N voxels, where I 3 denotes the 3 × 3 identity tensor (see for example [53] for details). From (2), the discretized total field is the solution to the following linear system: where I 3N denotes the 3N × 3N identity matrix. In microwave imaging, the object is successively illuminated with a set of N S sources at different spatial locations, creating incident fields e inc i , i = 1, . . . , N S . Then, computing the corresponding total fields e tot i amounts to solving N S linear systems L x e tot i = e inc i , that is: where all linear systems involve the same system matrix L x . Such a property is the key point for our block resolution strategy.

Observation Model
The scattered field E scat at any point r outside V is defined by the volume integral observation equation: Using the same discretization procedure as in Section 2.1, considering a set of N M measurement points, it becomes: where the size-N M vector e scat collects the measurements of the scattered fields at the receiver locations, and the N M × 3N matrix G O corresponds to the discretization of the Green tensor k 2 b I 3 + ∇∇ g(r − r ), for r at the N voxels in V and r at the measurement points [56]. Combining both domain (3) and observation (6) equations finally yields the observation model:

Inverse Problem Formulation
The microwave imaging inverse problem consists of estimating the contrast function from the measured scattered fields at the receivers. This problem is ill-posed [8,9], notably because the relation between the contrast function and the data, obtained from the two coupled Equations (1) and (5) in the continuous formulation (Equations (2) and (6) in the discrete case), is highly nonlinear.
Here, we adopt a classical approach to inverse problems [13,57], which consists of minimizing the misfit between the data e scat i and the simulated scattered fields e scat i , penalized by an additive regularization term. More precisely, we define the reconstructed contrast function x by: where the regularization term operates on differences of the contrast function at neighboring voxels: where notation i ∼ j selects indices of neighbor voxels and ϕ is the " 2 1 " function ϕ(u) = √ δ 2 + u 2 , where δ is a small positive constant ensuring the differentiability of the cost function, which was set to 10 −2 in all our experiments. Such a penalization favors edge-preserving solutions and leads to a smooth cost function F . Therefore, optimization can be addressed with gradient-based iterative algorithms. The penalization weight γ trades off between fidelity to data and regularization, which is necessary since the problem is ill-posed [8,9]. Its value may depend on the noise level and on the problem size.
Note that alternate regularization strategies, such as multiplicative regularization [33,58], have also been proposed. Here, we chose additive regularization because it was previously shown to be particularly efficient in the case of highly contrasted objects [13]. However, the techniques presented in this paper may also be applied to reconstruction methods based on multiplicative regularization, since both the corresponding objective function and its gradient present similar algebraic characteristics.

Optimization and Computational Issues
We propose to solve the optimization problem (9) with the L-BFGS algorithm, which is an acknowledged method for solving high-dimensional convex optimization problems ( [59], see for example [13] in application to microwave imaging). Each iteration of L-BFGS consists of two steps: computing a descent direction, based on the computation of the gradient at the current point, and finding a convenient step size, for which several evaluations of the cost function may be necessary. The gradient of F reads: where Q = [I N , I N , I N ], the symbol † denotes the conjugate transpose and J i is the Jacobian matrix associated to the i-th data misfit term in (8). It can easily be shown that: where diag{u} stands for the diagonal matrix with diagonal u, such that the gradient reads: where L x denotes the conjugate of L x . Inspection of (8) and (12) reveals that the bulk of the computational effort required for evaluating F and ∇F lies in the evaluation of matrix-vector products involving matrices X and G O , and in the resolution of linear systems with matrices L x and L x . Note that G O is an N M × 3N matrix and this manageable size makes it possible to explicitly store it for direct computations; similarly, X is a 3N × 3N diagonal matrix, which results in small memory and computational requirements. Indeed, the major difficulty lies in the resolution of the N S systems of size 3N × 3N involving matrix L x (computation of L −1 x e inc i ), and in the resolution of another N S similar systems involving matrix ). Note that, in the frequent case where the number of sources exceeds the number of receivers, by virtue of the Lorentz reciprocity theorem (see for example [60]), one can equivalently switch the roles of sources and receivers, then reducing the number of systems to be solved.
In Section 3, we propose a dedicated block resolution strategy in order to jointly solve such multiple systems. This iterative method requires the computation of high-dimensional matrix-vector products involving matrix L x (resp. L x ), that is, involving the 3N × 3N matrix G D (resp. G D ). Note that these products are convolution products, which can be efficiently computed with 3-D Fast Fourier Transform (FFT) algorithms [36,47].

Simultaneous Resolution of Multiple Forward Scattering Problems
We now address the joint resolution of the multiple forward scattering problems (4) by the block-BiCGStab algorithm proposed in [52]. We present the algorithm and we focus on implementation issues for solving one set of multiple scattering problems.

Description of the Block-BiCGStab Algorithm
The pseudo-code implementing the block-BiCGStab algorithm for the joint resolution of (4) is given in Algorithm 1. The notation T, S F represents the Frobenius product of two matrices. For comparison purposes, the sequential implementation with the BiCGStab algorithm, commonly used in microwave imaging, is recalled in Algorithm 2. For an initial guess e tot Choose an arbitrary vectorr 0 3: : 14: k ← k + 1 15: until r (k) / e inc i < tolerance, 16: end for In both algorithms, the most time-consuming operations concern the computation of matrix-vector products of the form L x p = p − G D Xp in steps 5 and 8. Note that the same number (2N S ) of such products are performed by one iteration of block-BiCGStab and N S iterations of BiCGStab. In the remaining steps, a set of N S dot products and scalar-vector products (one per system inversion) in BiCGStab are replaced by one N S -dimensional matrix-vector product in block-BiCGStab. Similarly, a set of N S scalar divisions are replaced by one N S × N S system inversion (steps 6 and 12). Therefore, such block operations generate a slight extra cost in block-BiCGStab compared to BiCGStab. Because of the small size of the involved systems, this remains negligible compared to the computation time required by multiplications with L x . Most of all, block operations introduce some coupling in the auxiliary variables and, subsequently, in the directions that are computed in order to update the solution at step 10. The rationale behind block versions is that coupling the descent directions yields more efficient update steps, and therefore results in a decrease in the total number of iterations and computing time.

Efficient Implementation for Solving Multiple Forward Problems
In its original description [52], the block-BiCGStab algorithm was evaluated on problems involving only a small number (5 to 10) of RHS vectors which were drawn independently, in moderate dimensions (up to 2500 × 2500). In microwave imaging, the number of RHS vectors-the number of measurements-is much higher, and vectors are highly correlated, since they correspond to the different incident illuminating fields, which are spatially close to each other. Moreover, matrices involved in 3-D problems are usually much bigger than those used in [52].
Numerical experiments, whose salient results are reported in Section 4.2, revealed that the conditioning of matrix R † 0 V, which enters steps 6 and 12 of Algorithm 1, has a critical impact on the behavior and convergence of the block-BiCGStab algorithm. Inspection of steps 1 and 5 clearly indicates that the conditioning of V strongly depends on that of R (0) . With the standard initialization E tot (0) = E inc , when two sources i and j are close, the corresponding incident fields e inc i and e inc j which make up the corresponding columns of E tot (0) , become highly correlated; hence, R (0) becomes poorly conditioned and R † 0 V is near-singular, which may lead to numerical errors propagating in the computations of matrices A and B (steps 6 and 12). This pathological behavior does not occur in the simulated experiments reported in [52], since only statistically independent RHS vectors are considered.
Here, we propose to add a small random quantity to E tot (0) in the initialization step 1. This reduces the correlation between columns of E tot (0) , and thus improves the conditioning of matrix R † 0 V. Empirically, we have found that an initial perturbation with a signal-to-noise ratio ranging from 40 to 100 dB yielded roughly the same favorable effect, while a higher level of perturbation would slow down the convergence, and a smaller one would amount to no perturbation. In all experiments in this paper including such noisy initialization, the signal-to-noise ratio was set to 50 dB. The results reported in Section 4.2 show that the algorithm behaves in a satisfactory manner with such an initialization. Regarding the choice of matrix R 0 , various options were studied; the best results were obtained by using the choice proposed in [52], that is,

Implementation within the Reconstruction Procedure
We now consider the use of the block-BiCGStab algorithm for solving the two multiple linear systems involved in the evaluation of the cost function (8) and of its gradient (12) at each iteration of the optimization procedure. The first system reads S (t) : E tot (t) = L −1 x (t) E inc , and the second one is of the form S (t) : where notation (t) indexes the iteration number. According to the expression of the gradient (12), the RHS term U (t) in S (t) depends on the solution of S (t) , so the two systems can only be solved sequentially and not jointly. In the following, details are given considering S (t) , but similar arguments hold for S (t) .
As discussed in Section 3.2, initialization of block-BiCGStab plays an important role in the algorithm efficiency. Our implementation is based on the fact that two consecutive iterates of the optimization algorithm are close to each other, especially when the reconstruction algorithm is near to convergence.
Therefore, the solution E tot (t+1) should be close to that obtained at the former iteration E tot (t) , and E tot (t) could be used as an initialization of block-BiCGStab for solving S (t+1) . In particular, E tot (t) should be much closer to E tot (t+1) than the matrix formed by the incident fields E inc , that was used for initializing block-BiCGStab for the resolution of a single set of forward problems in Section 3.2. We noticed, however, that implementing this strategy led to poor convergence rates, because the total fields E tot (t) are highly correlated with each other, which generated conditioning issues similar to that explained in Section 3.2.
Therefore, we propose to initialize iteration t + 1 using a random perturbation added to E tot (t) .

Performance of Block-BiCGStab on Synthetic Data
The results presented in this section were obtained on synthetic data representing a typical experimental setup. First, we illustrate the impact of the implementation choices discussed in Section 3.2. Then, we investigate the performance of the proposed block-BiCGStab implementation for computing multiple forward scattering problems as a function of the problem difficulty, by varying the object contrast and the illuminating frequency. Finally, the computational efficiency of block-BiCGStab is evaluated on the full reconstruction process.

Description of the Numerical Example and Implementation Details
The object used in our experiments was composed of two nested cubes embedded in air. This example was taken from [33] and the contrast was increased by a factor of two in order to make the reconstruction problem more challenging. The object domain V was a cube with a 30-cm edge size, which was discretized onto a 30 × 30 × 30 regular Cartesian grid (1 cm 3 voxels). It is usually acknowledged that the discretization grid must have at least four voxel sides per wavelength for accurate estimation of the scattered field. Here, the highest illuminating frequency is 3 GHz (see Section 4.3.1), that is, λ = 10 cm, so that the voxel side is λ/10. The external part of the object was a cube with a 20-cm edge size, the contrast of which was set to χ 1 = 0.6 + j0.8. The internal part was a smaller cube with a 10-cm edge size and contrast χ 2 = 1.2 + j0.4. The object was illuminated by N S = 160 sources (vertical electric dipoles) that were regularly spaced around the object on five circles with a 60 cm diameter, at heights z = −20 cm, −10 cm, 0, 10 cm and 20 cm. Measurements were simulated by considering 160 receivers at the same locations as the sources. The object and the acquisition setup are represented in Figure 1.
Data were generated according to model (7) using a grid twice as fine as the one used in our numerical experiments (60 × 60 × 60 voxels) for more accurate computations, and circular Gaussian white noise was added in order to obtain a 20-dB SNR. Only the z component of the scattered field was used in our tests. Then, the reconstruction problem had 30 3 = 27,000 complex unknowns and 160 2 = 25,600 complex-valued data points.
All computations were performed with an Intel Core i7-5960X (eight cores) clocked at 3 GHz using Matlab, with multithreaded operations. For BiCGStab, a marching-on-in-source procedure [37] was used when it improved convergence. This procedure allows for better initialization of the solutions by using an interpolation based on the total fields previously computed for the nearest sources. The tolerance on the relative residuals imposed for convergence of both BiCGStab and Block-BiCGStab (line 15 in Algorithms 1 and 2) was set to 10 −6 . Therefore, the solutions obtained by the two algorithms are the same up to such numerical precision.

Impact of Initialization
The solution to the 3D forward problem corresponding to the experimental setup presented in  Figure 2a,b. BiCGStab converges in between 18 and 21 iterations for all problems, whereas block-BiCGStab residuals first slowly decrease, and then increase, leading to divergence.
Then, as described in Section 3.2, noise with 50 dB SNR was added to the initial solution of block-BiCGStab. The corresponding iterates are shown in Figure 2c-BiCGStab iterates, which are not represented, behave similarly to those represented in Figure 2a. On the contrary, block-BiCGStab iterates now converge much faster: in this example, block-BiCGStab converges in 40% less iterations than BiCGStab, corresponding to a similar saving in terms of computation time.
In all remaining experiments in the paper, the block-BiCGStab algorithm was initialized by adding such a noise to the incident fields.

Performance for Different Forward Scattering Configurations
We now compare BiCGStab and block-BiCGStab for several configurations of the 3-D forward problem based on the previous setup. The tests consisted of solving the N S systems (I 3N − G D X)e tot i = e inc i , and we focused on the influence of the illuminating frequency and of the contrast values in X, since both factors impact the problem difficulty-the Born approximation, for example, is only valid for low-contrast objects with small electric size.

Impact of the Frequency
We first study the algorithmic performance as a function of the illuminating frequency. The object defined in Section 4.1 was used, with frequencies equal to 1, 2 and 3 GHz. The corresponding wavelengths are respectively λ = 30 cm, λ = 15 cm and λ = 10 cm, and the size of the reconstructed volume respectively corresponds to λ, 2λ and 3λ: the higher the frequency, the bigger the electric size of the object. Table 1 gives the corresponding central processing unit (CPU) time and the number of iterations required for BiCGStab and block-BiCGStab to converge. First, in accordance with the discussion in Section 3.1, the CPU time is almost proportional to the number of iterations performed by both algorithms. Second, block-BiCGStab always yields a shorter computation time than BiCGStab, and the improvement becomes more significant as the frequency increases: at 3 GHz, block-BiCGStab is 33% faster than BiCGStab for solving the 160 systems.

Impact of the Contrast
We now evaluate the computation time as a function of the contrast. A scaling factor ranging from 0.25 to 2.5 was applied to the synthetic object defined in Section 4.1. The illuminating frequency was set to 3 GHz. Table 2 gives the corresponding CPU time and the number of iterations required for BiCGStab and block-BiCGStab. Here again, the CPU time is almost proportional to the number of iterations. In all cases, block-BiCGStab is faster than BiCGStab and, more importantly, the relative saving in CPU time increases with the contrast of the object: for a contrast factor of 2.5, block-BiCGStab runs approximately twice as fast as BiCGStab.
The results reported in Sections 4.3.1 and 4.3.2 therefore suggest that block-BiCGStab is best suited for the resolution of difficult problems, i.e., involving large and/or highly contrasted objects.

Inversion Procedure and Object Reconstruction
We now consider the reconstruction problem with the simulation setup described in Section 4.1. Since the object was highly contrasted, illuminations at frequencies 1, 2 and 3 GHz were used and the following frequency hopping technique was implemented (see [28,61,62] for example): • the initial solution was set to zero, and N it iterations of the inversion algorithm were performed with the 1 GHz data; • another N it iterations were then performed with the 2 GHz data, with an initial solution set to the output of the previous step; • the reconstruction algorithm was finally applied to 3 GHz data, with an initial solution set to the output of the previous step, and iterations are run until the ∞ -norm of the gradient becomes lower than 10 −6 .
The number N it of iterations for each intermediate frequency is a trade-off between reconstruction quality and computation time. In this experiment, we found that N it = 20 iterations were sufficient to achieve satisfactory results (that is, more iterations at intermediate frequencies would not improve the final reconstruction and would unnecessarily increase the computation time). The regularization weight in (9) was set to γ = 10 −6 for the three frequencies.
Reconstruction was performed using BiCGStab and block-BiCGStab, following the procedure proposed in Section 3.3. Since both algorithms used the same convergence thresholds, evaluations of the objective function and of its gradient at any point by the two algorithms were nearly identical; therefore, the minimization process followed roughly the same sequence, and the reconstructed objects obtained with both approaches are comparable. Reconstruction results are shown in Figure 3, together with the true object. The shapes of the two cubes are well retrieved and the contrast of the outer cube (χ 1 = 0.6 + j0.8) is very close to that of the true object. The boundaries of the reconstructed cubes slightly exceed the true ones, especially for the inner cube. The contrast value in the inner cube is also slightly misestimated (underestimation of the real part and overestimation of the imaginary part). Note that the inner cube is highly contrasted (χ 2 = 1.2 + j0.4), which makes it very difficult to reconstruct. For this problem, the reconstruction lasted about 21.3 h when computations were performed with BiCGStab. The reconstruction time dropped to 16.8 hours when block-BiCGStab was used, which represents a saving of more than 20%.

Reconstruction of the Fresnel Database Objects
We finally consider the reconstruction of the five objects of the Fresnel database [54]. Such objects have already been reconstructed in many papers [27][28][29]61,63,64]-see also a summary of reconstruction results in [65]. Due to space limitations, we restrict the presentation to two objects, which can be considered as the two most difficult ones: the TwoSpheres and the Cylinder objects.
The 3-D experimental setup for the Fresnel data set is detailed in [54]. Using the reciprocity theorem and the two polarizations of the experimental data [29], we consider that the setup was equivalently composed of 36 sources placed on a circle at z = 0 with a 1.796-m radius. Data were equivalently acquired with 81 receivers placed on a sphere with a 1.796-m radius [54]. The incident fields were plane waves polarized along the z-axis and acquisitions were performed at frequencies ranging from 3 to 8 GHz. The two polarizations were used (co-and cross-polarization). Since the measurements were performed in the far-field zone of the scattering object, the electric field at the receivers is transverse (that is, its longitudinal component is zero). When projected into the Cartesian coordinate system, the three spatial components of the measured field are generally non-zero. Thus, each data set is made up of 36 × 81 × 3 = 8748 complex-valued measurement points. For all objects, the volume of interest V was defined as a cube with a 10 cm edge size divided into 30 × 30 × 30 voxels, so that the object is fully enclosed in V in each case. The maximum frequency is 8 GHz, that is, λ = 37.4 mm, so that the coarser discretization scheme corresponds to 11.3 voxel sides per wavelength. We then have 27,000 complex-valued unknowns. In all experiments, the regularization parameter γ in (9) was set to γ = 10 −6 .
Here also, as discussed in Section 4.4, a frequency hopping procedure was used to carry out the reconstructions as follows: • Perform N it iterations of the reconstruction algorithm with the 3 GHz data and initial contrast function set to zero.

•
Perform N it iterations of the reconstruction algorithm with the 4 GHz data and initial contrast function set to the final iterate of the previous step.

•
Repeat the previous step with the 5 GHz, 6 GHz and 7 GHz data. • Perform reconstruction of the object with the 8 GHz data, initial contrast function set to the final iterate of the previous step and a stopping rule on the ∞ -norm of the gradient set to 10 −6 .
For all objects except Cylinder, computing N it = 10 iterations at each frequency empirically led to a satisfactory compromise between reconstruction quality and time. For Cylinder, which was bigger and more contrasted, N it = 20 intermediate iterations were necessary in order to achieve the best reconstruction.
The TwoSpheres object was composed of two identical spheres in contact embedded in air, 50 mm in diameter and with uniform, real-valued, relative permittivity equal to 2.6, that is, contrast χ = 1.6. Figure 4 (top) shows 30 slices of the true object and of our corresponding reconstruction, and Figure 4 (bottom) represents the isosurface of the reconstruction at the contrast value χ = 1. Note that we only present the real part of the estimated contrast function because the estimated imaginary part was very small. The latter is presented in the supplementary data files. It can be observed that the shape of the object was well reconstructed, even at the contact point. The contrast value was slightly overestimated at the contact point (the estimated contrast was about 1.8), and slightly underestimated around it (the estimated contrast was about 1.3).
The Cylinder object was composed of a cylinder of 80 mm in length and 80 mm in diameter, with uniform relative permittivity equal to 3.05 (contrast χ = 2.05). It was the most difficult object of the database because of its large size and high contrast, and several methods failed to reconstruct this object [28,33,63,66], while many others produced poor quality results [27,29,61,62,64]. Figure 5 shows our reconstruction results. The shape and the contrast were adequately reconstructed: the contrast function was homogeneous inside the object and the estimated value at the center voxels was close to the true one, since the estimated permittivity is about 1.9. To the best of our knowledge, this reconstruction of Cylinder outperformed all other results previously reported in the literature.  On a broader perspective, Table 3 provides a heuristic assessment of the reconstruction quality for all objects in the Fresnel database. Each cell in the table contains a subjective comparison of our result with previously obtained ones for each object, based on the figures in the corresponding papers. We consider that the proposed method provided results at least similar to previous ones, and often better for the most difficult objects. The reader is referred to supplementary files associated with the paper, that depict graphical representations of all reconstructed objects and provide all reconstruction results as Matlab files. Table 3. Subjective comparison of the obtained reconstruction results with previously published works on Fresnel database objects: "=", "+" and "++" respectively mean that our reconstruction is of comparable, better, and much better quality. N/A means that reconstruction of the object is not presented in the corresponding paper.  Table 4 finally gives the reconstruction time for the five objects of the Fresnel database using either BiCGStab or block-BiCGStab. The time saved by block-BiCGStab becomes more important as the difficulty of the reconstruction problem increases (i.e., as objects become larger and/or more contrasted). For the most difficult Cylinder object, the reconstruction time drops from 60 h to 28.3 h, which means that block-BiCGStab is more than twice faster than BiCGStab.

Discussion
We proposed a reconstruction strategy based on an additive regularization framework, where optimization is performed via first-order optimization. Computations of the cost function and of its gradient were performed through a dedicated implementation of the block-BiCGStab algorithm for multiple RHS systems inversion. In all our experiments, such an implementation led to a significant reduction of the computation time (up to one half) compared to BiCGStab, the gain being all the more significant that the reconstruction problem is difficult and the computation time is high. The procedure proposed in this paper is therefore particularly attractive for problems dealing with high-frequency illuminations and/or highly contrasted objects, which are of great interest: high-frequency acquisitions contain more detailed spatial information about the object under study, yielding higher resolution, and highly contrasted objects may be encountered in many application areas, such as biomedical imaging. Consequently, the proposed procedure may contribute to reducing the computational burden of microwave imaging, which still represents a crucial issue in the development of practical applications.
Improved convergence properties of block-BiCGStab over its sequential counterpart also enable block-BiCGStab to reach a low residual level for each iterative system inversion. In our experiments, the tolerance threshold on the norm of the residuals was set to 10 −6 , whereas higher values are commonly used. More accurate solutions to the forward problem yield more efficient optimization steps; this may explain why the proposed procedure achieved satisfactory reconstruction results for the most difficult objects of the Fresnel database, for which no result of equivalent quality has been reported in the literature.
Exploiting the proposed block resolution strategy may also improve the efficiency of any other microwave imaging inversion method requiring the computation of similar quantities (i.e., the total fields from the incident fields), either based on additive [13] or multiplicative [33] regularization. This strategy can also be advantageously used whatever the choice of the optimization method, such as non-linear conjugate gradient [29,64], regularized Gauss-Newton [33], Levenberg-Marquardt [67], or distorted Born iterative method [27]. It would also be worth studying the numerical performance of the block-BiCGStab algorithm in different geometrical configurations concerning, e.g., data acquisition setups or discretization schemes. Finally, it may also be efficiently used in other applications involving the computation of the total electromagnetic field accounting for scattering, e.g., radar cross-section [48,68,69], eddy-current [26,70] or magnetotelluric [71] computations. Depending on the complexity of the problem, specific tuning of the block-BiCGStab algorithm could be required, for which the strategies proposed in this paper may help.
Further improvements may be brought by parallel implementation of the proposed procedure. In our experiments, multicore architecture was taken advantage of only by performing multithreaded operations at a low level within the block-BiCGStab loop. However, massive parallelization of the resolution of independent systems (with BiCGStab) reduces the computation time by exploiting parallelization at a higher level [33]. Indeed, the two parallelization levels could be combined into a block resolution procedure, where the multiple systems would be split into several subsystems to be solved in parallel. In [53], such an idea was successfully tested on an early version of block-BiCGStab. This remains a promising possibility to accelerate microwave imaging in the context of High Performance Computing, as well as parallelization capacities brought by GPU-or multi-GPU-based implementations.