Application of a Deep Neural Network to Phase Retrievalin Inverse Medium Scattering Problems

We address the inverse medium scattering problem with phaseless data motivated by nondestructive testing for optical fibers. As the phase information of the data is unknown, this problem may be regarded as a standard phase retrieval problem that consists of identifying the phase from the amplitude of data and the structure of the related operator. This problem has been studied intensively due to its wide applications in physics and engineering. However, the uniqueness of the inverse problem with phaseless data is still open and the problem itself is severely ill-posed. In this work, we construct a model to approximate the solution operator in finite-dimensional spaces by a deep neural network assuming that the refractive index is radially symmetric. We are then able to recover the refractive index from the phaseless data. Numerical experiments are presented to illustrate the effectiveness of the proposed model.


Introduction
Consider an operator T 0 from a certain vector space X to another vector space Y: The direct problem may be regarded as finding y for a given x. On the other hand, in the inverse problem, one seeks x, which represents the parameters characterizing the system, from the measurement y. In many important applications, such as signal analysis, imaging, crystallography, and optics, the space Y is defined in the complex field, i.e., y is a complex valued function such as y = |y|e iΦ . However, it is very computationally expensive or even impossible to measure the phase Φ. For this reason, the phase retrieval problem, written as y = T (x) := |T 0 (x)|, has been studied intensively and has a long and rich history in various aspects. One of the main difficulties in the phase retrieval problems lies in nonuniqueness. Consider a classical signal recovery problem that reconstructs the signal f (t) from the amplitude of its Fourier transform F ( f ) := f (t)e −iξt dt. Even taking into account trivial ambiguities such as translation and reflection invariance, it is known that there is no uniqueness in phase retrieval. For example, for any function g such that F (g) = e iΦ(ξ) , the amplitudes of F ( f * g) and F ( f ) are identical. Here, f * g denotes the convolution of f and g. This nonuniqueness in phase retrieval can be removed by restricting the domain or property of the operator T . We refer the reader to references [1][2][3][4] for comprehensive studies on phase retrieval problems.
In this manuscript, we address the phase retrieval arising in the 2-dimensional inverse medium scattering problem, which is governed by the following Helmholtz equation: where k > 0 is the wave number and q > 0 is the refractive index. Throughout this paper, it is assumed that 1 − q has compact support. That is, for a circle B R 0 centered at 0 with radius R 0 , The total field u comprises the incident field u i and the scattered field u s : The incident field u i (x; d) = exp(ikx · d) with incident direction d ∈ S 1 solves the homogeneous equation The scattered field u s satisfies the Sommerfeld radiation condition The goal of the inverse medium scattering problem is to reconstruct the refractive index q from the measurement of scattered fields. This problem has been studied analytically as well as numerically in many fields, such as medical imaging, nondestructive testing, optics, radar, and seismology. See [5][6][7] for comprehensive surveys of this problem.
Our work here is especially motivated by the nondestructive testing of optical fibers. Generally, an optical fiber consists of a core surrounded by cladding material with the desired index of refraction. However, the index of refraction may be inaccurate due to manufacturing error, which would be problematic in optical communications. One testing method is to identify the refractive index profile q from the measurement of near-field data, i.e., u(x; d)| |x|=R . However, it is very expensive to measure the scattered field, especially for high frequencies. For this reason, we seek the refractive index from the modulus of the full field data at |x| = R, where R ≥ R 0 , i.e., we want to solve Note that from the motivation, we assume that q is radially symmetric throughout this paper, i.e., q = q(r), r = |x|.
This problem can be categorized into the class of inverse problems with phaseless data, which have also been widely studied over the past decades (see, e.g., [8][9][10][11][12][13]). Although several results have been provided for the uniqueness of the inverse problem without phase information under certain restrictions (e.g., [14][15][16][17][18]), to the authors' best knowledge, identification of the refractive index from the measured data set {|u(x, d)| : |x| = R, d ∈ S 1 } remains open. Furthermore, it is well known that the inverse problem is severely ill-posed in the sense that the solution q is highly sensitive to changes in the data |u(x, d)| |x|=R .
To overcome these difficulties, we invoke a deep neural network (DNN) [19] to approximate T −1 . Several works have described solving inverse problems and performing phase retrievals using DNNs. Indeed, DNNs have been shown to successfully solve various inverse problems in imaging in the last few years. We refer the reader to [20][21][22][23][24] and the references therein. In particular, there are several works that have set out to solve phase retrieval in imaging. See, for example, [25,26]. It is worth mentioning that Xu et al. proposed deep learning methods for inversion of the electromagnetic inverse scattering problem without phase information in [27]. They reconstructed piecewise constant relative permittivity profiles using the U-net convolutional neural network. In our work, we use a multilayer perceptron to train T −1 (or T ) on the subspace X M of X by considering the inverse problem as a prediction of a set of points that are Fourier coefficients of 1 − q. Then, we are able to successfully construct models to approximate T −1 for various wavenumbers k. As ∪ ∞ M=1 X M = X, it is expected that the resolution of the reconstructed q ∈ X can be increased by taking a large enough M with the proposed approach.
The rest of this paper is organized as follows: In Section 2, we discuss the direct problem for the system (1) and define the operator T , both of which are essential for obtaining the data set. In Section 3, we discretize the function spaces X and Y and redefine operator T accordingly. Then, we illustrate the numerical results with examples. Some concluding remarks are drawn in the last section.

Mathematical Model
In this section, we discuss the direct problem of the system (1) under assumption (2). For a general refractive index q, the solution u to the system (1) depends on the incident direction d, and they are related highly nonlinearly. However, the symmetric assumption (2) gives where Q is the rotation operator such that due to the rotational invariance of the Laplacian. The uniqueness of the direct problem (1) That is, for a given u(x; d 1 ), one can determine u(x; d) for any d ∈ S 1 as long as the refractive index q is radially symmetric. Thus, without loss of generality, we set d = (1, 0) and we omit d from u(x; d) and u i (x; d).
It is well known that the system of Equations (1) is equivalent to the Lippmann-Schwinger equation (e.g., [6]), where H (1) n is the Hankel function of the first kind of order n. Let Then, we deduce that s n solves n (kr > )(1 − q(r))s n (r)dr, 0 ≤ r < ∞ from substitutions of the Jacobi-Anger formula [28] and the addition theorem [29] H (1) Here, (r, θ) and (r,θ) are the polar coordinates for x and y, respectively. J n is the Bessel function of the first kind of order n, r < represents the lesser of r andr, and r > represents the greater. By taking into account the support of 1 − q and the change of variables ρ = kr,ρ = kr, Here, we denote It is worth noting that (5) can be derived by the method of separation of variables, which also justifies the representation of (4) (e.g., [30]). Furthermore, one can show that the integral equation (5) is uniquely solvable in L ∞ for p ∈ L 2 [6,30]. Together with the asymptotic behavior of the Bessel functions |J n (ρ)| ≤ C/n (0 ≤ ρ ≤ n/2) [31], it follows that for some constant We also remark that as J −n = (−1) n J n and H −n = (−1) n H n , the uniqueness of (5) yields S −n (ρ) = S n (ρ), 0 ≤ ρ ≤ ρ 0 . Now, we formulate the inverse problem considered here. Let X = L 2 (0, R) and Y = L 2 (0, 2π), and define For a given S, we are interested in solving

Deep Neural Network for the Inverse Problem
Given a set of data pairs {(T (q), q)}, we seek to approximate T −1 with the neural network to solve (6). To this end, it is necessary to discretize q(r) and S(θ) and approximate X and Y in finite-dimensional spaces. Since it is assumed that 1 − q(|x|) is compactly supported in B(0, R) (1b), we restrict X to square integrable functions such that q(R) = 1. We further assume that q(0) = 1 to simplify the problem. That is, 1 − q belongs to L 2 0 (0, R), a set of square integrable functions vanishing at the boundary. As {sin(mπr/R)} ∞ m=1 is an orthogonal basis for L 2 0 (0, R), we approximate q(r) by a projectile onto X M , where In this manner, we are able to discretize q(r) as a vector C = (c 1 , c 2 , · · · , c M ) T . Additionally, S(θ) is vectorized by uniform discretization {θ l } 128 l=1 of θ in [0, 2π] for the finite sum | ∑ N n=−N S n (kR)e inθ |. Here, S n (kR) is obtained from the numerical solution to the integral Equation (5). We use the trapezoidal rule for numerical integration to convert (5) to a linear system Here, ρ 1 = 0, ρ K = ρ 0 and ω β are weighting factors for the trapezoidal method. We take ∆ρ = 0.005. All the computations to generate the data were performed using MATLAB R2020a. Table 1 shows that the approximation | ∑ N n=−N S n (kR)e inθ | converges to S(θ) quickly as N increases for various q. With an abuse of notation, S denotes a column vector whose lth component is | ∑ N n=−N S n (kR)e inθ l | (l = 1, 2, · · · , 128) to represent S(θ). Now, we are in a position to approximate T −1 by DNNs have various methods such as Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and transfer learning. However, these methods are all suitable for solving classification or time series problems. As our problem can be regarded as predicting a set of points-coefficients, we use a multilayer perceptron as our DNN. The Feed-Forward Neural Network (FFNN) is used since there is no correlation between the coefficients c i , and the network was configured according to the number of coefficients to be predicted. The architecture of a multilayer perceptron is illustrated in Figure 1. We refer the reader to references [32,33] for various methods for the DNNs. A total of 12,000 numerical data sets with R = 1, N = 8, M = 5 were separated into 8000 training data sets, 2000 development data sets, and 2000 evaluation data sets. All training was performed using an NVIDIA Quadro P4000. Figure 2 shows the distribution of relative errors of the 2000 evaluation data sets for different wave numbers, i.e., { q j −q j 2 / q j 2 } 2000 j=1 . Here, q is the actual refractive index andq is the recovered one by the trained model. To compare the learning performance for each k, we configured the network with the same conditions each time. The network consists of 128 nodes for the input layer of sources and 12 hidden layers. The rectified linear unit (ReLU) is used for the activation function, and dropout is not adopted because it reduces the performance. We take the mean squared error for the loss function and Adam for the optimizer with a learning rate 0.00001. We note that in the simulations with the indicated settings, the performance of the model is low or the model is not trained for wavenumbers less than 5. This is related to the nonlinearity of the operator. Indeed, for small values of k, the scattered data are more widely distributed on {S n } than for larger k; see [11,30]. On the other hand, at high frequencies, the nonlinear Equation (6) becomes extremely oscillatory and possesses many more local minima [34]. Furthermore, the scattering data is accumulated near S 0 , i.e., |S 0 | >> |S n | for large n. This reduces the ill-posedness of the inverse problem but it also reduces the rank of {S j (θ l )}. Recall that S(θ) = ∑ N n=−N S n (kR)e inθ . Then, the model to be trained tends to be an underdetermined problem. For these reasons, the performance of the model at high frequency is low, as shown in the case of k = 20. Table 2 shows the coefficients of the recovered 1 − q when q / ∈ X 5 from the trained model for X 5 . The recovered coefficient c m (m > M) is very small in X M (M = 2, 4 < 5), and in the case of X M (M = 6, 8 > 5), our model acts as the projection of X M onto X 5 by ignoring the actual c m (m > 5)-i.e., our model is indeed an approximation of the projection of T −1 onto X 5 . This can be verified from Figure 3 as well. In Figure 3, we show how the model can recover the refractive indexes when they are general functions that do not belong to X M . We also provide the relative errors defined as P 5 q −q 2 / P 5 q 2 . Here, P 5 is the projection operator onto X 5 . Table 2. Actual coefficients c m of 1 − q(r) for q ∈ X M (M = 2, 4, 6, 8) and recovered coefficients from the trained model for X 5 with k = 7. The trained model for X 5 can recover c m only for 1 ≤ m ≤ 5. The recovered coefficients c 3 , c 4 , c 5 in X 2 and c 5 in X 4 are very small. In the case of X 6 and X 8 , the trained model is able to recover c m (1 ≤ m ≤ 5) successfully and ignores the other coefficients c m (m > 5).  One of the difficulties in solving inverse problems and phase retrieval problems is their ill-posedness; a small error in measurement may result in a much larger error in the numerical solutions. Thus, a certain regularization technique is required. The classical regularization method converts the ill-posed problem to a well-posed problem using single fidelity. On the other hand, the DNN uses group data fidelity to learn the inverse map T −1 from the training data [35]. This may also convert the ill-posed problem to a well-posed problem. Figure 4 shows how our trained model handles noise data. Indeed, we illustrate the distribution of { q j −q j 2 / q j 2 } 2000 j=1 in Figure 4, whereq is the reconstructed refractive index from noisy data S δ = S + δ with Gaussian noise δ added. We notice that the relative errors increase according to the Gaussian noise level without any sudden change. Specific examples with different signal-to-noise ratios (SNRs) are shown in Figure 5. The SNR is computed as SNR = 20 log 10 S 2 δ 2 .

Discussion and Conclusions
In this work, we solve the phase retrieval problem arising from the inverse medium scattering problem using the DNN. From the set of data pairs {(S j , q j )}, we train the model to approximate T −1 . More precisely, our model is an approximation of where P M is the projection operator onto the M-dimensional Fourier sine space. Since the Fourier sine space is dense in L 2 0 , we are able to obtain an approximate solution to T (q) = S for any 1 − q ∈ L 2 0 . In the machine learning approach, it is crucial to design the input and output arguments. In this work, we take Fourier coefficients to represent the refractive index. Then, we can reconstruct the shape well with a relatively small dimension M. Furthermore, the error may be reduced by increasing M. However, for large M, the model is not easily trained since the nonlinearity of q and S in T and the sensitivity dramatically increase as M increases. For this reason, a new network is required for larger M; this is the focus of our ongoing work. We notice that the nonlinearity of T is also related to the wavenumber k. For small k, the scattered data are more widely distributed on {S n } than for larger k. This increases the nonlinearity of the operator and yields low performance of the model. At high frequencies, the nonlinear equation we considered becomes extremely oscillatory and possesses many more local minima. Furthermore, the scattering data is accumulated, which reduces the rank of the data set. This increases the variance of performance.
Although the uniqueness of the problem considered here remains open, we can successfully reconstruct the refractive index from phaseless scattering data. In particular, our model overcomes the ill-posedness of the problems using group data fidelity. This converts an ill-posed problem to a well-posed problem. In addition, the proposed method does not require any additional computation once the model is trained by learning, and the solution can be obtained directly from the measured data, making it suitable for industrial applications such as the nondestructive testing of optical fibers.