Next Article in Journal
The 3D Printing of Freestanding PLLA Thin Layers and Improving First Layer Consistency through the Introduction of Sacrificial PVA
Previous Article in Journal
Thermal Characteristics and Parametric Analysis of an Improved Solar Wall
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Generalized Cross-Validation and Unbiased Predictive Risk Estimator Methods Using the RGSVD: Application to Inversion of Potential Field Data

School of Geophysics and Information Technology, China University of Geosciences, Beijing 100083, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2021, 11(14), 6326; https://doi.org/10.3390/app11146326
Submission received: 14 May 2021 / Revised: 22 June 2021 / Accepted: 6 July 2021 / Published: 8 July 2021

Abstract

:
The inversion of potential field data has widely utilized the generalized cross-validation (GCV) and the unbiased predictive risk estimator (UPRE) methods to determine the regularization parameter. However, these two methods are time-consuming and it is difficult for them to determine the optimal linear search range including the optimal regularization. To solve these problems, this article improves the GCV and UPRE methods using the RGSVD (randomized generalized singular value decomposition) algorithm. The improved methods first use the randomized algorithm to compute an approximate generalized singular value decomposition (GSVD) with less computational time. Then, the optimal linear search range is determined based on the generalized singular values. Finally, the GCV and the UPRE functions are efficiently computed on the basis of the results from the RGSVD algorithm. In this way, the GCV and UPRE methods using the RGSVD algorithm are able to determine the optimal regularization parameter fast and effectively. One comparative test shows the effectiveness and efficiency of the GCV and the UPRE methods using the RGSVD algorithm.

1. Introduction

Various geological and geophysical problems can be solved by the inversion of potential field data [1,2]. Generally, the inversion of potential field data is an ill-posed problem, which means that this inversion is usually non-unique and unstable [3]. Tikhonov regularization [4] can solve ill-posed problems. Estimating an optimal regularization parameter is very significant for the Tikhonov regularization, e.g., [5]. In the literature, many methods have been introduced to determine an optimal regularization parameter, such as the Morozov discrepancy principle (MDP) method [6], the generalized cross-validation (GCV) method [7], the L-curve (LC) method [8], and the unbiased predictive risk estimator (UPRE) method [9].
Here, we focus on the GCV and the UPRE methods. The conjugate gradient (CG) method and generalized singular value decomposition (GSVD) can be used to compute the GCV and the UPRE functions in a linear search range to find the optimal regularization parameter. However, when using the CG method, the process is time-consuming, and it is difficult to determine the optimal linear search range including the optimal regularization parameter. When using the GSVD, the optimal linear search range can be determined by analyzing the spectrum (generalized singular values) of the kernel matrix [9], but computational costs and memory requirements limit the application of this method.
In this paper, the RGSVD [10] algorithm was adopted in the GCV and the UPRE methods for determination of the optimal regularization parameter. The RGSVD algorithm uses a randomized algorithm to compute an approximation of the GSVD with less memory requirements and computing time [10], with which the optimal linear search range can be determined based on the generalized singular values. The result from the RGSVD facilitates efficient computation of the GCV and UPRE functions. Therefore, the regularization parameter can be determined fast and effectively by the improved GCV and UPRE methods using the RGSVD algorithm. One comparative test demonstrated the performances of the GCV and the UPRE methods using the RGSVD algorithm.

2. Inversion Methodology

Generally, the model domain is discretized into many cells, whose physical properties are the model parameters. According to the linear relationship between model parameters and measured data, the inverse problem of potential field data has the matrix form as
d o b s = G m ,
where d o b s R n represents the measured data vector, m R m is the unknown parameters, and G R n × m represents the kernel matrix. n represents the number of measured data, and m represents the number of the unknown parameters. In general, m is much larger than n. Therefore, the inversion of potential field data belongs to an under-determined problem. Through the singular value decomposition (SVD) of the matrix G or the least squares solution, Equation (1) can be used to calculate the inversion result directly. However, in this way, the inversion result may not conform to reality. These ill-posed problems can be solved by introducing a regularization term [4]. Then, the objective function of this inversion has the form
ϕ = W d G m d o b s 2 2 + μ 2 W m Z m 2 2 ,
In Equation (2), W d G m d o b s 2 2 is the data misfit, W m Z m 2 2 is the regularization term, and μ is the regularization parameter, which is used to balance these two terms. In the data misfit, W d represents a data weighting matrix. In the regularization term, W m R 4 m × m represents a smooth constraint matrix and its matrix form is W m = [ W s ;   W x ;   W y ;   W z ] , where W s R m × m , W x R m × m , W y R m × m , and W z R m × m are different component matrices, respectively [11,12]; and Z represents a depth-weighting matrix [11,12].
Generally, the physical bound constraint is incorporated into this inversion for obtaining a geologically plausible inversion result. Then, the inverse problem of potential field data becomes the following constrained minimization problem
minimize : ϕ = W d G m d o b s 2 2 + μ 2 W m Z m 2 2 subject to : m m i n m m m a x ,
where m m i n and m m a x are vectors consisting of the lower and upper physical bounds on the unknown model values. Because the matrix Z is a diagonal matrix, the following transformations can be performed easily:
h = W d G Z 1 , r = W d d o b s ,   and   y = Z m .
Then, Equation (2) is rewritten as
ϕ = h y r 2 2 + μ 2 W m y 2 2 .
The constrained minimization problem (3) becomes the following new problem
minimize :   ϕ = h y r 2 2 + μ 2 W m y 2 2 subject to :   y m i n y y m a x ,
where y m i n = Z m m i n and y m a x = Z m m a x are vectors of the lower and upper bounds for y. The logarithmic barrier method is adopted to incorporate the physical bound constraint into the objective function [13,14,15]. Then, the new objective function has the form
ϕ = h y r 2 2 + μ 2 W m y 2 2 2 λ i = 1 n ln y i y min , i + ln y max , i y i ,
where 2 λ i = 1 n ln y i y min , i + ln y max , i y i is the barrier function, and λ is the barrier parameter.
The Newton method [13,14,15] is used to solve the minimization of (7) iteratively. The regularization parameter μ is kept fixed during the iterations, and the barrier parameter λ decreases with iteration [13]. At the kth iteration, one step of the Newton method is applied for (7) to yield
    h T h + μ 2 W m T W m + λ k X k 2 + Y k 2 Δ y k = h T h y k 1 r μ 2 W m T W m y k 1 y 0 + λ k X k 1 + Y k 1 e ,
where X k = d i a g y k 1 y m i n , Y k = d i a g y m a x y k 1 , e R m × 1 is the vector with all entries one, and y 0 is an initial model. The solution Δ y k is the search direction at the kth iteration. The strategy of Li and Oldenburg [13] is used to obtain the final inversion result, and Algorithm 1 shows the detailed steps. In step 9, the solution Δ y k can be obtained by the preconditioned conjugate gradient (PCG) method, and the preconditioner P k has the form
P k = d i a g d i a g A k ,
where A k has the form
A k = h T h + μ 2 W m T W m + λ k X k 2 + Y k 2 .
When the number of iterations k reaches the preset maximum number of iterations K max or the change of the objective function is less than 1%, the iterative process ends. The method of estimating the regularization parameter is discussed in Section 3.
Algorithm 1. Inversion of potential field data
Preparation :   d o b s , G , W d , m m i n , m m a x and K max .
  • Calculate   W m , and Z .
  • Initialize m 0 = 0 . 001 , and k = 0 .
  • Calculate   h = W d G Z 1 ,   r = W d d o b s ,   y 0 = Z m 0 ,   y m i n = Z m m i n , and y m a x = Z m m a x .
  • Estimate the regularization parameter μ .
  • Calculate   λ 1 = h y 0 r 2 2 + μ 2 W m y 0 2 2 2 i = 1 n ln y i 0 y min , i + ln y max , i y i 0 .
  • while k < K max
  • k = k + 1 .
  • Update X k = d i a g y k 1 y m i n and Y k = d i a g y m a x y k 1 .
  • Solve Equation (8) for Δ y k .
  • Update y k = y k 1 + δ β k Δ y k , where δ = 0.925 and
    • β k = 1 ,   if   y m i n < y k 1 + Δ y k < y m a x min min y i k 1 + Δ y i k < y min , i y i k 1 y min , i Δ y i k ,   min y i k 1 + Δ y i k > y max , i y max , i y i k 1 Δ y i k ,   otherwise
  • Exit the loop if the termination criterion is satisfied.
  • Update λ k + 1 = 1 min β k , δ λ k .
  • End
  • Output: Solution m = Z 1 y k .

3. Estimation of Regularization Parameter

When estimating the regularization parameter, we only focused on (5) without considering the barrier function. In this way, the regularization parameter is still suitable in the minimization of (7). Here, the GCV [7] and the UPRE [9] methods are used to estimate the regularization parameter.

3.1. Generalized Cross-Validation Method

The randomized trace estimation was introduced into the GCV function to solve the difficulty of a trace calculation in the GCV function [13]. Then, this function for Equation (5) has the form
G C V ( μ ) r h h T h + μ 2 W m T W m 1 h T r 2 2 n R T h h T h + μ 2 W m T W m 1 h T R 2 ,
where R is a random vector consisting of −1 and 1, each with a probability of 0.5. The parameter μ which minimizes (11) is the optimal regularization parameter μopt. This parameter μopt can be determined by line search within a range for regularization parameter, and the process can be solved by using the CG method [13]. However, when using the CG method, an optimal range for regularization parameter is difficult to determine, and sometimes the optimal regularization parameter μopt is not within the selected range. Moreover, the GCV method using the CG method is time-consuming.
The GSVD of the matrix pair h , W m can be used to compute the GCV function [5]. Based on the GSVD, the spectrum of h can be obtained. The optimal linear search range can be determined by analyzing the spectrum of h [9]. However, the calculation of GSVD is time-consuming and has a large memory requirement. When the scale of data is large, the calculation of GSVD is even impractical. Instead, the RGSVD algorithm is adopted. The RGSVD algorithm uses the randomized algorithm to provide a low-rank approximation of the GSVD [10]. Compared with the GSVD, the RGSVD can reduce computational costs and memory demands, and results with good accuracy can be obtained. Algorithm 2 shows the RGSVD algorithm [10], which is used in this study. The parameter q determines the accuracy and efficiency of the RGSVD algorithm. As the value of the parameter q increases, the result becomes more accurate and the computation time is longer.
Algorithm 2. Randomized generalized singular value decomposition (RGSVD) algorithm. Given   h R n × m ( n m ) and W m R 4 m × m , a target matrix rank q ( q n ), calculate an approximate GSVD of the matrix pair h , W m :   h U C X , W m V S X with U R n × q , V R 4 m × q ,   C R q × q , S R q × q , and X R q × m .
  • Generate   a   q × n Gaussian random matrix A .
  • Calculate the q × m matrix Y = A h .
  • Calculate the m × q orthonormal matrix Q via QR factorization Y T = Q R .
  • Form the n × q matrix B 1 = h Q and the 4 m × q matrix B 2 = W m Q .
  • Use U , V , W , C , S = gsvd ( B 1 , B 2 , 0 ) to calculate the economy-sized GSVD of the matrix pair B 1 , B 2 :   B 1 B 2 = U V C S W T .
  • Form the q × m matrix X = W T Q T .
  • Note :   h U C X and W m V S X .
In the RGSVD algorithm, C = d i a g c 1 , c 2 , c q R q × q with
0 < c 1 c 2 c q < 1 ,
and S = d i a g s 1 , s 2 , , s 2 R q × q with
1 > s 1 s 2 s q > 0 ,
c i 2 + s i 2 = 1 , i = 1 : q , that is C T C + S T S = I q . According to the result of the RGSVD, Equation (11) has the form
G C V ( μ ) r h i = 1 q γ i 2 γ i 2 + μ 2 u i T r c i ( X ) i 2 2 n R T h i = 1 q γ i 2 γ i 2 + μ 2 u i T R c i ( X ) i 2 ,
where γ i = c i / s i denotes the ith generalized singular value, u i represents the ith column of matrix U, and ( X ) i denotes the ith column of the Moore-Penrose inverse of matrix X. The parameter μopt can be found between the minimum and maximum of the generalized singular value γ i . Here, the parameter q is set as n. Therefore, the full generalized singular values can be obtained. The value of n is slightly large, but the computation time of the RGSVD algorithm with q = n is still much shorter than that of GSVD.

3.2. Unbiased Predictive Risk Estimator Method

The UPRE function for Equation (5) has the form [16]
U P R E ( μ ) = r h h T h + μ 2 W m T W m 1 h T r 2 2   + 2   t r a c e h h T h + μ 2 W m T W m 1 h T n ,
where t r a c e   is the trace of the term in the brackets. The randomized trace estimation [9] was also introduced in Equation (15) to solve the difficulty from the calculation of the trace. Then, the UPRE function is approximated by
U P R E ( μ ) r h h T h + μ 2 W m T W m 1 h T r 2 2   + 2   R   T   h h T h + μ 2 W m T W m 1 h T R n .
The result of the RGSVD is introduced into Equation (16), and Equation (16) has the form
U P R E ( μ ) r h i = 1 q γ i 2 γ i 2 + μ 2 u i T r c i ( X ) i 2 2   + 2   R T h i = 1 q γ i 2 γ i 2 + μ 2 u i T R c i ( X ) i n .
Similarly, when the value of Equation (17) is the minimum, its corresponding parameter μ is the optimal regularization parameter μopt, which can be found between the minimum and maximum of the generalized singular value γ i as well. Here, the parameter q is also set as n.

4. Synthetic Example Tests

In this study, one example was used to demonstrate the performances of the GCV and the UPRE methods using the RGSVD algorithm. For comparison, the CG method and the GSVD method were also used in the GCV and UPRE methods. The following tests were run on a computer with a 3.00 GHz processor and 32 GB RAM.
The model consists of two cuboids, and Figure 1a shows its 3D perspective view. The dimensions of the two cuboids are both 300 m × 300 m × 200 m, and the depths of their tops are both 50 m. Both of them have a density contrast of 1 g/cm3. The gravity data were generated on a grid with 31 × 21 = 651 points and 50 m × 50 m spacings. Meanwhile, random Gaussian noise, whose standard deviation is 0.02 mGal plus 2% of each datum, were incorporated in the gravity data (Figure 1b). The subsurface was divided into 30 × 20 × 10 = 6000 cells, and dimension of each cell is 50 m × 50 m × 50 m. In the following inversions, the density range was 0–1 g/cm3 and K max was set as 50.
First, we used the GCV method to determine the regularization parameter in inversion. In the GCV method, the CG method, the GSVD, and the RGSVD algorithm were used, respectively. Figure 2a–c show the inversion results from the GCV methods using the CG method, the GSVD, and the RGSVD algorithm, respectively. These three inversion results are very similar: two source bodies with cuboid shape are in the actual position of the true model. Their corresponding GCV function curves are shown in Figure 3a–c, and these three curves have similar trends. Meanwhile, Table 1 records the corresponding optimal regularization parameters μopt, and the corresponding elapsed times for the process of choosing the regularization parameter. The GCV methods using the GSVD and the RGSVD algorithm have very similar μopt, and the μopt of the GCV method using the CG method is slightly large. The GCV method using the RGSVD algorithm has the shortest elapsed time, at only 1.3 s.
Then, the UPRE method was implemented in inversion. Similarly, the UPRE method used the CG method, GSVD, and the RGSVD algorithm, respectively. Their inversion results are shown in Figure 2d–f, respectively. These inversion results are similar to those obtained using the GCV method. The curves for the UPRE function are demonstrated in Figure 3d–f, and they also have similar trends. Table 1 also lists the value of optimal regularization parameters and the elapsed times for the UPRE method. Their optimal regularization parameters are the same as them from the GCV method, and the elapsed times are close to them from the GCV method. The UPRE method using the RGSVD algorithm also has the shortest elapsed time.
Through the above comparative test, it was concluded that the GCV and the UPRE methods using the RGSVD algorithm can quickly provide an optimal regularization parameter that can generate a satisfying inversion result.

5. Conclusions

We introduced the GCV and the UPRE methods using the RGSVD algorithm, with which the optimal regularization parameter can be determined fast in the inversion of potential field data. We demonstrated the effect of these two methods through a comparative test.

Author Contributions

Conceptualization, J.W. and X.M.; Investigation, H.T.; Methodology, Y.F.; Software, Y.F.; Visualization, H.T.; Writing—original draft, Y.F.; Writing—review & editing, J.W. and X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by (1) The National Natural Science Foundation of China (Grant numbers: 41804099 and 41974161), and (2) The Fundamental Research Funds for the Central Universities (Grant number: 2-9-2019-040).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yan, J.; Lü, Q.; Chen, X.; Qi, G.; Liu, Y.; Guo, D.; Chen, Y. 3D lithologic mapping test based on 3D inversion of gravity and magnetic data: A case study in Lu-Zong ore concentration district, Anhui Province. Acta Petrol. Sin. 2014, 30, 1041–1053. [Google Scholar]
  2. Yan, J.; Chen, X.; Meng, G.; Lü, Q.; Deng, Z.; Qi, G.; Tang, H. Concealed faults and intrusions identification based on multi-scale edge detection and 3D inversion of gravity and magnetic data: A case study in Qiongheba area, Xinjiang, Northwest China. Interpretation 2019, 7, T331–T345. [Google Scholar] [CrossRef]
  3. Wang, J.; Meng, X.; Li, F. Fast Nonlinear Generalized Inversion of Gravity Data with Application to the Three-Dimensional Crustal Density Structure of Sichuan Basin, Southwest China. Pure Appl. Geophys. 2017, 174, 4101–4117. [Google Scholar] [CrossRef]
  4. Tikhonov, A.N.; Arsenin, V.Y. Solutions of Ill-Posed Problems. Math. Comput. 1977, 32, 491. [Google Scholar]
  5. Vatankhah, S.; Ardestani, V.E.; Renaut, R.A. Automatic estimation of the regularization parameter in 2D focusing gravity inversion: Application of the method to the Safo manganese mine in the northwast of Iran. J. Geophys. Eng. 2014, 11, 045001. [Google Scholar] [CrossRef] [Green Version]
  6. Morozov, V.A. On the solution of functional equations by the method of regularization. Sov. Math. Dokl. 1966, 7, 414–417. [Google Scholar]
  7. Golub, G.H.; Heath, M.; Wahba, G. Generalized cross-validation as a method for choosing a good ridge regression parameter. Technometrics 1979, 21, 215–223. [Google Scholar] [CrossRef]
  8. Hansen, P.C. Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 1992, 34, 561–580. [Google Scholar] [CrossRef]
  9. Vogel, C.R. Computational Methods for Inverse Problems. In SIAM Frontiers in Applied Mathematics; SIAM: Philadelphia, PA, USA, 2002. [Google Scholar]
  10. Wei, Y.; Xie, P.; Zhang, L. Tikhonov Regularization and Randomized GSVD. SIAM J. Matrix Anal. Appl. 2016, 37, 649–675. [Google Scholar] [CrossRef]
  11. Li, Y.; Oldenburg, D.W. 3-D inversion of magnetic data. Geophysics 1996, 61, 394–408. [Google Scholar] [CrossRef]
  12. Li, Y.; Oldenburg, D.W. 3-D inversion of gravity data. Geophysics 1998, 63, 109–119. [Google Scholar] [CrossRef]
  13. Li, Y.; Oldenburg, D.W. Fast inversion of large-scale magnetic data using wavelet transforms and a logarithmic barrier method. Geophys. J. Int. 2003, 152, 251–265. [Google Scholar] [CrossRef] [Green Version]
  14. Li, Z.; Yao, C.; Zheng, Y.; Wang, J.; Zhang, Y. 3D magnetic sparse inversion using an interior-point method. Geophysics 2018, 83, J15–J32. [Google Scholar] [CrossRef]
  15. Li, Z.; Yao, C. 3D sparse inversion of magnetic amplitude data when strong remanence exists. Acta Geophys. 2020, 68, 365–375. [Google Scholar] [CrossRef]
  16. Vatankhah, S.; Renaut, R.A.; Ardestani, V.E. Total variation regularization of the 3-D gravity inverse problem using a randomized generalized singular value decomposition. Geophys. J. Int. 2018, 213, 695–705. [Google Scholar] [CrossRef] [Green Version]
Figure 1. (a) 3D perspective view of the model consisting of two cuboids; (b) gravity data produced by this model.
Figure 1. (a) 3D perspective view of the model consisting of two cuboids; (b) gravity data produced by this model.
Applsci 11 06326 g001
Figure 2. Inversion results from different regularization methods. The generalized cross-validation (GCV) methods using (a) the conjugate gradient (CG) method, (b) the generalized singular value decomposition (GSVD), and (c) the randomized generalized singular value decomposition (RGSVD algorithm; the unbiased predictive risk estimator (UPRE) methods using (d) the CG method, (e) the GSVD, and (f) the RGSVD algorithm. In all these panels, the black lines indicate the real position of the model.
Figure 2. Inversion results from different regularization methods. The generalized cross-validation (GCV) methods using (a) the conjugate gradient (CG) method, (b) the generalized singular value decomposition (GSVD), and (c) the randomized generalized singular value decomposition (RGSVD algorithm; the unbiased predictive risk estimator (UPRE) methods using (d) the CG method, (e) the GSVD, and (f) the RGSVD algorithm. In all these panels, the black lines indicate the real position of the model.
Applsci 11 06326 g002
Figure 3. The GCV functions using (a) the CG method, (b) the GSVD, and (c) the RGSVD algorithm; the UPRE functions using (d) the CG method, (e) the GSVD, and (f) the RGSVD algorithm. The blue curve is GCV function or UPRE function, and the red ○ is the optimal regularization parameter.
Figure 3. The GCV functions using (a) the CG method, (b) the GSVD, and (c) the RGSVD algorithm; the UPRE functions using (d) the CG method, (e) the GSVD, and (f) the RGSVD algorithm. The blue curve is GCV function or UPRE function, and the red ○ is the optimal regularization parameter.
Applsci 11 06326 g003
Table 1. The optimal regularization parameters and the elapsed times for different regularization methods.
Table 1. The optimal regularization parameters and the elapsed times for different regularization methods.
GCV MethodUPRE Method
CGGSVDRGSVDCGGSVDRGSVD
μ o p t 100.045.344.4100.045.344.4
Time117.5 s113.0 s1.3 s111.2 s114.3 s1.4 s
Note: μopt is the optimal regularization parameter, and Time denotes the elapsed time for different regularization method.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Fang, Y.; Wang, J.; Meng, X.; Tang, H. Improved Generalized Cross-Validation and Unbiased Predictive Risk Estimator Methods Using the RGSVD: Application to Inversion of Potential Field Data. Appl. Sci. 2021, 11, 6326. https://doi.org/10.3390/app11146326

AMA Style

Fang Y, Wang J, Meng X, Tang H. Improved Generalized Cross-Validation and Unbiased Predictive Risk Estimator Methods Using the RGSVD: Application to Inversion of Potential Field Data. Applied Sciences. 2021; 11(14):6326. https://doi.org/10.3390/app11146326

Chicago/Turabian Style

Fang, Yuan, Jun Wang, Xiaohong Meng, and Hanhan Tang. 2021. "Improved Generalized Cross-Validation and Unbiased Predictive Risk Estimator Methods Using the RGSVD: Application to Inversion of Potential Field Data" Applied Sciences 11, no. 14: 6326. https://doi.org/10.3390/app11146326

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop