1. Introduction
Faults are key structures in the Earth’s crust. Most gravity anomalies are caused by faults, and faults play a crucial role in tectonic evolution. Determining fault parameters is of great significance for analyzing the causes of underground special structures and interpreting geological tectonics [
1]. Previously, the seismic method was the mainstream approach for fault identification [
2,
3]. This method offers high accuracy but has notable drawbacks, such as relying on seismic stations and being costly. In contrast, the gravity inversion method has the advantages of low cost and large-scale applicability. However, gravity inversion results are ambiguous [
4], which imposes significant limitations on gravity inversion methods. To overcome these limitations of gravity inversion methods, optimization algorithms are commonly used. Common optimization algorithms include least squares minimization [
5,
6], analysis of variance [
7], the simulated annealing method [
8], and the Particle Swarm Optimization (PSO) method [
9,
10,
11]. Among numerous methods, the particle swarm optimization algorithm stands out, with a more competitive performance than many other optimization algorithms, as it is easy to implement, has a fast convergence speed, and achieves good convergence results.
The Particle Swarm Optimization (PSO) algorithm is an optimization method that simulates the foraging behavior of a swarm. It tracks individual and global optimal solutions by updating particle positions. Its performance heavily relies on three key parameters: the inertia weight ω and the learning factors and . The traditional PSO algorithm has two major limitations: First, although there are empirical ranges for parameter values, they lack universality; and second, with fixed parameter settings, the algorithm struggles to adapt to complex optimization processes. These limitations affect the algorithm’s flexibility and optimization effectiveness.
The aforementioned issues have received widespread attention and research. Y. Shi and R. C. Eberhart (1998) proposed a PSO algorithm based on Linearly Decreasing Inertia Weight (PSO-LDIW) [
12], which allows for the inertia weight ω to decrease linearly with the number of iterations. Besides introducing a strategy of decreasing inertia weight, A. Ratnaweera et al. (2004) proposed the PSO algorithm with Time-Varying Acceleration Coefficients (PSO-TVAC) [
13]. This algorithm causes the learning factors
and
to change dynamically with each iteration. Some PSO algorithms have introduced topological structures to avoid getting trapped in local optima [
14,
15]. Although the aforementioned PSO algorithms have improved their performance in searching for the global optimum by refining parameter selection methods, this often comes at the cost of reduced convergence rates. Therefore, we need a new PSO algorithm that can both search for the global optimal solution by adaptively updating parameters and maintain an excellent convergence rate. Most variants of PSO algorithms adjust the learning factors in a time-varying manner. However, according to the principles of the PSO algorithm, the speed of particle updates largely depends on the distance between the particle and its individual historical best solution as well as the historical global best solution. Building on this, Liu et al. (2021) proposed an Adaptive Weighted PSO (AWPSO) algorithm [
16]. Unlike time-varying update strategies, in AWPSO, the learning factors change based on the distances of particles towards
and
. These improvements can enhance the convergence efficiency and accuracy of the PSO algorithm to varying degrees. However, since fault models inherently involve multiple parameters to be determined, applying optimization algorithms to fault parameter inversion typically involves calculating the residuals between observed and predicted gravity anomalies. The optimization direction is then decided based on these residuals for iterative refinement. This approach is evidently not rigorous enough. Additional constraint terms need to be considered for inclusion.
In summary, we proposed a Constrained Adaptive Weighted PSO (C-AWPSO) algorithm. It adds regularization constraints and penalty term constraints (based on prior conditions) to the AWPSO algorithm. We conducted simulation experiments and applications to verify the algorithm’s accuracy. Additionally, experiments were carried out from perspectives such as inertia weight decay methods, the contribution of constraint terms, and sensitivity analysis to demonstrate the feasibility of applying this algorithm to actual fault parameter inversion.
2. Methods
The relationship between a forward-dipping fault layer profile and gravity anomalies is as follows [
5,
7,
17]:
where
represents the horizontal position,
is the depth of the upper center of the fault,
is the depth of the lower center of the fault,
is the dip angle of the fault,
is the origin,
is the amplitude factor related to the thickness
,
is the gravitational constant, and
is the density contrast.
2.1. Traditional Particle Swarm Optimization Algorithm
The PSO algorithm [
18] is an optimization method that mimics the foraging behavior of bird flocks. Each particle has a position vector representing a candidate solution, and a group of particles forms a swarm. During the iterative process, the position vectors of particles are changed by updating their velocity vectors. This allows for the particles to track their individual historical best solutions (
) and the swarm’s historical global best solution (
). The traditional PSO algorithm consists of a velocity update equation (Equation (2)) and a position update equation (Equation (3)), with the specific expressions as follows:
where
represents the velocity of the j-th particle during the k-th iteration and
denotes the position of the j-th particle for each object parameter (
,
,
,
, and
) during the k-th iteration.
is the inertia weight, which controls the trend of velocity change.
and
are learning factors (also known as acceleration coefficients), governing the step size for particles to move towards the best positions.
and
are random numbers within the range [0, 1] and obey uniform distribution.
is the individual historical best solution, while
is the swarm’s historical best solution.
2.2. AWPSO Algorithm
It is widely known that selecting parameters for optimization algorithms is a significant challenge. The PSO algorithm involves three parameters: the inertia weight ω and the learning factors
and
. If these parameters are chosen unreasonably, the impact on the results will undoubtedly be substantial. For the traditional PSO algorithm, although previous researchers have provided some prior information, such as the recommended value ranges for
and
being [1.5, 2] and for
being [0.4, 0.9] [
19,
20], these ranges still lack universality. For the adaptive strategy of the inertia weight
, we select the linearly decreasing method proposed by Y. Shi and R. C. Eberhart (1998) [
12]. The specific improvement is shown in Equation (4):
where
represents the inertia weight during the k-th iteration.
and
are the maximum and minimum inertia weights, respectively, typically set to 0.9 and 0.4.
is the maximum number of iterations.
For the adaptive weighted update strategy of learning factors
and
, the key step is to calculate particle distances from
and
and then perform updates based on these distances. The updated rules are as follows:
where the function
represents the adaptive weighted update function and
and
are, respectively, the distances of particle
to
and
at the k-th iteration. Their expressions are as follows:
Since learning factors are weighted terms designed to guide particles towards
and
and the search space for constrained optimization problems is usually bounded, the adaptive weighted update function needs to have two characteristics: the update function is monotonically increasing and bounded. The activation functions in neural networks perfectly meet this requirement. We choose the sigmoid function as the adaptive weighted update function; in addition to satisfying the conditions of being monotonically increasing and bounded, it also has an S-shaped curve. This curve helps avoid abrupt changes in control parameters. Moreover, the sigmoid function is smooth and differentiable, reflecting the adaptive nature of updating weights iteratively. The specific expression is as follows:
where
is the natural logarithmic base.
represents the steepness of the curve, which is a constant.
represents the peak value of the curve,
represents the abscissa value of the center of the curve,
is a positive constant, and
is calculated by the equation (5), representing the distance of the particle from
and
. Combining Equations (4)–(6), Equation (2) is updated to the following:
In the PSO algorithm, fitness needs to be calculated as an indicator for selecting the optimal solution. In gravity inversion problems, the Root Mean Square (RMS) of the difference between observed gravity and forward-modeled gravity is typically used as the fitness value. The expression is as follows:
where
represents the measured data points,
is the observed gravity field, and
is the computed gravity field at the point
.
2.3. Constraint Method
During the computation process of optimization algorithms, precision (i.e., the objective function) is given priority. However, if only Equation (8) is used as the objective function, the algorithm’s optimization direction would solely focus on obtaining a forward-modeled anomaly curve that is similar to the observed anomaly, resulting in inaccurate inversion results. Such precision values are not true. In such cases, the physical significance and reasonableness of parameters are easily overlooked, making it difficult to obtain more accurate fault layer parameters. This is clearly not an ideal situation because, in gravity inversion, the non-uniqueness of solutions is a very common and major challenge. Different combinations of fault parameters might result in forward-model gravity anomalies that are quite similar. Therefore, we considered adding constraint terms to the objective function.
First, we considered adding an L2 regularization term, applying different regularization weights
to each parameter. The expression for the regularization term is as follows:
Additionally, it can be observed that the fault layer model itself has a prior condition, which is
. Based on this prior condition, we added a penalty term to the objective function:
where
is the penalty coefficient, which takes the value of
here. Combined with Equations (9)–(11), the objective function of this method is obtained:
where
is the fitness value, which is the target value for optimization.
4. Discussion
Optimization algorithms are developed based on mathematical logic. When applying optimization algorithms to gravity inversion, besides considering issues like convergence efficiency and accuracy inherent to the algorithms themselves, we also need to take into account the physical significance of their application, such as the non-uniqueness of gravity inversion solutions and the coupling relationships among various parameters. Next, we will conduct experiments on the C-AWPSO algorithm to address these issues and then discuss its feasibility in practical applications.
4.1. Comparison of Different Inertia Weight Decreasing Methods
From the principle of the C-AWPSO algorithm, its parameter updates mainly consist of two parts. One is the decreasing inertia weight, and the other is the update of learning factors based on the distance between particles and their historical best positions. There are generally several approaches to handling inertia weight: linear decrease, Gaussian decrease, exponential decrease, and logarithmic decrease [
12,
21,
22,
23]. We used these four inertia weight decrease methods to invert simulated data with 5% noise, 10% noise, and 15% noise. The corresponding fitness convergence curves are shown in
Figure 5 (all four methods exhibit excellent convergence speed with zero RMS error in noise-free cases, so they are not displayed here).
As shown in
Figure 5, when using four different inertia weight decrease methods for inversion, there are noticeable differences in both RMS error and convergence efficiency. This phenomenon is likely related to the inherent characteristics of each decrease method. The Gaussian decrease method offers fast convergence but is sensitive to parameters and prone to getting stuck in local optima, resulting in a higher RMS error compared to the other three methods. The logarithmic decrease method is insensitive to parameters, has strong noise resistance, and can balance convergence efficiency with the ability to search for global optimal solutions [
24]. The linear decrease method outperforms nonlinear decrease methods in terms of stability and is more universally applicable than other methods. When the characteristics of the inversion problem are not distinct, the linear decrease method is a safer choice [
21]. Therefore, we can observe that, although the linear decrease method does not excel in convergence efficiency, its performance is relatively stable.
To sum up, although the Gaussian decay method has a fast convergence speed, it is prone to getting trapped in local optima. The exponential decay method has similar issues and performs unstably. The logarithmic decay method, while having the advantage of maintaining search capability over the long term, has a slower convergence speed compared to the linear decay method and is more suitable for complex optimization problems. In fault inversion problems, the shape and type of faults are what need to be solved, and their characteristics are not obvious. The linear decay method is precisely suitable for such situations. Therefore, we choose the stable linear decay method to control the change in inertia weight.
4.2. Sensitivity Analysis
To analyze parameter sensitivity, we added a 10% perturbation to each parameter of an ideal model (
,
,
,
, and
). Then, we calculated the fitness value of this perturbed model and used the central difference method to compute its sensitivity. This is a method that assumes a locally smooth fitness surface and a symmetric response to perturbations. The expression is as follows:
where
represents fault parameters,
is the objective function, and
denotes the perturbation step size. Since the dimensions and magnitudes of each parameter vary, normalization is also required (using Equation (14)). After calculations, the sensitivities of the five fault parameters are obtained, as shown in
Figure 6.
From
Figure 6, we see that top-layer center depth z and bottom-layer center depth w of the fault have relatively high sensitivity, while fault dip angle
has the lowest sensitivity. This means that
and
have a significant impact on the objective function, and adjusting them is the key to optimizing the algorithm. This also indirectly indicates that it is reasonable for us to choose
and
as penalty terms to constrain the algorithm. However, in gravity inversion, there is a high chance of ambiguous solutions [
4]. Even though
and
have high sensitivity, they are inherently highly correlated. One may change abruptly while the other changes at the same time. This is the parameter coupling issue mentioned in
Section 3.1.
4.3. Selection of Regularization Parameters
In the regularization term, we need to introduce different regularization factors for each of the five parameters to be solved. Different combinations of regularization factors have a significant impact on the inversion results. Trying each combination one by one would obviously be extremely inefficient. So, we introduced the Bayesian optimization method in the step of selecting regularization parameters. The Bayesian optimization method is a hyperparameter optimization approach based on sequential decision-making. Its core has two aspects: First, it uses a surrogate model to output the expected value of the objective function, guiding the search direction; and second, it evaluates and updates the surrogate model through an acquisition function, enabling iterative optimization.
Our surrogate model employs the commonly used Gaussian Process Regression (GPR). This model assumes that the objective function follows a Gaussian distribution. Then, the approximation of the objective function can be expressed as follows:
where
is the objective function,
denotes a Gaussian distribution,
represents the predicted mean (expected performance), and
stands for the predicted standard deviation (uncertainty). Based on the output of the surrogate model, we calculated the value of the acquisition function
at all candidate points
. We then select the point that maximizes
as the next sampling point:
We used the commonly adopted Expected Improvement (EI) method as the acquisition function:
where
,
is the exploration parameter,
is the Cumulative Distribution Function (CDF) of the standard normal distribution, and
is the Probability Density Function (PDF) of the standard normal distribution.
The process of conducting Bayesian optimization using the above method is shown in
Figure 7. A total of 200 iterations were carried out. According to the figure, convergence was basically achieved after the 100th iteration. The obtained optimal combination of regularization factors is
.
4.4. Selection of Penalty Coefficient
As part of the constraint term, selecting the penalty coefficient in the penalty term is also crucial. A larger penalty coefficient is not necessarily better. If the constraint is too strong, the algorithm is prone to getting stuck in local optimal solutions. Since the penalty term involves only one parameter, we choose a simple loop iteration to test the impact of different penalty coefficients on inversion accuracy. The penalty coefficient linear increases regularly with each loop iteration, ranging from [
,
]. The details are shown in
Figure 8. From
Figure 8, it can be seen that the most suitable penalty coefficient is
. The RMS error calculated with this penalty coefficient is 0.67 mGal. Meanwhile, we found that, when an unsuitable penalty coefficient is chosen, the RMS error can reach up to 1.6 mGal, with a deviation rate of 160% from the lowest accuracy. And in 100 loop iterations, unsuitable penalty coefficients are chosen quite often. This indicates that testing the selection of the penalty coefficient is very necessary.
4.5. The Effect of Constraint Terms on Inversion
To highlight the contribution of constraint terms during the inversion process, we used the Bouguer gravity anomaly data of the Gazelle Fault as observational data and conducted 100 inversions under both unconstrained and constrained conditions. A comparison of the inversion results with and without constraint terms is shown in
Figure 9.
From
Figure 9, it is evident that the inversion results with constraint terms are significantly more stable than those without. The inversion results in constraint terms oscillate within a small range. The amplitude factor
, fault dip angle
, and fault origin
each exhibit one “abrupt change” phenomenon, but the magnitude is small and does not affect the overall stability of the inversion. Observing the inversion results without constraint terms, we observed that the oscillations of the amplitude factor
, fault dip angle
, and fault origin
are more pronounced, indicating that these three parameters are less stable during the inversion process due to the non-uniqueness of gravity inversion. The top center depth
and bottom center depth
of the fault are relatively stable and show a certain correlation. When
undergoes an abrupt change,
also changes abruptly. An interesting observation is that, when
and
change abruptly, the RMS error increases, suggesting that
and
have a greater impact on accuracy, which aligns with the results of the parameter sensitivity test in
Section 4.2.
In this section, we first compared the effects of different inertia weight reduction methods in the C-AWPSO algorithm and chose the linearly decreasing method, which strikes a good balance between convergence efficiency and accuracy. Next, we analyzed the sensitivity of five fault parameters. The results showed that parameters and had the highest sensitivity, and their inversion results would be relatively stable. Given the large number of parameters involved in the constraint terms, we also conducted tests to select appropriate regularization factors and penalty coefficients. Finally, we verified the impact of the constraint terms added to the objective function on inversion. The results indicated that inversion results with constraint terms were significantly more stable than those without. These experiments fully demonstrate that the C-AWPSO algorithm can yield accurate fault depth information when applied to practical fault inversion. After introducing constraint terms, the instability in the fault dip inversion results is notably improved. Therefore, the C-AWPSO algorithm is feasible and robust for fault parameter inversion.