Tsallis Entropy, Likelihood, and the Robust Seismic Inversion

The nonextensive statistical mechanics proposed by Tsallis have been successfully used to model and analyze many complex phenomena. Here, we study the role of the generalized Tsallis statistics on the inverse problem theory. Most inverse problems are formulated as an optimisation problem that aims to estimate the physical parameters of a system from indirect and partial observations. In the conventional approach, the misfit function that is to be minimized is based on the least-squares distance between the observed data and the modelled data (residuals or errors), in which the residuals are assumed to follow a Gaussian distribution. However, in many real situations, the error is typically non-Gaussian, and therefore this technique tends to fail. This problem has motivated us to study misfit functions based on non-Gaussian statistics. In this work, we derive a misfit function based on the q-Gaussian distribution associated with the maximum entropy principle in the Tsallis formalism. We tested our method in a typical geophysical data inverse problem, called post-stack inversion (PSI), in which the physical parameters to be estimated are the Earth’s reflectivity. Our results show that the PSI based on Tsallis statistics outperforms the conventional PSI, especially in the non-Gaussian noisy-data case.


Introduction
Recently, Tsallis nonextensive entropy [1] was applied to describe and analyze many complex phenomena that the standard statistical mechanics seem inadequate to address [2][3][4]. The reason for the success of the Tsallis theory comes from the fact that the observational data are better described within the Tsallis framework [5]. In this scenario, the non-Gaussian distributions that arise from the Maximum Entropy Principle (MEP) [6,7] associated with Tsallis entropy [1,8] help to explain many characteristics of complex systems, such as long-range correlations [9] and the scale-free phenomena [10,11].
The Tsallis statistics are an extension of the standard statistical mechanics and they are based on the Boltzmann-Gibbs-Shannon (BGS) entropy measure. The successful applications of Tsallis entropy [12] motivates us to explore the mathematical tool behind the Tsallis framework. In this study, we included the Tsallis statistics in the inverse problems theory, which is an important field in applied physics [13], engineering [14], seismology [15], biomedical imaging [16], machine learning [17], and geophysics [18], among many others.
The essence of an inverse problem consists of obtaining information about physical parameters from indirect observations [18,19]. To achieve this task, the inverse problem is formulated as an optimisation problem that aims to minimize the difference between the modelled data and the observed data (residuals or errors), in which the usual misfit function is based on the least-squares distance of the residuals [19]. In addition, inverse problems are inherently ill-posed in the sense of Hadamard [20], meaning that the solution is not unique and a little perturbation in the observed data can severely impact the search for the global minimum of the misfit function [21].
In this paper, we explore a novel methodology in the geophysical data inverse problem using Tsallis entropy and a probabilistic maximum-likelihood approach. This methodology allows us to perform seismic data inversion to estimate the physical parameters of subsurface structures, known as post-stack inversion (PSI) [22]. The main objective of the PSI methodology is to estimate the reflectivity of subsurface structures by minimizing the difference between the observed seismic data and the theoretically modelled data using the misfit function [22,23]. We construct the misfit function by applying the probabilistic maximum-likelihood [19,24] method for the q-Gaussian probability distribution, which is obtained through the MEP for Tsallis entropy [8,25,26]. From a statistical viewpoint, this is equivalent to assuming that the residuals obey the q-generalized Gauss' error law [27]. The results show that the PSI based on Tsallis statistics is a powerful tool for the reconstruction of seismic images, especially in situations where the data is noisy. Our proposal outperforms the conventional PSI approach because the noise present in the data is seldom Gaussian [28].
In the next section, we briefly review the conventional PSI formulation. In Section 3, we review the Tsallis framework and derive the misfit function associated with the nonextensive entropy, as proposed in the Tsallis work [1]. This section is the core of this work and, therefore, it is dedicated to PSI formulation on the Tsallis framework. In Section 4, we describe the parameters that we used in the numerical simulations to test our methodology to a geophysical problem from oil reservoir imaging. In addition, we compare the conventional Gaussian-oriented PSI with the PSI based on q-statistics. Finally, in Section 5, we give our final remarks and we place this manuscript in a broader context.

Conventional PSI Formulation
The PSI is used to quantitatively infer rock-properties. In particular, it describes the subsurface structures from the reflection data (or post-stack data) recorded in a seismic survey [22]. The typical seismic experiment consists of an explosive source and a set of mechanical sensors that capture the waves reflected from the geological structures. Using a conventional approach, the PSI is formulated as a least-squares optimisation problem [22,23], whose main goal is to minimize the difference between the observed data (d obs ) and the modelled data (d mod ) using the misfit function: In this equation, r represents the reflectivity of the medium under study (model parameters) and t denotes the time. In addition, the superscript T refers to the transpose operator and n represents the size of the total reflectivity series recorded in the seismic survey. The modelled data relies on the convolution between the seismic source, s(t), and the medium reflectivity series, r(t), given by [29]: where * represents the convolution operator and G represents a kernel matrix computed from the seismic source, in which its action on any vector r(t) will result in the convolution between s(t) and r(t).
The least-squares PSI (hereafter conventional PSI) determines the maximum a posteriori solution that can be used to estimate the model parameters. From a probabilistic maximum-likelihood viewpoint, the errors are assumed to be independent and identically distributed by a Gaussian probability distribution [18,19,24]. Consequently, the minimization of φ G (r) is equivalent to the maximization of the standard Gaussian likelihood, given by: in which the maximum a posteriori of r is obtained by minimizing the negative log of L G (r). It is well-known that the MEP [6,7] for the BGS entropy and straightforwardly yields a standard Gaussian distribution, where p(x) is a probability density function. The constraint (5) is the normalization condition, while the constraint (6) restricts the second moment to unity. In summary, the conventional PSI maximizes the BGS entropy under the constraints (5) and (6).

Tsallis Framework and Seismic Inversion
Constantino Tsallis postulated a generalization of the BGS entropy [1] that describes non-equilibrium states and also describes the equilibrium thermostatistics [30]. Formally, the Tsallis framework (or q-framework) is based on the q-generalized logarithm function [1,8] and its inverse function, the q-generalized exponential: where q ∈ R is the nonextensive parameter of the Tsallis theoretical framework and [a] + = max{0, a}.
In the q → 1 case, the expressions in (7) and (8) reduce to the usual exponential and logarithmic functions, respectively. For the continuous case, the q-entropy associated with the q-framework is given by: where k B is the Boltzmann constant. In the limit q → 1, the q-entropy in (10) recovers BGS entropy [1] Furthermore, Tsallis entropy (10) is usually written as [1]: (12) which is the mathematical definition that we will use in this paper.

Maximum Tsallis Entropy and the q-Gaussian Distribution
The MEP [6,7] for Tsallis entropy (12) under the constrains of normalization (5) and the q-normalized expectation is associated with the following functional entropy to be maximized: where α 1 and α 2 are Lagrange multipliers. The optimisation of Equation (14) consists of calculating the functional entropy saddle point (stationary point), δS δp(x) = 0: To satisfy Equation (15) the following equation should be satisfied: which straightforwardly yields the q-probability function: To rewrite the probability function in (17) using q-generalized functions, consider the expression where A q and B q depend on q and the Lagrange multipliers α 1 and α 2 . After some algebra, we derive the following expression for the q-probability distribution: with In this context, Equation (19) corresponds to the q-Gaussian probability distribution, which is used in the geophysical inversion in this manuscript. We notice that no q-Gaussian exists in the q ≥ 3 case because it does not satisfy the normalization condition, Equation (5).

The q-misfit Function
Assuming that all residual data (x = x 1 , x 2 , ..., x n ) are independent and follow a q-Gaussian distribution (20), the q-misfit function is obtained by the log-likelihood (we take the logarithm for convenience): This formula can be written as: We notice that minimize Equation (22) is equivalent to minimizing the function: which in our geophysical inversion problem is formulated as the following optimisation problem: In addition, we consider that B q = 1/(3 − q), as proposed in [25] for 1 < q < 3. We notice that the conventional misfit function is recovered from Equation (24) for the limit q → 1. Using the L'Hôpital's rule: quod erat demonstrandum. Therefore, the optimisation problem formulated in (24) becomes: We call the function φ q (r) a q-misfit function and we call q-PSI the optimisation problem employing the q-misfit function. It is worth noting that the term in brackets is always positive in the 1 < q < 3 case. However, for q < 1, φ q (r) is given by Equation (26) [25] and zero otherwise. To simplify the notation, henceforth operation [a] + will be implicit in the equations.

PSI as a Local Optimisation Problem
As mentioned earlier, the PSI is usually solved using local optimisation methods. Starting from a reflectivity series r 0 (initial model), the optimisation problem is solved iteratively by updating the model according to: where α k is the steplength [31] of the k-th iteration, H is the Hessian matrix, and ∇ r φ(r k ) = ∂φ(r k )/∂r is the gradient of the misfit function φ(r k ). Therefore, in addition to the misfit function, we need to determine their corresponding gradient and Hessian.
In the q-PSI case, the gradient is given by: For comparison, the gradient of the conventional misfit function is given by: wherein it is easy to see that at the limit q → 1, the q-misfit function gradient-Equation (28)-becomes the conventional misfit gradient function-Equation (29). By comparing Equations (28) and (29), we can see that the gradient of the q-misfit function is weighted by the factor 2/[3 − q + (q − 1)(Gr i (t) − d obs i (t)) T (Gr i (t) − d obs i (t))]. Consequently, the large residual data are dampened in the model update process, Equation (27), which makes the q-PSI less sensitive to large errors than the conventional PSI approach.
In our numerical study, because of the fast convergence, we perform the optimisation with the quasi-Newton method limited-memory Broyden-Fletcher-Goldfarb-Shanno (l-BFGS) [31], in which the inverse of the Hessian is obtained through an approximation computed from previous gradients [32]. Thus, we do not explicitly calculate the second derivative of the misfit function, which saves a lot of memory and reduces the computational cost. This trick is valuable because geophysics problems typically have a large number of variables (number of model parameters).

Numerical Results
We illustrate our method using a portion of the Marmousi2 model, which is based on the geology of the Kwanza Basin region (Angola) [33,34]. The acoustic impedance distribution of this model is shown in Figure 1a. The impedance model consists of 2000 and 400 grid cells in the vertical and horizontal directions, respectively (800,470 total grid points). From this impedance model, we obtained the reflectivity model that will be used as the true model in this study, which is depicted in Figure 1b. The mathematical relationship between acoustic impedance (Z) and reflectivity (r) is given by [29,35]: (b) Reflectivity model (true model), which is extracted from the impedance model in Figure 1a using Equation (30).
A Ricker wavelet [36] is considered to be the source signature, which is defined by: where ν p is the most energetic frequency. In this study, we considered ν p = 55 Hz.
For the optimisation solver, we use the l-BFGS algorithm [32] to minimize the misfit function for both methods, in which the stop criteria is a tolerance in the gradient norm or if the step length does not satisfy the Wolfe conditions [31]. The gradient norm is ||∇ r φ(r)|| < = 10 −12 . If one of these criteria is satisfied, then the inversion process stops. Figure 2b shows the initial model that was used for all of the numerical simulations. We considered two scenarios: in the first scenario, an ideal case with noiseless data is considered to demonstrate the good working order of the algorithms. Then, in the second scenario, we considered a dataset with spikes, to simulate a non-Gaussian noise. The spikes were added over 1% of the samples (chosen randomly using a uniform distribution) by rescaling the signal amplitudes by a factor of 15β, in which β follows a standard Gaussian distribution. For each scenario, we performed 16 inversions, the first follows the conventional approach, Equation (1) and for the others we used the q-PSI with q = 0.1, 0.3, 0.5, ..., 2.9, Equation (26). The observed data for these two numerical experiments are depicted in Figure 3. We emphasize that the initial model is the same for all numerical experiments. The inversion results for the first scenario using conventional PSI and q-PSI with q = 0.1, 0.3, 0.5, ..., 2.9 are shown in Figure 4. The results for both methods show a good image resolution of the subsurface, we then compare the results with the true reflectivity model (Figure 1b).
In the second scenario in which non-Gaussian noise is considered, the conventional PSI fails to obtain a good reconstruction of the reflectivity model, as depicted in Figure 5e. In contrast, the q-PSI seems to obtain results that are very close to the true reflectivity model as the q value increases in the 1 < q < 3 case; that is, as the deviation from Gaussian behaviour is larger, as depicted in Figures 5f-o for q = 1.1, 1.3, 1.5, 1.7, ..., 2.9, respectively. Although the q-PSI results for q ≤ 1.3 show artefacts (see Figures 4a-d,f,g), they are still superior to the conventional PSI result (Figure 4e).
In addition, we compute three statistical measures to quantitatively compare the PSI results with the true reflectivity model: in the first measure, we computed the normalized root-mean-square (NRMS), which is defined as: where r true corresponds to the true reflectivity model and r inv is the inversion result. A NRMS close to 0 means low error. The other two measures are similarity measures: Pearson's coefficient (R) [37] and the structural similarity (SSIM) index [38]. Pearson's coefficient measures the linear relationship between the reconstructed model and the true model, while the SSIM is a quality index of similarity between images. Both R and SSIM measures vary between −1 (anti correlation/similarity) to 1 (perfect correlation/similarity). The statistical measures for the two scenarios are summarized in Table 1.
In the analysis of the results presented in Table 1, we observe that the q-PSI inversion for the q-value around 2.1 shows the best result with the lowest NRMS and the highest similarity (highest R and SSIM) for all scenarios. However, we emphasize that our proposal with q > 1.3 also produces good results, is robust to non-Gaussian noise, and shows low sensitivity to outliers in the dataset, which are represented here by the spikes.
Furthermore, the q-PSI requires the same or fewer number of iterations than the conventional PSI to converge, as illustrated in Figure 6. In the first scenario, the convergence of both methods is quite similar, as depicted in Figures 6a and 7a. However, in the second scenario, the q-PSI with q > 1.3 and q < 1 converges more quickly than the others, in which the decay of the convergence curve slopes more as the q-value increases, as depicted in Figures 6b and 7b.

Conclusions
We investigate the portability of Tsallis nonextensive entropy to compute physical parameters for a geophysical data inverse problem. In this context, we have presented a robust misfit function to mitigate the seismic inversion sensitivity to noise in the reconstruction of subsurface reflectivity models. Given that we employ a PSI method for the geophysical inversion, we refer to our proposal with the abbreviation q-PSI-in reference to Tsallis statistics (or q-statistics).
To illustrate the stability and robustness of our proposal, we considered two numerical experiments: the first experiment is an ideal noiseless data and in the second experiment we added spikes in 1% of the data simulating a non-Gaussian noise. The numerical experiments demonstrate that the q-PSI outperforms the conventional PSI and is a promising tool for seismic exploration, especially for use with non-Gaussian noisy-data. Furthermore, the inclusion of Tsallis statistics in solving the inverse problem under study accelerates the algorithmic convergence of the inversion method, especially for the 1.3 < q < 3.0 cases.
The choice of the q-parameter is an important aspect of our methodology and, after a statistical analysis of the PSI results, we conclude that q-values around 2.1 produce more reliable estimates of the physical parameters. To conclude, the q-PSI is a valuable tool in exploration geophysics. Furthermore, we believe that our proposal can be extended to other inverse problems that require robust methods.