Article Estimation of Seismic Wavelets Based on the Multivariate Scale Mixture of Gaussians Model

This paper proposes a new method for estimating seismic wavelets. Suppose a seismic wavelet can be modeled by a formula with three free parameters (scale, frequency and phase). We can transform the estimation of the wavelet into determining these three parameters. The phase of the wavelet is estimated by constant-phase rotation to the seismic signal, while the other two parameters are obtained by the Higher-order Statistics (HOS) (fourth-order cumulant) matching method. In order to derive the estimator of the Higher-order Statistics (HOS), the multivariate scale mixture of Gaussians (MSMG) model is applied to formulating the multivariate joint probability density function (PDF) of the seismic signal. By this way, we can represent HOS as a polynomial function of second-order statistics to improve the anti-noise performance and accuracy. In addition, the proposed method can work well for short time series.


Introduction
In seismic exploration, a seismic trace y(n) can be often represented as the convolution of a wavelet w(n) with the reflectivity series r(n) plus a superposed noise n(n):

OPEN ACCESS
where * denotes the convolution, s(n) represents the clean signal.It is generally assumed that reflectivity is a white super-Gaussian process; the noise is a white Gaussian process, and the wavelet is a band-limited filter.
In equation (1), the reflectivity series, the wavelet, and noise are unknown, except for the seismic trace, which can be obtained by a seismic receiver, so estimation of the wavelet and the reflectivity series from the seismic trace is a typical blind deconvolution problem.
Many techniques have previously been developed for this purpose, and these have been reviewed and described in detail in the literature.One early technique, which assumes that the reflectivity sequence is a Gaussian white noise, supplies a minimum phase wavelet from the second-order statistical content of the trace signal [1,2].Unfortunately, this approach is not satisfactory since the source wavelet is usually non-minimum phase, and the reflectivity sequence is not well described by a Gaussian model.
Another way of applying the whiteness hypothesis to estimating the wavelet consists in general purpose information criteria [3][4][5][6].It is related to minimizing the mutual information rate, which measures the independence degree of the deconvolved data sequence.Though there are some shortcomings to be overcome in the method, such as sensitivity to noise, finding global optimum, and so on, it is still a very important approach due to the absence of limitations for the wavelet phase.
In the blind deconvolution context, the problem is ill posed, and extra assumptions about the reflectivity or about the wavelet are necessary.For instance, one can model the reflectivity sequence by using a Bernoulli-Gaussian (BG) model.In the literature, a few approaches based on spike detection/estimation and BG models have been proposed using a maximum-likelihood approach or in a Bayesian framework adding some priors [7][8][9].For seismic field data, the reflectivity sequence changes do not appear as clearly as predicted by the BG model, therefore, the sparse nature of the input signal is not a sufficiently robust hypothesis for good system estimation [3].
The cumulant matching (CM) method has been developed for estimating a mix-phase wavelet from a convolutional process [10].Matching between the trace cumulant and the wavelet moment is done in a minimum mean-squared error sense under the assumption of a non-Gaussian, stationary and statistically independent reflectivity sequence.The reliability of the Higher-order Statistics (HOS) matching method depends strongly on the amount of data available.Thus, in this paper, we are interested in the problem of how we can estimate the HOS from the field recordings accurately and efficiently without having a large amount of data available.A computationally efficient method is proposed for computing the HOS (fourth-order and sixth-order cumulants) based on the joint probability distribution function (PDF) of the output seismic signal, which is described by the multivariate scale mixture of Gaussians (MSMG) model.Using this PDF model, we can represent HOS as a polynomial function of Second-order Statistics (SOS); hence we work just with SOS.Since SOS is phaseless, we apply constant-phase rotations technique to estimating the wavelet phase.Simulation result is provided showing the performance of the method, and the proposed method is seen to work satisfactorily in a real data example.
This paper is organized as follows.In Section 2, we describe the wavelet model and introduce the general CM methods for estimating the wavelet parameters.After introducing MSMG PDF model briefly in Section 3, and in Section 4, we derive the cumulants of the field recordings from second-order to sixth-order based on MSMG model, and use the one-dimensional cumulant rate (ODCR) to match the one-dimensional rate of the wavelet moment in order to obtain the scale and frequency parameters of the wavelet; we then illustrate performances of different orders ODCR for estimating the wavelet.In Section 5, we describe our wavelet estimation procedure.Examples illustrating the effectiveness of our approach are given in Section 6.Finally, we summarize our work in Section 7.

Wavelet Model and the General CM Methods
In reservoir geophysics, the seismic wavelet usually can be approximated by a formula as follows [11]: where the scale, σ, controls the width of the seismic wavelet, φ is its initial phase and f is its dominant frequency.Without loss of generality, we let the amplitude of the wavelet be a unit.

Estimating the Phase
The optimum phase could be estimated by applying a series of constant-phase rotations to the analyzed seismic trace [12].The rotated trace y rot (n) can be obtained from the original trace y(n) by where φ denotes the phase rotation angle, and H[•] is the Hilbert transform.Since a zero-phase wavelet contains the majority of its energy in relatively narrow time duration, its kurtosis is larger than that of any non-zero-phase wavelet.So, when the kurtosis of y rot (n) reaches its maximum, the related angle, φ is considered as the most likely negative wavelet phase of y(n).This method for estimating the phase of a seismic wavelet is called the constant-phase rotation method, which is very important to the blind deconvolution of seismic traces.
The constant-phase rotation method is based on the theory of entropy.The operation of rotating the seismic trace results in the increase of its non-Gaussianity.Negentropy is used to measure the non-Gaussianity of a random variable, which is defined as: where H(X) denotes the entropy of random variable X, X Gauss is a Gaussian variable with the same variance as that of X. Negentropy is nonnegative, and will become larger as a random variable goes off the Gaussianity.
In practice, the negentropy of a random variable with zero mean and unit variance can approximately be obtained as follows [13]: + ≈ (5) where: is defined as kurtosis of X.If X has a symmetrical probability distribution function, the equation above reduces to: Hence, the constant-phase rotation method is identical to the minimum entropy deconvolution proposed by R.A. Wiggins.To guarantee scale-invariance, Wiggins adopted the standardized 4th cumulant as a criterion to performing optimization, for a random variable with zero mean, which is: The kurtosis is usually calculated by (7) for the constant-phase rotation method.

Estimating the Scale and Frequency of the Seismic Wavelet
To estimate the scale and frequency parameters, one can use various cumulants to match the wavelet moments based on the above models (1) and (2).
In model (1), based on the assumption that reflectivity is a zero-mean, non-Gaussian, stationary and statistically independent sequence, and the wavelet represents a linear, time-invariant, causal and stable filter, the resulting seismic trace is thus a zero-mean discrete stationary random process, we further assume that {y(n)} is kth-order stationary, then, we have the following formula: where: ( ) ( ) denotes the kth-order cumulant of reflectivity sequence.The above equation establishes the relation between the kth-order cumulant of {y(n)} and the wavelet, which was proposed in 1955 by Bartlett [14] and in 1967 by Brillinger and Rosenblatt [15].
Since the reflectivity sequence distribution is unknown, γ r (k) cannot be obtained, we have to match the wavelet by means of the ratio of c y where: Hence, following from (8), we have: So, the cumulant matching method can be performed in a minimum mean-squared error sense by defining the cost function: (10) where rˆy (k) (τ 1 , τ 2 , …, τ k-1 ) denotes the estimate of r y (k) (τ 1 , τ 2 , …, τ k-1 ).

MSMG (Multivariate Scale Mixture of Gaussians) Model
In many real world data sets involving multivariate observations, the data have an empirical distribution which is highly peaked at zero (or the mean vector), and which asymptotically falls off more slowly than the Gaussian distribution as the distance from zero increases.We deal these distributions with sparse distributions, which can be frequently encountered in various areas, like blind source separation and independent component analysis (ICA), etc. Multivariate observations, which are mutually correlated and have higher-order dependencies, have frequently been represented using mixture of Gaussians models.In the above blind deconvolution context, the reflectivity sequence is said a white sparse distribution.As a convolution of a wavelet with the reflectivity series, the clean signal presents sparse and correlated nature, and can also be represented using mixture of Gaussians models.A random vector Y with zero mean called Multivariate scale mixture of Gaussians distribution can be described as follows [16][17][18].
Let X be an N-dimensional, zero mean Gaussian variable with covariance matrix equal to the identity matrix.And furthermore let Γ∈R N×N (N × N real matrix) be a positive definite symmetric matrix with det(Γ) = 1, and let Z be a scalar random variable with PDF g z (z), which can attain only positive values.We denote variable Y a multivariate scale mixture of Gaussians according to: Given a PDF g z (z) for the scale parameter Z, the marginal PDF for Y can be obtained by averaging the PDF of Y|Z over the density of Z: The characteristic function of Y can be written as: where: Using (13), we have: Hence, we have: where µ Y is mean vector of Y, R Y is covariance matrix of Y.

Calculating the HOS of Recorded Seismic Trace
Assuming that the clean signal s(n) is independent of additive noise n(n), using (1), we can write the characteristic function of the seismic trace as: where y = [y 1 , y 2 , …, y N ] T is the N-dimensional vector formed by the seismic trace, Φ s (ω) is the characteristic function of clean signal which can be given by ( 13), and Φ n (ω) is the characteristic function of additive noise.
Noting that the noise is a white Gaussian process, we have: (17) where σ n 2 is noise variance.
The cumulant-generating function of the seismic trace is: It is well known that there is a relationship between the cumulant-generating function and the cumulant given by [19]: where j 2 = −1.Representation ( 19) plays a fundamental role in calculating the cumulant, using ( 18) and ( 19), we can calculate the cumulant of the seismic trace.The calculated cumulants of the recorded seismic trace from second-order to sixth-order are shown in Table1, and the details are given in Appendix A. ) ) where Γ ij denotes the ith row and the jth column element of Γ.
Similarly, based on above result and using (9), we obtain the ratios of c y fourth-order and sixth-order, which are in Table 2, where Table 2. Cumulant rate.

One-dimensional Cumulant Rate
In practice, using one-dimensional cumulant rate (ODCR) is sufficient for us to estimate the wavelet parameters.To express succinctly, we denote different orders ODRC in convenient form, for instance: The different orders ODRC and their relation to the corresponding one-dimensional rate of the wavelet moments, following from (10), are shown in Table 3, where m denotes time lag.

Estimating ODCR
Noting that ρ s (0) is equal to one, and following from Table 2, we can obtain the ODCR, which take the form: To obtain ODCR, we need to estimate ρ s (m).It follows from (1) that: where σ r 2 denotes reflectivity variance, and δ(m) is Dirac function, R y (m) and ρ y (m) denote the autocorrelation function and the correlation coefficient of the seismic trace respectively.Hence, we have: The term σ r 2 R s (0)/σ n 2 denotes Signal-to-Noise Ratio (SNR).In practice, usually we do not possess the clean data; we have to estimate ρ y (m) as a substitution for ρ s (m).It follows from ( 22) that the estimation error increases as the SNR drops.
Using above results, we can establish now the estimators of ODCR, which are given by: where ρ ˆy(m) is the estimation of ρ y (m).

The Behavior of ODCR with Different Orders
To obtain a good estimation, it is necessary for us to exploit the differences between the different order ODCRa for matching waveleta.In this section, we thus define the different error to evaluate each estimated ODCR.
It follows from (9) and Table 3, that the value of that the true ODCR, derived from seismic data, divided by the corresponding one-dimensional rate of the wavelet moment, is equal to one.As for the estimated ODCR, the above value cannot be one, thus, we use the absolute value of the deviation from one to define error, which is shown in Table 4, where M denotes the maximal time lag.
We have simulated the seismic trace by convolving a super-Gaussian reflectivity with a seismic wavelet with zero phase, plus additive white noise, to select which one can match the wavelet best.Figure 1 illustrates the error behavior of ODCR with different orders, in terms of the summation of error, versus the number of simulation times, where the error following from correlation coefficient (the first row of Table 4) is compared with other errors from ODCR.We obtain the numerical results presented below by performing independent simulation trial 100 times.The simulation setting is shown in Table 5.
The reflectivity sequence samples are distributed according to a generalized Gaussian distribution (GGD).The PDF of a zero-mean generalized Gaussian random variable R with deviation λ is: where: where λ > 0 and Γ ~(•) defines the complete Gamma function given by: 1 0 ( ) A(ν, λ) is a generalized measure of the variance and defines the dispersion of the distribution, while the parameter ν describes the exponential rate decay and, in general, the shape of the peak at the center of the distribution.Well-known special cases of the GGD function include a Laplacian distribution (ν = 1) and a normal distribution (ν = 2).In effect, smaller values of the shape λ correspond to heavier tails and therefore to more peaked distributions.We perform 512 Monte-Carlo runs to obtain the reflectivity sequence samples; a typical reflectivity sequence is presented in Figure 4.As can be seen from Figure 1, rˆy (4) (y n (2) , y n+m (2) ) gains the best performance to match the wavelet, the next is rˆy (6) (y n (4) , y n+m (2) ).Moreover, in our simulation trials, the error variances of using rˆy (4) (y n (2) , y n+m (2) ) and rˆy (6) (y n (4) , y n+m (2) ) are very small, about 3 × 10 −4 and 5 × 10 −4 respectively, others are about 0.01, which implies the numerical robustness.Figure 2 shows the change of the performance of rˆy (4) (y n , y n+m ( ) with the wavelet frequency.We can see that there is no distinct change when the frequency varies.Like the other methods, the smaller the value of the scale of the wavelet is, the better estimation values our method can obtain.In order to test our method, we choose a comparatively large scale value for the wavelet in the simulation.To complete our discussion, we finally compare the performance of rˆy (4) (y n , y n+m ) derived from MSMG model with that of the same one estimated directly from the recorded seismic trace.rˆy (4) (y n (2) , y n+m ), estimated directly from the recorded seismic trace, is denoted as rˆy 0 (4) (y n (2) , y n+m (2) ).
According to: ) given by: ) , y n+m ), given by: ( ) ( ) ( ) We also perform independent simulation trial 100 times in the same setting which is shown in Table 5, the result illustrated in Figure 3 shows that using rˆy 0 (4) (y n (2) , y n+m (2) ) to match the wavelet is very poor, the drastically different performance with that of using rˆy (4) (y n (2) , y n+m (2) ) consist in that the outliers of seismic trace pose a serious influence on the HOS estimation.).

Estimating the Wavelet
We can solve our problem in a two-step approach.First, we estimate the wavelet phase using the constant-phase rotation method.Then, we use rˆy (4) (y n (2) , y n+m (2) ) to match the rate of the wavelet moment in order to obtain the scale and frequency parameters.Since the estimator of ODCR is a polynomial function of the correlation coefficient, and the correlation coefficient is phaseless, we can perform these two steps.
As for computing the rate of the wavelet moment, since correlation coefficient is phaseless, we use the wavelet by simply taking the form: Its discrete moment rate corresponding to rˆy (4) (y n , y n+m ) is: Minimizing cost function: we can obtain the scale and frequency parameters: This is done by simply searching for different values of the scale and frequency among their reasonable scopes, which can be obtained from the spectrum of the seismic trace.Based on above results, we summarize the seismic wavelet estimate algorithm as follows: 1. Use constant-phase rotation method to obtain the wavelet phase.2. Estimate the correlation coefficient of the recorded seismic trace, and compute rˆ4(y n (2) , y n+m ).

Results
In this section, we evaluate performance of the proposed methods for the wavelet estimation through simulations and real data experiments.

Simulation Results
We build a synthetic trace of length over 500 by convolving a super-Gaussian reflectivity with a wavelet defined by equation ( 2), plus additive white noise.Figure 4 plots the result of a simulation experiment.The reflectivity is simulated by a white process with GGD distribution, the true and estimated parameters are shown in table 6 respectively.Table 6 shows that the estimated quantities σ, f and φ are very close to their true values with small errors.This demonstrates that: (1) our method can work well over short time series, but the general CM method cannot; (2) our method has both strong anti-noise performance and high accuracy.Moreover, our method has low computational costs.

Real-data Experiment
Here, we consider real seismic trace with about 800 samples acquired from the Changqing oil field in China.It is noteworthy that the Wiener filter can achieve an optimum trade-off solution between deconvolution quality and noise amplification [6], so we find the seismic wavelet by our method and then improve the resolution of the seismic trace with the Wiener filter.The results are shown in Figure 5 where: (a) is original seismic trace, (b) denotes the deconvolved one, (c) is the estimated seismic wavelet, and (d) are amplitude spectrum of (a) and (b) respectively.
Let us compare (a) and (b).The seismic reflectors, labeled A, B in (b), are identified, but this cannot be done in (a) because of its poor resolution.Now let us compare them in frequency domain.Trace (d) shows that the amplitude spectrum of (b) is broader than (a).These show that the resolution of (b) is obviously high than (a).

Conclusions
In this paper, we have proposed a new approach based on the MSMG model for estimating seismic wavelets.We compared the performance using HOS derived from the MSMG model with that of the same one estimated directly from the recorded seismic trace.The method has been applied to synthetic and real traces.Simulation and real-data experiment have shown the ability of our approach in the presence of low Signal-to-Noise Ratio and the traces without a large amount of data available.We only give an application of our method for enhancing the resolution of seismic trace.In fact, it can be used any field where seismic wavelets must be estimated from seismic data, such as seismic trace inversion, Q estimated with WEPIF method, etc.

Figure 1 .
Figure 1.The error behavior of ODCR with different orders.

Figure 2 .
Figure 2. The change of the performance in variation with wavelet frequency.
define the error and its summation in term of rˆy 0

M
(m) corresponding to the different values of the scale, frequency and lags.

4 .
Minimize cost function to get the scale and frequency parameters.

Figure 4 .
Figure 4. (a) Super-Gaussian reflectivity sequence.(b) Reflectivity histogram.(c) Noisy seismic trace.(d) The kurtosis value of the rotated trace versus the rotation angle (the angle is represented as the multiple of π).(e) Cost surface.(f) Wavelet.
To calculate the cumulants, we need compute the following statistics: the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland.This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Table 3 .
ODCR and the rate of wavelet moment.