3-D Data Interpolation and Denoising by an Adaptive Weighting Rank-Reduction Method Using Multichannel Singular Spectrum Analysis Algorithm

Addressing insufficient and irregular sampling is a difficult challenge in seismic processing and imaging. Recently, rank reduction methods have become popular in seismic processing algorithms for simultaneous denoising and interpolating. These methods are based on rank reduction of the trajectory matrices using truncated singular value decomposition (TSVD). Estimation of the ranks of these trajectory matrices depends on the number of plane waves in the processing window; however, for the more complicated data, the rank reduction method may fail or give poor results. In this paper, we propose an adaptive weighted rank reduction (AWRR) method that selects the optimum rank in each window automatically. The method finds the maximum ratio of the energy between two singular values. The AWRR method selects a large rank for the highly curved complex events, which leads to remaining residual errors. To overcome the residual errors, a weighting operator on the selected singular values minimizes the effect of noise projection on the signal projection. We tested the efficiency of the proposed method by applying it to both synthetic and real seismic data.


Introduction
Seismic surveys are designed to keep a consistent grid of sources and receivers. In typical surveys, regularly sampled seismic surveys are uncommon or rare because of logistic obstacles or economic constraints. These limitations lead to large shot and receiver sampling intervals, which produce poorly, irregularly sampled seismic data along spatial coordinates with gaps without recorded traces. Seismic reconstruction methods can be divided into four main classes: signal processing-based methods, wave equation-based methods, machine learning-based methods, and rank reduction-based methods.
Most of the methods in the signal processing-based category are multidimensional and use prediction filters [1][2][3] and transform domains through means such as the Fourier transform [4][5][6], Radon transform [7,8], or Curvelet transform [9]. There are some hybrid techniques using combinations of the Fourier transform with prediction error filters [10] as a way to improve interpolation beyond aliasing. Vassallo et al. [11] proposed multichannel interpolation by matching pursuit (MIMAP), which interpolates high-order aliased data without making any prior information about the linearity of seismic events or the seismic wavefield model. Ghaderpour [12] proposed a multichannel anti-leakage least-squares spectral (MALLSSA) method that includes the spatial gradients of seismic data into the anti-leakage least-squares spectral (ALLSSA) to regularize seismic data beyond aliasing.
Wave-equation-based algorithms execute an implicit migration de-migration pair. Stolt [13] introduced a reconstruction algorithm that regularized a dataset by mapping it to physical space using migration-demigration. A finite-difference offset continuation filter was proposed by Fomel [14].
In recent years, machine learning has attracted much attention in geophysical studies. Deep learning (DL) is having a great influence on signal and image processing. However, method (AWRR) that selects the rank of the block of Hankel matrices automatically to reconstruct and denoise 3-D seismic data simultaneously. The strategy is to choose the second cutoff in the singular-value spectrum of the block Hankel matrix. We tested the efficiency of the proposed method by applying it to both synthetic and real seismic data.

Materials and Methods
Consider S(t, x, y) a block of 3-D seismic data in the t − x − y domain on N t by N x by N y samples. (t = 1, . . . , N t ), (x = 1, . . . , N x ), y = 1, . . . , N y . In the frequency domain, the data are represented as S(ω, x, y) and (ω = 1, . . . , N ω ). Each frequency slice of the data at a given frequency ω m can be represented by the following matrix: To avoid notational confusion, let us ignore the argument ω m . Then, construct a Hankel matrix from each inline of S; for the inline ith, the Hankel matrix in the mth frequency slice will be: The size of M is I × J, where I = (N x − m + 1) N y − n + 1 and J = mn. The integers m and n are chosen to make the Hankel matrices of H and the block Hankel matrix of M square matrices, or close to square. As we see, including the two spatial dimensions of the 3-D cube for each frequency can make the block Hankel matrix very large.
In seismic data, the observed data can be indicated as S obs = R S 0 + η , where S obs represents observed seismic data, S 0 indicates the full noiseless data, η represents the random noise and the residuals, and R indicates the sampling matrix formed of zeros and ones. Using the MSSA algorithm, we can write the Hankel matrix of observed data as: where P represents low-rank Hankel matrix of the desired signal and N denotes the noise component which includes the gaps as well. In MSSA, the decomposition of the Hankel matrix in the rank-reduction step using TSVD will be as follows: If one knows the desired rank of the Hankel matrix, the estimated signal desired signal is recovered by: whereM is the estimated signal, and k is the predefined rank of the Hankel matrix equal to the number of the linear events in each local window.
Let us investigate the effect of linear events on the singular values in each frequency. Figure 1a shows the cube of 3-D data. Figure 1b indicates an inline of the data and Figure 1c represents a slice of data in a cross-line direction. We investigate the singular value distribution of the data by converting it to the ( f , x, y) domain. Next comes generating the block Hankel matrix in each frequency slice. Figure 2a shows the block Hankel matrix for the frequency slice of 20 Hz. Figure 2b displays the block Hankel matrix generated in the frequency slice of 50 Hz. According to Figure 2a,b, it is clear that the block Hankel matrix for the frequency of 20 Hz is smoother than the one for the frequency of 50 Hz. It signifies that the block Hankel matrices for the higher frequencies require a higher rank than those in the lower frequencies to recover. The singular-value spectrum of the data's block Hankel matrices for its frequency range is presented in Figure 3. Figure 3 represents the first 15 singular values' spectrum. This figure illustrates how the number of nonzero singular values of the block Hankel matrix at low frequencies is equal to the number of linear events but increases with frequency. Similarly, Figure 4a represents the bar plot of the normalized singular values for the frequency slice of 20 Hz. Figure 4b shows the zoomed image for the first 20 singular values. Figure 4c shows the bar plot of the singular values for the frequency slice of 50 Hz. Figure 4d is the zoomed image for the first 20 singular values. From the figures, one can see that for the frequency of 20 Hz, the maximum difference in the energy between two adjacent singular values occurs in the third singular value, and it is the same as the number of linear events. However, for the frequency slice of 50 Hz, there is a second group of nonzero singular values. It means that we need to keep more singular values to recover all the frequencies completely.
(a) (c) To analyze the effect of each singular value on the energy of each frequency for the data presented in Figure 1, data were decomposed into their singular matrices and recovered with each singular value from 1 to 15 separately. Then, the energy per frequency or power spectrum of the recovered data with each singular value is calculated. In Figure 5, the red line represents the graph of frequency energy per frequency, and the blue line shows the estimate of the mean normalized frequency of the power spectrum. Figure 5a is the result of recovering data with the first singular value. Figure 5b shows the energy per frequency for the recovered data with the second singular value. Figure 5c is the energy per frequency for the recovered data, including just the third singular value. The estimated mean frequency for these first three singular values is 77 Hz. Figure 5d-f represent the energy per frequency for the recovered data, including just the 4th, 5th, and 6th singular values, respectively. The estimated mean frequency for these singular values is 90 Hz. Figure 5g-i indicates the energy per frequency for the recovered data, including just the 7th, 8th and 9th singular values, respectively. The estimated mean frequency for these singular values is 88 Hz. Figure 5j-l show the energy per frequency for the recovered data, including just the 10th, 11th, and 12th singular values, respectively. It is clear that most energy of the useful signal is recovered with the first three singular values. Nevertheless, the shift in the mean frequency from 77 to 90 Hz states that there is leakage for the higher frequencies in the data. We conclude from this test that despite the remaining residual errors in the presence of the additive noise, we need to choose the rank of the block Hankel matrix to be greater than the number of linear events in each processing window of frequency and space. To find the best rank of data that minimizes residual errors, we tested different kinds of data containing various numbers of events with different slopes and amplitudes. The best result is when we choose the rank from the second cutoff of singular values instead of the first cutoff.
In all tested data, it is possible to see the first cutoff with a distinct change in the energy of singular values. However, sometimes there is no abrupt change in the amplitude of the singular values for the second cutoff, especially in the presence of a high level of random noise. Finding the second cutoff could be a challenge when the quality of the signal is poor. With a careful look at where the first and the second cutoff in clean and complete data happen, we conclude that there is a linear relationship between the first and second cutoff. The best estimate for the second cutoff can be estimated as follows: where σ i is the ith singular value of the Hankel matrix in each frequency. T (Σ, k) indicates the operator that finds the rank at the point where the two following singular values become more scattered. B Σ,k indicates the second cutoff of the singular-value spectrum of the block of a Hankel matrix in each frequency slice.k can be introduced as the optimal rank of the block Hankel matrix that minimizes the Frobenius-norm difference between the approximated and the exact signal components.
Substituting Equation (8) to the rank-reduction step will give us: By applying Equation (9) on the rank-reduction step of MSSA, the rank-reduced block Hankel matrix is reconstructed. The next step is averaging anti-diagonals of the recovered block Hankel matrix. It recovers the signal in the Fourier domain for each frequency slice. The rank-reduction step leads to gradually reconstructing the missing traces and can be used together with an iterative algorithm to replace the missing traces with the reconstructed traces. An iterative algorithm is a practical approach for random noise attenuation and seismic data amplitude reconstruction. The algorithm is written as follows: where ω m is the temporal frequency, I is a matrix of ones, α n ∈ (0, 1] is a weight factor, which decreases linearly with iterations, F indicates the Fourier transform, and F −1 shows the inverse Fourier transform. B points to the Hankelization operator to generate the block Hankel matrix, R reveals the rank reduction operator, and A gives the averaging anti-diagonal operator. Selecting the rank using Equation (9) recovers all the signal components. One of the advantages of this rank-reduction method is that it is adaptive and data-driven, and there is no need to set any parameter for the rank-reduction step in each processing window. Moreover, choosing the second cutoff instead of the first one leads to a better reconstruction of the high frequencies. However, selecting large ranks leads to more residual errors in the recovered data. That is why we need to apply a weighting operator to reduce the effect of noise on the projected signal components to recover higher frequencies.
We are looking for a weighting operatorŴ to adjust the singular values ofM to calculate the best estimation of the desired signal. Nadakuditi [34] proposed an algorithm for low-rank matrix denoising that can be summarized as follows

1.
M n×m is the signal plus noise Hankel matrix.

2.
k is the best effective rank that can represent the signal.

3.
For i = 1 : k, 4. End for loop, 5. Compute In this algorithm, D(σ; Σ) is computed as: where D represents D-transform and Tr(.) denotes the trace operator of the input. The D represents the derivative of D with respect to σ: The D-transform describes how the distribution of the singular values of the sum of the independent matrices is related to the distribution of the singular values of the individual matrices [37]. Benaych and Nadakuditi [38] indicate that the principal singular values and vectors of a large matrix can be set apart as the singular values of the signal matrix and the D-transform of the limiting noise-only singular value distribution. From Equation (11), the weighting operatorŴ can be written as: We can substitute the weighting algorithm obtained from Equation (14) into Equation (9) to enhance the results of the rank-reduction step as: whereM indicates the reduced rank block Hankel matrix. Missing traces can be interpolated completely by applying the iterative algorithm. This adaptive-weighting rankreduction (AWRR) method leads the way that sorts out the rank of the block Hankel matrix automatically while reducing the effect of the residual errors. The weighting operator satisfies the equation below [34]: To understand the effect of the weighting operator on the singular values of the block Hankel matrices, the predefined rank-reduction method (TRR) and the weighting rankreduction method (WRR) with predefined rank were applied to a block Hankel matrix. Figure 6a shows clean and complete 3-D synthetic data having four linear events. Figure 6b indicates the same 3-D data with SNR = 2 and 51% missing traces. Figure 6c,d show slices of data in inline directions for the cubes of Figure 6a and Figure 6b, respectively. Figure 6e,f are the slices of the cubes of Figure 6a and Figure 6b in the crossline direction, respectively. Figure 7 shows the Hankel matrices of data in Figure 6 for the frequency slice of 20 Hz. Figure 7a is the block Hankel matrix of clean and complete data. Figure 7b corresponds to the block Hankel matrix of data with SNR = 2 and 51% missing traces. Figure 7c shows the block Hankel matrix after applying the predefined rank = 12 for ten iterations. Figure 7d indicates the block Hankel matrix after applying the weighting operator to adjust the singular values of the data using TSVD with predefined rank = 12 after ten iterations. We can see from Figure 7 that the block Hankel matrix of the WRR method is smoother than the result of applying a predefined rank = 12 after ten iterations. Figure 8 corresponds to the spectrum of the first 25 singular values of the block Hankel matrices of Figure 6. Figure 8a represents the singular-value spectrum of the block Hankel matrix of clean and complete data for the frequency of 20 Hz, the first three singular values indicate the useful signal that relates to the coherent events in the data. Figure 8b relates to the singular-value spectrum of the noisy and incomplete data, Figure 8c shows the singular-value spectrum of the noisy and incomplete data after applying TSVD with rank = 12. Figure 8d indicates the singular-value spectrum of the noisy and incomplete data after applying TSVD with rank = 12 and the weighting operator. We can see that the singular spectrum of the data after applying the weighting operator is much closer to the spectrum of the desired signal in Figure 8a.   Figure 9a shows the block Hankel matrix of the clean and complete data, and Figure 9b is the block Hankel matrix of data with SNR = 2 where 51% of traces are removed. Figure 9c displays the recovered block Hankel matrix after applying TRR and rank = 4 for ten iterations. Figure 9d indicates the block Hankel matrix after applying the WRR with predefined rank = 4. Figure 9e represents the recovered block Hankel matrix of the ARR method, and Figure 9f is the block Hankel matrix recovered by the weighting ARR (AWRR) method. Generally, the results of AWRR and WRR are smoother than the output of TRR and ARR, and the results of ARR and AWRR include more details than those of TRR and WRR. Figure 10 corresponds to the spectrum of the first 25 singular values of the block Hankel matrices of Figure 9. Figure 10a represents the singular-value spectrum of the block Hankel matrix of clean and complete data for the frequency of 50 Hz. The abrupt drop in energy of the third and the fourth singular values relates to the useful signal. However, there are still nonzero singular values that are related to the useful signal. Figure 10b is the singular-value spectrum of the block Hankel matrix of noisy and incomplete data. Figure 10c shows the singular-value spectrum of the noisy and incomplete data after TRR with rank = 4. Figure 10d displays the singular-value spectrum of the noisy and incomplete data after applying WRR with rank = 4. Figure 10e depicts the singular-value spectrum after applying ARR. Figure 10d indicates the singular-value spectrum of the noisy and incomplete data after the implementation of the AWRR method. From the figures, one can see that the result of AWRR is more comparable to the corresponding result for the desired signal in Figure 10a.
(e) (f) Figure 9. Block Hankel matrices for frequency of 50 Hz for (a) clean and complete data and (b) data with SNR = 2 and 50% missing traces; (c) block Hankel matrix recovered by TRR rank = 4 after ten iterations; (d) data recovered by WRR after ten iterations. (e) Data recovered by applying the ARR method for ten iterations, and (f) data recovered after applying AWRR for ten iterations.

Results and Discussion
Several evaluations are presented in this section to estimate the proficiency of the different methods of rank reduction for the MSSA algorithm. First, the methods were tested on a synthetic shot gather containing nine hyperbolic events with different curvatures. Then, they were tested on a shot gathered from a 3-D field dataset. To judge the reconstruction results numerically, we use an interpolation quality factor (QF) defined by: where d 0 is the clean and complete data, and d f is the result after applying an interpolation algorithm. Figure 11 shows the result of applying the previously discussed methods of rankreduction to a synthetic shot gather holding nine hyperbolic events with different curvatures. The first test is a cube of 100 inline and 11 crosslines with SNR = 2 and 60% decimated traces. We chose the local window with 23 traces in the inline direction and 11 in the crossline direction for each method and half a window overlapping in each direction. We set the number of iterations to be constant for all of the methods and the frequency range for the application of the MSSA algorithm to 1 to 100 Hz. The rank of TRR and WRR was set to nine. Figure 11a is the desired data arranged into a 2-D matrix. Figure 11b shows input data with SNR = 2 and 60% missing traces. The results of applying TRR, ARR, WRR, and AWRR methods are shown in Figure 11c,e,g,i, respectively. Figure 11d,f,h,j are residual errors of Figure 11c,e,g,i, respectively. All four methods recovered the signals with correct amplitudes. However, in the TRR and ARR results, we can see significant residual errors. The AWRR result is much cleaner than those of TRR, ARR, and WRR. The input data quality factor is QF = −1.44 dB. The output quality factors of the TRR, ARR, WRR, and AWRR methods were QF = 5.23, 6.58, 7.56, 8.12 dBs, respectively. Figure 12 compares the f − k spectra of the discussed rank-reduction methods. Figure 12a shows the f − k spectra of clean data. Figure 12b represents the f − k spectra of data with SNR = 2 and 50% missing traces. Figure 12c-f are the f − k spectra of the TRR, WRR, ARR, and AWRR methods, respectively. To evaluate the stability of each algorithm, we ran them with different percentages of gaps and noise realizations for an SNR of two. The other parameters were constant for each run. Figure 13a shows a graph of the interpolation quality factor for each method versus the gap ratio. The length of the error bars is the standard deviation of the interpolation quality factor, and each colored line connects the mean value of the interpolation quality factor for each method at each gap ratio. As we can see in Figure 13a, the input quality factor decreases with increasing gap ratio, and the output quality factor decreases with increasing gap ratios. Moreover, the output quality factor curve of the AWRR method is always above the other curves.

Synthetic Data
(a) (b) Figure 13. (a) The mean and standard error of the quality of reconstruction versus gap ratio for various methods. These results were obtained by running each method on 20 noise realizations of the dataset with different gap ratios and rank = 9 for the methods requiring a predefined rank.
(b) The mean and standard error of the reconstruction quality versus SNR for various methods. These results were obtained by running each algorithm on 20 noise realizations of the dataset with a gap ratio = 50%, and rank = 9 for the methods requiring a predefined rank.
In the next experiment, the stability analysis was repeated to test the sensitivity of each method to the additive noise. The test was performed by changing the level of signal-to-noise ratio with 20 different realizations of SNR for a gap ratio = 50%. We set the other parameters to be constant for each run. In Figure 13b we can see that the proposed method (AWRR) outperforms the other methods even with poor quality of the signal. Table 1 indicates the values of the mean and standard deviation of the interpolation quality of each method for each gap percent. We can see that the average value of the AWRR quality factor is higher than for the other methods. Regarding the discussed methods, AWRR has lower standard deviation values, which indicates the values tend to be close to the mean value, especially when the data are sparser.  Table 2 represents the values of the mean and standard deviation of the output quality factor of each method for each signal-to-noise ratio input.

Real Field Data
This experiment tested the efficiency of the methods on a shot gathered of a 3-D field dataset. It is easy to inspect traces in shot/receiver gather displays for poor receivers or any bad shots, which are the logistic constraints during the seismic survey. Regarding the input data, the best results are obtained with NMO-corrected data [6]. However, in this experiment, we applied the MSSA interpolation before NMO correction to see the effect of the algorithm on the curvature. Figure 14 refers to the acquisition coordinates of the shot gather. The red star represents the location of the shot, and the black dot indicates the receiver's location. Figure 14a shows the initial distribution of the traces in a shot gather. For this experiment, 41% of the traces were removed. Figure 14b illustrates the geometry of input traces after removed 41% of them. This test applied the MSSA algorithm using the AWRR and TRR methods in the rankreduction step. Figure 15a shows the input cube of data, which presents the missing traces. Figure 15b shows the result of the interpolation using TRR in the rank-reduction step, and Figure 15c shows the result of using AWRR. For both tests, the number of iterations and processing window remained unchanged. For the TRR approach, the predefined rank wa set to rank = 10. Figure 16 represents the cube of data arranged into a 2-D matrix. Figure 16a shows the input data. Figure 15b shows the result of applying TRR. Figure 16c shows the result of applying AWRR. Figure 17 corresponds to a patch of data from the time 1.35 (s) to 1.75 (s) and the trace numbers 160 to 190. Figure 17 represents the cube of data rearranged in a 2-D matrix. Figure 17a shows the input data. Figure 17b shows the result of applying TRR. Figure 17c shows the result of applying AWRR.

Conclusions
We proposed an adaptive weighting rank-reduction (AWRR) method by exploring the singular values of the Hankel matrix in each frequency slice. The method chooses the maximum energy ratio between the two following singular values as the criterion to define the optimal rank. For 3-D data, the block Hankel matrices in higher frequencies are recovered with higher ranks. Consequently, the method selects the second cutoff in the singular-value spectrum. AWRR is based on an adaptive rank-reduction method combined with a cascade weighting operator. It is adaptive in selecting the rank of the block Hankel matrices. In addition, the weighting operator reduces the effect of additive random-noise projection on data, so a smoother block of Hankel matrices is its result. Furthermore, for different types of data, such as linear and curved, through a lot of simulations with different SNR and gap percentages, the proposed algorithm showed a higher output SNR and a better reconstruction of data with the predefined rank reduction methods. Most interpolation rank-reduction methods, such as MSSA, are SVD-based methods in their rank-reduction step. An SVD decomposition of a Hankel matrix is a very time-consuming operation. In addition, the iterative approach in the interpolation part of MSSA makes the algorithm time-consuming. To make AWRR in the MSSA algorithm applicable for higher dimensions, it is recommended to move from the SVD-based method to the SVD-free-based rank minimization method.