Reliable Denoising Strategy to Enhance the Accuracy of Arrival Time Picking of Noisy Microseismic Recordings

We propose a method to enhance the accuracy of arrival time picking of noisy microseismic recordings. A series of intrinsic mode functions (IMFs) of the microseismic signal are initially decomposed by employing the ensemble empirical mode decomposition. Subsequently, the sample entropy values of the obtained IMFs are calculated and applied to set an appropriate threshold for selecting IMFs. These are then reconstructed to distinguish between noise and useful signals. Ultimately, the Akaike information criterion picker is used to determine the arrival time of the denoised signal. Test results using synthetic noisy microseismic recordings demonstrate that the proposed approach can significantly reduce picking errors, with errors within the range of 1–3 sample intervals. The proposed method can also give a more stable picking result when applied to different microseismic recordings with different signal-to-noise ratios. Further application in real microseismic recordings confirms that the developed method can estimate an accurate arrival time of noisy microseismic recordings.


Introduction
As a reliable and effective monitoring tool, the microseismic method has been extensively used in various domains such as rock bursts, hydraulic fracturing monitoring, or mining safety [1][2][3].Precise arrival time identification is crucial to obtaining an accurate microseismic location result.High-quality estimates of arrival times can significantly enhance the efficiency of the microseismic monitoring process [4,5].Currently, numerous well-establish pickers are available for evaluating the arrival times of high-quality microseismic recordings, including the autoregressive Akaike information criterion (AIC) and various improved pickers [6][7][8][9], short/long time averaging (STA/LTA) [10], modified energy ratio (MER) [11,12], modified Coppens' method (MCM) [13,14], and higher-order statistics methods which involving skewness and kurtosis algorithms [15,16].Additionally, various reliable approaches are designed to suppress the noise in noisy microseismic recordings and develop joint picking methods.The filter algorithms are primarily based on the popular linear digital filter, polarization filter, wavelet transform, etc. [17][18][19][20].Recently, some new approaches based on deep learning have been proposed to achieve reliable arrival time estimations, which benefit from the modern computer platform with highperformance [21][22][23].All the aforementioned picking approaches have made significant contributions to enhancing the precision of arrival time identification in different historical periods, and several algorithms remain the most widely used nowadays.
However, digital filters usually need to set different parameters based on the different principal frequencies of the microseismic data [24,25].Polarization filters need to set a reasonable sliding time window combining with the principal frequency of microseismic signals [26,27].Moreover, predicting the corresponding principal frequency in microseismic recordings with high noise is difficult.Similarly, wavelet transforms also need to set different parameters according to the different characteristics of microseismic data, such as wavelet basis, wavelet decomposition level, etc. [28,29].Traditional machine learning methods depend strongly on some hand-engineered features of the sample dataset, requiring training predictive models through analyzing these crucial features [30,31].Nevertheless, obtaining some apparent quantitative features from microseismic data might be challenging.As the most representative branch of machine learning algorithms, deep learning can construct reliable prediction models using some crucial features extracted through self-learning.However, deep learning relies heavily on well-designed training datasets, and each sample trajectory should contain one or more effective microseismic events in the training dataset [32,33].Moreover, deep learning involves substantial computational complexity, requiring corresponding high-performance computing platforms.
Motivated by the aforementioned facts and taking into account the noisy environments encountered during in-situ monitoring as well as the intrinsic non-linearity and nonstationary nature of microseismic signals, the extraction of undistorted, useful signals from high-noise microseismic data, followed by the application of a robust picker, emerges as a reasonable approach.In this study, we introduce a reliable denoising algorithm employing sample entropy (SampEn) and ensemble empirical mode decomposition (EEMD), capable of obtaining high-quality undistorted signals.Furthermore, this denoising method can effectively overcome the influence of external monitoring environments, further ensuring that microseismic monitoring technology can be applied across various domains, such as rock bursts, hydraulic fracturing monitoring, or mining safety.

Ensemble Empirical Mode Decomposition
Applying the empirical mode decomposition (EMD) to the microseismic recording adaptively produces a family of intrinsic mode functions (IMFs) in descending.The decomposed IMFs correspond to the different frequency compositions of the original signal [34].However, this process involves a drawback of mode mixing, resulting in inaccurate IMF components, depriving the IMFs of their intrinsic single-scale features, and forming an oscillation in mixed scale [35].To address this, Wu and Huang [36] developed the ensemble EMD (EEMD) method, which can effectively avoid mode aliasing using the noise-assisted approach.The establishment process of EEMD is as follows: Step 1: Obtain the target signal X i (t) by mixing the original signal x(t) with the white Gaussian noise w i (t): where M represents the total of the mixed noise, and ε is its amplitude.
Step 2: Decompose the target signal X i (t) into a family of IMFs (C ji (t)) employing EMD: where r ni (t) is the residual of X i (t), which is the residual of that the n IMFs are gained.
Step 3: Repeat the above two steps M times by mixing various noise sequences to the original data.
Step 4: The actual individual IMFs are obtained using the (ensemble) average values of the decomposed IMFs: As an effective signal decomposition method, EEMD is a typical application combing with the noise-assisted preprocessing method [37,38].Considering the specialized procedure of EEMD, both a reasonable loop time and an appropriate amplitude of the added white noise are essential during the execution of EMD.Generally, to alleviate the impact of the added noise on the IMFs, the amplitude of the added noise should be minimized.However, if the amplitude of the added noise is too low, it will not produce extremal changes in the mixed data during critical steps of the EEMD computation [36].Moreover, a proper ensemble counts yields reliable results, maintaining the error induced by the added noise at a sufficiently diminutive or even insignificant level [35].Accordingly, adhering to Wu and Huang [36]'s recommendation, we selected an amplitude for the added white noise ε, approximately equal to 0.2 times the standard deviation of the decomposed data, and its ensemble counts N is 100.To verify the decomposition efficacy of the EEMD method in decomposing microseismic data, we established a synthetic microseismic recording with a signal-to-noise ratio (SNR) of −15 dB and adaptively decomposed it into 11 components.As shown in Figure 1, the decomposition result demonstrated that the decomposed IMFs sequentially correspond to different frequency compositions of the original signal in descending order.The distinguishable IMFs facilitate the extraction of useful signals from complex microseismic recordings.After obtaining a series of IMFs employing the EEMD method, we need to determine those IMFs that are substantively useful for reconstructing new microseismic signals.This necessitates the delineation of a valid and reliable boundary to discriminate between noise and useful signals within noisy microseismic recordings.Our study employs Sample Entropy (SamEn) to facilitate this process.

Sample Entropy
The SamEn represents an improved measure method for time series complexity based on approximate entropy (ApEn) [39,40], and was widely applied to evaluate the complexity of time sequences, as well as in diverse domains such as medicine and economics.Notably, the SamEn algorithm is particularly suitable for judging the complexity of time-series signals, which means that the smaller the SamEn value, the higher self-similarity and the lower the noise [41].In this investigation, the SamEn value is used to determine the randomness of microseismic signals, thereby facilitating the further distinguishing noise signals and microseismic events.
The procedures for calculating the SamEn value of the aforementioned IMFs are as follows: Step 1: Define C f (i) as the obtained IMFs sequences using the EEMD algorithm, (i = 1, 2, . . ., N), f represents the total of IMFs.
Step 3: Calculate the distance between each sequence of the X(i) and all the k groups of X(j).This distance is defined as: Step 4: Define T = r × SD, where r is set as 0.1-0.25, and SD represents the standard deviation of the calculated sequences.Then, count the number that the d ij is higher than T by row, noted as B m i .Define the A m i as: The mean of all the A m i is defined as: Step 5: m = m + 1, repeat the steps 1-4.
Step 6: Define the sample entropy as: From the definition of SampEn, it is evident that the m and T are two important factors in its computation.Guided by [42], established values of m are 1 or 2, while T is conventionally set between 0.1 and 0.25 times the standard deviation of the signal under consideration.Consequently, to distinguish the different characteristics of the decomposed IMFs, we computed the SampEn of the IMFs utilized in Figure 1, the corresponding parameters m is set as 1, and T is 0.15 × std(I MF).The SampEn values for different IMFs are illustrated in Figure 2, revealing a distinguished decrease in SampEn values with ascending IMFs.In other words, this further reveals that the higher-order IMF components exhibit lower complexity and potentially contain less noise.These typical characteristics in Figure 2 may assist in extracting useful microseismic signals.

Microseismic Signal Extraction
Upon applying EEMD to microseismic recordings to obtain a series of IMFs, these IMFs are adaptively compiled in descending order based on the different frequency components of the decomposed signal.Subsequently, we compute the SampEn values of the different IMFs, illustrating their respective complexities.In order to extract the useful IMF components, a reasonable threshold needs to be determined to eliminate redundant IMF components.Empirically, IMF components with SampEn values exceeding 0.2 are discarded.Finally, by accumulating the residual IMF components, the ultimate microseismic signal is reconstructed.
To further validate the capability of the developed signal extraction algorithm, we conducted spectral and time-frequency analyses.Figure 3 demonstrates that the principal frequency of the reconstructed data are consistent with the corresponding original data, and the principal frequency of the original data are preset with a stationary frequency while being synthesized.Moreover, the high-frequency information can be significantly eliminated while the low-frequency composition is reliably preserved.Additionally, the reconstructed data can avoid phase distortion (the P phase arrival point of the synthetic data are preset at 813).Consequently, the extracted data can be reasonably applied to verify onset times.

The Improved Picking Algorithm
In this study, we apply the AIC picker to estimate the arrival time of the extracted microseismic data.A robust AIC picker named M-AIC is employed to perform the AIC calculation, an approach developed by Maeda [7].Furthermore, considering the picking speed, the AIC calculation time window (TW) should be determined to conserve the picking consumption.The specific picking procedures are as follows: Step 1: Data extraction.Utilizing the EEMD and SamEn to decompose and reconstruct the noisy microseismic recording, thus extracting high-quality, useful microseismic signals.
Step 2: Segment selection of the M-AIC Picker.For clean microseismic recordings, the arrival time should be located prior to the peak signal amplitude.In this sense, and considering the computing consumption, the point of maximum signal amplitude is taken as the cutoff of the calculated data segment.
Step 3: Arrival time identification.The M-AIC picker is employed to the selected data segment of the extracted microseismic data using the algorithm as mentioned above, subsequently determining the first arrival at the minimum value within the obtained M-AIC sequences.
To exhibit the efficacy of the developed picking method, the improved algorithm is applied to synthetic microseismic recordings that were integrated with different kinds of noise.Figure 4 illustrates the results obtained by executing the conventional M-AIC picker on the constrained calculation segment of the reconstructed signal, the entire denoised signal, and the original noisy signal, respectively.Evident from Figure 4, the reconstructed denoised signal can reliably substitute the original signal for the preliminary picking calculation.In other words, the proposed method can achieve accurate arrival time picking of microseismic recordings with high noise.

Picking Results Comparison of Different Approaches Using Synthetic Microseismic Recordings
In order to evaluate the validity of the improved method in this study, comparative tests were conducted utilizing different picking approaches, including the proposed algorithm, the popular M-AIC picker, and the conventional STA/LTA method.Akram and Eaton [43] provided a reasonable time window (TW) to choose the criterion for the STA/LTA method.In addition, the aforementioned approaches are applied to different synthetic microseismic recordings mixed with various noise, and the accumulated noise is given with different SNRs (SNRs are set between 10 dB and −15 dB).We design 100 distinct microseismic recordings using diverse noise at each SNR to derive average picking results of the abovementioned methods.Figure 5 demonstrates that the proposed algorithm can yield accurate and reliable picking results across varying SNRs for white and colored (blue) noise.However, SNR when the SNR is below −1 dB, traditional M-AIC, and STA/LTA pickers, despite de-noising the original signal, fail to obtain accurate arrival times.Specifically, the M-AIC picker exhibits more minor picking errors while being applied to the microseismic data, which has been de-noised utilizing the developed de-noising strategy.Furthermore, the STA/LTA picker is easier to involve in uncertain picking results since the TW length relies heavily on the frequency features of the microseismic data, which necessitates dynamic adjustments for different microseismic recordings [44].In addition, stability represents an important indicator in evaluating the efficacy of the proposed picking method.Figure 6 is a three-dimension plot of statistical curves, derived from the computation results of 100 picking results using different picking approaches, based upon synthetic microseismic recordings at five distinct SNRs (with the first arrival point preset at 813).These results reveal that, compared to other popular pickers, the new picker can achieve more stable and accurate picking results at the designated SNRs of 10 dB, −1 dB, −5 dB, −8 dB, and −10 dB.Moreover, the new method has several random mutational picking results at −10 dB, possibly due to the effective microseismic signal and the added random noise having an approximate principal frequency.The overall trend of Figure 6 indicates that the picking results from the microseismic data with added colored (blue) noise fluctuate more than those from the microseismic data with added white Gaussian noise.In addition, when the SNR is set to 10 dB, the proposed picker does not fluctuate when the synthetic data are mixed with white Gaussian noise, but with the colored (blue) noise, the new picker has an error of 3 sampling intervals.This issue might be attributed to the frequency approximation between the useful microseismic event and the colored (blue) noise.Furthermore, while the STA/LTA algorithm can maintain better stability compared to the M-AIC algorithm, its picking accuracy is not as good as the latter.

Application on Real Data
We conducted comparative tests on real microseismic data, thereby further confirming the effectiveness of the new method in estimating the arrival time of microseismic recordings.The real microseismic recordings were collected from a deeply buried tunnel excavation monitoring project.The configuration of microseismic monitoring consists of two rows spaced about 10 m apart with eight accelerometers, which were deployed on the sidewalls of the tunnel.The sample rate of the microseismic monitoring system was set to 6000 Hz.
The comparison process involves applying the proposed algorithm and the M-AIC method directly to the original and denoised microseismic data.As shown in Figure 7, several intuitive microseismic events are displayed in the original Z component, which can serve as useful location events.However, the denoising results from Z-denoising show several random pseudo-microseismic events, which might originate from field environmental noise.In summary, the proposed approach can achieve reliable first arrival identification.In contrast, we cannot determine a reasonable arrival time when the M-AIC method is applied to the original microseismic recordings.To evaluate the computational time consumption of the new method, we apply different picking approaches to process the same microseismic recording.The recording contains 1600 sampling points, and the test was conducted on Windows 10 with 8 GB RAM and a 3.5 GHz processor.The results show that the time required for the new method, M-AIC (A), STA/LTA (A), M-AIC (O), STA/LTA (O) are 12.81 s,12.79 s, 12.83 s, 0.046 s, and 0.02 s, respectively.Clearly, picking the first arrival after denoising consumes more time than the picking process of original microseismic data because the denoising process requires a significant amount of computational time.Notably, sacrificing time consumption to enhance the accuracy of first arrival estimation might be an acceptable method, especially for the effective microseismic signal extraction using EEMD and SamEn.Furthermore, considering the performance of advanced computational terminals, time cost may not restrict the application of the proposed automatic arrival time estimation method.
In addition, to validate the effectiveness of the proposed method, we conducted a comparative test using the abovementioned real microseismic recording, comparing traditional finite impulse response (FIR) filter (based on the Gauss window with 21 orders, and the cut-off frequency is set as 100 Hz), wavelet hard threshold denoising (using the heuristic thresholding rule 'heursure' rule, db4 wavelet basis, and the decomposition level is 3), and the proposed strategy.Comparison results demonstrate that the proposed approach can achieve a better denoising result, a higher signal-to-noise ratio (SNR), and a clear first arrival.The SNR for the original microseismic recording, the denoised signal of the traditional FIR filter, the denoised signal of the wavelet hard threshold denoising method, and the denoised signal of the proposed strategy are 0.0013782 dB, 0.0093141 dB, 0.0054686 dB, and 0.016775 dB, respectively.When calculating the SNR, the mean square values are used to estimate signal and noise power.

Conclusions
This paper presents an effective method for estimating the arrival time of high-noise microseismic data, aiming to alleviate the influences of strong noise and data segment selection.Combined with the characteristics of microseismic recordings, the microseismic signal is subjected to EEMD, yielding a series of IMFs.Subsequently, we utilize the threshold determined by the SamEn values of different IMFs to select several useful IMF components.IMF components with SamEn values greater than 0.2 are removed.The selected IMF components are then reconstructed into the final microseismic signal, which is used to construct the picking target signal.Ultimately, the M-AIC algorithm is employed to estimate the arrival time of the constrained computation data segment, which is determined by setting the maximum point of the reconstructed signal amplitude.

Figure 1 .
Figure 1.The decomposition result of the synthetic microseismic data using the EEMD method.The principal frequency of the synthetic P-wave is 100 Hz, and a sampling rate of 4 kHz using Rick wavelet, with an SNR of −15 dB.

Figure 2 .
Figure 2. The SamEn value of the aforementioned decomposed IMFs.

Figure 3 .
Figure 3.Comparison of Time-frequency analysis and spectrum analysis.(a) Synthetic noisy microseismic data with a principal frequency of 100 Hz and a sampling rate of 4 kHz using Rick wavelet, added with white Gaussian noise at an SNR of −15 dB.(b) The reconstructed microseismic data using the selected IMFs.

Figure 4 .
Figure 4. Arrival time estimation of different picking regulations employing the M-AIC picker for synthetic data.(Ori) represents the noisy synthetic signal, (Pre) is the denoised signal, the red dotted line represents the picking result of the M-AIC picker for the noisy signal, the red solid line represents the picking result of the proposed method, and the blue dotted line is the picking result of the M-AIC picker for the denoised signal.(a) Synthetic signal with white Gaussian noise (SNR = −15 dB).(b) Synthetic signal with colored (blue) noise (SNR = −15 dB).

Figure 5 .
Figure 5. Arrival time evaluation results using different picking approaches for the noisy data.(O) means the original noisy signal, and (A) is the denoised signal.(a) Synthetic data added with white Gaussian noise.(b) Synthetic data added with colored (blue) noise.

Figure 6 .
Figure 6.Stability comparison using different picking approaches for synthetic microseismic data with different SNRs, A and B under each subgraph are the corresponding enlarged picking results of the red dotted box.(O) means the original noisy signal, and (A) is the denoised signal.(a) Synthetic data added with white Gaussian noise.(b) Synthetic data added with colored (blue) noise.

Figure 7 .
Figure 7. Arrival time identification results of different picking regulations for the real microseismic recording.The red solid line represents the picking result of the proposed method.