A Waveform Image Method for Discriminating Micro-Seismic Events and Blasts in Underground Mines

The discrimination of micro-seismic events (events) and blasts is significant for monitoring and analyzing micro-seismicity in underground mines. To eliminate the negative effects of conventional discrimination methods, a waveform image discriminant method was proposed. Principal component analysis (PCA) was applied to extract the raw features of events and blasts through their waveform images that established by the recorded field data, and transform them into the new uncorrelated features. The amount of initial information retained in the derived features could be determined quantitatively by the contribution rate. The binary classification models were established by utilizing the support vector machine (SVM) algorithm and the PCA derived waveform image features. Results of four groups of cross validation show that the optimal values for the accuracy of events and blasts, total accuracy, and quality evaluation parameter MCC are 97.1%, 93.8%, 93.60%, and 0.8723, respectively. Moreover, the computation efficiency per accuracy (CEA) was introduced to quantitatively evaluate the effects of contribution rate on classification accuracy and computation efficiency. The optimal contribution rate was determined to be 0.90. The waveform image discriminant method can automatically classify events and blasts in underground mines, ensuring the efficient establishment of high-quality micro-seismic databases and providing adequate data for the subsequent seismicity analysis.


Introduction
Micro-seismic monitoring has been effective in global underground mines for providing information about the local state of stress of rock [1]. It also provides information that can be used to understand the behavior of rock mass [2,3], to prevent rock mass instability and rock burst [4,5], and to assess the potential hazards induced by intensive deep mining activities [6,7]. The discrimination of micro-seismic events and blasts in underground mines is one of the most important issues for the robust and efficient micro-seismic monitoring technology [8], which directly affects the quality of micro-seismic database [9,10]. The reasonable and reliable interpretation of underground process of mining can be obtained only by using the pure data of micro-seismic events rather than the mixed data of micro-seismic events and blasts [11]. Otherwise, the location accuracy of micro-seismic events will be reduced seriously [12]. Moreover, the resolution of passive tomography will be influenced negatively because the accurate localization of micro-seismic events is necessarily a basis. Subsequently, the related analysis of rock deformation and stress evolution may be ineffective. Therefore, the false discrimination of micro-seismic events and blasts may result in the unreasonable assessment of seismic hazards, such as a fictitious region of high seismic stress [13]. It is evident that the discrimination of micro-seismic events and blasts in underground mines is a fundamental and significant problem to be solved.
The manual discrimination has been widely used to classify micro-seismic events and blasts based on the blast time and the visual judgment of waveforms. However, it is common that a great deal of monitoring data needs to be processed routinely, which will result in a long discrimination time, low work efficiency, and delayed classification results. Also, the professional knowledge and practical experience are necessary for the data analysts, whose technical levels will have an impact on the classification results. It can be seen that the manual discrimination is inefficient and leads to difficulties for fast analysis of seismic sources, determination of rock mass conditions, and assessment of seismic hazards in underground mines.
Different seismic source parameters are selected to establish discrimination models for micro-seismic events and blasts by using statistical methods. Malovichko [33] applied the maximumlikelihood Gaussian classifier to discriminate micro-seismic events and blasts, for which the source parameters including time of occurrence, radiation pattern, ratio of high-and low-frequency radiation, and source repetition to the neighboring waveforms were selected as indicators. Vallejos and McKinnon [34] selected 13 seismic source parameters provided by the full-waveform system. Through the comparison of classification accuracy, they found that the neural network models outperformed the current approach, indicating good classification performance of machine learning methods. Based on the seismic source parameters proposed in Dong et al. [13], Dong et al. [35] applied the logistic and log-logistic distributions to establish probability density functions for origin time of blasts and origin time difference (OTD) of neighboring blasts in the time domain. Then, the Fisher classifier, naive Bayesian classifier, and logistic regression were used to establish discrimination models with explicit functions. However, the values of seismic source parameters may be unstable with the change of source coordinates, wave velocity, and location error. In summary, the discrimination methods using seismic source parameters mainly have four prominent disadvantages that may lead to poor discrimination results. Firstly, there are dozens of initial seismic source parameters in mine seismicity, which mainly includes location error (E rr ), origin time of seismic records (t 0 ), source radius (R), number of triggered sensors (N s ), moment magnitude (M m ), seismic moment (M 0 ), total radiated energy (E 0 ), corner frequency (f c ), maximum displacement (MD), peak velocity parameter (PV), etc. There will be more derived parameters definitely by applying various functions, as well as by combining and transforming different variables. The workload for analyzing each initial and derived parameter will be heavy as the statistical analysis is the most common method for selecting parameters with good discrimination performance. Secondly, the combinations of acceptable parameters corresponding to different discrimination models are various and complex. Thirdly, the importance of each parameter is usually ignored without determining a quantitative value and the established classification models lack the process for eliminating the correlations between different parameters. Fourthly, the selection of seismic source parameters and distribution functions are sometimes carried out with experience and subjective judgment.
As for the waveform spectrum analysis, its advantage lies in that different characteristics between micro-seismic events and blasts can be analyzed intuitively [15]. Nevertheless, its analysis object is the waveform recorded by each sensor, the workload is heavy as numerous sensors will be triggered by a micro-seismic event or a blast. Zhao et al. [36] selected the repetition of waveforms, tail decreasing, dominant frequency, and occurrence time as the discrimination indicators. Besides, for micro-seismic events and blasts, by considering the differences of the time needed to reach the first and the main peak and the amplitude distribution, the slope values of two regression lines (one corresponds to the first peak while another corresponds to the main peak) were extracted as the characteristic parameters for waveforms. Ma et al. [37] proposed two discrimination approaches, where one extracted features from seismic sources (Approach I) and another utilized waveform characteristics (Approach II). The results showed that 97.1% of cases were correctly classified by Approach II while the accuracy of Approach I was only 83.5%. According to both two researches, it can be inferred that waveform characteristics can provide useful information for effective discrimination. Hence, three main disadvantages of waveform spectrum analysis can be concluded through the above review. Firstly, the workload of waveform spectrum analysis is heavy, because the analysis object is the waveform recorded by each sensor and many sensors will be triggered by a single micro-seismic event or a blast. Secondly, in general, only the characteristics of the P-wave are considered, instead of the full waveform. Thirdly, the importance of each parameter is usually ignored without determining a quantitative value and the established classification models lack the process for eliminating the correlations between different parameters, which is also an identical disadvantage of many discrimination methods using seismic source parameters.
In this paper, we developed an effective waveform image method for discriminating micro-seismic events and blasts in underground mines. Firstly, we established the waveform image databases of micro-seismic events and blasts through the full waveform data. Then, we used PCA [38][39][40] to extract the original image features and obtained the new uncorrelated features with quantitative importance and lower dimensions. Thirdly, we developed the discrimination models by utilizing the support vector machine (SVM) algorithm [41][42][43] and the PCA derived features. Finally, we analyzed the discrimination results of cross validations, quantitatively evaluated the effects of contribution rate on classification accuracy and computation efficiency, and discussed the further application, advantages, and disadvantages of the proposed discriminant method. Figure 1 illustrates the proposed waveform image method for discriminating micro-seismic events in underground mines, which are mainly divided into four steps.

Materials and Methods
Sensors 2020, 20, x FOR PEER REVIEW  3 of 18 for micro-seismic events and blasts, by considering the differences of the time needed to reach the first and the main peak and the amplitude distribution, the slope values of two regression lines (one corresponds to the first peak while another corresponds to the main peak) were extracted as the characteristic parameters for waveforms. Ma et al. [37] proposed two discrimination approaches, where one extracted features from seismic sources (Approach I) and another utilized waveform characteristics (Approach II). The results showed that 97.1% of cases were correctly classified by Approach II while the accuracy of Approach I was only 83.5%. According to both two researches, it can be inferred that waveform characteristics can provide useful information for effective discrimination. Hence, three main disadvantages of waveform spectrum analysis can be concluded through the above review. Firstly, the workload of waveform spectrum analysis is heavy, because the analysis object is the waveform recorded by each sensor and many sensors will be triggered by a single micro-seismic event or a blast. Secondly, in general, only the characteristics of the P-wave are considered, instead of the full waveform. Thirdly, the importance of each parameter is usually ignored without determining a quantitative value and the established classification models lack the process for eliminating the correlations between different parameters, which is also an identical disadvantage of many discrimination methods using seismic source parameters.
In this paper, we developed an effective waveform image method for discriminating microseismic events and blasts in underground mines. Firstly, we established the waveform image databases of micro-seismic events and blasts through the full waveform data. Then, we used PCA [38][39][40] to extract the original image features and obtained the new uncorrelated features with quantitative importance and lower dimensions. Thirdly, we developed the discrimination models by utilizing the support vector machine (SVM) algorithm [41][42][43] and the PCA derived features. Finally, we analyzed the discrimination results of cross validations, quantitatively evaluated the effects of contribution rate on classification accuracy and computation efficiency, and discussed the further application, advantages, and disadvantages of the proposed discriminant method. Figure 1 illustrates the proposed waveform image method for discriminating micro-seismic events in underground mines, which are mainly divided into four steps.   Establishment of discrimination models: SVM algorithm is selected to establish discrimination models for micro-seismic events and blasts in underground mines by utilizing the PCA derived waveform image features. Then, the discrimination models are used to classify for test sets.

4.
Application of the discrimination results: The micro-seismic data is applied to locate micro-seismic events, to analyze the local state of stress of rock, and to assess potential hazards in underground mining area.

Establishment of Waveform Image Databases
The generation of waveforms, the determination of a reasonable and unified signal duration for all the waveforms, as well as the generation and definition of waveform images are the key issues for establishing waveform image databases of micro-seismic events and blasts.
Firstly, the micro-seismic monitoring system installed in the Yongshaba underground mine is composed of 28 sensors that measure the ground velocity and their sampling frequency is 6000 Hz, which means that the ground velocity is measured for 6000 times during one second. Hence, the waveforms of micro-seismic events and blasts can be produced through the data of time and the corresponding ground velocity, where the x-axis and y-axis represent waveform time (s) and velocity amplitude (m/s), respectively.
Secondly, the signal durations of a micro-seismic event and a blast are supposed to present differences due to the differences of their energy release [36]. Here, we quantify the distributions and percentages of different signal durations for micro-seismic events and blasts, which can subsequently be used to determine a reasonable and unified signal duration for all the waveforms. Figure 2 shows the distributions and percentages of lg(t) for micro-seismic events and blasts, where t denotes the signal duration. and lower dimension, where the amount of initial information retained in the derived features is determined by the contribution rate. Thus, PCA can reduce the number of input features and improve the classification efficiency. 3. Establishment of discrimination models: SVM algorithm is selected to establish discrimination models for micro-seismic events and blasts in underground mines by utilizing the PCA derived waveform image features. Then, the discrimination models are used to classify for test sets. 4. Application of the discrimination results: The micro-seismic data is applied to locate microseismic events, to analyze the local state of stress of rock, and to assess potential hazards in underground mining area.

Establishment of Waveform Image Databases
The generation of waveforms, the determination of a reasonable and unified signal duration for all the waveforms, as well as the generation and definition of waveform images are the key issues for establishing waveform image databases of micro-seismic events and blasts.
Firstly, the micro-seismic monitoring system installed in the Yongshaba underground mine is composed of 28 sensors that measure the ground velocity and their sampling frequency is 6000 Hz, which means that the ground velocity is measured for 6000 times during one second. Hence, the waveforms of micro-seismic events and blasts can be produced through the data of time and the corresponding ground velocity, where the x-axis and y-axis represent waveform time (s) and velocity amplitude (m/s), respectively.
Secondly, the signal durations of a micro-seismic event and a blast are supposed to present differences due to the differences of their energy release [36]. Here, we quantify the distributions and percentages of different signal durations for micro-seismic events and blasts, which can subsequently be used to determine a reasonable and unified signal duration for all the waveforms. Figure 2 shows the distributions and percentages of lg(t) for micro-seismic events and blasts, where t denotes the signal duration. It can be seen that there are numerous different values of signal duration for the waveforms of micro-seismic events and blasts, which account for different percentages. Therefore, it is necessary to determine a reasonable and unified signal duration for all waveforms, which can avoid the changes of image features and inaccuracy of subsequent discrimination models caused by different signal duration of micro-seismic events and blasts. tei (i = 1, 2,…, ne) and tbj (j = 1, 2,…, nb) denote the signal duration of the i-th micro-seismic event and j-th blast, respectively. ne and nb indicate the total numbers of micro-seismic events and blasts, respectively. The percentages of each signal duration of micro-seismic events and blasts are calculated as It can be seen that there are numerous different values of signal duration for the waveforms of micro-seismic events and blasts, which account for different percentages. Therefore, it is necessary to determine a reasonable and unified signal duration for all waveforms, which can avoid the changes of image features and inaccuracy of subsequent discrimination models caused by different signal Sensors 2020, 20, 4322 5 of 18 duration of micro-seismic events and blasts. t ei (i = 1, 2, . . . , n e ) and t bj (j = 1, 2, . . . , n b ) denote the signal duration of the i-th micro-seismic event and j-th blast, respectively. n e and n b indicate the total numbers of micro-seismic events and blasts, respectively. The percentages of each signal duration of micro-seismic events and blasts are calculated as where n t ei denotes the numbers of micro-seismic waveforms whose signal duration are equal to t ei and n t bi denotes the numbers of blast waveforms whose signal duration are equal to t bj . η p (p = 1, 2, . . . , l, . . . , x) indicates the percentages of different micro-seismic signal durations (i.e., t ei ) in all micro-seismic events and ξ q (q = 1, 2, . . . , k, . . . , y) indicates the percentages of different blast signal durations (i.e., t bj ) in all blasts. By sorting the resolved percentages with the descending order, thus, η 1 and η x correspond to the micro-seismic signal durations that have the maximum number (percentage) and the minimum number (percentage), respectively. Then, by setting a threshold of 80% for the sum of percentages of micro-seismic signal durations (or blast signal durations), the reasonable and unified signal duration of micro-seismic events and blasts can be solved through Equations (2)-(4) Sensors 2020, 20, x FOR PEER REVIEW 6 of 18 Figure 3. Examples of waveform images of micro-seismic events and blasts. (a) Waveform image of a micro-seismic event whose signal duration is greater than 1.8 s; (b) Waveform image of another microseismic event whose signal duration is less than 1.8 s; (c) Waveform image of a blast whose signal duration is greater than 1.8 s; (d) Waveform image of another blast whose signal duration is less than 1.8 s; (e) A part of pixels distributed in the red rectangle of (d), which are surrounded by the blue rectangle, and the corresponding gray values. The pure black and pure white correspond to the gray values of 0 and 255, respectively.

Principal Component Analysis
The database of micro-seismic events is taken as an example to clarify the main theory of PCA [38][39][40]. We can extract the original features from the micro-seismic database and present them by a 2D matrix mn X , which consists of m row vectors ( 1 2 , , , ). m and n denote the number of waveform images and the number of original features extracted from each waveform image, Figure 3. Examples of waveform images of micro-seismic events and blasts. (a) Waveform image of a micro-seismic event whose signal duration is greater than 1.8 s; (b) Waveform image of another micro-seismic event whose signal duration is less than 1.8 s; (c) Waveform image of a blast whose signal duration is greater than 1.8 s; (d) Waveform image of another blast whose signal duration is less than 1.8 s; (e) A part of pixels distributed in the red rectangle of (d), which are surrounded by the blue rectangle, and the corresponding gray values. The pure black and pure white correspond to the gray values of 0 and 255, respectively.

Principal Component Analysis
The database of micro-seismic events is taken as an example to clarify the main theory of PCA [38][39][40]. We can extract the original features from the micro-seismic database and present them by a 2D matrix X mn , which consists of m row vectors (x 1n , x 2n , · · · , x mn ). m and n denote the number of waveform images and the number of original features extracted from each waveform image, respectively. To eliminate the errors caused by the original features with different scales, the Min-Max method is selected to normalize the original features, which is explained below where x ij , x ij , min, and max are the initial jth feature value of the ith waveform image, the normalized jth feature value of the ith waveform image, the minimum feature value, and the maximum feature value, respectively. Then, the new original features of seismic database are presented by the normalized matrix X mn , which consists of m normalized row vectors (x 1n , x 2n , · · · , x mn ). Furthermore, the difference matrix X d , which can retain differences between different waveform images, and its covariance matrix C nn as well as the dimension reduction matrix X DR are solved as where X ave consists of m identical row vectors that are composed of the average value of each column in the matrix X mn . λ j are the eigenvalues of C nn that sorted by the descending order. e j and w λ j are the eigenvectors and importance that correspond to λ j , respectively. σ (0 ≤ σ ≤ 1) is the contribution rate that quantitatively determines the amount of initial information retained in the PCA derived features. k (k ≤ n) is the smallest integer that satisfies the preset contribution rate σ. Finally, the new uncorrelated features with quantitative importance and lower dimension, named principle components (PCs), can be presented by a reconstruction matrix X R , which is calculated as Therefore, PCA can be utilized to objectively extract the original waveform image features from the databases of micro-seismic events and blasts, as well as to transform them into the new uncorrelated features, which can be used for establishing discrimination models.

Classification Algorithm
SVM algorithm, proposed by Cortes and Vapnik [41], has indicated excellent performances in the fields of regression, classification, and pattern recognition. As for binary classification problems, the basic thought of SVM algorithm is to search an optimal hyperplane between two objects that can maximize the margin area while ensuring the classification accuracy. The flexibility allows us to modify and improve the SVM algorithm, as well as to conveniently apply it to different situations according to the specific requirements. In addition, the number of initial input parameters can be decreased by providing more default parameters, which can reduce the work for parameters adjustment and accelerate the computation process.
Therefore, SVM algorithm is used to establish discrimination models for micro-seismic events and blasts by using the training samples. Then, the effectiveness of classification models will be examined through the test samples.

Evaluation of Classification Quality
The Matthews correlation coefficient (MCC), proposed by Matthews [44], is a commonly used index in machine learning for evaluating binary classification quality, which can simultaneously consider the classification accuracy of micro-seismic events and blasts. Essentially, MCC is a correlation coefficient between the observed and the predicted binary classifications. The value interval of MCC is [−1, 1], where −1 represents total falseness between observation and predication, 0 denotes no better than random prediction, and 1 indicates absolute correctness for prediction. MCC is defined and calculated as where a true micro-seismic event (TE) means that a micro-seismic event is identified as a micro-seismic event, a true blast (TB) means that a blast is identified as a blast, a false micro-seismic event (FE) means that a blast is incorrectly tagged as a micro-seismic event, and a false blast (FB) means that a micro-seismic event is incorrectly tagged as a blast.

Data Description and Preparation
The full waveform data of micro-seismic events and blasts recorded from 2013 to 2015 by the Institute of Mine Seismology (IMS) system installed in the Yongshaba deposit, an underground mine in Guizhou Province, China, was used to establish databases and discrimination models. Twenty-six uniaxial sensors and two triaxial sensors, measuring the ground velocity with a sampling frequency of 6000 Hz, were deployed across the major stopes at the 930 m level, 1080 m level and 1120 m level, which can cover the main mining area and record the data of mining-induced seismicity and production blasts as much as possible. Figure 4 shows the geographic location of the Yongshaba underground mine, the locations of micro-seismic events and blasts [45], the layout of the sensors, and the examples of waveforms recorded by different sensors.
where a true micro-seismic event (TE) means that a micro-seismic event is identified as a microseismic event, a true blast (TB) means that a blast is identified as a blast, a false micro-seismic event (FE) means that a blast is incorrectly tagged as a micro-seismic event, and a false blast (FB) means that a micro-seismic event is incorrectly tagged as a blast.

Data Description and Preparation
The full waveform data of micro-seismic events and blasts recorded from 2013 to 2015 by the Institute of Mine Seismology (IMS) system installed in the Yongshaba deposit, an underground mine in Guizhou Province, China, was used to establish databases and discrimination models. Twenty-six uniaxial sensors and two triaxial sensors, measuring the ground velocity with a sampling frequency of 6000 Hz, were deployed across the major stopes at the 930 m level, 1080 m level and 1120 m level, which can cover the main mining area and record the data of mining-induced seismicity and production blasts as much as possible. Figure 4 shows the geographic location of the Yongshaba underground mine, the locations of micro-seismic events and blasts [45], the layout of the sensors, and the examples of waveforms recorded by different sensors.  To ensure the generality for different micro-seismic data of the proposed discriminant method, 2000 micro-seismic events and 2000 blasts are randomly selected from the established waveform image databases. The cross validation, an effective method for evaluating discrimination models, is used in this study, whose basic thought is establishing discrimination models through the training sets (test sets) and evaluating the established models through test sets (training sets). Therefore, 2000 micro-seismic events are equally divided into E1 and E2, and 2000 blasts are equally divided into B1 and B2, where E1 and E2 indicate the first and the second micro-seismic dataset containing 1000 micro-seismic events, respectively, and B1 and B2 represent the first and the second blasting dataset consisting of 1000 blasts, respectively. Thus, four groups of cross validation can be carried out through the combinations of these four datasets (E1, E2, B1, and B2), which are shown in Table 1.

PCA Application and Analysis
PCA is applied to the four groups of cross validation, where the contribution rate is firstly set to be 95% as it is a commonly used value that has shown good classification performances in many fields such as the facial recognition. Hence, the first 95% information contained in the original waveform image features is retained in the PCA derived waveform image features. The eigenvalues, importance, and cumulative importance corresponding to different PCs for test 1 to test 4 are listed in Table 2.  Figure 5 shows the distributions and the logistic probability density distributions of PC 1 of micro-seismic events and blasts for test 1 to test 4. It can be seen from the left figures that the differences between micro-seismic events and blasts are evident. Also, the overlapped areas between micro-seismic events and blasts under the logistic probability density distributions in the right figures are small. Therefore, the effectiveness of PC 1 for discriminating micro-seismic events and blasts is confirmed and we can believe that the PCA derived waveform image features are effective and efficient for the further discrimination of micro-seismic events and blasts.
Sensors 2020, 20, x FOR PEER REVIEW 10 of 18 reduced to 1157, 1166, 1195, and 1205, respectively. Additionally, it can be calculated that the average importance of PC1 for test 1 to test 4 is about 8.68% [(9.20% + 9.67% + 7.73% + 8.11%)/4 = 8.68%], which is 173.6 times as large as the average importance of an original feature (1/2000 = 0.05%). Figure 5 shows the distributions and the logistic probability density distributions of PC1 of micro-seismic events and blasts for test 1 to test 4. It can be seen from the left figures that the differences between micro-seismic events and blasts are evident. Also, the overlapped areas between micro-seismic events and blasts under the logistic probability density distributions in the right figures are small. Therefore, the effectiveness of PC1 for discriminating micro-seismic events and blasts is confirmed and we can believe that the PCA derived waveform image features are effective and efficient for the further discrimination of micro-seismic events and blasts.  Figure 6 shows the importance and cumulative importance of the PCA derived eigenvalues for test 1 to test 4. It can be seen that the importance decreases and the cumulative importance increase with the reduction of eigenvalues. In Figure 6, the eigenvalues with smaller importance are distributed at the lower left corner. For the cumulative importance curves of test 1 to test 4, the upper left parts are constituted of numerous eigenvalues with relatively smaller importance. The PCA derived waveform image features that contribute the first 95% cumulative importance are the input features for the establishment of discrimination models.

Classification Results
Usually, the radical basis function (RBF) is used as the kernel function of SVM algorithm for its good performance in common classification problems. However, RBF is not suitable when the dimension of the input features is very large, while the linear kernel function is an advisable choice considering the greater dimension of input features in further engineering applications. In addition,  Figure 6 shows the importance and cumulative importance of the PCA derived eigenvalues for test 1 to test 4. It can be seen that the importance decreases and the cumulative importance increase with the reduction of eigenvalues. In Figure 6, the eigenvalues with smaller importance are distributed at the lower left corner. For the cumulative importance curves of test 1 to test 4, the upper left parts are constituted of numerous eigenvalues with relatively smaller importance. The PCA derived waveform image features that contribute the first 95% cumulative importance are the input features for the establishment of discrimination models.  Figure 6 shows the importance and cumulative importance of the PCA derived eigenvalues for test 1 to test 4. It can be seen that the importance decreases and the cumulative importance increase with the reduction of eigenvalues. In Figure 6, the eigenvalues with smaller importance are distributed at the lower left corner. For the cumulative importance curves of test 1 to test 4, the upper left parts are constituted of numerous eigenvalues with relatively smaller importance. The PCA derived waveform image features that contribute the first 95% cumulative importance are the input features for the establishment of discrimination models. (a) Importance of the PCA derived eigenvalues for test 1 to test 4. The red triangles, yellow circles, green stars, and blue crosses indicate the eigenvalues with specific importance for test 1 to test 4, respectively; (b) Cumulative importance of the PCA derived eigenvalues for test 1 to test 4. The red line with triangles, yellow line with circles, green line with stars, and blue line with crosses represent the cumulative importance curves for test 1 to test 4, respectively. The zoom view shows the cumulative importance curves when the eigenvalue is between 1 × 10 4 and 10 × 10 4 .

Classification Results
Usually, the radical basis function (RBF) is used as the kernel function of SVM algorithm for its good performance in common classification problems. However, RBF is not suitable when the dimension of the input features is very large, while the linear kernel function is an advisable choice considering the greater dimension of input features in further engineering applications. In addition, Figure 6. Importance and cumulative importance of the PCA derived eigenvalues for test 1 to test 4. (a) Importance of the PCA derived eigenvalues for test 1 to test 4. The red triangles, yellow circles, green stars, and blue crosses indicate the eigenvalues with specific importance for test 1 to test 4, respectively; (b) Cumulative importance of the PCA derived eigenvalues for test 1 to test 4. The red line with triangles, yellow line with circles, green line with stars, and blue line with crosses represent the cumulative importance curves for test 1 to test 4, respectively. The zoom view shows the cumulative importance curves when the eigenvalue is between 1 × 10 4 and 10 × 10 4 .

Classification Results
Usually, the radical basis function (RBF) is used as the kernel function of SVM algorithm for its good performance in common classification problems. However, RBF is not suitable when the dimension of the input features is very large, while the linear kernel function is an advisable choice considering the greater dimension of input features in further engineering applications. In addition, there is no need to set or adjust numerous parameters for linear kernel function and the classification process can be simplified. Therefore, the linear kernel function is selected for SVM algorithm, which is given by K (E, B) where E and B indicate the training sets of micro-seismic events and blasts, respectively. By inputting the PCA derived waveform image features, the discrimination models can be established through SVM algorithm. The classification results for test 1 to test 4 including classification accuracy and quality evaluation factor MCC are shown in Table 3, where TE and TB are the numbers of correctly classified micro-seismic events and blasts. The classification accuracy of micro-seismic events for test 1 to test 4 are 95.0%, 92.3%, 97.1%, and 94.7%, respectively. Similarly, the discriminant accuracy of blasts for test 1 to test 4 are 92.2%, 93.8%, 89.9%, and 89.0%, respectively. The average classification accuracy of micro-seismic events and blasts for test 1 to test 4 are 94.78% and 91.23%, respectively. It can be seen that the optimal classification accuracy of micro-seismic events and blasts are 97.1% and 93.8%, respectively. All the four tests show excellent total accuracy, where the greatest value, the smallest value, and the average value are 93.60%, 91.85%, and 93.00%, respectively. The average value and the optimal value of MCC are 0.8607 and 0.8723, respectively, which are close to the upper limit of its value interval [−1,1]. The classification results indicate that the proposed waveform image method has excellent discriminating performance in underground mines.

Contribution Rate
The contribution rate is a key parameter that quantitatively determines the amount of initial information retained in the input features derived from PCA. An appropriate value of the contribution rate can not only reduce the computation time by decreasing the number of the input features, but also can ensure good classification accuracy. Therefore, it is important to discuss the classification accuracy for test 1 to test 4 under different contribution rates. Figure 7 shows the classification accuracy of micro-seismic events and blasts, total classification accuracy, and quality evaluation parameter MCC for test 1 to test 4 under different contribution rates.
The effects of the contribution rate can be analyzed by dividing it into three value intervals, which are [0.50, 0.90], [0.90, 0.95], and [0.95, 1.00], respectively. Firstly, it can be seen clearly that the total classification accuracy of the four tests takes an overall increasing trend when the contribution rate ranges from 0.50 to 0.90. Secondly, the total classification accuracy for test 1 to test 4 are relatively stable when the contribution rate ranging from 0.90 to 0.95. In addition, when the values of the contribution rate for test 1, test 2, test 3, and test 4 are equal to 0.92, 0.95, 0.90, and 0.90, the four tests reach their optimal total accuracy, which are 93.65%, 93%, 94.5%, and 93.45%, respectively. Then, the total accuracy for test 1 to test 4 begin to decline when the contribution rates exceed their optimal values. Thirdly, when the contribution rate ranges between 0.95 and 1.00, it can be seen that the total accuracy of the four tests show fluctuations, which decrease firstly and then increase slightly. Therefore, it can be determined that the optimal values of the contribution rate for test 1 to test 4 are distributed in the interval [0.90, 0.95].

Contribution Rate
The contribution rate is a key parameter that quantitatively determines the amount of initial information retained in the input features derived from PCA. An appropriate value of the contribution rate can not only reduce the computation time by decreasing the number of the input features, but also can ensure good classification accuracy. Therefore, it is important to discuss the The effects of the contribution rate can be analyzed by dividing it into three value intervals, which are [0.50, 0.90], [0.90, 0.95], and [0.95, 1.00], respectively. Firstly, it can be seen clearly that the total classification accuracy of the four tests takes an overall increasing trend when the contribution rate ranges from 0.50 to 0.90. Secondly, the total classification accuracy for test 1 to test 4 are relatively stable when the contribution rate ranging from 0.90 to 0.95. In addition, when the values of the contribution rate for test 1, test 2, test 3, and test 4 are equal to 0.92, 0.95, 0.90, and 0.90, the four tests reach their optimal total accuracy, which are 93.65%, 93%, 94.5%, and 93.45%, respectively. Then, the total accuracy for test 1 to test 4 begin to decline when the contribution rates exceed their optimal values. Thirdly, when the contribution rate ranges between 0.95 and 1.00, it can be seen that the total accuracy of the four tests show fluctuations, which decrease firstly and then increase slightly. Therefore, it can be determined that the optimal values of the contribution rate for test 1 to test 4 are distributed in the interval [0.90, 0.95]. Table 4 lists the computation time of test 1 to test 4 under different contribution rates, where the computation process includes the reading of waveform image databases, PCA procedure, establishment of the discrimination model, and prediction of classification results.    As shown in Table 4, the computation time increases gradually when the contribution rate increases from 0.90 to 0.95, while the computation time increases rapidly when the contribution rate reaches 1.00. Specifically, the average computation time of the four tests is 627.38 s when the contribution rate is equal to 1.00, which is approximately 2.15 times and 1.72 times as large as the average computation time when the values of the contribution rate are 0.90 and 0.95, respectively, indicating that the computation efficiency is seriously affected. Another disadvantage for the contribution rate reaching 1.00 is that numerous waveform image features with little classification effect are generated, which may even represent the noises in the complex underground mining environment. Hence, it can be inferred that the excellent classification results of micro-seismic events and blasts in underground mines cannot be obtained by simply increasing the value of the contribution rate.
Based on the above analysis, we can preliminarily determine that the optimal value of the contribution rate should be between 0.90 and 0.95. To further determine the optimal value of the contribution rate, a new variable named computation efficiency per accuracy (CEA) is introduced to quantitatively evaluate the effects of contribution rate on classification accuracy and computation efficiency, which is defined as The contribution rate that corresponds to the minimum CEA is the optimal, because less computation time will be consumed for reaching the same total classification accuracy. Therefore, the values of CEA for test 1 to test 4 under the contribution rates ranging from 0.90 to 0.95 are solved, which are listed in Table 5. Evidently, the values of CEA for test 1 to test 4 are minimum when the contribution rate is equal to 0.90. It can be concluded that the optimal contribution rate is 0.90, which can simultaneously ensure excellent classification accuracy and computation efficiency.

Computation Efficiency
PCA is used to transform the original features into the new uncorrelated features with lower dimension. Therefore, the computation efficiency can be improved by PCA as the number of the input features is decreased according to the contribution rate. Table 4 lists the computation time of test 1 to test 4 under different contribution rates. It can be seen clearly that the computation time increases with the increase of the contribution rate, where the contribution rate of 1.00 means that the input features are the original features without dimension reduction. Additionally, for test 1 to test 4, the average computation time difference between σ = 0.90 and σ = 0.95 (∆t 1 ), that between σ = 0.95 and σ = 1.00 (∆t 2 ), and that between σ = 0.90 and σ = 1.00 (∆t 3 ) are 72.31 s, 263.31 s, and 335.62 s, respectively. Evidently, ∆t 1 is small while ∆t 2 and ∆t 3 are particularly large. Moreover, the average computation time corresponding to σ = 1.00 is approximately 2.15 times and 1.72 times as large as the that corresponding to σ = 0.90 and σ = 0.95, respectively, indicating that PCA can greatly improve the computation efficiency.
Based on the calculated results, we can estimate that the average total classification accuracy of 93% is reached through the proposed discriminant method, where the computation time of about 320 s (less than 6 min) is needed to correctly identify approximately 1860 micro-seismic events or blasts. It is evident that this computation efficiency is able to satisfy the requirements of data processing for seismicity analysis in underground mines. However, it will take approximately 1860 min (31 h) to finish the same workload through the manual discrimination if we assume that about 1 min is needed for an experienced analyst to discriminate a micro-seismic events (blast). Compared to the manual discrimination and the discrimination methods without PCA, it is demonstrated that the proposed method for discriminating micro-seismic events and blasts in underground mines can significantly improve the computation efficiency.

Further Applications
The proposed discriminant method is effective as long as the waveform image databases are established. For a seismic monitoring system that works normally, it is easy to collect sufficient data by a period of time (e.g., several weeks) to establish waveform image databases of micro-seismic events and blasts. As the PCA derived waveform image features are effective and the SVM algorithm performs well for binary classifications, the two databases containing at least 100 records (e.g., 50 micro-seismic events and 50 blasts) are acceptable to apply the proposed method. In addition, the waveform image databases can be updated by supplementing the correct classified micro-seismic events and blasts. Therefore, the proposed waveform image method can be effective in different underground mines with the establishment and update of waveform image databases.
To sum up, the advantages of the proposed discriminant method are prominent. Specifically, it is effective with high classification accuracy and is automatic with superior computation efficiency. In addition, the proposed method is demonstrated to be robust as there are not many differences between the results of four cross validations. Nevertheless, in the underground mining processes, the mining methods may be changed due to the increase of mining depth and the change of mining circumstances and conditions. Along with the continuous update of waveform image databases, it may be necessary to update the unified signal duration that determined before to generate a new value, in order to adapt to the new databases and guarantee the classification accuracy. This could be a limitation for the proposed method as the unified signal duration needs to be updated periodically and the time for updating it needs to be judged by professional mining technical staff.

Conclusions
Currently, the discrimination of micro-seismic events is a significant problem in underground mine seismicity. Focusing on the disadvantages of the discrimination methods using seismic source parameters and waveform spectrum analysis, a novel waveform image method was proposed. The waveform image databases of micro-seismic events and blasts were established by using the full waveform data collected from the Yongshaba underground mine in China. PCA was applied to extract the original features from the two waveform image databases, which could get rid of the similarities between micro-seismic events and blasts as well as retain the differences. Then, the original image features were transformed into new uncorrelated image features with quantitative importance and lower dimension through PCA, where the contribution rate was utilized to quantitatively determine the amount of initial information contained in the derived image features. Furthermore, the PCA derived waveform image features were coupled with SVM algorithm to establish discrimination models and perform the cross-validation tests. With the contribution rate of 0.95, results of four groups of cross validation show that the optimal values for the classification accuracy of micro-seismic events and blasts, total classification accuracy, and quality evaluation parameter MCC are 97.1%, 93.8%, 93.60%, and 0.8723, respectively. In addition, the effects of contribution rate on classification accuracy and computation efficiency were discussed quantitatively. The optimal contribution rate was determined to be 0.90. It is concluded that the proposed waveform image method for discriminating micro-seismic events is accurate and automatic, which can provide high-quality seismic data for seismicity analysis in underground mines. As for the future work, we intend to investigate the possibility for using the deep learning algorithms to replace the SVM algorithm to further improve classification accuracy and computation efficiency. In addition, it is interesting to explore the possibility of merging the magnitudes of micro-seismic events into the discrimination process, which can provide insights for the real-time identification of micro-seismic events with large magnitudes as well as prevent potential mining-induced seismic hazards.