Review Reports - An Algorithm for the Shape-Based Distance of Microseismic Time Series Waveforms and Its Application in Clustering Mining Events

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors 've included my full report below. Thank you for allowing me to read and review this interesting article on time series in one part of the industry/economy. I found the article interesting, but it must undergo significant revision before proceeding. My concerns are: 1. Equation 1, what is the meaning of the symbol, d? 2. In the intro section, what is the aim and motivation of the study? Also, a better introduction to Times series would be important for the rereaders 3. Figure 1, when printed on A4 paper, is unreadable. Make more comprehensife figure and explain it in full below the figure. The text at the moment is missing 4. There is a lack of explanation as to why this particular form of the formula 2 was chosen — why the difference aqr − iqr is essential, and why the absolute value of kurtosis (|sk|) is included.
Mathematically, this expression is unusual (the difference between AQR and IQR can have a negative value), which may compromise the stability of the result despite the logarithm. The authors should clearly explain the motivation and stability of this equation, especially regarding possible numerical sensitivity.
5. Algorithm 1 could be presented in the Appendices
6. Figure 2, printed on A4 paper, is unreadable. Improve the quality of the Figure.
7. Equation 5. What are the meanings of both x symbols?
8. Data frequency is?, Data Logger is?, Which data were isolated? No detailed information about measurement location, sensor types, possible errors, or structured capture.
No mention of whether the data were manually marked or automatically sorted, which is crucial for the objectivity of the evaluation.
9. Figure 3 is again unreadable, and all the figures do not have an explanation below them.
10. All symbols in the equation must have their meaning. Revise!11. Tables 1-end. What is e-04? Use SI symbols or expressions!
12. What are MACE, DCAE and CAE? 13. Figures 7 and 8 must have a much better explanation for the reader, while this is what you want to present. 14. What are the limitations and delimitations of the study? Add to the conclusions
15. Also, add some proposals for further research.
16. What are the scientific contributions to your field? Add to conclusions
17. And cite seven most recent articles on your theme, eg, 2024, 2025 or 2025. This is now missing.

Author Response

Comments 1: Equation 1, what is the meaning of the symbol d?

Response 1: We define a distance metric function d(x_i,y_i) between two time series X and Y, where d(x_i,y_j) denotes the distance between the i-th data point of time series X and the j-th data point of time series Y.

Comments 2: In the intro section, what is the aim and motivation of the study? Also, a better introduction to Times series would be important for the readers.

Response 2: We have revised the introduction to clarify both the aim and motivation of the study, and to provide a more comprehensive background on time series analysis.

Comments 3: Figure 1, when printed on A4 paper, is unreadable. Make more comprehensife figure and explain it in full below the figure. The text at the moment is missing.

Response 3: We have redrawn Figure 1 using higher resolution and a modular layout to improve its readability on standard A4 prints. The updated figure provides a more comprehensive depiction of the MDCAE-CSBD-Vol model architecture, with clear module separation (multi-scale fusion block, dilated convolution block, feature extraction, and decoder). Each key operation (convolution, pooling, activation, regularization) is now explicitly labeled.

Furthermore, a detailed and informative caption has been added below the figure, explaining each component and its role in the network. These modifications aim to enhance both the visual quality and conceptual clarity of the figure.

Comments 4: There is a lack of explanation as to why this particular form of the formula 2 was chosen — why the difference aqr − iqr is essential, and why the absolute value of kurtosis (|sk|) is included. Mathematically, this expression is unusual (the difference between AQR and IQR can have a negative value), which may compromise the stability of the result despite the logarithm. The authors should clearly explain the motivation and stability of this equation, especially regarding possible numerical sensitivity.

Response 4: The seismic waveforms generated by different microseismic events often exhibit highly similar signal characteristics, making waveform classification a particularly challenging task. To address this issue, we explored a wide range of feature extraction and representation techniques. Among dozens of methods evaluated, only the one described in this manuscript demonstrated a satisfactory ability to distinguish between waveform types. It is possible that more effective approaches exist; however, they have yet to be identified.

Comments 5: Algorithm 1 could be presented in the Appendices

Response 5: Algorithm 1 has been relocated to the Appendices for improved readability and structural clarity.

Comments 6: Figure 2, printed on A4 paper, is unreadable. Improve the quality of the Figure.

Response 6: We have redrawn Figure 2 using higher resolution and a modular layout to improve its readability on standard A4 prints.

Comments 7: Equation 5. What are the meanings of both x symbols?

Response 7: In Equation 5, the two variables are not both denoted as x; the variable on the left-hand side of the equation is x', whereas the one on the right-hand side is x.

Comments 8: Data frequency is?, Data Logger is?, Which data were isolated? No detailed information about measurement location, sensor types, possible errors, or structured capture. No mention of whether the data were manually marked or automatically sorted, which is crucial for the objectivity of the evaluation.

Response 8: The waveform data were sampled at a frequency of 500 Hz and were collected using microseismic monitoring stations installed at a coal mine in China. The dataset was subsequently organized and annotated manually.

Comments 9: Figure 3 is again unreadable, and all the figures do not have an explanation below them.

Response 9: Figure 3 is intended solely to illustrate the field equipment used for microseismic waveform acquisition and is not directly related to the proposed algorithm. If deemed unnecessary, it may be removed without affecting the technical content of the paper.

Comments 10: All symbols in the equation must have their meaning. Revise!

Response 10: We have carefully revised the manuscript to ensure that every mathematical symbol introduced in each equation is explicitly defined either directly following the equation or within its immediate context.

Comments 11: Tables 1-end. What is e-04? Use SI symbols or expressions!

Response 11: The relevant data have been reformatted using scientific notation for clarity and consistency.

Comments 12: What are MACE, DCAE and CAE?

Response 12: MCAE refers to a model that incorporates multi-scale fusion convolution blocks but excludes dilated convolution blocks; DCAE refers to a model that employs dilated convolution blocks without incorporating multi-scale fusion convolution blocks; CAE denotes an autoencoder that utilizes neither multi-scale fusion nor dilated convolution blocks. Detailed explanations of these model configurations are provided in the main text of the paper.

Comments 13: Figures 7 and 8 must have a much better explanation for the reader, while this is what you want to present.

Response 13: We have significantly improved the explanations accompanying Figures 7 and 8. We enhanced the figure captions to provide a more comprehensive description of the visualization and its implications.

Comments 14: What are the limitations and delimitations of the study? Add to the conclusions.

Response 14: We have added a paragraph to the conclusion section discussing the limitations and delimitations of the study, including data scope, dependency on statistical volatility, and unsupervised assumption constraints.

Comments 15: Also, add some proposals for further research.

Response 15: We have included several proposals for future research, such as testing across diverse mining environments, extending to semi-supervised frameworks, and integrating with real-time monitoring systems.

Comments 16: What are the scientific contributions to your field? Add to conclusions.

Response 16: We have added a clear summary of the scientific contributions in the conclusion section, highlighting the integration of volatility and time-window constraints in deep unsupervised clustering for seismic waveform analysis.

Comments 17: And cite seven most recent articles on your theme, eg, 2024, 2025 or 2025. This is now missing.

Response 17: We have supplemented seven recent references (2024-2025) relevant to our work, addressing the latest advancements in time-series clustering, deep feature extraction, and microseismic analysis. These references are now integrated throughout the manuscript to strengthen the literature review and contextualize our contributions.

Reviewer 2 Report

Comments and Suggestions for Authors

Comments to the Author

The authors propose an unsupervised learning method based on the classic Shape-Based Distance (SBD) algorithm, a multi-scale fusion spatial convolutional autoencoder for feature extraction of the original waveform, using a temporally constrained window to improve SBD. The manuscript is well organized, an impressive research in seismic and non-seismic events detection, full of details, and technically convincing. I would suggest it for acceptance. Before doing so, however, I would ask the authors to address these few comments.

In section 4.3. Max-min normalization does not account for variance or mean, which can distort the relative distances between data points, and features with inherently small ranges may become insignificant compared to others after normalization. What method have you implemented in your model to avoid such a scenario?
In section 4.3. The method of stopping early to determine the best fit of the training model is not clearly explained. It would be more realistic to support this statement with a graphical representation that uses the “elbow method”, where a plot of the sum of squares within a cluster (WCSS) is explained as a function of the number of clusters K.
In Section 5.1, the authors highlighted the key indicators for evaluating clustering algorithms. In particular, the representation of the silhouette score as a function of the number of unsupervised clusters k is, in my opinion, necessary to explain whether clustering is useful or is affected by noise due to the normalization method employed.

Author Response

Comments 1: In section 4.3. Max-min normalization does not account for variance or mean, which can distort the relative distances between data points, and features with inherently small ranges may become insignificant compared to others after normalization. What method have you implemented in your model to avoid such a scenario?

Response 1: We fully acknowledge that max-min normalization may, in some cases, fail to preserve the variance structure of the data, potentially leading to distorted distance relationships among features. To mitigate this issue, our proposed MDCAE-CSBD-Vol model incorporates several strategies:

Robust Feature Extraction via MDCAE:

Although we initially apply max-min normalization to bring different time series to a comparable scale, the subsequent feature extraction process is performed by the MDCAE network, which leverages multi-scale fusion convolutions and dilated convolutions. This design allows the network to capture both local and global features, effectively reducing the reliance on raw input scale and preserving essential variance and shape information in the learned representations.

Spectral Regularization:

As detailed in Section 3.2, we introduce spectral regularization to enhance the stability and generalization of the network. This regularization constrains the model’s capacity to overly adapt to specific value ranges, thus alleviating the effect of compressed features after normalization.

Volatility-Based Similarity Measure:

More importantly, in the similarity measurement phase (Section 4.2), we compute a shape-based distance fused with a volatility descriptor that captures higher-order statistics including variance, kurtosis, and interquartile range. This allows the clustering process to account for the distributional characteristics of the time series, even if such differences are not directly visible in the normalized values.

Ablation Experiments for Justification:

The effectiveness of these enhancements is validated through comparative experiments and ablation studies (Section 5.2, Table 2 and Table 4), which demonstrate that our model significantly outperforms baseline methods even under uniform normalization settings.

Comments 2: In section 4.3. The method of stopping early to determine the best fit of the training model is not clearly explained. It would be more realistic to support this statement with a graphical representation that uses the “elbow method”, where a plot of the sum of squares within a cluster (WCSS) is explained as a function of the number of clusters K.

Response 2: We have clarified the early stopping strategy used in model training and provided additional explanation for determining the number of clusters ?. The early stopping method in our model training is based on the monitoring of clustering evaluation metrics, including Silhouette Coefficient (SC), Rand Index (RI), Adjusted Rand Index (ARI), and Normalized Mutual Information (NMI). Training is terminated when these metrics show no significant improvement over a predefined number of iterations (patience = 10). We have updated Section 4.3 to clearly state this stopping criterion. The objective of our study is to classify seismic waveforms into three categories: microseisms, blasts, and noise.

Comments 3: In Section 5.1, the authors highlighted the key indicators for evaluating clustering algorithms. In particular, the representation of the silhouette score as a function of the number of unsupervised clusters k is, in my opinion, necessary to explain whether clustering is useful or is affected by noise due to the normalization method employed.

Response 3: The primary objective of this study is to classify seismic waveforms into three distinct types—microseisms, blasts, and noise—to facilitate subsequent research and analysis.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors addressed all my comments. I have no further open issues. One concern left:

Please ensure that all information you add to the answer during my first review is also included in the manuscript, specifically in the text.