Review Reports - Research on CNC Machine Tool Spindle Fault Diagnosis Method Based on DRSN–GCE Model

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for allowing me to review your manuscript on the DRSN-GCE model for spindle fault diagnosis in CNC machine tools. The application area is relevant, and the proposed approach is useful in denoising data, which is appreciated and welcomed in an industrial setup. However, I feel that some areas of the manuscript require improvements before publication. Here are some comments:
1—The method is well structured, and all the components of the proposed architecture are sufficiently described. However, the authors used four gated convolutional residual shrinkage modules in the proposed architecture. The choice of four modules remains unclear and must be justified.

2—The dataset description in the experimental setup lacks some details. For instance, information on the signal acquisition setup, sensor type, and noise conditions is required.

3—During model evaluation, the authors used accuracy and t-SNE visualizations. You should include more statistical validation with other performance metrics. Cross-validation or the use of independent test sets is required, and it includes additional metrics: precision, recall, F1-score...

4—The baseline models used during model comparison seem old and relatively outdated. Consider adding Comparisons with more recent architectures (e.g., Transformer-based models, and attention-enhanced CNNs). Additionally, the model should be evaluated from another aspect (parameter counts, training time, and performance under noisy conditions).

5—The manuscript does not clarify what proper noise types are used. In the results section, the authors use Laplace, salt-and-pepper, Gaussian noise, etc. The authors should justify the arbitrary use of the kind of noise or select a set of noise that will be used for accuracy evaluation. In Figure 9, we can only find the accuracy of the proposed method under Laplace and salt-and-pepper. It further persists in the confusion matrix where it is done under Gaussian noise. Why not use one noise or a set of noise for the results? This may ease the understanding of the obtained results.

6- A conclusion section is needed to discuss the industrial implications, limitations, and further research.

Minors: In Figure 10, remove non-English characters.

Comments on the Quality of English Language

Awkward sentence structure and redundancy impact technical clarity. Consider revising the language used in the manuscript.

Author Response

Comments 1:The method is well structured, and all the components of the proposed architecture are sufficiently described. However, the authors used four gated convolutional residual shrinkage modules in the proposed architecture. The choice of four modules remains unclear and must be justified.

Response 1:Regarding the structure of four gated convolutional contraction modules, we have done ablation experiments to discuss the rationality of four gated convolutional contraction modules, and through the ablation experiments we conclude that four gated convolutional contraction modules are a reasonable choice. This piece of experimental content in our paper 4.3. Ablation experiment.

Comments 2:The dataset description in the experimental setup lacks some details. For instance, information on the signal acquisition setup, sensor type, and noise conditions is required.

Response 2:We have added some detailed descriptions of the datasets in the experimental setup, supplementing information about signal acquisition settings, sensor types, and noise conditions. This section is described in the paper 4.2. Introduction to the experimental dataset.The dataset was collected using SKF6205-2RSJEM bearings at sampling frequencies of 12 kHz and 48 kHz, and a single point of failure was set up by EDM. For signal acquisition, the experimental platform was fitted with piezoelectric accelerometers (mainly PCB Piezotronics type 352C33 accelerometers), which were mounted on the drive end bearing cover (DE, Drive End) and the fan end (FE, Fan End) of the motor, and held in place by means of magnetic bases. The acquired signals were captured by a National Instruments (NI) data acquisition card and transferred to a computer for processing. For the effect of noise on bearing diagnosis, we add different types of noise to the original vibration signal to simulate the noise-containing vibration signal. The noise in the rolling bearing vibration signal is dominated by inherent noise similar to Gauss noise, so we add this noise in the following experiments. Based on other research in the field of bearing diagnostics in noisy environments, we also add other non-Gauss noises to the vibration signals to make them more consistent with real industrial scenarios. For example, Lapace noise, which is a more complex and randomized non-Gauss noise in the real industrial scene. Salt-and-Pepper noise is also a common non-ideal noise simulation in the field of rolling bearing fault diagnosis. There is also Poisson noise, a typical non-Gauss noise. We set different signal-to-noise ratio (SNR) levels for each noise type to simulate various noise intensities, as shown in Table 3.

Comments 3:During model evaluation, the authors used accuracy and t-SNE visualizations. You should include more statistical validation with other performance metrics. Cross-validation or the use of independent test sets is required, and it includes additional metrics: precision, recall, F1-score...

Response 3:For this comment of yours, we added other performance metrics for more statistical validation. We use cross-validation to include more metrics: precision, recall, mean and standard deviation of F1-scores.

Comments 4:The baseline models used during model comparison seem old and relatively outdated. Consider adding Comparisons with more recent architectures (e.g., Transformer-based models, and attention-enhanced CNNs). Additionally, the model should be evaluated from another aspect (parameter counts, training time, and performance under noisy conditions).

Response 4:We have added a comparison of the latest architectures (Swin transformer model structure). Through the comparison experiments, the model in this paper works better than other models in terms of performance metrics such as accuracy, precision, recall, and F1 score. The content of this paper is mainly based on the performance of bearing fault diagnosis in the noise environment, compared with other models, the model in this paper is more advantageous in noise immunity and fault diagnosis accuracy, which provides an effective method for the fault diagnosis of rolling bearings in complex noise environment, and the performance indexes such as accuracy, precision, recall, and F1 scores are enough to prove the superiority of the model in this paper.

Comments 5:The manuscript does not clarify what proper noise types are used. In the results section, the authors use Laplace, salt-and-pepper, Gaussian noise, etc. The authors should justify the arbitrary use of the kind of noise or select a set of noise that will be used for accuracy evaluation. In Figure 9, we can only find the accuracy of the proposed method under Laplace and salt-and-pepper. It further persists in the confusion matrix where it is done under Gaussian noise. Why not use one noise or a set of noise for the results? This may ease the understanding of the obtained results.

Response 5:For this modification, in 4.4. Comparison Test, in the first experiment we added a mixed set of Gauss and Laplace noise to the vibration signals, and in the second experiment we added a set of Gauss, Salt-and-Pepper, and Poisson noise. In 4.5. Single Noise Experiments, we also performed single noise experiments and discussed the experimental effects of single noise.

Comments 6:A conclusion section is needed to discuss the industrial implications, limitations, and further research.

Response 6:We have added these notes in the last part of the conclusion.

Reviewer 2 Report

Comments and Suggestions for Authors

This paper presents a robust approach to bearing fault diagnosis using the DRSN-GCE model, specifically designed to address the challenges posed by noise in electromechanical systems. By converting one-dimensional time-series signals into two-dimensional time-frequency images via Continuous Wavelet Transform (CWT), the model enhances input data richness and maintains diagnostic performance even in noisy environments. The introduction of gated convolutional layers and a Gated Convolutional Residual Shrinkage Module allows the model to suppress irrelevant information and extract critical features effectively. Experimental results on the CWRU dataset demonstrate the model’s superior noise immunity and diagnostic accuracy across various noise types, outperforming existing deep learning models in noise resistance and fault detection reliability.

The paper is well structured, with a strong methodology and results section. The thing that would enforce the paper is to explain why this method (scientifically) is of interest and why these methods for are adequate to study this phenomenon (an extension to the methodology/introduction section for the problem statement and solution description)

Author Response

Comments 1:This paper presents a robust approach to bearing fault diagnosis using the DRSN-GCE model, specifically designed to address the challenges posed by noise in electromechanical systems. By converting one-dimensional time-series signals into two-dimensional time-frequency images via Continuous Wavelet Transform (CWT), the model enhances input data richness and maintains diagnostic performance even in noisy environments. The introduction of gated convolutional layers and a Gated Convolutional Residual Shrinkage Module allows the model to suppress irrelevant information and extract critical features effectively. Experimental results on the CWRU dataset demonstrate the model’s superior noise immunity and diagnostic accuracy across various noise types, outperforming existing deep learning models in noise resistance and fault detection reliability.

Response 1:Thank you for your recognition. The existing deep learning models generally takes Gauss noise into account and have achieved good diagnostic performance,while other kinds of noise in the field need further systematic research. Therefore, in this paper the DRSN - GCE algorithm is created to identify the fault classification of rolling bearings considering various types of noise influences. Firstly, different signal-to-noise ratios and frequencies of noise are added to the vibration signal to simulate the noise environment in real working conditions. Secondly, the one-dimensional vibration signal is transformed into a two-dimensional time-frequency diagram using continuous wavelet transform to fully extract the noise-containing fault features, and then a gated convolution layer is introduced into the DRSN to inhibit the irrelevant noise interference, and a gated convolution residual shrinkage module structure is proposed to enhance the feature extraction capability and noise suppression effect in the fault signal. Finally, based on the validation of Case Western Reserve University bearing dataset, the results show that the model in this paper has high accuracy in all kinds of noise environments.

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript proposed an integrated neural network model for CNC spindle fault diagnosis. Authors combined several techniques in neural networks and the results seem great, achieved over 99% accuracy. Overall the work is interesting and useful for fault detection, but the manuscript would proabably need revisions to make it more concise and clear for readers.

My comments are as follows:

There are several typos and formatting errors throughout the manuscript; authors should really pay more attention to the details.
1. line 21, in the middle of the sentence, a period appeared in the middle. The following word is not capitalized and the sentence is not complete. Same errors appear again in lines 77, 331.
2. line 65 and following citations, it should be "et al."
3. line 70 and other places, there should be a space after punctuation mark and between words (line 313).
4. variables should be italicized, for example, line 190, i, j, m, n
5. line 502 and other places, please indicate clearly which Fig. (b).
6. Typo, "Mould" in Table 5
7. Chinese characters in Figure 10
8. line 439, two "damage"
Some text and figures/tables are redundant, authors should consider remove them from the manuscript.
1. For example, Table 4 and Figure 9 actually convey the same information. Same as Table 5 and Figure 10; Tables 6, 7 and Figure 13. If there is a specific point to be made by these figures and tables, authors should state clearly.
2. I think section 2.1 to 2.3 are not suitable in a research paper. Authors should just add citations and refer readers to appropriate references. Corresponding figures should be removed too, for example, Figures 1 to 6. If authors think it is necessary to explain the model in details, it could be done in Figure 7, which is really the model that authors should focus on.
3. Line 310, there is really not necessary to have indexing and step at the same time, 1) and Step 1.
Authors setup experiments with four types of noise, two frequencies and 5 different signal-to-noise ratios. It seems complete, but only one setup is introduced in each experiment. I wonder is this the real sitation? Is it possible, for example, to introduce two types of noises in one experiment? If this is not suitable, please discuss.
Please add reference for statement in line 432~434.
What are the noise source? where do they come from? Which noise environment corresponds to what noise sourece? Please add references.
Why is 4 Gated CRSM the best? Is there any experiment or ablation study to support this decision?

Author Response

Comments 1:There are several typos and formatting errors throughout the manuscript; authors should really pay more attention to the details.

Response 1:Thank you for pointing out the typos and formatting errors in the manuscript, which we have corrected and really should have paid more attention to detail.

Comments 2:Some text and figures/tables are redundant, authors should consider remove them from the manuscript.

Comments 2-1:For example, Table 4 and Figure 9 actually convey the same information. Same as Table 5 and Figure 10; Tables 6, 7 and Figure 13. If there is a specific point to be made by these figures and tables, authors should state clearly.

Response 2-1:Thank you for pointing out the suggestion that some of the text and graphs are redundant, we deleted Figures 9, 10, and 13, and deleted the tables and graphs that express the same information while clearly expressing the experimental point of view.

Comments 2-2:I think section 2.1 to 2.3 are not suitable in a research paper. Authors should just add citations and refer readers to appropriate references. Corresponding figures should be removed too, for example, Figures 1 to 6. If authors think it is necessary to explain the model in details, it could be done in Figure 7, which is really the model that authors should focus on.

Response 2-2:Sections 2.1 and 2.3 are some relevant theoretical foundations, which we have modified and simplified, and if all of them are deleted, we feel that they do not fit and lack the necessary description of the theoretical foundations. Corresponding icons, such as Figures 1 to 6 we have deleted.

Comments 2-3:Line 310, there is really not necessary to have indexing and step at the same time, 1) and Step 1.

Response 2-3:We've made a correction to this problem, detailing each of the steps.

Comments 3:Authors setup experiments with four types of noise, two frequencies and 5 different signal-to-noise ratios. It seems complete, but only one setup is introduced in each experiment. I wonder is this the real sitation? Is it possible, for example, to introduce two types of noises in one experiment? If this is not suitable, please discuss.

Response 3:For this modification, in 4.4. Comparative Tests, in the first experiment we added a mixed set of Gauss and Laplace noise to the vibration signals, and in the second experiment we added a set of Gauss, Salt-and-Pepper and Poisson noise. In 4.5. Single Noise Experiments, we also performed single noise experiments and discussed the experimental effects of single noise.

Comments 4:Please add reference for statement in line 432~434.

Response 4:We have added references to these statements, which are time-frequency diagrams derived by passing a one-dimensional vibration signal through a continuous wavelet transform.

In the paper 4.2. Introduction to the Experimental Dataset.

Rolling body failures with failure diameters of 0.1778 mm, 0.3556 mm, 0.5334 mm (labeled BA_Ⅰ, BA_Ⅱ, and BA_Ⅲ, respectively), Inner ring failures with failure diameters of 0.1778 mm, 0.3556 mm, 0.5334 mm (labeled IR_Ⅰ, IR_Ⅱ, and IR_Ⅲ, respectively), and Outer ring failure diameters of 0.1778 mm, 0.3556 mm, 0.5334 mm(labeled OR_Ⅰ, OR_Ⅱ, and OR_Ⅲ, respectively). In this study, we constructed the dataset using vibration signals from the drive-end (DE) bearings with noise artificially added to the vibration signals to simulate noise-containing signals in real industrial environments, sampled at 12 kHz with a rotational speed of 1730 rpm.

Comments 5:What are the noise source? where do they come from? Which noise environment corresponds to what noise sourece? Please add references.

Response 5:In the introduction, we discuss the noise present in the diagnosis of bearings in noisy environments and add relevant references.

Vibration, acoustic, temperature and current signals [25][26] are commonly used as signal sources in engineering practice. However, the above methods for fault diagnosis of rolling bearings often mixed with the noise which may mask the characteristics of the signal. The noise in the vibration signal of rolling bearings is mainly similar to the Gauss noise inherent noise [27] and other kinds of non-Gauss noises such as Laplace noise [28], Salt-and-Pepper noise[29] and Poisson noise [30].

Comments 6:Why is 4 Gated CRSM the best? Is there any experiment or ablation study to support this decision?

Response 6:Regarding the structure of four gated convolutional contraction modules, we have done ablation experiments to discuss the rationality of four gated convolutional contraction modules, and through the ablation experiments we conclude that four gated convolutional contraction modules are a reasonable choice. This piece of experimental content in our paper 4.3. Ablation Experiment.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The responses to the previous comments are satisfying. Thank you for your effort in updating the manuscript.

Author Response

Comments : The responses to the previous comments are satisfying. Thank you for your effort in updating the manuscript.

Response : Thank you for your recognition of our responses. We also sincerely appreciate your valuable comments during the revision process, which have played an important role in improving our manuscript.

Reviewer 3 Report

Comments and Suggestions for Authors

The revised version has reduced redundancy significantly and added mix-model simulation results as suggested. However, there are still a few minor issues that the authors should really rigorously go through the manuscript and check again.

For example, lines 38, 47, 53, and other places, there should be a single space after a concluding punctuation mark.
Lines 106, 107, controlling marks in the middle of the sentences.
Figure 3 is refered before Figure 1 and 2.
Line 277, in the middle of the sentence, typo
Line 338, "K-fold" should be removed
Please add citation for each of the methods in Table 5
It is very difficult to read the numbers in Figures 5 and 7. Please make the table more clear if possible.

Author Response

Comments 1: For example, lines 38, 47, 53, and other places, there should be a single space after a concluding punctuation mark.

Response 1: Thank you for your feedback. We have made the necessary revisions and ensured that a single space is added after the concluding punctuation marks on lines 38, 47, 53, and other places as you suggested. In addition, we have carefully reviewed the full text to ensure that such problems do not recur and that the manuscript is accurate and standardized.

Comments 2: Lines 106, 107, controlling marks in the middle of the sentences.

Response 2: Thank you for your careful review. We have thoroughly examined the manuscript and confirmed the presence of unintended control characters in lines 106 and 107, which were likely introduced during formatting or document encoding conversion. These control characters have now been removed, and we have ensured that no similar issues remain in the rest of the manuscript.

Comments 3: Figure 3 is refered before Figure 1 and 2.

Response 3: Thank you for your comment. We have corrected this issue and have now removed the redundant reference.

Comments 4: Line 277, in the middle of the sentence, typo.

Response 4: Thank you for pointing this out. The typo in line 277 has been identified and corrected accordingly. And furthermore, the spelling of words in the full text of the content was checked.

Comments 5: Line 338, "K-fold" should be removed.

Response 5: Thank you for your suggestion. As per your recommendation, we have removed "K-fold" from line 338.

Comments 6: Please add citation for each of the methods in Table 5.

Response 6: Thank you for your suggestion. We appreciate the importance of proper citation. However, in Table 5, the compared models were re-implemented under our own experimental settings, which differ from those in the original papers. Therefore, it is not appropriate to directly cite those papers, as the performance results are not taken from them but rather reproduced in our study. Moreover, after reviewing related literature, we found that similar studies also did not cite the original papers for comparison models unless the exact experimental conditions were replicated. For example, the following references. These references are also cited in this paper.

[1]Li W, Zhong X, Shao H, et al. Multi-mode data augmentation and fault diagnosis of rotating machinery using modified ACGAN designed with new framework[J]. Advanced Engineering Informatics, 2022, 52: 101552.

[20]Zhao M, Zhong S, Fu X, et al. Deep residual shrinkage networks for fault diagnosis[J]. IEEE Transactions on Industrial Informatics, 2019, 16(7): 4681-4690.

Comments 7: It is very difficult to read the numbers in Figures 5 and 7. Please make the table more clear if possible.

Response 7: Thank you for your comments. We have remade Figures 5 and 7 to improve their clarity and make the figures and details easier to read.