Next Article in Journal
A Comprehensive Performance Evaluation Method Based on Dynamic Weight Analytic Hierarchy Process for In-Loop Automatic Emergency Braking System in Intelligent Connected Vehicles
Previous Article in Journal
Numerical Analysis of the Structural Parameters on the Performance of Oil-Injected Rotary Vane Compressors
Previous Article in Special Issue
Systematic Optimization Study of Line-Start Synchronous Reluctance Motor Rotor for IE4 Efficiency
 
 
Article
Peer-Review Record

Artificial Intelligence for Fault Detection of Automotive Electric Motors

Machines 2025, 13(6), 457; https://doi.org/10.3390/machines13060457
by Federico Soresini, Dario Barri, Ivan Cazzaniga, Federico Maria Ballo, Gianpiero Mastinu and Massimiliano Gobbi *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Machines 2025, 13(6), 457; https://doi.org/10.3390/machines13060457
Submission received: 13 April 2025 / Revised: 16 May 2025 / Accepted: 21 May 2025 / Published: 26 May 2025
(This article belongs to the Special Issue Fault Diagnostics and Fault Tolerance of Synchronous Electric Drives)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study proposes a fault detection system for automotive electric motors (PMSM) using autoencoder networks, evaluating six architectures (AE, VAE, CNN AE/VAE, LSTM AE/VAE) with vibration data. The 1D CNN AE model demonstrated the best performance in accuracy and efficiency, and was integrated into a semi-automatic monitoring algorithm that reduces testing times and improves detection compared to traditional methods.

 

 

  • The work focuses on the use of deep neural networks but does not compare it with machine learning methods such as SVM, Random Forest, or supervised MLP networks.

 

  • The presented metrics, such as R², MAE, and the reconstruction error visualization, are correct, but no unified classification metric, such as AUC, F1 score, or sensitivity/specificity, is provided to validate fault detections. Please add a classification metrics table to compare the model's efficiency in classifying "OK" and "KO."

 

  • How the classification thresholds were chosen as IQR (Q3 + 1.5×IQR and Q3 + 3×IQR), how the thresholds were optimized or what is the statistical justification.

 

  • The paper shows that 1D CNN AE performs better, but doesn't go into enough detail about why it's better, or what specific types of faults it improves performance on.

 

  • In other publications on flaw detection with AE or CNN in vibroacoustic signals, R² values ​​close to or greater than 0.9 have been reported using similar techniques. The authors should improve the performance or justify the results obtained.

 

  • The model is both trained and evaluated on data from the same motor, without employing cross-validation or hold-out testing using data from other motors or from subsequent healthy-state cycles of the same motor. This limits the assessment of the model’s generalization capability.

 

  • The use of preprocessing techniques such as order tracking, extraction of mechanical and bearing orders, time averaging via logarithmic windows, and Min-Max normalization based on the entire dataset significantly limits the applicability of the model for real-time fault detection. It remains unclear how the authors intend to overcome these constraints. Is the model specifically designed for offline post-processing only, or are there plans for adapting it to true online or streaming scenarios?
Comments on the Quality of English Language

I found a few style errors, Please do a complete review of the work.

Author Response

Comments and Suggestions for Authors

This study proposes a fault detection system for automotive electric motors (PMSM) using autoencoder networks, evaluating six architectures (AE, VAE, CNN AE/VAE, LSTM AE/VAE) with vibration data. The 1D CNN AE model demonstrated the best performance in accuracy and efficiency, and was integrated into a semi-automatic monitoring algorithm that reduces testing times and improves detection compared to traditional methods. 

  • The work focuses on the use of deep neural networks but does not compare it with machine learning methods such as SVM, Random Forest, or supervised MLP networks.

Thank you for your valuable comment. Regarding the comparison with traditional supervised machine learning methods such as SVMs, Random Forests, supervised MLP networks and k-Nearest Neighbors (kNN), we would like to clarify that, based on the literature review conducted (e.g., [15], [21], [22], [23]), unsupervised learning techniques based on deep neural networks have shown superior performance in similar application domains. Therefore, our study focused specifically on these approaches, as they were deemed more suitable for achieving the research objectives. We have also clarified this point in the contribution section of the revised manuscript.

  • The presented metrics, such as R², MAE, and the reconstruction error visualization, are correct, but no unified classification metric, such as AUC, F1 score, or sensitivity/specificity, is provided to validate fault detections. Please add a classification metrics table to compare the model's efficiency in classifying "OK" and "KO."

Thank you for the comment. As suggested, we have added a classification metrics table (Table 6 in the new Section 4.4) to evaluate the models' performance in distinguishing between the "OK" and "KO" classes. The table compares the six networks using accuracy, precision, recall and F1-score, providing a clearer assessment of their classification capabilities. 

  • How the classification thresholds were chosen as IQR (Q3 + 1.5×IQR and Q3 + 3×IQR), how the thresholds were optimized or what is the statistical justification.

We thank the reviewer for this relevant observation. In our method, two thresholds are used, as proposed by Givnan et al. [17]: a lower threshold to detect early signs of anomalies (warning), and a higher threshold to flag more severe faults. Both thresholds are calculated using a statistical approach based on the interquartile range (IQR), as proposed by Lee at al. [8], which measures the spread of the healthy data. Specifically, the warning threshold is set as Q3 + 1.5 × IQR, and the fault threshold double it,  as Q3 + 3 × IQR. This method is commonly used in statistics to identify moderate and extreme outliers, and allows us to define thresholds in a consistent and robust way without requiring faulty data during training. This concept has been clarified in section 4.3.

 

The paper shows that 1D CNN AE performs better, but doesn't go into enough detail about why it's better, or what specific types of faults it improves performance on.

  • Se riusciamo a mettere la tabella con AUC, F1 siamo ok, insieme a richiamare la tabella già presente + COMMENTO GENERICO.

Thank you for your valuable feedback. To address your comment, we have added F1-score and accuracy to Table 6 in Section 4.4, specifically to provide a more complete evaluation of each model's classification performance. Additionally, Table 7 in Section 4.6 summarizes the key reasons why the selected 1D CNN autoencoder outperforms the other networks in this specific fault detection scenario. We would also like to clarify that the main objective of this work was to develop a generic fault detection approach using AI, rather than focusing on optimising the detection of specific fault types. The goal was to improve the overall detection rate compared to traditional metrics, regardless of the fault nature. This was particularly challenging given the harsh starting conditions, characterised by large variations in speed and torque of the electric motors.

 

  • In other publications on flaw detection with AE or CNN in vibroacoustic signals, R² values ​​close to or greater than 0.9 have been reported using similar techniques. The authors should improve the performance or justify the results obtained.

We thank the reviewer for the insightful comment regarding the R² values and for pointing out the need for either performance improvement or clearer justification of the results. After a thorough re-examination of the calculations, we identified an inconsistency in the way the R² index was computed across the six tested autoencoder architectures. Specifically, the R² had initially been calculated on a dataset that included both healthy and defective cycles, instead of considering only the healthy cycles in the training set to properly assess reconstruction quality. We have corrected this aspect by recalculating the R² values exclusively on the training set, ensuring that only healthy cycles are considered. This adjustment allowed for a more consistent and comparable evaluation across all models.

As a result of this correction:

  • The 1D CNN autoencoder, which was already the top-performing architecture, has now achieved an R² of 0.91, aligning well with the values reported in the literature (≥0.9).
  • The other architectures also showed an improvement in R², with values now closer to 0.9. Nevertheless, they still remain below this value, confirming the superior performance of the 1D CNN-based model.

We have updated Table 5 accordingly to reflect these revised values. 

  • The model is both trained and evaluated on data from the same motor, without employing cross-validation or hold-out testing using data from other motors or from subsequent healthy-state cycles of the same motor. This limits the assessment of the model’s generalization capability.

We acknowledge the reviewer’s observation regarding the lack of cross-validation across different motors. However, the model is intentionally trained and evaluated on each individual motor. The objective of this study is not to develop a generalized model across motors, but rather to monitor each motor independently and detect early signs of fault onset. This motor-specific approach aligns with realistic applications in predictive maintenance, where models are typically tailored to each asset to capture deviations from its own healthy behaviour.         
Section 4.1 describes the training procedure and results for all six architectures, each trained independently on a single motor. The networks are trained and tested on all scenarios listed in Table 1, with a separate re-training performed for each case. In addition, Table 8 summarizes the fault detection performance of the 1D-CNN for the fault-related cases only. Although no cross-motor validation is performed, the repeated training and evaluation across different motors demonstrate the robustness and consistency of the proposed approach. Data for verifying subsequent healthy-state cycles of the same motor is not available.

  • The use of preprocessing techniques such as order tracking, extraction of mechanical and bearing orders, time averaging via logarithmic windows, and Min-Max normalization based on the entire dataset significantly limits the applicability of the model for real-time fault detection. It remains unclear how the authors intend to overcome these constraints. Is the model specifically designed for offline post-processing only, or are there plans for adapting it to true online or streaming scenarios?

We thank the reviewer for this valuable comment. As correctly pointed out, the employed preprocessing techniques, such as order tracking, extraction of mechanical and bearing orders, time averaging via logarithmic windows and Min-Max normalization, require access to the entire cycle, and therefore the model is evaluated at the end of each operating cycle. Since the motor is expected to operate over hundreds of such cycles, the duration of a single cycle is relatively short compared to the total durability test. This makes the per-cycle evaluation a temporally adequate discretization for effective fault monitoring. For this reason, we refer to our AIFD algorithm as “semi-real-time”. This concept has been clarified in Section 4.8.

Comments on the Quality of English Language

I found a few style errors, Please do a complete review of the work.

Thank you for your comment. The entire paper has been carefully reviewed to ensure clarity and consistency in language and style.

 

 

Reviewer 2 Report

Comments and Suggestions for Authors

Hi,

The work done in this paper, I find very interesting and well-prepared. On the other hand, I do have some comments, as there is a part of the work that should be improved.

Comments:

1 - The bibliographical references need to be improved, because most of them are out of date. Try to include recent references, as there are too many that are less than 5 years old.

2- To highlight your work, I see that you should add a section containing comparisons with other work in the field of real-time fault detection.

3- In relation to your method, how is it better than using a simple neural network or support vector machine (SVM) or kNN for fault detection in an electric motor?

4- How does your method perform? Does it perform well for load torque or speed variation?

5- To quantify the reconstruction error, you used MAE. Why MAE and not MSE? How is it better than the other?

Author Response

The work done in this paper, I find very interesting and well-prepared. On the other hand, I do have some comments, as there is a part of the work that should be improved.

Comments:

1 - The bibliographical references need to be improved, because most of them are out of date. Try to include recent references, as there are too many that are less than 5 years old.

Thank you for your comment. The entire bibliography has been carefully reviewed, and several older references have been removed or replaced. In addition, recent and more relevant works have been included to ensure the reference list reflects the current state of the field.

2- To highlight your work, I see that you should add a section containing comparisons with other work in the field of real-time fault detection.

Thank you for your valuable suggestion. We have added a dedicated section on real-time monitoring at the end of the Introduction to highlight the importance of this topic in the context of fault detection. In this additional part, we have included references to relevant works in the field. However, we did not provide a direct comparison with our approach, as it can be considered a semi-real-time monitoring method, as discussed in detail in Section 4.8.

3- In relation to your method, how is it better than using a simple neural network or support vector machine (SVM) or kNN for fault detection in an electric motor?

Thank you for your valuable comment. Regarding the comparison with traditional supervised machine learning methods such as SVMs, Random Forests, supervised MLP networks and k-Nearest Neighbors (kNN), we would like to clarify that, based on the literature review conducted (e.g., [15], [21], [22], [23]) unsupervised learning techniques based on deep neural networks have shown superior performance in similar application domains. Therefore, our study focused specifically on these approaches, as they were deemed more suitable for achieving the research objectives. We have also clarified this point in the contribution section of the revised manuscript.

4- How does your method perform? Does it perform well for load torque or speed variation?

We thank the reviewer for this question. The method has been specifically developed and validated under heavy load cycling conditions, characterised by strong variations in both speed and torque, as clearly illustrated in Figure 3. This is, in fact, one of the key novelties of the paper. As stated in Section 1.1 (Contribution), our study addresses a gap in the literature by applying Autoencoders to fault detection in electric motors operating under non-stationary conditions, which are significantly more challenging than the steady-state scenarios typically investigated in previous works. The algorithm's ability to detect anomalies in such a highly dynamic context demonstrates its robustness and practical applicability. All results reported in the paper pertain to this challenging scenario characterized by load torque and speed variations.

5- To quantify the reconstruction error, you used MAE. Why MAE and not MSE? How is it better than the other?

The choice of MAE over MSE was motivated by the excessive variability observed in the reconstruction errors on the test set when using MSE. Due to its squared nature, MSE tends to amplify the effect of large deviations, making the metric highly sensitive to outliers or signal peaks, which are common in vibration data. In contrast, MAE provides a more stable and interpretable evaluation by treating all deviations linearly. This makes it better suited for defining robust thresholds in anomaly detection, where consistency and resilience to noise are crucial.

A dedicated paragraph has been added in section 4.3.

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

no more comments

Author Response

The paper has been revised according to the Editor's comments.

 

 

Back to TopTop