UnderFSL: Boundary-Preserving Undersampling with Few-Shot Relation Networks for Cross-Machine CNC Fault Diagnosis
Abstract
1. Introduction
1.1. Industrial Motivation and Background
1.2. Challenges and Limitations of Existing Research
1.3. Main Contributions
1.4. Organization of the Paper
2. Preliminaries and Problem Formulation
2.1. System Description
2.2. Problem Formulation: Imbalanced Few-Shot Fault Diagnosis
3. Proposed Methodology
3.1. Time–Frequency Feature Representation via CWT
3.2. Undersampling Techniques for Data Balancing
- Random Undersampling: This method randomly removes samples from the majority class until the target class ratio is reached. While widely used for its simplicity, it risks discarding potentially informative data [15]. Mathematically, this is equivalent to randomly selecting a subset from the majority class dataset that satisfies , according to a target ratio r.
- Instance Hardness Thresholding: This technique removes ’hard’ majority class samples, which are consistently misclassified by multiple classifiers, considering them as noise or overlapping data [16]. The Instance Hardness for each majority class sample is given by Equation (2).Here, is the true label of sample x, M is the number of classifiers, and is the indicator function. Samples with an higher than a predefined threshold t are removed, ultimately producing .
- Condensed Nearest Neighbor Undersampling: U-CNN aims to find a minimal subset of the majority class that retains the classification performance for the minority class [17]. As described in Algorithm 1, the procedure begins by initializing a store, S, containing all minority class samples (). It then iteratively reviews all majority class samples () and adds a sample to the store S only if a 1-NN classifier, using only the current data in S, misclassifies that sample. This process is repeated until no more samples can be added to the store, and the final resulting set S consists of high-value, informative samples located primarily near the decision boundaries between classes.
Algorithm 1 Condensed Nearest Neighbor | |
1: | Input: (set of minority class samples), (set of majority class samples) |
2: | Output: S (the condensed training set) |
3: | |
4: | |
5: | ▹ Create a candidate set from |
6: | |
7: | |
8: | while do |
9: | |
10: | for each sample x in C do |
11: | Find , the nearest neighbor of x in S |
12: | if label(x) ≠ label() then |
13: | ▹ Add misclassified sample to S |
14: | ▹ Remove x from candidates |
15: | |
16: | end if |
17: | end for |
18: | end while |
19: | |
20: | return S |
3.3. Relation Network for Fault Diagnosis
- Episodic Training Paradigm: FSL models are trained in an episodic manner [20]. In each training episode, we construct a small Support Set and a Query Set from the (undersampled) training data. We adopted a 2-way 3-shot setting. This corresponds to our binary classification problem (2-way: Normal vs. Abnormal). The 3-shot setting was chosen to reflect the extreme scarcity of fault data in real-world scenarios. The support set consists of 3 examples from the normal class and 3 examples from the abnormal class.
- Relation Network Architecture: As shown in Figure 2, the Relation Network consists of two main modules. The detailed architecture, including specific layer configurations and output shapes, is provided in Table 1.
- -
- Embedding Module (): This module is a CNN-based backbone that maps the input images (scalograms) into high-dimensional feature embeddings. We utilized a CNN architecture for feature extraction from the 64 × 64 input images.
- -
- Relation Module (): This module calculates a similarity score, termed the relation score. For a given class , the support features are aggregated (e.g., by element-wise sum or averaging). This aggregated feature representation is then concatenated with the query feature . This concatenated feature block is fed into the relation module (another neural network), which outputs a relation score between 0 and 1, indicating the similarity between the query and the class .
- Optimization: The model is trained end-to-end to optimize the parameters and . The objective is to maximize the relation score for matching pairs and minimize it for mismatching pairs. The Mean Squared Error (MSE) loss function is used:
4. Experimental Validation and Results
4.1. Dataset Description: Bosch CNC Machining Benchmark
4.2. Experimental Setup and Implementation Details
4.3. Evaluation Metrics
4.4. Analysis of Undersampling Effects
4.5. Baseline Comparisons
4.6. Performance Analysis and Discussion
4.6.1. Impact of Undersampling on Baseline Models
4.6.2. Synergy of Undersampling and Few-Shot Learning
4.6.3. Overall Performance Comparison and Stability
4.7. Feature Space and Error Analysis Visualization
5. Industrial Implications and Discussion
5.1. Generalization Capability and Scalability
5.2. Addressing Data Scarcity in Practice
5.3. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
CAE | Convolutional Autoencoder |
CNC | Computer Numerical Control |
CNN | Convolutional Neural Network |
CWT | Continuous Wavelet Transform |
FSL | Few-Shot Learning |
GAP | Global Average Pooling |
IHT | Instance Hardness Threshold |
IQR | Interquartile Range |
MSE | Mean Squared Error |
OP | Operation |
RMS | Root Mean Square |
StatAE | Statistical Autoencoder |
SVM | Support Vector Machine |
U-CNN | Condensed Nearest Neighbor |
UnderFSL | Undersampling-based Few-shot Learning |
References
- Nath, C. Integrated tool condition monitoring systems and their applications: A comprehensive review. Procedia Manuf. 2020, 48, 852–863. [Google Scholar] [CrossRef]
- Zhao, R.; Yan, R.; Chen, Z.; Mao, K.; Wang, P.; Gao, R.X. Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 2019, 115, 213–237. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
- Tnani, M.A.; Feil, M.; Diepold, K. Smart data collection system for brownfield CNC milling machines: A new benchmark dataset for data-driven machine monitoring. Procedia CIRP 2022, 107, 131–136. [Google Scholar] [CrossRef]
- Lei, Y.; Li, N.; Guo, L.; Li, N.; Yan, T.; Lin, J. Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mech. Syst. Signal Process. 2018, 104, 799–834. [Google Scholar] [CrossRef]
- Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv. (CSUR) 2020, 53, 1–34. [Google Scholar] [CrossRef]
- Liang, X.; Zhang, M.; Feng, G.; Wang, D.; Xu, Y.; Gu, F. Few-shot learning approaches for fault diagnosis using vibration data: A comprehensive review. Sustainability 2023, 15, 14975. [Google Scholar] [CrossRef]
- Ochal, M.; Patacchiola, M.; Vazquez, J.; Storkey, A.; Wang, S. Few-shot learning with class imbalance. IEEE Trans. Artif. Intell. 2023, 4, 1348–1358. [Google Scholar] [CrossRef]
- Feng, Z.; Liang, M.; Chu, F. Recent advances in time–frequency analysis methods for machinery fault diagnosis: A review with application examples. Mech. Syst. Signal Process. 2013, 38, 165–205. [Google Scholar] [CrossRef]
- Randall, R.B. Vibration-Based Condition Monitoring: Industrial, Automotive and Aerospace Applications; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Gao, R.X.; Yan, R. Wavelets: Theory and Applications for Manufacturing; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
- Rafiee, J.; Arvani, F.; Harifi, A.; Sadeghi, M. Intelligent condition monitoring of a gearbox using artificial neural network. Mech. Syst. Signal Process. 2007, 21, 1746–1754. [Google Scholar] [CrossRef]
- Peng, Z.K.; Chu, F. Application of the wavelet transform in machine condition monitoring and fault diagnostics: A review with bibliography. Mech. Syst. Signal Process. 2004, 18, 199–221. [Google Scholar] [CrossRef]
- Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013, arXiv:1312.4400. [Google Scholar]
- Japkowicz, N.; Stephen, S. The class imbalance problem: A systematic study. Intell. Data Anal. 2002, 6, 429–449. [Google Scholar] [CrossRef]
- Smith, M.R.; Martinez, T.; Giraud-Carrier, C. An instance level analysis of data complexity. Mach. Learn. 2014, 95, 225–256. [Google Scholar] [CrossRef]
- Hart, P. The condensed nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory 1968, 14, 515–516. [Google Scholar] [CrossRef]
- Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1199–1208. [Google Scholar]
- Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4080–4090. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D. Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3630–3638. [Google Scholar]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Chalapathy, R.; Chawla, S. Deep learning for anomaly detection: A survey. arXiv 2019, arXiv:1901.03407. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Maaten, L.v.d.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Module | Layer/Block | Configuration | Output Shape |
---|---|---|---|
Embedding Module () | Input | CWT Scalogram (3 channels) | 64 × 64 × 3 |
Block 1 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 32 × 32 × 64 | |
Block 2 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 16 × 16 × 64 | |
Block 3 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 8 × 8 × 64 | |
Block 4 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 4 × 4 × 64 | |
Relation Module () | Input | Concatenated Features (Block 4 × 2) | 4 × 4 × 128 |
Block 5 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 2 × 2 × 64 | |
Block 6 | Conv(3 × 3, 64), BN, ReLU, MaxPool(2 × 2) | 1 × 1 × 64 | |
Flatten | - | 64 | |
FC 1 | 8 nodes, ReLU | 8 | |
FC 2 (Output) | 1 node, Sigmoid | 1 |
Model | Undersampling | Accuracy | Recall | Precision | F1-Score |
---|---|---|---|---|---|
SVM | None | ||||
Random | |||||
IHT | |||||
U-CNN | |||||
StatAE | None | ||||
VGGNet | None | ||||
Random | |||||
U-CNN | |||||
IHT | |||||
UnderFSL | None | ||||
Random | |||||
IHT | |||||
U-CNN |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, J.; Kim, J.; Lee, H.-U.; Choi, O.; Kim, S. UnderFSL: Boundary-Preserving Undersampling with Few-Shot Relation Networks for Cross-Machine CNC Fault Diagnosis. Electronics 2025, 14, 3699. https://doi.org/10.3390/electronics14183699
Kim J, Kim J, Lee H-U, Choi O, Kim S. UnderFSL: Boundary-Preserving Undersampling with Few-Shot Relation Networks for Cross-Machine CNC Fault Diagnosis. Electronics. 2025; 14(18):3699. https://doi.org/10.3390/electronics14183699
Chicago/Turabian StyleKim, Jonggeun, Jinyong Kim, Hyeon-Uk Lee, Ohkyu Choi, and Sijong Kim. 2025. "UnderFSL: Boundary-Preserving Undersampling with Few-Shot Relation Networks for Cross-Machine CNC Fault Diagnosis" Electronics 14, no. 18: 3699. https://doi.org/10.3390/electronics14183699
APA StyleKim, J., Kim, J., Lee, H.-U., Choi, O., & Kim, S. (2025). UnderFSL: Boundary-Preserving Undersampling with Few-Shot Relation Networks for Cross-Machine CNC Fault Diagnosis. Electronics, 14(18), 3699. https://doi.org/10.3390/electronics14183699