Robust Industrial Surface Defect Detection Using Statistical Feature Extraction and Capsule Network Architectures
Abstract
1. Introduction
1.1. Objective of the Work
1.2. Main Contributions
- A hybrid framework for detecting industrial surface defects is proposed, combining statistical feature extraction with classical ML models and advanced capsule network architectures.
- A comparative evaluation of five widely used classifiers(RF, KNN, LR, GB, and SVM) against four capsule network variants (Capsule3D, AttnCaps, SpectralCaps, and GraphCaps) is conducted.
- Feature extraction and selection techniques are applied to identify the most relevant parameters for model input and enhance the robustness of the classification process.
- Experimental results show that the combination of statistical descriptors with capsule networks can match or outperform traditional ML models, even in scenarios with limited or imbalanced data.
- A scalable and reliable methodology applicable to real industrial environments is presented, contributing to the advancement of intelligent quality control systems.
2. Foundations of ML and DL
2.1. ML
- RF: An ensemble of decision trees trained on random subsets of the data and features. Each tree produces a prediction for an input sample x, and the final output is obtained by majority voting:
- SVM: Seeks a hyperplane defined by w and b that maximizes the margin between classes. The optimization problem is:
- LR: Models the probability of belonging to the positive class as:
- KNN: Classifies a sample based on the majority label among its k closest neighbors, according to a distance metric , typically Euclidean:
- GB: Builds an additive model:
2.2. DL
- Input Layer: Receives raw or preprocessed image data.
- Convolutional Layer: Extracts low-level features such as edges and textures, similar to CNNs.
- Primary Capsules: Groups of neurons organized into capsules that encode simple features as vectors representing their presence and pose (orientation, position, scale).
- Digit Capsules: Higher-level capsules that receive input from primary capsules and encode more complex features representing specific classes (e.g., defect categories).
- Squash Function: A specialized activation function applied to normalize the length of capsule output vectors between 0 and 1, preserving pose information. The function is defined as:
- Decision Layer: Computes the length (norm) of each class capsule’s output vector, interpreting it as the probability of the corresponding class, with the final prediction corresponding to the capsule with the greatest length.
Advanced Capsule Network Variants
- Capsule3D: Extends the capsule concept to three-dimensional data, where each capsule is a vector representing instantiation parameters. The transformation between capsules is done through weight matrices :
- AttnCaps: Introduces an attention mechanism assigning weights to connections between capsules to emphasize the most relevant relationships. The output of an upper-level capsule is computed as:
- SpectralCaps: Applies spectral transformations (e.g., Fourier or Wavelet transforms) to capsule outputs to capture frequency-domain information:
- GraphCaps: Combines capsule networks with graph structures to model non-Euclidean relationships among capsules. Given a graph with nodes V representing capsules and edges E their connections, the capsule states update as:
3. Materials and Methodology
3.1. Training Phase
3.1.1. Dataset Preparation
3.1.2. Preprocessing
3.1.3. Feature Extraction
3.1.4. Model Training and Optimization
3.2. Evaluation Phase
3.2.1. Test Set Preprocessing
3.2.2. Model Evaluation
3.2.3. Performance Measurement
3.3. Dataset
- Dataset Original → 1300 grayscale images of 512 × 512 pixels without augmentation, divided into two classes: normal (parts without visible surface anomalies, OK_FRONT) and defective (parts with flaws such as porosity, open holes, flashing, cracks, and stains, DEF_FRONT).
- Dataset Expanded → 7348 grayscale images of 300 × 300 pixels with augmentation applied, organized into training (train) and testing (test) folders, also divided into the same two classes.
3.4. Feature Extraction and Selection
3.5. 3D Convolutional Neural Network Approach with Simulated HSI
3.5.1. HSI Simulation
3.5.2. Visualizations of Simulated HSI
3.6. Model Training and Evaluation
3.6.1. CNN3D Architecture
3.6.2. Evaluation
3.7. Evaluation Metrics
4. Results
- Path 1: Raw images are directly input into DL models such as AttnCapsNet, 3D Capsule3D, SpectralCap, PrimaryCaps, ConvCaps, GraphCaps, and ResNet50.
- Path 2: DL models are employed to extract abstract features from images, which are subsequently used for classification within DL architectures.
- Path 3: Raw images are directly used as input to traditional ML classifiers, including RF, LR, KNN, SVM, and GB.
- Path 4: Statistical features are extracted from images and then used as input for the same set of ML classifiers.
4.1. Feature Extraction and Correlation-Based Selection
4.2. Statistical Significance Evaluation
4.3. Feature Importance Analysis via PCA
4.4. Exploratory Visual Analysis
4.5. Dataset Overview for Modeling
- Number of samples: 1300
- Number of selected parameters: 10
4.6. ML
4.7. DL
4.8. Comparison with Previous Work
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rojas Santelices, I.; Cano, S.; Moreira, F.; Peña Fritz, Á. Artificial Vision Systems for Fruit Inspection and Classification: Systematic Literature Review. Sensors 2025, 25, 1524. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, D. Toward Efficient Edge Detection: A Novel Optimization Method Based on Integral Image Technology and Canny Edge Detection. Processes 2025, 13, 293. [Google Scholar] [CrossRef]
- He, Y.; Liu, Z.; Guo, Y.; Zhu, Q.; Fang, Y.; Yin, Y.; Wang, Y.; Zhang, B.; Liu, Z. UAV based sensing and imaging technologies for power system detection, monitoring and inspection: A review. Nondestruct. Test. Eval. 2024, 1–68. [Google Scholar] [CrossRef]
- Yang, J.; Lee, C.H. Real-Time Data-Driven Method for Bolt Defect Detection and Size Measurement in Industrial Production. Actuators 2025, 14, 185. [Google Scholar] [CrossRef]
- Shubham; Banerjee, D. Application of CNN and KNN Algorithms for Casting Defect Classification. In Proceedings of the 2024 First International Conference on Innovations in Communications, Electrical and Computer Engineering (ICICEC), Davangere, India, 24–25 October 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Gaaloul, Y.; Bel Hadj Brahim Kechiche, O.; Oudira, H.; Chouder, A.; Hamouda, M.; Silvestre, S.; Kichou, S. Faults Detection and Diagnosis of a Large-Scale PV System by Analyzing Power Losses and Electric Indicators Computed Using Random Forest and KNN-Based Prediction Models. Energies 2025, 18, 2482. [Google Scholar] [CrossRef]
- Ghosh, K.; Bellinger, C.; Corizzo, R.; Branco, P.; Krawczyk, B.; Japkowicz, N. The class imbalance problem in deep learning. Mach. Learn. 2024, 113, 4845–4901. [Google Scholar] [CrossRef]
- Ebrahimi, N.; Kim, H.S.; Blaauw, D. Physical Layer Secret Key Generation Using Joint Interference and Phase Shift Keying Modulation. IEEE Trans. Microw. Theory Tech. 2021, 69, 2673–2685. [Google Scholar] [CrossRef]
- Trinidad-Fernández, M.; Beckwée, D.; Cuesta-Vargas, A.; González-Sánchez, M.; Moreno, F.A.; González-Jiménez, J.; Joos, E.; Vaes, P. Validation, Reliability, and Responsiveness Outcomes of Kinematic Assessment with an RGB-D Camera to Analyze Movement in Subacute and Chronic Low Back Pain. Sensors 2020, 20, 689. [Google Scholar] [CrossRef] [PubMed]
- Barghikar, F.; Tabataba, F.S.; Soorki, M.N. Resource Allocation for mmWave-NOMA Communication Through Multiple Access Points Considering Human Blockages. IEEE Trans. Commun. 2021, 69, 1679–1692. [Google Scholar] [CrossRef]
- Zhang, Z.; Cheng, Q.; Qi, B.; Tao, Z. A general approach for the machining quality evaluation of S-shaped specimen based on POS-SQP algorithm and Monte Carlo method. J. Manuf. Syst. 2021, 60, 553–568. [Google Scholar] [CrossRef]
- Kergus, P. Data-Driven Control of Infinite Dimensional Systems: Application to a Continuous Crystallizer. IEEE Control Syst. Lett. 2021, 5, 2120–2125. [Google Scholar] [CrossRef]
- Li, P.; Fei, Q.; Chen, Z.; Liu, X. Interpretable Multi-Channel Capsule Network for Human Motion Recognition. Electronics 2023, 12, 4313. [Google Scholar] [CrossRef]
- Sekiyama, K.; Kikuma, N.; Sakakibara, K.; Sugimoto, Y. Blind Signal Separation Using Array Antenna with Modified Optimal-Stepsize CMA. In Proceedings of the 2020 International Symposium on Antennas and Propagation (ISAP), Taipei, Taiwan, 19–22 October 2021; pp. 799–800. [Google Scholar] [CrossRef]
- Roscia, F.; Cumerlotti, A.; Del Prete, A.; Semini, C.; Focchi, M. Orientation Control System: Enhancing Aerial Maneuvers for Quadruped Robots. Sensors 2023, 23, 1234. [Google Scholar] [CrossRef]
- Adhinata, F.D.; Wahyono; Sumiharto, R. A comprehensive survey on weed and crop classification using machine learning and deep learning. Artif. Intell. Agric. 2024, 13, 45–63. [Google Scholar] [CrossRef]
- Silva, J.L.d.S.; Paula, M.V.d.; Barros, J.d.S.G.; Barros, T.A.D.S. Anomaly Detection Workflow Using Random Forest Regressor in Large-Scale Photovoltaic Power Plants. IEEE Access 2025, 13, 54168–54176. [Google Scholar] [CrossRef]
- Thango, B.A. Winding Fault Detection in Power Transformers Based on Support Vector Machine and Discrete Wavelet Transform Approach. Technologies 2025, 13, 200. [Google Scholar] [CrossRef]
- Amaral, A.M.R. Enhancing Power Converter Reliability Through a Logistic Regression-Based Non-Invasive Fault Diagnosis Technique. Appl. Sci. 2025, 15, 6971. [Google Scholar] [CrossRef]
- Hsiao, C.H.; Su, H.C.; Wang, Y.T.; Hsu, M.J.; Hsu, C.C. ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification. Sensors 2025, 25, 4233. [Google Scholar] [CrossRef] [PubMed]
- Liu, H.; Meng, X. Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines. Appl. Sci. 2025, 15, 4031. [Google Scholar] [CrossRef]
- Saleh, M.A.; Darwish, A.; Ghrayeb, A.; Refaat, S.S.; Abu-Rub, H.; Khatri, S.P.; El-Hag, A.H.; Kumru, C.F. CapsPDNet: Optimized Capsule Network for Predicting Insulator Discharges Using UHF Signals. IEEE Trans. Instrum. Meas. 2025, 74, 1–17. [Google Scholar] [CrossRef]
- Zhao, Y.; Birdal, T.; Deng, H.; Tombari, F. 3D Point Capsule Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 1009–1018. [Google Scholar] [CrossRef]
- Hoogi, A.; Wilcox, B.; Gupta, Y.; Rubin, D.L. Self-Attention Capsule Networks for Image Classification. CoRR 2019. abs/1904.12483. Available online: http://arxiv.org/abs/1904.12483 (accessed on 28 September 2025).
- Zhu, K.; Chen, Y.; Ghamisi, P.; Jia, X.; Benediktsson, J.A. Deep Convolutional Capsule Network for Hyperspectral Image Spectral and Spectral-Spatial Classification. Remote Sens. 2019, 11, 223. [Google Scholar] [CrossRef]
- Verma, S.; Zhang, Z.L. Graph Capsule Convolutional Neural Networks. arXiv 2018, arXiv:1805.08090. [Google Scholar] [CrossRef]
- Mjahad, A.; Rosado-Muñoz, A. Optimizing Tumor Detection in Brain MRI with One-Class SVM and Convolutional Neural Network-Based Feature Extraction. J. Imaging 2025, 11, 207. [Google Scholar] [CrossRef] [PubMed]
- Hridoy, M.; Rahman, M.; Sakib, S. A Framework for Industrial Inspection System using Deep Learning. Ann. Data Sci. 2024, 11, 445–478. [Google Scholar] [CrossRef]
- Stephen, O.; Madanian, S.; Nguyen, M. A Hard Voting Policy-Driven Deep Learning Architectural Ensemble Strategy for Industrial Products Defect Recognition and Classification. Sensors 2022, 22, 7846. [Google Scholar] [CrossRef]
- Tsiktsiris, D.; Sanida, T.; Sideris, A.; Dasygenis, M. Accelerated defective product inspection on the edge using deep learning. In Recent Advances in Manufacturing Modelling and Optimization: Select Proceedings of RAM 2021; Lecture Notes in Mechanical Engineering (LNME); Springer: Berlin/Heidelberg, Germany, 2022; pp. 185–191. [Google Scholar]
- Nguyen, H.T.; Yu, G.H.; Shin, N.R.; Kwon, G.J.; Kwak, W.Y.; Kim, J.Y. Defective Product Classification System for Smart Factory Based on Deep Learning. Electronics 2021, 10, 826. [Google Scholar] [CrossRef]
- Apostolopoulos, I.D.; Tzani, M.A. Industrial object and defect recognition utilizing multilevel feature extraction from industrial scenes with Deep Learning approach. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 10263–10276. [Google Scholar] [CrossRef]
- Dabhi, R. Casting Product Image Data for Quality Inspection. Kaggle Dataset. 2020. Available online: https://www.kaggle.com/datasets/ravirajsinh45/real-life-industrial-dataset-of-casting-product (accessed on 10 January 2023).
Parameter | Description |
---|---|
Statistical, Texture, and Shape | |
Mean, Max, Min, Range, Std. Dev. | Basic intensity statistics |
LBP, LBP Histogram | Local Binary Patterns and their histogram |
GLCM (contrast, correlation, energy, homogeneity) | Gray-Level Co-occurrence Matrix features |
HOG (mean) | Mean of Histogram of Oriented Gradients descriptor |
Hu Moments | Scale, rotation, translation invariant moments |
Edge Density | Ratio of detected edge pixels |
Entropy | |
Image Entropy | Measure of intensity disorder |
Histogram Entropy | Entropy after various histogram transforms |
FFT Entropy | Entropy of Fourier magnitude spectrum |
Frequency | |
Fourier Transform (mean, max, power) | Magnitude spectrum of image |
Wavelet Coefficients | Multiscale decomposition coefficients |
Bispectrum (mean and max values) | Second order spectral features |
Other Domains: Fractal, Statistical, Complexity | |
Fractal Dimension (Katz) | Measure of spatial roughness |
Poincaré Features | Variability based on lagged differences |
Lempel-Ziv Complexity | Algorithmic complexity of binary sequence |
Layer (Type) | Output Shape | Parameters |
---|---|---|
Conv3D | (None, 126, 126, 8, 32) | 896 |
MaxPooling3D | (None, 63, 63, 4, 32) | 0 |
Conv3D_1 | (None, 61, 61, 2, 64) | 55,360 |
MaxPooling3D_1 | (None, 30, 30, 1, 64) | 0 |
Flatten | (None, 57,600) | 0 |
Dense | (None, 128) | 7,372,928 |
Dense_1 | (None, 2) | 258 |
Total parameters | 7,429,442 (28.34 MB) | |
Trainable parameters | 7,429,442 (28.34 MB) | |
Non-trainable parameters | 0 |
No. | Parameter Name |
---|---|
1 | bordes_array_0 |
2 | FT_max_magnitud_FT |
3 | FT_array_0 |
4 | Zero_Crossing |
5 | fft_entropy.1 |
6 | fft_entropy |
7 | Hsqrt |
8 | lbp_hist_8 |
9 | sbp_0 |
10 | sbp_4 |
Model | Default Parameters |
---|---|
RF | n_estimators = 100; max_depth = None; min_samples_split = 2; min_samples_leaf = 1; max_features = auto; random_state = 42 |
LR | penalty = l2; C = 1.0; solver = lbfgs; max_iter = 1000; random_state = 42 |
K-Nearest Neighbors | n_neighbors = 5; weights = uniform; metric = minkowski; p = 2 |
SVM (RBF Kernel) | C = 1.0; kernel = rbf; gamma = scale; probability = True; random_state = 42 |
Gradient Boosting | n_estimators = 100; learning_rate = 0.1; max_depth = 3; min_samples_split = 2; min_samples_leaf = 1; subsample = 1.0; random_state = 42 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
RF | 0 | 90.4 ± 1.1 | 90.4 ± 1.1 | 88.3 ± 0.5 | 90.4 ± 1.1 | 89.2 ± 0.6 |
1 | 93.2 ± 0.7 | 88.3 ± 0.5 | 90.4 ± 1.1 | 90.7 ± 0.5 | 89.2 ± 0.6 | |
LR | 0 | 74.4 ± 0.5 | 74.4 ± 0.5 | 80.1 ± 0.0 | 74.4 ± 0.5 | 77.8 ± 0.2 |
1 | 82.5 ± 0.3 | 80.1 ± 0.0 | 74.4 ± 0.5 | 81.3 ± 0.1 | 77.8 ± 0.2 | |
K-Nearest Neighbors | 0 | 87.5 ± 0.0 | 87.5 ± 0.0 | 89.1 ± 0.0 | 87.5 ± 0.0 | 88.5 ± 0.0 |
1 | 91.4 ± 0.0 | 89.1 ± 0.0 | 87.5 ± 0.0 | 90.3 ± 0.0 | 88.5 ± 0.0 | |
SVM (RBF Kernel) | 0 | 82.7 ± 0.0 | 82.7 ± 0.0 | 81.4 ± 0.0 | 82.7 ± 0.0 | 81.9 ± 0.0 |
1 | 87.6 ± 0.0 | 81.4 ± 0.0 | 82.7 ± 0.0 | 84.4 ± 0.0 | 81.9 ± 0.0 | |
Gradient Boosting | 0 | 86.9 ± 0.5 | 86.9 ± 0.5 | 85.1 ± 0.9 | 86.9 ± 0.5 | 85.8 ± 0.4 |
1 | 90.7 ± 0.2 | 85.1 ± 0.9 | 86.9 ± 0.5 | 87.8 ± 0.4 | 85.8 ± 0.4 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
RF | 0 | 96.7 ± 1.6 | 96.7 ± 1.6 | 93.2 ± 2.4 | 96.7 ± 1.6 | 94.6 ± 1.4 |
1 | 97.7 ± 1.1 | 93.2 ± 2.4 | 96.7 ± 1.6 | 95.4 ± 1.2 | 94.6 ± 1.4 | |
LR | 0 | 65.8 ± 5.4 | 65.8 ± 5.4 | 78.6 ± 2.9 | 65.8 ± 5.4 | 73.5 ± 3.4 |
1 | 77.5 ± 3.1 | 78.6 ± 2.9 | 65.8 ± 5.4 | 78.0 ± 2.8 | 73.5 ± 3.4 | |
K-Nearest Neighbors | 0 | 66.2 ± 5.6 | 66.2 ± 5.6 | 78.3 ± 2.1 | 66.2 ± 5.6 | 73.5 ± 2.5 |
1 | 77.7 ± 2.9 | 78.3 ± 2.1 | 66.2 ± 5.6 | 78.0 ± 1.9 | 73.5 ± 2.5 | |
SVM (RBF Kernel) | 0 | 67.3 ± 3.8 | 67.3 ± 3.8 | 80.5 ± 2.3 | 67.3 ± 3.8 | 75.2 ± 1.9 |
1 | 78.7 ± 1.9 | 80.5 ± 2.3 | 67.3 ± 3.8 | 79.6 ± 1.6 | 75.2 ± 1.9 | |
Gradient Boosting | 0 | 97.3 ± 1.4 | 97.3 ± 1.4 | 92.9 ± 2.0 | 97.3 ± 1.4 | 94.7 ± 1.3 |
1 | 98.1 ± 1.0 | 92.9 ± 2.0 | 97.3 ± 1.4 | 95.4 ± 1.2 | 94.7 ± 1.3 |
Method | AttnCapsNet | ConvCaps | 3D Capsule Networks | Primary | Capsule 3D | Spectral |
---|---|---|---|---|---|---|
Time (s/image) |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
ResNet | 0 | 64.23 ± 3.96 | 42.31 ± 42.31 | 81.41 ± 15.38 | 49.72 ± 32.91 | 65.77 ± 9.23 |
1 | 60.00 ± 0.00 | 100.00 ± 0.00 | 0.00 ± 0.00 | 75.00 ± 0.00 | 60.00 ± 0.00 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
AttnCapsNet | 0 | 90.6 ± 0.9 | 90.6 ± 0.9 | 90.4 ± 2.3 | 90.6 ± 0.9 | 90.5 ± 1.1 |
1 | 93.5 ± 0.5 | 90.4 ± 2.3 | 90.6 ± 0.9 | 91.9 ± 1.0 | 90.5 ± 1.1 | |
ConvCaps | 0 | 89.8 ± 1.0 | 89.8 ± 1.0 | 90.5 ± 1.1 | 89.8 ± 1.0 | 90.2 ± 0.9 |
1 | 93.0 ± 0.7 | 90.5 ± 1.1 | 89.8 ± 1.0 | 91.7 ± 0.8 | 90.2 ± 0.9 | |
3D CapsNet | 0 | 91.0 ± 1.4 | 91.0 ± 1.4 | 91.5 ± 1.2 | 91.0 ± 1.4 | 91.3 ± 1.1 |
1 | 93.9 ± 1.0 | 91.5 ± 1.2 | 91.0 ± 1.4 | 92.6 ± 0.9 | 91.3 ± 1.1 | |
Primary | 0 | 89.0 ± 2.2 | 89.0 ± 2.2 | 87.8 ± 1.6 | 89.0 ± 2.2 | 88.3 ± 1.3 |
1 | 92.3 ± 1.5 | 87.8 ± 1.6 | 89.0 ± 2.2 | 90.0 ± 1.2 | 88.3 ± 1.3 | |
Capsule 3D | 0 | 91.3 ± 1.2 | 91.3 ± 1.2 | 91.3 ± 0.3 | 91.3 ± 1.2 | 91.3 ± 0.3 |
1 | 94.1 ± 0.8 | 91.3 ± 0.3 | 91.3 ± 1.2 | 92.7 ± 0.2 | 91.3 ± 0.3 | |
Spectral | 0 | 76.7 ± 2.5 | 76.7 ± 2.5 | 80.3 ± 3.6 | 76.7 ± 2.5 | 78.8 ± 2.9 |
1 | 83.8 ± 2.0 | 80.3 ± 3.6 | 76.7 ± 2.5 | 82.0 ± 2.7 | 78.8 ± 2.9 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
ResNet50 | 0 | 98.0 ± 2.0 | 98.5 ± 1.5 | 97.0 ± 2.0 | 98.2 ± 1.7 | 98.0 ± 1.5 |
1 | 97.5 ± 1.8 | 97.0 ± 2.0 | 98.5 ± 1.5 | 97.2 ± 1.6 | 98.0 ± 1.5 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
AttnCapsNet | 0 | 99.6 ± 0.5 | 99.6 ± 0.5 | 97.3 ± 0.9 | 99.6 ± 0.5 | 98.2 ± 0.6 |
1 | 99.7 ± 0.3 | 97.3 ± 0.9 | 99.6 ± 0.5 | 98.5 ± 0.5 | 98.2 ± 0.6 | |
ConvCaps | 0 | 100.0 ± 0.0 | 100.0 ± 0.0 | 97.8 ± 0.3 | 100.0 ± 0.0 | 98.7 ± 0.2 |
1 | 100.0 ± 0.0 | 97.8 ± 0.3 | 100.0 ± 0.0 | 98.9 ± 0.2 | 98.7 ± 0.2 | |
3D CapsNet | 0 | 99.7 ± 0.5 | 99.7 ± 0.5 | 97.9 ± 0.3 | 99.7 ± 0.5 | 98.6 ± 0.2 |
1 | 99.8 ± 0.3 | 97.9 ± 0.3 | 99.7 ± 0.5 | 98.8 ± 0.2 | 98.6 ± 0.2 | |
Primary | 0 | 100.0 ± 0.0 | 100.0 ± 0.0 | 97.0 ± 0.3 | 100.0 ± 0.0 | 98.2 ± 0.2 |
1 | 100.0 ± 0.0 | 97.0 ± 0.3 | 100.0 ± 0.0 | 98.5 ± 0.1 | 98.2 ± 0.2 | |
Capsule 3D | 0 | 100.0 ± 0.0 | 100.0 ± 0.0 | 97.8 ± 0.3 | 100.0 ± 0.0 | 98.7 ± 0.2 |
1 | 100.0 ± 0.0 | 97.8 ± 0.3 | 100.0 ± 0.0 | 98.9 ± 0.2 | 98.7 ± 0.2 | |
Spectral | 0 | 85.4 ± 1.1 | 85.4 ± 1.1 | 85.0 ± 1.4 | 85.4 ± 1.1 | 85.2 ± 0.7 |
1 | 89.7 ± 0.6 | 85.0 ± 1.4 | 85.4 ± 1.1 | 87.3 ± 0.7 | 85.2 ± 0.7 | |
Graph | 0 | 99.6 ± 0.5 | 99.6 ± 0.5 | 97.3 ± 0.9 | 99.6 ± 0.5 | 98.2 ± 0.4 |
1 | 99.7 ± 0.3 | 97.3 ± 0.9 | 99.6 ± 0.5 | 98.5 ± 0.3 | 98.2 ± 0.4 |
Method | AttnCapsNet | ConvCaps | 3D Capsule Net. | Primary | Capsule 3D | Spectral |
---|---|---|---|---|---|---|
Time (s/image) |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
CNN3D | 0 | 81.87 ± 2.20 | 91.75 ± 0.41 | 86.54 ± 1.92 | 86.51 ± 1.05 | 88.61 ± 1.01 |
1 | 94.08 ± 0.20 | 86.54 ± 1.92 | 91.75 ± 0.41 | 90.14 ± 0.95 | 88.61 ± 1.01 |
Model | Class | Precision (%) | Sensitivity (%) | Specificity (%) | F1 (%) | Accuracy (%) |
---|---|---|---|---|---|---|
RF | 0 | 99.4 ± 0.2 | 99.4 ± 0.2 | 99.3 ± 0.2 | 99.4 ± 0.2 | 99.3 ± 0.2 |
1 | 99.7 ± 0.1 | 99.3 ± 0.2 | 99.4 ± 0.2 | 99.5 ± 0.1 | 99.3 ± 0.2 | |
K-Nearest Neighbors | 0 | 99.2 ± 0.0 | 99.2 ± 0.0 | 98.5 ± 0.0 | 99.2 ± 0.0 | 98.7 ± 0.0 |
1 | 99.6 ± 0.0 | 98.5 ± 0.0 | 99.2 ± 0.0 | 99.0 ± 0.0 | 98.7 ± 0.0 | |
LR | 0 | 83.6 ± 0.3 | 83.6 ± 0.3 | 92.7 ± 0.3 | 83.6 ± 0.3 | 89.4 ± 0.2 |
1 | 90.7 ± 0.2 | 92.7 ± 0.3 | 83.6 ± 0.3 | 91.7 ± 0.2 | 89.4 ± 0.2 | |
Gradient Boosting | 0 | 96.0 ± 0.2 | 96.0 ± 0.2 | 96.2 ± 0.2 | 96.0 ± 0.2 | 96.2 ± 0.1 |
1 | 97.6 ± 0.1 | 96.2 ± 0.2 | 96.0 ± 0.2 | 96.9 ± 0.1 | 96.2 ± 0.1 | |
SVM | 0 | 99.2 ± 0.0 | 99.2 ± 0.0 | 99.1 ± 0.0 | 99.2 ± 0.0 | 99.2 ± 0.0 |
1 | 99.6 ± 0.0 | 99.1 ± 0.0 | 99.2 ± 0.0 | 99.3 ± 0.0 | 99.2 ± 0.0 |
Model | Recall (%) | Precision (%) | F1 (%) | Accuracy (%) | Dataset |
---|---|---|---|---|---|
Proposed research (ConvCaps) | 97.90 | 98.0 | 97.60 | 98.0 | Dataset Original |
Proposed research (RF) | 99.60 | 99.60 | 99.60 | 99.50 | Dataset Expanded |
Xception-CNN [28] | 99.62 | 99.62 | 99.62 | 99.72 | Dataset Expanded |
Voting Policy Model [29] | – | – | – | 99.90 | Dataset Expanded |
CNN [30] | 99.72 | 99.72 | 99.72 | – | Dataset Expanded |
ResNet [31] | 98.47 | 97.36 | 97.91 | 98.46 | Dataset Expanded |
Multipath VGG19 [32] | 97.72 | 78.01 | 87.37 | 87.39 | Dataset Expanded |
CNN [30] | 81.71 | 98.12 | 89.14 | 88.52 | Dataset Expanded |
DenseNet [31] | 99.24 | 99.62 | 99.43 | 99.58 | Dataset Expanded |
VGGNet [31] | 97.71 | 96.24 | 96.97 | 97.76 | Dataset Expanded |
GoogleLeNet [31] | 99.62 | 98.49 | 99.05 | 99.30 | Dataset Expanded |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mjahad, A.; Rosado-Muñoz, A. Robust Industrial Surface Defect Detection Using Statistical Feature Extraction and Capsule Network Architectures. Sensors 2025, 25, 6063. https://doi.org/10.3390/s25196063
Mjahad A, Rosado-Muñoz A. Robust Industrial Surface Defect Detection Using Statistical Feature Extraction and Capsule Network Architectures. Sensors. 2025; 25(19):6063. https://doi.org/10.3390/s25196063
Chicago/Turabian StyleMjahad, Azeddine, and Alfredo Rosado-Muñoz. 2025. "Robust Industrial Surface Defect Detection Using Statistical Feature Extraction and Capsule Network Architectures" Sensors 25, no. 19: 6063. https://doi.org/10.3390/s25196063
APA StyleMjahad, A., & Rosado-Muñoz, A. (2025). Robust Industrial Surface Defect Detection Using Statistical Feature Extraction and Capsule Network Architectures. Sensors, 25(19), 6063. https://doi.org/10.3390/s25196063