Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,315)

Search Parameters:
Keywords = Vgg-19 net

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 2136 KB  
Article
Transformer-Based Multi-Class Classification of Bangladeshi Rice Varieties Using Image Data
by Israt Tabassum and Vimala Nunavath
Appl. Sci. 2026, 16(3), 1279; https://doi.org/10.3390/app16031279 - 27 Jan 2026
Abstract
Rice (Oryza sativa L.) is a staple food for over half of the global population, with significant economic, agricultural, and cultural importance, particularly in Asia. Thousands of rice varieties exist worldwide, differing in size, shape, color, and texture, making accurate classification essential [...] Read more.
Rice (Oryza sativa L.) is a staple food for over half of the global population, with significant economic, agricultural, and cultural importance, particularly in Asia. Thousands of rice varieties exist worldwide, differing in size, shape, color, and texture, making accurate classification essential for quality control, breeding programs, and authenticity verification in trade and research. Traditional manual identification of rice varieties is time-consuming, error-prone, and heavily reliant on expert knowledge. Deep learning provides an efficient alternative by automatically extracting discriminative features from rice grain images for precise classification. While prior studies have primarily employed deep learning models such as CNN, VGG, InceptionV3, MobileNet, and DenseNet201, transformer-based models remain underexplored for rice variety classification. This study addresses this gap by applying two deep learning models such as Swin Transformer and Vision Transformer for multi-class classification of rice varieties using the publicly available PRBD dataset from Bangladesh. Experimental results demonstrate that the ViT model achieved an accuracy of 99.86% with precision, recall, and F1-score all at 0.9986, while the Swin Transformer model obtained an accuracy of 99.44% with a precision of 0.9944, recall of 0.9944, and F1-score of 0.9943. These results highlight the effectiveness of transformer-based models for high-accuracy rice variety classification. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

18 pages, 14590 KB  
Article
VTC-Net: A Semantic Segmentation Network for Ore Particles Integrating Transformer and Convolutional Block Attention Module (CBAM)
by Yijing Wu, Weinong Liang, Jiandong Fang, Chunxia Zhou and Xiaolu Sun
Sensors 2026, 26(3), 787; https://doi.org/10.3390/s26030787 - 24 Jan 2026
Viewed by 190
Abstract
In mineral processing, visual-based online particle size analysis systems depend on high-precision image segmentation to accurately quantify ore particle size distribution, thereby optimizing crushing and sorting operations. However, due to multi-scale variations, severe adhesion, and occlusion within ore particle clusters, existing segmentation models [...] Read more.
In mineral processing, visual-based online particle size analysis systems depend on high-precision image segmentation to accurately quantify ore particle size distribution, thereby optimizing crushing and sorting operations. However, due to multi-scale variations, severe adhesion, and occlusion within ore particle clusters, existing segmentation models often exhibit undersegmentation and misclassification, leading to blurred boundaries and limited generalization. To address these challenges, this paper proposes a novel semantic segmentation model named VTC-Net. The model employs VGG16 as the backbone encoder, integrates Transformer modules in deeper layers to capture global contextual dependencies, and incorporates a Convolutional Block Attention Module (CBAM) at the fourth stage to enhance focus on critical regions such as adhesion edges. BatchNorm layers are used to stabilize training. Experiments on ore image datasets show that VTC-Net outperforms mainstream models such as UNet and DeepLabV3 in key metrics, including MIoU (89.90%) and pixel accuracy (96.80%). Ablation studies confirm the effectiveness and complementary role of each module. Visual analysis further demonstrates that the model identifies ore contours and adhesion areas more accurately, significantly improving segmentation robustness and precision under complex operational conditions. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 1978 KB  
Article
UAV-Based Forest Fire Early Warning and Intervention Simulation System with High-Accuracy Hybrid AI Model
by Muhammet Sinan Başarslan and Hikmet Canlı
Appl. Sci. 2026, 16(3), 1201; https://doi.org/10.3390/app16031201 - 23 Jan 2026
Viewed by 170
Abstract
In this study, a hybrid deep learning model that combines the VGG16 and ResNet101V2 architectures is proposed for image-based fire detection. In addition, a balanced drone guidance algorithm is developed to efficiently assign tasks to available UAVs. In the fire detection phase, the [...] Read more.
In this study, a hybrid deep learning model that combines the VGG16 and ResNet101V2 architectures is proposed for image-based fire detection. In addition, a balanced drone guidance algorithm is developed to efficiently assign tasks to available UAVs. In the fire detection phase, the hybrid model created by combining the VGG16 and ResNet101V2 architectures has been optimized with Global Average Pooling and layer merging techniques to increase classification success. The DeepFire dataset was used throughout the training process, achieving an extremely high accuracy rate of 99.72% and 100% precision. After fire detection, a task assignment algorithm was developed to assign existing drones to fire points at minimum cost and with balanced load distribution. This algorithm performs task assignments using the Hungarian (Kuhn–Munkres) method and cost optimization, and is adapted to direct approximately equal numbers of drones to each fire when the number of fires is less than the number of drones. The developed system was tested in a Python-based simulation environment and evaluated using performance metrics such as total intervention time, energy consumption, and task balance. The results demonstrate that the proposed hybrid model provides highly accurate fire detection and that the task assignment system creates balanced and efficient intervention scenarios. Full article
Show Figures

Figure 1

22 pages, 2774 KB  
Article
Uncovering Neural Learning Dynamics Through Latent Mutual Information
by Arianna Issitt, Alex Merino, Lamine Deen, Ryan T. White and Mackenzie J. Meni
Entropy 2026, 28(1), 118; https://doi.org/10.3390/e28010118 - 19 Jan 2026
Viewed by 175
Abstract
We study how convolutional neural networks reorganize information during learning in natural image classification tasks by tracking mutual information (MI) between inputs, intermediate representations, and labels. Across VGG-16, ResNet-18, and ResNet-50, we find that label-relevant MI grows reliably with depth while input MI [...] Read more.
We study how convolutional neural networks reorganize information during learning in natural image classification tasks by tracking mutual information (MI) between inputs, intermediate representations, and labels. Across VGG-16, ResNet-18, and ResNet-50, we find that label-relevant MI grows reliably with depth while input MI depends strongly on architecture and activation, indicating that “compression’’ is not a universal phenomenon. Within convolutional layers, label information becomes increasingly concentrated in a small subset of channels; inference-time knockouts, shuffles, and perturbations confirm that these high-MI channels are functionally necessary for accuracy. This behavior suggests a view of representation learning driven by selective concentration and decorrelation rather than global information reduction. Finally, we show that a simple dependence-aware regularizer based on the Hilbert–Schmidt Independence Criterion can encourage these same patterns during training, yielding small accuracy gains and consistently faster convergence. Full article
Show Figures

Figure 1

21 pages, 2749 KB  
Article
A Lightweight Model of Learning Common Features in Different Domains for Classification Tasks
by Dong-Hyun Kang, Kyeong-Taek Kim, Erkinov Habibilloh and Won-Du Chang
Mathematics 2026, 14(2), 326; https://doi.org/10.3390/math14020326 - 18 Jan 2026
Viewed by 116
Abstract
The increasing size of recent deep neural networks, particularly when applied to learning across multiple domains, limits their deployment in resource-constrained environments. To address this issue, this study proposes a lightweight neural architecture with a parallel structure of convolutional layers to enable efficient [...] Read more.
The increasing size of recent deep neural networks, particularly when applied to learning across multiple domains, limits their deployment in resource-constrained environments. To address this issue, this study proposes a lightweight neural architecture with a parallel structure of convolutional layers to enable efficient and scalable multi-domain learning. The proposed network includes an individual feature extractor for domain-specific features and a common feature extractor for the shared features. This design minimizes redundancy and significantly reduces the number of parameters while preserving classification performance. To evaluate the proposed method, experiments were conducted using four image classification datasets: MNIST, FMNIST, CIFAR10, and SVHN. These experiments focused on classification settings where each image contained a single dominant object without relying on large pretrained models. The proposed model achieved high accuracy while significantly reducing the number of parameters. It required only 3.9 M parameters for learning across the four datasets, compared to 33.6 M for VGG16. The model achieved an accuracy of 98.87% on MNIST and 85.83% on SVHN, outperforming other lightweight models, including MobileNet v2 and EfficientNet v2b0, and was comparable to ResNet50. These findings indicate that the proposed architecture has the potential to support multi-domain learning while minimizing model complexity. This approach may be beneficial for applications in resource-constrained environments. Full article
Show Figures

Figure 1

28 pages, 4099 KB  
Article
Fatigue Crack Length Estimation Using Acoustic Emissions Technique-Based Convolutional Neural Networks
by Asaad Migot, Ahmed Saaudi, Roshan Joseph and Victor Giurgiutiu
Sensors 2026, 26(2), 650; https://doi.org/10.3390/s26020650 - 18 Jan 2026
Viewed by 214
Abstract
Fatigue crack propagation is a critical failure mechanism in engineering structures, requiring meticulous monitoring for timely maintenance. This research introduces a deep learning framework for estimating fatigue fracture length in metallic plates through acoustic emission (AE) signals. AE waveforms recorded during crack growth [...] Read more.
Fatigue crack propagation is a critical failure mechanism in engineering structures, requiring meticulous monitoring for timely maintenance. This research introduces a deep learning framework for estimating fatigue fracture length in metallic plates through acoustic emission (AE) signals. AE waveforms recorded during crack growth are transformed into time-frequency images using the Choi–Williams distribution. First, a clustering system is developed to analyze the distribution of the AE image-based dataset. This system employs a CNN-based model to extract features from the input images. The AE dataset is then divided into three categories according to fatigue lengths using the K-means algorithm. Principal Component Analysis (PCA) is used to reduce the feature vectors to two dimensions for display. The results show how close together the data points are in the clusters. Second, convolutional neural network (CNN) models are trained using the AE dataset to categorize fracture lengths into three separate ranges. Using the pre-trained models ResNet50V2 and VGG16, we compare the performance of a bespoke CNN using transfer learning. It is clear from the data that transfer learning models outperform the custom CNN by a wide margin, with an accuracy of approximately 99% compared to 93%. This research confirms that convolutional neural networks (CNNs), particularly when trained with transfer learning, are highly successful at understanding AE data for data-driven structural health monitoring. Full article
Show Figures

Figure 1

21 pages, 11032 KB  
Article
Scale Calibration and Pressure-Driven Knowledge Distillation for Image Classification
by Jing Xie, Penghui Guan, Han Li, Chunhua Tang, Li Wang and Yingcheng Lin
Symmetry 2026, 18(1), 177; https://doi.org/10.3390/sym18010177 - 18 Jan 2026
Viewed by 136
Abstract
Knowledge distillation achieves model compression by training a lightweight student network to mimic the output distribution of a larger teacher network. However, when the teacher becomes overconfident, its sharply peaked logits break the scale symmetry of supervision and induce high-variance gradients, leading to [...] Read more.
Knowledge distillation achieves model compression by training a lightweight student network to mimic the output distribution of a larger teacher network. However, when the teacher becomes overconfident, its sharply peaked logits break the scale symmetry of supervision and induce high-variance gradients, leading to unstable optimization. Meanwhile, research that focuses only on final-logit alignment often fails to utilize intermediate semantic structure effectively. This causes weak discrimination of student representations, especially under class imbalance. To address these issues, we propose Scale Calibration and Pressure-Driven Knowledge Distillation (SPKD): a one-stage framework comprising two lightweight, complementary mechanisms. First, a dynamic scale calibration module normalizes the teacher’s logits to a consistent magnitude, reducing gradient variance. Secondly, an adaptive pressure-driven mechanism refines student learning by preventing feature collapse and promoting intra-class compactness and inter-class separability. Extensive experiments on CIFAR-100 and ImageNet demonstrate that SPKD achieves superior performance to distillation baselines across various teacher–student combinations. For example, SPKD achieves a score of 74.84% on CIFAR-100 for the homogeneous architecture VGG13-VGG8. Additional evidence from logit norm and gradient variance statistics, as well as representation analyses, proves the fact that SPKD stabilizes optimization while learning more discriminative and well-structured features. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

25 pages, 2452 KB  
Article
Predicting GPU Training Energy Consumption in Data Centers Using Task Metadata via Symbolic Regression
by Xiao Liao, Yiqian Li, Shaofeng Zhang, Xianzheng Wei and Jinlong Hu
Energies 2026, 19(2), 448; https://doi.org/10.3390/en19020448 - 16 Jan 2026
Viewed by 159
Abstract
With the rapid advancement of artificial intelligence (AI) technology, training deep neural networks has become a core computational task that consumes significant energy in data centers. Researchers often employ various methods to estimate the energy usage of data center clusters or servers to [...] Read more.
With the rapid advancement of artificial intelligence (AI) technology, training deep neural networks has become a core computational task that consumes significant energy in data centers. Researchers often employ various methods to estimate the energy usage of data center clusters or servers to enhance energy management and conservation efforts. However, accurately predicting the energy consumption and carbon footprint of a specific AI task throughout its entire lifecycle before execution remains challenging. In this paper, we explore the energy consumption characteristics of AI model training tasks and propose a simple yet effective method for predicting neural network training energy consumption. This approach leverages training task metadata and applies genetic programming-based symbolic regression to forecast energy consumption prior to executing training tasks, distinguishing it from time series forecasting of data center energy consumption. We have developed an AI training energy consumption environment using the A800 GPU and models from the ResNet{18, 34, 50, 101}, VGG16, MobileNet, ViT, and BERT families to collect data for experimentation and analysis. The experimental analysis of energy consumption reveals that the consumption curve exhibits waveform characteristics resembling square waves, with distinct peaks and valleys. The prediction experiments demonstrate that the proposed method performs well, achieving mean relative errors (MRE) of 2.67% for valley energy, 8.42% for valley duration, 5.16% for peak power, and 3.64% for peak duration. Our findings indicate that, within a specific data center, the energy consumption of AI training tasks follows a predictable pattern. Furthermore, our proposed method enables accurate prediction and calculation of power load before model training begins, without requiring extensive historical energy consumption data. This capability facilitates optimized energy-saving scheduling in data centers in advance, thereby advancing the vision of green AI. Full article
Show Figures

Figure 1

28 pages, 13960 KB  
Article
Deep Learning Approaches for Brain Tumor Classification in MRI Scans: An Analysis of Model Interpretability
by Emanuela F. Gomes and Ramiro S. Barbosa
Appl. Sci. 2026, 16(2), 831; https://doi.org/10.3390/app16020831 - 14 Jan 2026
Viewed by 388
Abstract
This work presents the development and evaluation of Artificial Intelligence (AI) models for the automatic classification of brain tumors in Magnetic Resonance Imaging (MRI) scans. Several deep learning architectures were implemented and compared, including VGG-19, ResNet50, EfficientNetB3, Xception, MobileNetV2, DenseNet201, InceptionV3, Vision Transformer [...] Read more.
This work presents the development and evaluation of Artificial Intelligence (AI) models for the automatic classification of brain tumors in Magnetic Resonance Imaging (MRI) scans. Several deep learning architectures were implemented and compared, including VGG-19, ResNet50, EfficientNetB3, Xception, MobileNetV2, DenseNet201, InceptionV3, Vision Transformer (ViT), and an Ensemble model. The models were developed in Python (version 3.12.4) using the Keras and TensorFlow frameworks and trained on a public Brain Tumor MRI dataset containing 7023 images. Data augmentation and hyperparameter optimization techniques were applied to improve model generalization. The results showed high classification performance, with accuracies ranging from 89.47% to 98.17%. The Vision Transformer achieved the best performance, reaching 98.17% accuracy, outperforming traditional Convolutional Neural Network (CNN) architectures. Explainable AI (XAI) methods Grad-CAM, LIME, and Occlusion Sensitivity were employed to assess model interpretability, showing that the models predominantly focused on tumor regions. The proposed approach demonstrated the effectiveness of AI-based systems in supporting early diagnosis of brain tumors, reducing analysis time and assisting healthcare professionals. Full article
(This article belongs to the Special Issue Advanced Intelligent Technologies in Bioinformatics and Biomedicine)
Show Figures

Figure 1

24 pages, 5571 KB  
Article
Bearing Fault Diagnosis Based on a Depthwise Separable Atrous Convolution and ASPP Hybrid Network
by Xiaojiao Gu, Chuanyu Liu, Jinghua Li, Xiaolin Yu and Yang Tian
Machines 2026, 14(1), 93; https://doi.org/10.3390/machines14010093 - 13 Jan 2026
Viewed by 133
Abstract
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial [...] Read more.
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial Pyramid Pooling (ASPP). First, the Continuous Wavelet Transform (CWT) is applied to the vibration and acoustic signals to convert them into time–frequency representations. The vibration CWT is then fed into a multi-scale feature extraction module to obtain preliminary vibration features, whereas the acoustic CWT is processed by a Deep Residual Shrinkage Network (DRSN). The two feature streams are concatenated in a feature fusion module and subsequently fed into the DSAC and ASPP modules, which together expand the effective receptive field and aggregate multi-scale contextual information. Finally, global pooling followed by a classifier outputs the bearing fault category, enabling high-precision bearing fault identification. Experimental results show that, under both clean data and multiple low signal-to-noise ratio (SNR) noise conditions, the proposed DSAC-ASPP method achieves higher accuracy and lower variance than baselines such as ResNet, VGG, and MobileNet, while requiring fewer parameters and FLOPs and exhibiting superior robustness and deployability. Full article
Show Figures

Figure 1

15 pages, 665 KB  
Article
Comparative Evaluation of Deep Learning Models for the Classification of Impacted Maxillary Canines on Panoramic Radiographs
by Nazlı Tokatlı, Buket Erdem, Mustafa Özcan, Begüm Turan Maviş, Çağla Şar and Fulya Özdemir
Diagnostics 2026, 16(2), 219; https://doi.org/10.3390/diagnostics16020219 - 9 Jan 2026
Viewed by 254
Abstract
Background/Objectives: The early and accurate identification of impacted teeth in the maxilla is critical for effective dental treatment planning. Traditional diagnostic methods relying on manual interpretation of radiographic images are often time-consuming and subject to variability. Methods: This study presents a deep learning-based [...] Read more.
Background/Objectives: The early and accurate identification of impacted teeth in the maxilla is critical for effective dental treatment planning. Traditional diagnostic methods relying on manual interpretation of radiographic images are often time-consuming and subject to variability. Methods: This study presents a deep learning-based approach for automated classification of impacted maxillary canines using panoramic radiographs. A comparative evaluation of four pre-trained convolutional neural network (CNN) architectures—ResNet50, Xception, InceptionV3, and VGG16—was conducted through transfer learning techniques. In this retrospective single-center study, the dataset comprised 694 annotated panoramic radiographs sourced from the archives of a university dental hospital, with a mildly imbalanced representation of impacted and non-impacted cases. Models were assessed using accuracy, precision, recall, specificity, and F1-score. Results: Among the tested architectures, VGG16 demonstrated superior performance, achieving an accuracy of 99.28% and an F1-score of 99.43%. Additionally, a prototype diagnostic interface was developed to demonstrate the potential for clinical application. Conclusions: The findings underscore the potential of deep learning models, particularly VGG16, in enhancing diagnostic workflows; however, further validation on diverse, multi-center datasets is required to confirm clinical generalizability. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

33 pages, 24811 KB  
Article
Demystifying Deep Learning Decisions in Leukemia Diagnostics Using Explainable AI
by Shahd H. Altalhi and Salha M. Alzahrani
Diagnostics 2026, 16(2), 212; https://doi.org/10.3390/diagnostics16020212 - 9 Jan 2026
Viewed by 328
Abstract
Background/Objectives: Conventional workflows, peripheral blood smears, and bone marrow assessment supplemented by LDI-PCR, molecular cytogenetics, and array-CGH, are expert-driven in the face of biological and imaging variability. Methods: We propose an AI pipeline that integrates convolutional neural networks (CNNs) and transfer [...] Read more.
Background/Objectives: Conventional workflows, peripheral blood smears, and bone marrow assessment supplemented by LDI-PCR, molecular cytogenetics, and array-CGH, are expert-driven in the face of biological and imaging variability. Methods: We propose an AI pipeline that integrates convolutional neural networks (CNNs) and transfer learning-based models with two explainable AI (XAI) approaches, LIME and Grad-Cam, to deliver both high diagnostic accuracy and transparent rationale. Seven public sources were curated into a unified benchmark (66,550 images) covering ALL, AML, CLL, CML, and healthy controls; images were standardized, ROI-cropped, and split with stratification (80/10/10). We fine-tuned multiple backbones (DenseNet-121, MobileNetV2, VGG16, InceptionV3, ResNet50, Xception, and a custom CNN) and evaluated the accuracy and F1-score, benchmarking against the recent literature. Results: On the five-class task (ALL/AML/CLL/CML/Healthy), MobileNetV2 achieved 97.9% accuracy/F1, with DenseNet-121 reaching 97.66% F1. On ALL subtypes (Benign, Early, Pre, Pro) and across tasks, DenseNet121 and MobileNetV2 were the most reliable, achieving state-of-the-art accuracy with the strongest, nucleus-centric explanations. Conclusions: XAI analyses (LIME, Grad-CAM) consistently localized leukemic nuclei and other cell-intrinsic morphology, aligning saliency with clinical cues and model performance. Compared with baselines, our approach matched or exceeded accuracy while providing stronger, corroborated interpretability on a substantially larger and more diverse dataset. Full article
Show Figures

Figure 1

18 pages, 9641 KB  
Article
KT-NAS: Knowledge Transfer for Efficient Neural Architecture Search
by Linh-Tam Tran, A. F. M. Shahab Uddin, Younho Jang and Sung-Ho Bae
Appl. Sci. 2026, 16(2), 623; https://doi.org/10.3390/app16020623 - 7 Jan 2026
Viewed by 148
Abstract
Pre-trained models have played important roles in many tasks, such as domain adaptation and out-of-distribution generalization, by transferring matured knowledge. In this paper, we study Neural Architecture Search (NAS) in the feature space level and observe that low-level features of NAS-based networks (generated [...] Read more.
Pre-trained models have played important roles in many tasks, such as domain adaptation and out-of-distribution generalization, by transferring matured knowledge. In this paper, we study Neural Architecture Search (NAS) in the feature space level and observe that low-level features of NAS-based networks (generated networks from a NAS space) become stable in the earlier stage of training. In addition, these low-level features are similar to those from hand-crafted networks such as VGG, ResNet, and DenseNet. This phenomenon is consistent over different search spaces and datasets. Motivated by these observations, we propose a new architectural method for NAS, called Knowledge-Transfer NAS, which utilizes the features from a pre-trained hand-crafted network. Specifically, we replace the first few cells of NAS-based networks with pre-trained manually designed blocks and freeze them, and then only train the remaining cells. We perform extensive experiments using various NAS algorithms and search spaces, and show that Knowledge-Transfer NAS achieves higher/comparable performance while requiring less memory footprint and search time, offering a new perspective on the applicability of pre-trained models for improved NAS algorithms. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

23 pages, 30920 KB  
Article
A Surface Defect Detection System for Industrial Conveyor Belt Inspection Using Apple’s TrueDepth Camera Technology
by Mohammad Siami, Przemysław Dąbek, Hamid Shiri, Tomasz Barszcz and Radosław Zimroz
Appl. Sci. 2026, 16(2), 609; https://doi.org/10.3390/app16020609 - 7 Jan 2026
Viewed by 232
Abstract
Maintaining the structural integrity of conveyor belts is essential for safe and reliable mining operations. However, these belts are susceptible to longitudinal tearing and surface degradation from material impact, fatigue, and deformation. Many computer vision-based inspection methods are inefficient and unreliable in harsh [...] Read more.
Maintaining the structural integrity of conveyor belts is essential for safe and reliable mining operations. However, these belts are susceptible to longitudinal tearing and surface degradation from material impact, fatigue, and deformation. Many computer vision-based inspection methods are inefficient and unreliable in harsh mining environments characterized by dust and variable lighting. This study introduces a smartphone-driven defect detection system for the cost-effective, geometric inspection of conveyor belt surfaces. Using Apple’s iPhone 12 Pro Max (Apple Inc., Cupertino, CA, USA), the system captures 3D point cloud data from a moving belt with induced damage via the integrated TrueDepth camera. A key innovation is a 3D-to-2D projection pipeline that converts point cloud data into structured representations compatible with standard 2D Convolutional Neural Networks (CNNs). We then propose a hybrid deep learning and machine learning model, where features extracted by pre-trained CNNs (VGG16, ResNet50, InceptionV3, Xception) are classified by ensemble methods (Random Forest, XGBoost, LightGBM). The proposed system achieves high detection accuracy exceeding 0.97 F1 score in the case of all proposed model implementations with TrueDepth F1 score over 0.05 higher than RGB approach. Applied cost-effective smartphone-based sensing platform proved to support near-real-time maintenance decisions. Laboratory results demonstrate the method’s reliability, with measurement errors for defect dimensions within 3 mm. This approach shows significant potential to improve conveyor belt management, reduce maintenance costs, and enhance operational safety. Full article
(This article belongs to the Special Issue Mining Engineering: Present and Future Prospectives)
Show Figures

Figure 1

24 pages, 4670 KB  
Article
X-HEM: An Explainable and Trustworthy AI-Based Framework for Intelligent Healthcare Diagnostics
by Mohammad F. Al-Hammouri, Bandi Vamsi, Islam T. Almalkawi and Ali Al Bataineh
Computers 2026, 15(1), 33; https://doi.org/10.3390/computers15010033 - 7 Jan 2026
Viewed by 355
Abstract
Intracranial Hemorrhage (ICH) remains a critical life-threatening condition where timely and accurate diagnosis using non-contrast Computed Tomography (CT) scans is vital to reduce mortality and long-term disability. Deep learning methods have shown strong potential for automated hemorrhage detection, yet most existing approaches lack [...] Read more.
Intracranial Hemorrhage (ICH) remains a critical life-threatening condition where timely and accurate diagnosis using non-contrast Computed Tomography (CT) scans is vital to reduce mortality and long-term disability. Deep learning methods have shown strong potential for automated hemorrhage detection, yet most existing approaches lack confidence quantification and clinical interpretability, which limits their adoption in high-stakes care. This study presents X-HEM, an explainable hemorrhage ensemble model for reliable detection of Intracranial Hemorrhage (ICH) on non-contrast head CT scans. The aim is to improve diagnostic accuracy, interpretability, and confidence for real-time clinical decision support. X-HEM integrates three convolutional backbones (VGG16, ResNet50, DenseNet121) through soft voting. Bayesian uncertainty is estimated using Monte Carlo Dropout, while Grad-CAM++ and SHAP provide spatial and global interpretability. Training and validation were conducted on the RSNA ICH dataset, with external testing on CQ500. The model achieved AUCs of 0.96 (RSNA) and 0.94 (CQ500), demonstrated well-calibrated confidence (low Brier/ECE), and provided explanations that aligned with radiologist-marked regions. The integration of ensemble learning, Bayesian uncertainty, and dual explainability enables X-HEM to deliver confidence-aware, interpretable ICH predictions suitable for clinical use. Full article
Show Figures

Figure 1

Back to TopTop