Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (34)

Search Parameters:
Keywords = VanillaNet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 4079 KB  
Article
Breast Cancer Classification with Various Optimized Deep Learning Methods
by Mustafa Güler, Gamze Sart, Ömer Algorabi, Ayse Nur Adıguzel Tuylu and Yusuf Sait Türkan
Diagnostics 2025, 15(14), 1751; https://doi.org/10.3390/diagnostics15141751 - 10 Jul 2025
Viewed by 642
Abstract
Background/Objectives: In recent years, there has been a significant increase in the number of women with breast cancer. Breast cancer prediction is defined as a medical data analysis and image processing problem. Experts may need artificial intelligence technologies to distinguish between benign and [...] Read more.
Background/Objectives: In recent years, there has been a significant increase in the number of women with breast cancer. Breast cancer prediction is defined as a medical data analysis and image processing problem. Experts may need artificial intelligence technologies to distinguish between benign and malignant tumors in order to make decisions. When the studies in the literature are examined, it can be seen that applications of deep learning algorithms in the field of medicine have achieved very successful results. Methods: In this study, 11 different deep learning algorithms (Vanilla, ResNet50, ResNet152, VGG16, DenseNet152, MobileNetv2, EfficientB1, NasNet, DenseNet201, ensemble, and Tuned Model) were used. Images of pathological specimens from breast biopsies consisting of two classes, benign and malignant, were used for classification analysis. To limit the computational time and speed up the analysis process, 10,000 images, 6172 IDC-negative and 3828 IDC-positive, were selected. Of the images, 80% were used for training, 10% were used for validation, and 10% were used for testing the trained model. Results: The results demonstrate that DenseNet201 achieved the highest classification accuracy of 89.4%, with a precision of 88.2%, a recall of 84.1%, an F1 score of 86.1%, and an AUC score of 95.8%. Conclusions: In conclusion, this study highlights the potential of deep learning algorithms in breast cancer classification. Future research should focus on integrating multi-modal imaging data, refining ensemble learning methodologies, and expanding dataset diversity to further improve the classification accuracy and real-world clinical applicability. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

19 pages, 5919 KB  
Article
Evaluation of the Effectiveness of the UNet Model with Different Backbones in the Semantic Segmentation of Tomato Leaves and Fruits
by Juan Pablo Guerra Ibarra, Francisco Javier Cuevas de la Rosa and Julieta Raquel Hernandez Vidales
Horticulturae 2025, 11(5), 514; https://doi.org/10.3390/horticulturae11050514 - 9 May 2025
Viewed by 683
Abstract
Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting [...] Read more.
Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting this separation is to utilize intelligent digital image processing, wherein plant elements are labeled for subsequent analysis. The application of Deep Learning algorithms offers an alternative approach for conducting segmentation tasks on images obtained from complex environments with intricate patterns that pose challenges for separation. One such application is semantic segmentation, which involves assigning a label to each pixel in the processed image. This task is accomplished through training various models of Convolutional Neural Networks. This paper presents a comparative analysis of semantic segmentation performance using a convolutional neural network model with different backbone architectures. The task focuses on pixel-wise classification into three categories: leaves, fruits, and background, based on images of semi-hydroponic tomato crops captured in greenhouse settings. The main contribution lies in identifying the most efficient backbone-UNet combination for segmenting tomato plant leaves and fruits under uncontrolled conditions of lighting and background during image acquisition. The Convolutional Neural Network model UNet is is implemented with different backbones to use transfer learning to take advantage of the knowledge acquired by other models such as MobileNet, VanillaNet, MVanillaNet, ResNet, VGGNet trained with the ImageNet dataset, in order to segment the leaves and fruits of tomato plants. Highest percentage performance across five metrics for tomato plant fruit and leaves segmentation is the MVanillaNet-UNet and VGGNet-UNet combination with 0.88089 and 0.89078 respectively. A comparison of the best results of semantic segmentation versus those obtained with a color-dominant segmentation method optimized with a greedy algorithm is presented. Full article
(This article belongs to the Section Vegetable Production Systems)
Show Figures

Figure 1

19 pages, 7498 KB  
Article
An Efficient Explainability of Deep Models on Medical Images
by Salim Khiat, Sidi Ahmed Mahmoudi, Sédrick Stassin, Lillia Boukerroui, Besma Senaï and Saïd Mahmoudi
Algorithms 2025, 18(4), 210; https://doi.org/10.3390/a18040210 - 9 Apr 2025
Viewed by 660
Abstract
Nowadays, Artificial Intelligence (AI) has revolutionized many fields and the medical field is no exception. Thanks to technological advancements and the emergence of Deep Learning (DL) techniques AI has brought new possibilities and significant improvements to medical practice. Despite the excellent results of [...] Read more.
Nowadays, Artificial Intelligence (AI) has revolutionized many fields and the medical field is no exception. Thanks to technological advancements and the emergence of Deep Learning (DL) techniques AI has brought new possibilities and significant improvements to medical practice. Despite the excellent results of DL models in terms of accuracy and performance, they remain black boxes as they do not provide meaningful insights into their internal functioning. This is where the field of Explainable AI (XAI) comes in, aiming to provide insights into the underlying workings of these black box models. In this present paper the visual explainability of deep models on chest radiography images are addressed. This research uses two datasets, the first on COVID-19, viral pneumonia, normality (healthy patients) and the second on pulmonary opacities. Initially the pretrained CNN models (VGG16, VGG19, ResNet50, MobileNetV2, Mixnet and EfficientNetB7) are used to classify chest radiography images. Then, the visual explainability methods (GradCAM, LIME, Vanilla Gradient, Gradient Integrated Gradient and SmoothGrad) are performed to understand and explain the decisions made by these models. The obtained results show that MobileNetV2 and VGG16 are the best models for the first and second datasets, respectively. As for the explainability methods, the results were subjected to doctors and were validated by calculating the mean opinion score. The doctors deemed GradCAM, LIME and Vanilla Gradient as the most effective methods, providing understandable and accurate explanations. Full article
(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))
Show Figures

Figure 1

20 pages, 2239 KB  
Article
A Novel Lightweight Deep Learning Approach for Drivers’ Facial Expression Detection
by Jia Uddin
Designs 2025, 9(2), 45; https://doi.org/10.3390/designs9020045 - 3 Apr 2025
Cited by 1 | Viewed by 1030
Abstract
Drivers’ facial expression recognition systems play a pivotal role in Advanced Driver Assistance Systems (ADASs) by monitoring emotional states and detecting fatigue or distractions in real time. However, deploying such systems in resource-constrained environments like vehicles requires lightweight architectures to ensure real-time performance, [...] Read more.
Drivers’ facial expression recognition systems play a pivotal role in Advanced Driver Assistance Systems (ADASs) by monitoring emotional states and detecting fatigue or distractions in real time. However, deploying such systems in resource-constrained environments like vehicles requires lightweight architectures to ensure real-time performance, efficient model updates, and compatibility with embedded hardware. Smaller models significantly reduce communication overhead in distributed training. For autonomous vehicles, lightweight architectures also minimize the data transfer required for over-the-air updates. Moreover, they are crucial for their deployability on hardware with limited on-chip memory. In this work, we propose a novel Dual Attention Lightweight Deep Learning (DALDL) approach for drivers’ facial expression recognition. The proposed approach combines the SqueezeNext architecture with a Dual Attention Convolution (DAC) block. Our DAC block integrates Hybrid Channel Attention (HCA) and Coordinate Space Attention (CSA) to enhance feature extraction efficiency while maintaining minimal parameter overhead. To evaluate the effectiveness of our architecture, we compare it against two baselines: (a) Vanilla SqueezeNet and (b) AlexNet. Compared with SqueezeNet, DALDL improves accuracy by 7.96% and F1-score by 7.95% on the KMU-FED dataset. On the CK+ dataset, it achieves 8.51% higher accuracy and 8.40% higher F1-score. Against AlexNet, DALDL improves accuracy by 4.34% and F1-score by 4.17% on KMU-FED. Lastly, on CK+, it provides a 5.36% boost in accuracy and a 7.24% increase in F1-score. These results demonstrate that DALDL is a promising solution for efficient and accurate emotion recognition in real-world automotive applications. Full article
Show Figures

Figure 1

12 pages, 1100 KB  
Article
Lightweight U-Net for Blood Vessels Segmentation in X-Ray Coronary Angiography
by Jesus Salvador Ramos-Cortez, Dora E. Alvarado-Carrillo, Emmanuel Ovalle-Magallanes and Juan Gabriel Avina-Cervantes
J. Imaging 2025, 11(4), 106; https://doi.org/10.3390/jimaging11040106 - 30 Mar 2025
Viewed by 836
Abstract
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy [...] Read more.
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy of deep learning models. Additionally, deep learning models for this task often require high computational resources, limiting their practical application in real-time clinical settings. This study proposes a lightweight variant of the U-Net architecture using a structured kernel pruning strategy inspired by the Lottery Ticket Hypothesis. The pruning method systematically removes entire convolutional filters from each layer based on a global reduction factor, generating compact subnetworks that retain key representational capacity. This results in a significantly smaller model without compromising the segmentation performance. This approach is evaluated on two benchmark datasets, demonstrating consistent improvements in segmentation accuracy compared to the vanilla U-Net. Additionally, model complexity is significantly reduced from 31 M to 1.9 M parameters, improving efficiency while maintaining high segmentation quality. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

26 pages, 4394 KB  
Article
Neural Network Models for Prostate Zones Segmentation in Magnetic Resonance Imaging
by Saman Fouladi, Luca Di Palma, Fatemeh Darvizeh, Deborah Fazzini, Alessandro Maiocchi, Sergio Papa, Gabriele Gianini and Marco Alì
Information 2025, 16(3), 186; https://doi.org/10.3390/info16030186 - 28 Feb 2025
Viewed by 1211
Abstract
Prostate cancer (PCa) is one of the most common tumors diagnosed in men worldwide, with approximately 1.7 million new cases expected by 2030. Most cancerous lesions in PCa are located in the peripheral zone (PZ); therefore, accurate identification of the location of the [...] Read more.
Prostate cancer (PCa) is one of the most common tumors diagnosed in men worldwide, with approximately 1.7 million new cases expected by 2030. Most cancerous lesions in PCa are located in the peripheral zone (PZ); therefore, accurate identification of the location of the lesion is essential for effective diagnosis and treatment. Zonal segmentation in magnetic resonance imaging (MRI) scans is critical and plays a key role in pinpointing cancerous regions and treatment strategies. In this work, we report on the development of three advanced neural network-based models: one based on ensemble learning, one on Meta-Net, and one on YOLO-V8. They were tailored for the segmentation of the central gland (CG) and PZ using a small dataset of 90 MRI scans for training, 25 MRIs for validation, and 24 scans for testing. The ensemble learning method, combining U-Net-based models (Attention-Res-U-Net, Vanilla-Net, and V-Net), achieved an IoU of 79.3% and DSC of 88.4% for CG and an IoU of 54.5% and DSC of 70.5% for PZ on the test set. Meta-Net, used for the first time in segmentation, demonstrated an IoU of 78% and DSC of 88% for CG, while YOLO-V8 outperformed both models with an IoU of 80% and DSC of 89% for CG and an IoU of 58% and DSC of 73% for PZ. Full article
(This article belongs to the Special Issue Detection and Modelling of Biosignals)
Show Figures

Figure 1

20 pages, 742 KB  
Article
FedSeq: Personalized Federated Learning via Sequential Layer Expansion in Representation Learning
by Jae Won Jang and Bong Jun Choi
Appl. Sci. 2024, 14(24), 12024; https://doi.org/10.3390/app142412024 - 23 Dec 2024
Viewed by 2089
Abstract
Federated learning ensures the privacy of clients by conducting distributed training on individual client devices and sharing only the model weights with a central server. However, in real-world scenarios, especially in IoT scenarios where devices have varying capabilities and data heterogeneity exists among [...] Read more.
Federated learning ensures the privacy of clients by conducting distributed training on individual client devices and sharing only the model weights with a central server. However, in real-world scenarios, especially in IoT scenarios where devices have varying capabilities and data heterogeneity exists among IoT clients, appropriate personalization methods are necessary. In this paper, this work aims to address this heterogeneity using a form of parameter decoupling known as representation learning. Representation learning divides deep learning models into ‘base’ and ‘head’ components. The base component, capturing common features across all clients, is shared with the server, while the head component, capturing unique features specific to individual clients, remains local. This work proposes a new representation learning-based approach, named FedSeq, that suggests decoupling the entire deep learning model into more densely divided parts with the application of suitable scheduling methods, which can benefit not only data heterogeneity but also class heterogeneity. FedSeq has two different layer scheduling approaches, namely forward (Vanilla) and backward (Anti), in the context of data and class heterogeneity among clients. Our experimental results show that FedSeq, when compared to existing personalized federated learning algorithms, achieves increased accuracy, especially under challenging conditions, while reducing computation costs. The study introduces a novel personalized federated learning approach that integrates sequential layer expansion and dynamic scheduling methods, demonstrating a 7.31% improvement in classification accuracy on the CIFAR-100 dataset and a 4.1% improvement on the Tiny-ImageNet dataset compared to existing methods, while also reducing computation costs by up to 15%. Furthermore, Anti Scheduling achieves a computational efficiency improvement of 3.91% compared to FedAvg and 3.06% compared to FedBABU, while Vanilla Scheduling achieves a significant efficiency improvement of 63.93% compared to FedAvg and 63.61% compared to FedBABU. Full article
(This article belongs to the Special Issue The Internet of Things (IoT) and Its Application in Monitoring)
Show Figures

Figure 1

16 pages, 9523 KB  
Article
Method for Recognizing Disordered Sugarcane Stacking Based on Improved YOLOv8n
by Jiaodi Liu, Bang Zhang, Hongzhen Xu, Lichang Zhang and Xiaolong Zhang
Appl. Sci. 2024, 14(24), 11765; https://doi.org/10.3390/app142411765 - 17 Dec 2024
Cited by 2 | Viewed by 843
Abstract
In order to enhance the efficiency and precision of grab-type planting operations for disordered stacked sugarcane, and to achieve rapid deployment of the visual detection model on automatic sugarcane seed-cane planters, this study proposes a sugarcane detection algorithm based on an improved YOLOv8n [...] Read more.
In order to enhance the efficiency and precision of grab-type planting operations for disordered stacked sugarcane, and to achieve rapid deployment of the visual detection model on automatic sugarcane seed-cane planters, this study proposes a sugarcane detection algorithm based on an improved YOLOv8n model. Firstly, the backbone network of YOLOv8n is replaced with VanillaNet to optimize feature extraction capability and computational efficiency; the InceptionNeXt deep convolutional structure is integrated, utilizing its multi-scale processing feature to enhance the model’s ability to recognize sugarcane of different shapes and sizes. Secondly, the ECA attention mechanism is incorporated in the feature fusion module C2F to further enhance the recognition model’s capability to capture key features of sugarcane. The MPDIOU loss function is employed to improve the resolution of recognizing overlapping sugarcane, reducing misidentification and missed detection. Experimental results show that the improved YOLOv8n model achieves 96% and 71.5% in mAP@0.5 and mAP@0.5:0.95 respectively, which are increases of 5.1 and 6.4 percentage points compared to the original YOLOv8n model; moreover, compared to the currently popular Faster-RCNN, SSD, and other YOLO series object detection models, it not only improves detection accuracy but also significantly reduces the number of model parameters. The research results provide technical support for subsequent sugarcane grab-type planting recognition and mobile deployment. Full article
Show Figures

Figure 1

31 pages, 1620 KB  
Article
DB-Net and DVR-Net: Optimized New Deep Learning Models for Efficient Cardiovascular Disease Prediction
by Aymin Javed, Nadeem Javaid, Nabil Alrajeh and Muhammad Aslam
Appl. Sci. 2024, 14(22), 10516; https://doi.org/10.3390/app142210516 - 15 Nov 2024
Cited by 2 | Viewed by 1412
Abstract
Cardiovascular Disease (CVD) is one of the main causes of death in recent years. To overcome the challenges faced during diagnosing CVD at an early stage, deep learning has been used. With advancements in technology, the clinical practice in the health care industry [...] Read more.
Cardiovascular Disease (CVD) is one of the main causes of death in recent years. To overcome the challenges faced during diagnosing CVD at an early stage, deep learning has been used. With advancements in technology, the clinical practice in the health care industry is likely to transform significantly. To predict CVD, we constructed two models: Dense Belief Network (DB-Net) and Deep Vanilla Recurrent Network (DVR-Net). Proximity Weighted Random Affine Shadow sampling balancing technique is used for balancing the highly imbalanced Heart Disease Health Indicator dataset. SHapley Additive exPlanations exhibits each feature’s contribution. It is used to visualize features contribution to the output of DB-Net and DVR-Net in CVD prediction. Furthermore, 10-Fold Cross Validation is performed for evaluating the proposed models performance. Cross-dataset evaluation is also conducted on proposed models to see how well our proposed models generalize on unseen data. Various evaluation measures are used for assessment of models. The proposed DB-Net outperforms all the base models by achieving an accuracy of 91%, F1-score of 91%, precision of 93%, recall of 89%, and execution time of 1883 s on 30 epochs with batch size 32. The DVR-Net beats the state-of-art models with an accuracy of 90%, F1-score of 90%, precision of 90%, recall of 90%, and execution time of 2853 s on 30 epochs with batch size 32. Full article
Show Figures

Figure 1

16 pages, 6921 KB  
Article
V-YOLO: A Lightweight and Efficient Detection Model for Guava in Complex Orchard Environments
by Zhen Liu, Juntao Xiong, Mingrui Cai, Xiaoxin Li and Xinjie Tan
Agronomy 2024, 14(9), 1988; https://doi.org/10.3390/agronomy14091988 - 2 Sep 2024
Cited by 13 | Viewed by 2670
Abstract
The global agriculture industry is encountering challenges due to labor shortages and the demand for increased efficiency. Currently, fruit yield estimation in guava orchards primarily depends on manual counting. Machine vision is an essential technology for enabling automatic yield estimation in guava production. [...] Read more.
The global agriculture industry is encountering challenges due to labor shortages and the demand for increased efficiency. Currently, fruit yield estimation in guava orchards primarily depends on manual counting. Machine vision is an essential technology for enabling automatic yield estimation in guava production. To address the detection of guava in complex natural environments, this paper proposes an improved lightweight and efficient detection model, V-YOLO (VanillaNet-YOLO). By utilizing the more lightweight and efficient VanillaNet as the backbone network and modifying the head part of the model, we enhance detection accuracy, reduce the number of model parameters, and improve detection speed. Experimental results demonstrate that V-YOLO and YOLOv10n achieve the same mean average precision (mAP) of 95.0%, but V-YOLO uses only 43.2% of the parameters required by YOLOv10n, performs calculations at 41.4% of the computational cost, and exhibits a detection speed that is 2.67 times that of YOLOv10n. These findings indicate that V-YOLO can be employed for rapid detection and counting of guava, providing an effective method for visually estimating fruit yield in guava orchards. Full article
Show Figures

Figure 1

15 pages, 4366 KB  
Article
Field-Based Soybean Flower and Pod Detection Using an Improved YOLOv8-VEW Method
by Kunpeng Zhao, Jinyang Li, Wenqiang Shi, Liqiang Qi, Chuntao Yu and Wei Zhang
Agriculture 2024, 14(8), 1423; https://doi.org/10.3390/agriculture14081423 - 22 Aug 2024
Cited by 5 | Viewed by 1525
Abstract
Changes in soybean flower and pod numbers are important factors affecting soybean yields. Obtaining the number of flowers and pods, as well as fallen flowers and pods, quickly and accurately is crucial for soybean variety breeding and high-quality and high-yielding production. This is [...] Read more.
Changes in soybean flower and pod numbers are important factors affecting soybean yields. Obtaining the number of flowers and pods, as well as fallen flowers and pods, quickly and accurately is crucial for soybean variety breeding and high-quality and high-yielding production. This is especially challenging in the natural field environment. Therefore, this study proposed a field soybean flower- and pod-detection method based on an improved network model (YOLOv8-VEW). VanillaNet is used as the backbone feature-extraction network for YOLOv8, and the EMA attention mechanism module is added to C2f, replacing the CioU function with the WIoU position loss function. The results showed that the F1, mAP, and FPS (frames per second) of the YOLOv8-VEW model were 0.95, 96.9%, and 90 FPS, respectively, which were 0.05, 2.4%, and 24 FPS better than those of the YOLOv8 model. The model was used to compare soybean flower and pod counts with manual counts, and its R2 for flowers and pods was 0.98311 and 0.98926, respectively, achieving rapid detection of soybean flower pods in the field. This study can provide reliable technical support for detecting soybean flowers and pod numbers in the field and selecting high-yielding varieties. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

22 pages, 4511 KB  
Article
Automatic Foreign Matter Segmentation System for Superabsorbent Polymer Powder: Application of Diffusion Adversarial Representation Learning
by Ssu-Han Chen, Meng-Jey Youh, Yan-Ru Chen, Jer-Huan Jang, Hung-Yi Chen, Hoang-Giang Cao, Yang-Shen Hsueh, Chuan-Fu Liu and Kevin Fong-Rey Liu
Mathematics 2024, 12(16), 2473; https://doi.org/10.3390/math12162473 - 10 Aug 2024
Viewed by 1136
Abstract
In current industries, sampling inspections of the quality of powders, such as superabsorbent polymers (SAPs) still are conducted via visual inspection. The size of samples and foreign matter are around 500 μm, making them difficult for humans to identify. An automatic foreign matter [...] Read more.
In current industries, sampling inspections of the quality of powders, such as superabsorbent polymers (SAPs) still are conducted via visual inspection. The size of samples and foreign matter are around 500 μm, making them difficult for humans to identify. An automatic foreign matter detection system for powder has been developed in the present study. The powder samples can be automatically delivered, distributed, and recycled, and images of them are captured through the hardware of the system, while the identification software of this system was developed based on diffusion adversarial representation learning (DARL). The background image is a foreign-matter-free powder image with an input image size of 1024 × 1024 × 3. Since DARL includes adversarial segmentation, a diffusion process, and synthetic image generation, the DARL model was trained using a diffusion block with the employment of a U-Net attention mechanism and a spatial-adaptation de-normalization (SPADE) layer through the adoption of a loss function from a vanilla generative adversarial network (GAN). This model was then compared with supervised models such as a fully convolutional network (FCN), U-Net, and DeepLABV3+, as well as with an unsupervised Otsu threshold segmentation. It should be noted that only 10% of the training samples were utilized for the DARL to learn and the intersection over union (IoU) of the DARL can reach up to 80.15%, which is much higher than the 59.00%, 53.47%, 49.39%, and 30.08% for the Otsu threshold segmentation, FCN, U-Net, and DeepLABV3+ models. Therefore, the performance of the model developed in the present study would not be degraded due to an insufficient number of samples containing foreign matter. In practical applications, there is no need to collect, label, and design features for a large number of foreign matter samples before using the developed system. Full article
(This article belongs to the Section E2: Control Theory and Mechanics)
Show Figures

Figure 1

17 pages, 3521 KB  
Article
Underwater Dam Crack Image Classification Algorithm Based on Improved VanillaNet
by Sisi Zhu, Xinyu Li, Gang Wan, Hanren Wang, Shen Shao and Pengfei Shi
Symmetry 2024, 16(7), 845; https://doi.org/10.3390/sym16070845 - 4 Jul 2024
Cited by 4 | Viewed by 1714
Abstract
In the task of classifying images of cracks in underwater dams, symmetry serves as a crucial geometric feature that aids in distinguishing cracks from other structural elements. Nevertheless, the asymmetry in the distribution of positive and negative samples within the underwater dam crack [...] Read more.
In the task of classifying images of cracks in underwater dams, symmetry serves as a crucial geometric feature that aids in distinguishing cracks from other structural elements. Nevertheless, the asymmetry in the distribution of positive and negative samples within the underwater dam crack image dataset results in a long-tail problem. This asymmetry, coupled with the subtle nature of crack features, leads to inadequate feature extraction by existing convolutional neural networks, thereby reducing classification accuracy. To address these issues, this paper improves VanillaNet. First, the Seesaw Loss loss function is introduced to tackle the long-tail problem in classifying underwater dam crack images, enhancing the model’s ability to recognize tail categories. Second, the Adaptive Frequency Filtering Token Mixer (AFF Token Mixer) is implemented to improve the model’s capability to capture crack image features and enhance classification accuracy. Finally, label smoothing is applied to prevent overfitting to the training data and improve the model’s generalization performance. The experimental results demonstrate that the proposed improvements significantly enhance the model’s classification accuracy for underwater dam crack images. The optimized algorithm achieves superior average accuracy in classifying underwater dam crack images, showing improvements of 1.29% and 0.64% over the relatively more accurate models ConvNeXtV2 and RepVGG, respectively. Compared to VanillaNet, the proposed algorithm increases average accuracy by 2.66%. The improved model also achieves higher accuracy compared to the pre-improved model and other mainstream networks. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

10 pages, 386 KB  
Article
DE-MKD: Decoupled Multi-Teacher Knowledge Distillation Based on Entropy
by Xin Cheng, Zhiqiang Zhang, Wei Weng, Wenxin Yu and Jinjia Zhou
Mathematics 2024, 12(11), 1672; https://doi.org/10.3390/math12111672 - 27 May 2024
Cited by 2 | Viewed by 2601
Abstract
The complexity of deep neural network models (DNNs) severely limits their application on devices with limited computing and storage resources. Knowledge distillation (KD) is an attractive model compression technology that can effectively alleviate this problem. Multi-teacher knowledge distillation (MKD) aims to leverage the [...] Read more.
The complexity of deep neural network models (DNNs) severely limits their application on devices with limited computing and storage resources. Knowledge distillation (KD) is an attractive model compression technology that can effectively alleviate this problem. Multi-teacher knowledge distillation (MKD) aims to leverage the valuable and diverse knowledge distilled by multiple teacher networks to improve the performance of the student network. Existing approaches typically rely on simple methods such as averaging the prediction logits or using sub-optimal weighting strategies to fuse distilled knowledge from multiple teachers. However, employing these techniques cannot fully reflect the importance of teachers and may even mislead student’s learning. To address this issue, we propose a novel Decoupled Multi-Teacher Knowledge Distillation based on Entropy (DE-MKD). DE-MKD decouples the vanilla knowledge distillation loss and assigns adaptive weights to each teacher to reflect its importance based on the entropy of their predictions. Furthermore, we extend the proposed approach to distill the intermediate features from multiple powerful but cumbersome teachers to improve the performance of the lightweight student network. Extensive experiments on the publicly available CIFAR-100 image classification benchmark dataset with various teacher-student network pairs demonstrated the effectiveness and flexibility of our approach. For instance, the VGG8|ShuffleNetV2 model trained by DE-MKD reached 75.25%|78.86% top-one accuracy when choosing VGG13|WRN40-2 as the teacher, setting new performance records. In addition, surprisingly, the distilled student model outperformed the teacher in both teacher-student network pairs. Full article
Show Figures

Figure 1

21 pages, 5915 KB  
Article
YOLOv8-LMG: An Improved Bearing Defect Detection Algorithm Based on YOLOv8
by Minggao Liu, Ming Zhang, Xinlan Chen, Chunting Zheng and Haifeng Wang
Processes 2024, 12(5), 930; https://doi.org/10.3390/pr12050930 - 2 May 2024
Cited by 13 | Viewed by 3721
Abstract
In industrial manufacturing, bearings are crucial for machinery stability and safety. Undetected wear or cracks can lead to severe operational and financial setbacks. Thus, accurately identifying bearing defects is essential for maintaining production safety and equipment reliability. This research introduces an improved bearing [...] Read more.
In industrial manufacturing, bearings are crucial for machinery stability and safety. Undetected wear or cracks can lead to severe operational and financial setbacks. Thus, accurately identifying bearing defects is essential for maintaining production safety and equipment reliability. This research introduces an improved bearing defect detection model, YOLOv8-LMG, which is based on the YOLOv8n framework and incorporates four innovative technologies: the VanillaNet backbone network, the Lion optimizer, the CFP-EVC module, and the Shape-IoU loss function. These enhancements significantly increase detection efficiency and accuracy. YOLOv8-LMG achieves a mAP@0.5 of 86.5% and a mAP@0.5–0.95 of 57.0% on the test dataset, surpassing the original YOLOv8n model while maintaining low computational complexity. Experimental results reveal that the YOLOv8-LMG model boosts accuracy and efficiency in bearing defect detection, showcasing its significant potential and practical value in advancing industrial inspection technologies. Full article
(This article belongs to the Special Issue Fault Diagnosis Process and Evaluation in Systems Engineering)
Show Figures

Figure 1

Back to TopTop