MDPI - Publisher of Open Access Journals

34 pages, 3535 KiB

Open AccessArticle

Hybrid Optimization and Explainable Deep Learning for Breast Cancer Detection

by Maral A. Mustafa, Osman Ayhan Erdem and Esra Söğüt

Appl. Sci. 2025, 15(15), 8448; https://doi.org/10.3390/app15158448 - 30 Jul 2025

Viewed by 65

Breast cancer continues to be one of the leading causes of women’s deaths around the world, and this has emphasized the necessity to have novel and interpretable diagnostic models. This work offers a clear learning deep learning model that integrates the mobility of [...] Read more.

Breast cancer continues to be one of the leading causes of women’s deaths around the world, and this has emphasized the necessity to have novel and interpretable diagnostic models. This work offers a clear learning deep learning model that integrates the mobility of MobileNet and two bio-driven optimization operators, the Firefly Algorithm (FLA) and Dingo Optimization Algorithm (DOA), in an effort to boost classification appreciation and the convergence of the model. The suggested model demonstrated excellent findings as the DOA-optimized MobileNet acquired the highest performance of 98.96 percent accuracy on the fusion test, and the FLA-optimized MobileNet scaled up to 98.06 percent and 95.44 percent accuracies on mammographic and ultrasound tests, respectively. Further to good quantitative results, Grad-CAM visualizations indeed showed clinically consistent localization of the lesions, which strengthened the interpretability and model diagnostic reliability of Grad-CAM. These results show that lightweight, compact CNNs can be used to do high-performance, multimodal breast cancer diagnosis. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

35 pages, 4940 KiB

Open AccessArticle

A Novel Lightweight Facial Expression Recognition Network Based on Deep Shallow Network Fusion and Attention Mechanism

by Qiaohe Yang, Yueshun He, Hongmao Chen, Youyong Wu and Zhihua Rao

Algorithms 2025, 18(8), 473; https://doi.org/10.3390/a18080473 - 30 Jul 2025

Viewed by 137

Abstract

Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to [...] Read more.

Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to run efficiently on mobile devices or edge devices, so the research on lightweight face expression recognition is particularly important. However, feature extraction and classification methods of lightweight convolutional neural network expression recognition algorithms mostly used at present are not specifically and fully optimized for the characteristics of facial expression images, yet fail to make full use of the feature information in face expression images. To address the lack of facial expression recognition models that are both lightweight and effectively optimized for expression-specific feature extraction, this study proposes a novel network design tailored to the characteristics of facial expressions. In this paper, we refer to the backbone architecture of MobileNet V2 network, and redesign LightExNet, a lightweight convolutional neural network based on the fusion of deep and shallow layers, attention mechanism, and joint loss function, according to the characteristics of the facial expression features. In the network architecture of LightExNet, firstly, deep and shallow features are fused in order to fully extract the shallow features in the original image, reduce the loss of information, alleviate the problem of gradient disappearance when the number of convolutional layers increases, and achieve the effect of multi-scale feature fusion. The MobileNet V2 architecture has also been streamlined to seamlessly integrate deep and shallow networks. Secondly, by combining the own characteristics of face expression features, a new channel and spatial attention mechanism is proposed to obtain the feature information of different expression regions as much as possible for encoding. Thus improve the accuracy of expression recognition effectively. Finally, the improved center loss function is superimposed to further improve the accuracy of face expression classification results, and corresponding measures are taken to significantly reduce the computational volume of the joint loss function. In this paper, LightExNet is tested on the three mainstream face expression datasets: Fer2013, CK+ and RAF-DB, respectively, and the experimental results show that LightExNet has 3.27 M Parameters and 298.27 M Flops, and the accuracy on the three datasets is 69.17%, 97.37%, and 85.97%, respectively. The comprehensive performance of LightExNet is better than the current mainstream lightweight expression recognition algorithms such as MobileNet V2, IE-DBN, Self-Cure Net, Improved MobileViT, MFN, Ada-CM, Parallel CNN(Convolutional Neural Network), etc. Experimental results confirm that LightExNet effectively improves recognition accuracy and computational efficiency while reducing energy consumption and enhancing deployment flexibility. These advantages underscore its strong potential for real-world applications in lightweight facial expression recognition. Full article

► Show Figures

Figure 1

28 pages, 4007 KiB

Open AccessArticle

Voting-Based Classification Approach for Date Palm Health Detection Using UAV Camera Images: Vision and Learning

by Abdallah Guettaf Temam, Mohamed Nadour, Lakhmissi Cherroun, Ahmed Hafaifa, Giovanni Angiulli and Fabio La Foresta

Drones 2025, 9(8), 534; https://doi.org/10.3390/drones9080534 - 29 Jul 2025

Viewed by 172

Abstract

In this study, we introduce the application of deep learning (DL) models, specifically convolutional neural networks (CNNs), for detecting the health status of date palm leaves using images captured by an unmanned aerial vehicle (UAV). The images are modeled using the Newton–Euler method [...] Read more.

In this study, we introduce the application of deep learning (DL) models, specifically convolutional neural networks (CNNs), for detecting the health status of date palm leaves using images captured by an unmanned aerial vehicle (UAV). The images are modeled using the Newton–Euler method to ensure stability and accurate image acquisition. These deep learning models are implemented by a voting-based classification (VBC) system that combines multiple CNN architectures, including MobileNet, a handcrafted CNN, VGG16, and VGG19, to enhance classification accuracy and robustness. The classifiers independently generate predictions, and a voting mechanism determines the final classification. This hybridization of image-based visual servoing (IBVS) and classifiers makes immediate adaptations to changing conditions, providing straightforward and smooth flying as well as vision classification. The dataset used in this study was collected using a dual-camera UAV, which captures high-resolution images to detect pests in date palm leaves. After applying the proposed classification strategy, the implemented voting method achieved an impressive accuracy of 99.16% on the test set for detecting health conditions in date palm leaves, surpassing individual classifiers. The obtained results are discussed and compared to show the effectiveness of this classification technique. Full article

► Show Figures

Figure 1

23 pages, 3741 KiB

Open AccessArticle

Multi-Corpus Benchmarking of CNN and LSTM Models for Speaker Gender and Age Profiling

by Jorge Jorrin-Coz, Mariko Nakano, Hector Perez-Meana and Leobardo Hernandez-Gonzalez

Computation 2025, 13(8), 177; https://doi.org/10.3390/computation13080177 - 23 Jul 2025

Viewed by 247

Abstract

Speaker profiling systems are often evaluated on a single corpus, which complicates reliable comparison. We present a fully reproducible evaluation pipeline that trains Convolutional Neural Networks (CNNs) and Long-Short Term Memory (LSTM) models independently on three speech corpora representing distinct recording conditions—studio-quality TIMIT, [...] Read more.

Speaker profiling systems are often evaluated on a single corpus, which complicates reliable comparison. We present a fully reproducible evaluation pipeline that trains Convolutional Neural Networks (CNNs) and Long-Short Term Memory (LSTM) models independently on three speech corpora representing distinct recording conditions—studio-quality TIMIT, crowdsourced Mozilla Common Voice, and in-the-wild VoxCeleb1. All models share the same architecture, optimizer, and data preprocessing; no corpus-specific hyperparameter tuning is applied. We perform a detailed preprocessing and feature extraction procedure, evaluating multiple configurations and validating their applicability and effectiveness in improving the obtained results. A feature analysis shows that Mel spectrograms benefit CNNs, whereas Mel Frequency Cepstral Coefficients (MFCCs) suit LSTMs, and that the optimal Mel-bin count grows with corpus Signal Noise Rate (SNR). With this fixed recipe, EfficientNet achieves 99.82% gender accuracy on Common Voice (+1.25 pp over the previous best) and 98.86% on VoxCeleb1 (+0.57 pp). MobileNet attains 99.86% age-group accuracy on Common Voice (+2.86 pp) and a 5.35-year MAE for age estimation on TIMIT using a lightweight configuration. The consistent, near-state-of-the-art results across three acoustically diverse datasets substantiate the robustness and versatility of the proposed pipeline. Code and pre-trained weights are released to facilitate downstream research. Full article

(This article belongs to the Section Computational Engineering)

► Show Figures

Graphical abstract

24 pages, 8015 KiB

Open AccessArticle

Innovative Multi-View Strategies for AI-Assisted Breast Cancer Detection in Mammography

by Beibit Abdikenov, Tomiris Zhaksylyk, Aruzhan Imasheva, Yerzhan Orazayev and Temirlan Karibekov

J. Imaging 2025, 11(8), 247; https://doi.org/10.3390/jimaging11080247 - 22 Jul 2025

Viewed by 419

Abstract

Mammography is the main method for early detection of breast cancer, which is still a major global health concern. However, inter-reader variability and the inherent difficulty of interpreting subtle radiographic features frequently limit the accuracy of diagnosis. A thorough assessment of deep convolutional [...] Read more.

Mammography is the main method for early detection of breast cancer, which is still a major global health concern. However, inter-reader variability and the inherent difficulty of interpreting subtle radiographic features frequently limit the accuracy of diagnosis. A thorough assessment of deep convolutional neural networks (CNNs) for automated mammogram classification is presented in this work, along with the introduction of two innovative multi-view integration techniques: Dual-Branch Ensemble (DBE) and Merged Dual-View (MDV). By setting aside two datasets for out-of-sample testing, we evaluate the generalizability of the model using six different mammography datasets that represent various populations and imaging systems. We compare a number of cutting-edge architectures on both individual and combined datasets, including ResNet, DenseNet, EfficientNet, MobileNet, Vision Transformers, and VGG19. Both MDV and DBE strategies improve classification performance, according to experimental results. VGG19 and DenseNet both obtained high ROC AUC scores of 0.9051 and 0.7960 under the MDV approach. DenseNet demonstrated strong performance in the DBE setting, achieving a ROC AUC of 0.8033, while ResNet50 recorded a ROC AUC of 0.8042. These enhancements demonstrate how beneficial multi-view fusion is for boosting model robustness. The impact of domain shift is further highlighted by generalization tests, which emphasize the need for diverse datasets in training. These results offer practical advice for improving CNN architectures and integration tactics, which will aid in the creation of trustworthy, broadly applicable AI-assisted breast cancer screening tools. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Graphical abstract

15 pages, 3364 KiB

Open AccessArticle

Potential Benefits of Polar Transformation of Time–Frequency Electrocardiogram (ECG) Signals for Evaluation of Cardiac Arrhythmia

by Hanbit Kang, Daehyun Kwon and Yoon-Chul Kim

Appl. Sci. 2025, 15(14), 7980; https://doi.org/10.3390/app15147980 - 17 Jul 2025

Viewed by 217

Abstract

There is a lack of studies on the effectiveness of polar-transformed spectrograms in the visualization and prediction of cardiac arrhythmias from electrocardiogram (ECG) data. In this study, single-lead ECG waveforms were converted into two-dimensional rectangular time–frequency spectrograms and polar time–frequency spectrograms. Three pre-trained [...] Read more.

There is a lack of studies on the effectiveness of polar-transformed spectrograms in the visualization and prediction of cardiac arrhythmias from electrocardiogram (ECG) data. In this study, single-lead ECG waveforms were converted into two-dimensional rectangular time–frequency spectrograms and polar time–frequency spectrograms. Three pre-trained convolutional neural network (CNN) models (ResNet50, MobileNet, and DenseNet121) served as baseline networks for model development and testing. Prediction performance and visualization quality were evaluated across various image resolutions. The trade-offs between image resolution and model capacity were quantitatively analyzed. Polar-transformed spectrograms demonstrated superior delineation of R-R intervals at lower image resolutions (e.g., 96 × 96 pixels) compared to conventional spectrograms. For deep-learning-based classification of cardiac arrhythmias, polar-transformed spectrograms achieved comparable accuracy to conventional spectrograms across all evaluated resolutions. The results suggest that polar-transformed spectrograms are particularly advantageous for deep CNN predictions at lower resolutions, making them suitable for edge computing applications where the reduced use of computing resources, such as memory and power consumption, is desirable. Full article

(This article belongs to the Special Issue Artificial Intelligence and Computer Technologies in Sports and Healthcare)

► Show Figures

Figure 1

21 pages, 3250 KiB

Open AccessArticle

Deploying Optimized Deep Vision Models for Eyeglasses Detection on Low-Power Platforms

by Henrikas Giedra, Tomyslav Sledevič and Dalius Matuzevičius

Electronics 2025, 14(14), 2796; https://doi.org/10.3390/electronics14142796 - 11 Jul 2025

Viewed by 464

Abstract

This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial [...] Read more.

This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial features and eyewear styles. Several post-training quantization techniques, including Float16, dynamic range, and full integer quantization, were applied to reduce model size and computational demand while preserving detection accuracy. The impact of model architecture and quantization methods on detection accuracy and inference latency was systematically evaluated. The optimized models were deployed and benchmarked on Raspberry Pi 5 and NVIDIA Jetson Orin Nano platforms. Experimental results show that full integer quantization reduces model size by up to 75% while maintaining competitive detection accuracy. Among the evaluated models, MobileNet architectures achieved the most favorable balance between inference speed and accuracy, demonstrating their suitability for real-time eyeglasses detection in resource-constrained environments. These findings enable efficient on-device eyeglasses detection, supporting applications such as virtual try-ons and IoT-based facial analysis systems. Full article

(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)

► Show Figures

Figure 1

24 pages, 9593 KiB

Open AccessArticle

Deep Learning Approaches for Skin Lesion Detection

by Jonathan Vieira, Fábio Mendonça and Fernando Morgado-Dias

Electronics 2025, 14(14), 2785; https://doi.org/10.3390/electronics14142785 - 10 Jul 2025

Viewed by 312

Abstract

Recently, there has been a rise in skin cancer cases, for which early detection is highly relevant, as it increases the likelihood of a cure. In this context, this work presents a benchmarking study of standard Convolutional Neural Network (CNN) architectures for automated [...] Read more.

Recently, there has been a rise in skin cancer cases, for which early detection is highly relevant, as it increases the likelihood of a cure. In this context, this work presents a benchmarking study of standard Convolutional Neural Network (CNN) architectures for automated skin lesion classification. A total of 38 CNN architectures from ten families (ConvNeXt, DenseNet, EfficientNet, Inception, InceptionResNet, MobileNet, NASNet, ResNet, VGG, and Xception) were evaluated using transfer learning on the HAM10000 dataset for seven-class skin lesion classification, namely, actinic keratoses, basal cell carcinoma, benign keratosis-like lesions, dermatofibroma, melanoma, melanocytic nevi, and vascular lesions. The comparative analysis used standardized training conditions, with all models utilizing frozen pre-trained weights. Cross-database validation was then conducted using the ISIC 2019 dataset to assess generalizability across different data distributions. The ConvNeXtXLarge architecture achieved the best performance, despite having one of the lowest performance-to-number-of-parameters ratios, with 87.62% overall accuracy and 76.15% F1 score on the test set, demonstrating competitive results within the established performance range of existing HAM10000-based studies. A proof-of-concept multiplatform mobile application was also implemented using a client–server architecture with encrypted image transmission, demonstrating the viability of integrating high-performing models into healthcare screening tools. Full article

(This article belongs to the Special Issue Future Trends and Challenges of Ubiquitous Computing and Smart Systems, 2nd Edition)

► Show Figures

Figure 1

28 pages, 2676 KiB

Open AccessArticle

Improved Filter Designs Using Image Processing Techniques for Color Vision Deficiency (CVD) Types

by Fatma Akalın, Nilgün Özkan Aksoy, Dilara Top and Esma Kara

Symmetry 2025, 17(7), 1046; https://doi.org/10.3390/sym17071046 - 2 Jul 2025

Viewed by 432

Abstract

The eye is one of our five sense organs, where optical and neural structures are integrated. It works in synchrony with the brain, enabling the formation of meaningful images. However, lack of function, complete absence or structural abnormalities of cone cells in the [...] Read more.

The eye is one of our five sense organs, where optical and neural structures are integrated. It works in synchrony with the brain, enabling the formation of meaningful images. However, lack of function, complete absence or structural abnormalities of cone cells in the cone cells in the retina causes the emergence of types of Color Vision Deficiency (CVD). This deficiency is characterized by the lack of clear vision in the use of colors in the same region of the spectrum, and greatly affects the quality of life of the patient. Therefore, it is important to develop filters that enable colors to be combined successfully. In this study, an original filter design was improved, built on a five-stage systematic structure that complements and supports itself. But optimization regarding performance value needs to be tested with objective methods independent of human decision. Therefore, in order to provide performance analyses based on objective evaluation criteria, original and enhanced images simulated by patients with seven different Color Vision Deficiency (CVD) types were classified with the MobileNet transfer learning model. The classification results show that the developed final filter greatly improves the differences in color perception levels in both eyes. Thus, color stimulation between the two eyes is more balanced, and perceptual symmetry is created. With perceptual symmetry, environmental colors are perceived more consistently and distinguishably, and the visual difficulties encountered by color blind individuals in daily life are reduced. Full article

(This article belongs to the Special Issue Symmetry in Computational Intelligence and Applications)

► Show Figures

Figure 1

39 pages, 2612 KiB

Open AccessArticle

A Deep Learning-Driven CAD for Breast Cancer Detection via Thermograms: A Compact Multi-Architecture Feature Strategy

by Omneya Attallah

Appl. Sci. 2025, 15(13), 7181; https://doi.org/10.3390/app15137181 - 26 Jun 2025

Viewed by 450

Abstract

Breast cancer continues to be the most common malignancy among women worldwide, presenting a considerable public health issue. Mammography, though the gold standard for screening, has limitations that catalyzed the advancement of non-invasive, radiation-free alternatives, such as thermal imaging (thermography). This research introduces [...] Read more.

Breast cancer continues to be the most common malignancy among women worldwide, presenting a considerable public health issue. Mammography, though the gold standard for screening, has limitations that catalyzed the advancement of non-invasive, radiation-free alternatives, such as thermal imaging (thermography). This research introduces a novel computer-aided diagnosis (CAD) framework aimed at improving breast cancer detection via thermal imaging. The suggested framework mitigates the limitations of current CAD systems, which frequently utilize intricate convolutional neural network (CNN) structures and resource-intensive preprocessing, by incorporating streamlined CNN designs, transfer learning strategies, and multi-architecture ensemble methods. Features are primarily obtained from various layers of MobileNet, EfficientNetB0, and ShuffleNet architectures to assess the impact of individual layers on classification performance. Following that, feature transformation methods, such as discrete wavelet transform (DWT) and non-negative matrix factorization (NNMF), are employed to diminish feature dimensionality and enhance computational efficiency. Features from all layers of the three CNNs are subsequently incorporated, and the Minimum Redundancy Maximum Relevance (MRMR) algorithm is utilized to determine the most prominent features. Ultimately, support vector machine (SVM) classifiers are employed for classification purposes. The results indicate that integrating features from various CNNs and layers markedly improves performance, attaining a maximum accuracy of 99.4%. Furthermore, the combination of attributes from all three layers of the CNNs, in conjunction with NNMF, attained a maximum accuracy of 99.9% with merely 350 features. This CAD system demonstrates the efficacy of thermal imaging and multi-layer feature amalgamation to enhance non-invasive breast cancer diagnosis by reducing computational requirements through multi-layer feature integration and dimensionality reduction techniques. Full article

(This article belongs to the Special Issue Application of Decision Support Systems in Biomedical Engineering)

► Show Figures

Figure 1

23 pages, 5128 KiB

Open AccessArticle

Transfer Learning Fusion Approaches for Colorectal Cancer Histopathological Image Analysis

by Houda Saif ALGhafri and Chia S. Lim

J. Imaging 2025, 11(7), 210; https://doi.org/10.3390/jimaging11070210 - 26 Jun 2025

Viewed by 502

Abstract

It is well-known that accurate classification of histopathological images is essential for effective diagnosis of colorectal cancer. Our study presents three attention-based decision fusion models that combine pre-trained CNNs (Inception V3, Xception, and MobileNet) with a spatial attention mechanism to enhance feature extraction [...] Read more.

It is well-known that accurate classification of histopathological images is essential for effective diagnosis of colorectal cancer. Our study presents three attention-based decision fusion models that combine pre-trained CNNs (Inception V3, Xception, and MobileNet) with a spatial attention mechanism to enhance feature extraction and focus on critical image regions. A key innovation is the attention-driven fusion strategy at the decision level, where model predictions are weighted by relevance and confidence to improve classification performance. The proposed models were tested on diverse datasets, including 17,531 colorectal cancer histopathological images collected from the Royal Hospital in the Sultanate of Oman and a publicly accessible repository, to assess their generalizability. The performance results achieved high accuracy (98–100%), strong MCC and Kappa scores, and low misclassification rates, highlighting the robustness of the proposed models. These models outperformed individual transfer learning approaches (p = 0.009), with performance differences attributed to the characteristics of the datasets. Gradient-weighted class activation highlighted key predictive regions, enhancing interpretability. Our findings suggest that the proposed models demonstrate the potential for accurately classifying CRC images, highlighting their value for research and future exploration in diagnostic support. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

27 pages, 2079 KiB

Open AccessArticle

Deep Learning-Based Draw-a-Person Intelligence Quotient Screening

by Shafaat Hussain, Toqeer Ehsan, Hassan Alhuzali and Ali Al-Laith

Big Data Cogn. Comput. 2025, 9(7), 164; https://doi.org/10.3390/bdcc9070164 - 24 Jun 2025

Viewed by 742

Abstract

The Draw-A-Person Intellectual Ability test for children, adolescents, and adults is a widely used tool in psychology for assessing intellectual ability. This test relies on human drawings for initial raw scoring, with the subsequent conversion of data into IQ ranges through manual procedures. [...] Read more.

The Draw-A-Person Intellectual Ability test for children, adolescents, and adults is a widely used tool in psychology for assessing intellectual ability. This test relies on human drawings for initial raw scoring, with the subsequent conversion of data into IQ ranges through manual procedures. However, this manual scoring and IQ assessment process can be time-consuming, particularly for busy psychologists dealing with a high caseload of children and adolescents. Presently, DAP-IQ screening continues to be a manual endeavor conducted by psychologists. The primary objective of our research is to streamline the IQ screening process for psychologists by leveraging deep learning algorithms. In this study, we utilized the DAP-IQ manual to derive IQ measurements and categorized the entire dataset into seven distinct classes: Very Superior, Superior, High Average, Average, Below Average, Significantly Impaired, and Mildly Impaired. The dataset for IQ screening was sourced from primary to high school students aged from 8 to 17, comprising over 1100 sketches, which were subsequently manually classified under the DAP-IQ manual. Subsequently, the manual classified dataset was converted into digital images. To develop the artificial intelligence-based models, various deep learning algorithms were employed, including Convolutional Neural Network (CNN) and state-of-the-art CNN (Transfer Learning) models such as Mobile-Net, Xception, InceptionResNetV2, and InceptionV3. The Mobile-Net model demonstrated remarkable performance, achieving a classification accuracy of 98.68%, surpassing the capabilities of existing methodologies. This research represents a significant step towards expediting and enhancing the IQ screening for psychologists working with diverse age groups. Full article

(This article belongs to the Special Issue Advances and Applications of Deep Learning Methods and Image Processing)

► Show Figures

Figure 1

28 pages, 3267 KiB

Open AccessArticle

Alzheimer’s Disease Detection in Various Brain Anatomies Based on Optimized Vision Transformer

by Faisal Mehmood, Asif Mehmood and Taeg Keun Whangbo

Mathematics 2025, 13(12), 1927; https://doi.org/10.3390/math13121927 - 10 Jun 2025

Viewed by 527

Abstract

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder and a growing public health concern. Despite significant advances in deep learning for medical image analysis, early and accurate diagnosis of AD remains challenging. In this study, we focused on optimizing the training process of [...] Read more.

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder and a growing public health concern. Despite significant advances in deep learning for medical image analysis, early and accurate diagnosis of AD remains challenging. In this study, we focused on optimizing the training process of deep learning models by proposing an enhanced version of the Adam optimizer. The proposed optimizer introduces adaptive learning rate scaling, momentum correction, and decay modulation to improve convergence speed, training stability, and classification accuracy. We integrated the enhanced optimizer with Vision Transformer (ViT) and Convolutional Neural Network (CNN) architectures. The ViT-based model comprises a linear projection of image patches, positional encoding, a transformer encoder, and a Multi-Layer Perceptron (MLP) head with a Softmax classifier for multiclass AD classification. Experiments on publicly available Alzheimer’s disease datasets (ADNI-1 and ADNI-2) showed that the enhanced optimizer enabled the ViT model to achieve a 99.84% classification accuracy on Dataset-1 and 95.75% on Dataset-2, outperforming Adam, RMSProp, and SGD. Moreover, the optimizer reduced entropy loss and improved convergence stability by 0.8–2.1% across various architectures, including ResNet, RegNet, and MobileNet. This work contributes a robust optimizer-centric framework that enhances training efficiency and diagnostic accuracy for automated Alzheimer’s disease detection. Full article

(This article belongs to the Special Issue The Application of Deep Neural Networks in Image Processing)

► Show Figures

Figure 1

33 pages, 10095 KiB

Open AccessArticle

Enhanced Brain Tumor Classification Using MobileNetV2: A Comprehensive Preprocessing and Fine-Tuning Approach

by Md Atiqur Rahman, Mohammad Badrul Alam Miah, Md. Abir Hossain and A. S. M. Sanwar Hosen

BioMedInformatics 2025, 5(2), 30; https://doi.org/10.3390/biomedinformatics5020030 - 5 Jun 2025

Viewed by 1820

Abstract

Background: Brain tumors are among the most difficult diseases to deal with in modern medicine due to the uncontrolled cell proliferation, which causes grave damage to the nervous system. Brain tumors can be broadly classified into two categories: primary tumors, which originate within [...] Read more.

Background: Brain tumors are among the most difficult diseases to deal with in modern medicine due to the uncontrolled cell proliferation, which causes grave damage to the nervous system. Brain tumors can be broadly classified into two categories: primary tumors, which originate within the brain, and secondary tumors, which are metastatic in nature. Effective glioma, meningioma, and pituitary tumor diagnosis and treatment requires the precise differentiation of these tumors as well as non-tumors for improved clinical outcomes. Methods: Here, we present a new method to classify brain tumors based on the MobileNetV2 architecture with advanced preprocessing for high accuracy. We accessed an MRI image dataset from Kaggle that contained 1311 images in the test set. We split the data into 80% training and 20% testing. All images underwent extensive preprocessing, including grayscale conversion, noise removal, and contrast-limited-adaptive-histogram equalization (CLAHE). All images were resized to 224 × 224 pixels. Using transfer learning, the baseline frozen layers were kept intact while the top layers were trained with a learning rate of 0.0001, which was tuned to the model’s requirements using early stopping to avoid overfitting. Results: With the outlined methodology, we obtained an astounding accuracy of 99.16%, including strong performance in the no-tumor category, where recall rates were approaching 100% and false positive rates were minimized. Conclusions: These findings strongly indicate that the application of lightweight convolutional neural networks in diagnostic imaging can considerably expedite accurate brain tumor identification by radiologists. Full article

(This article belongs to the Section Applied Biomedical Data Science)

► Show Figures

Figure 1

19 pages, 5574 KiB

Open AccessArticle

Low-Damage Grasp Method for Plug Seedlings Based on Machine Vision and Deep Learning

by Fengwei Yuan, Gengzhen Ren, Zhang Xiao, Erjie Sun, Guoning Ma, Shuaiyin Chen, Zhenlong Li, Zhenhong Zou and Xiangjiang Wang

Agronomy 2025, 15(6), 1376; https://doi.org/10.3390/agronomy15061376 - 4 Jun 2025

Viewed by 381

Abstract

In the process of plug seedling transplantation, the cracking and dropping of seedling substrate or the damage of seedling stems and leaves will affect the survival rate of seedlings after transplantation. Currently, most research focuses on the reduction of substrate loss, while ignoring [...] Read more.

In the process of plug seedling transplantation, the cracking and dropping of seedling substrate or the damage of seedling stems and leaves will affect the survival rate of seedlings after transplantation. Currently, most research focuses on the reduction of substrate loss, while ignoring damage to the hole tray seedling itself. Targeting the problem of high damage rate during transplantation of plug seedlings, we have proposed an adaptive grasp method based on machine vision and deep learning, and designed a lightweight real-time grasp detection network (LRGN). The lightweight network Mobilenet is used as the feature extraction network to reduce the number of parameters of the network. Meanwhile, a dilated refinement module (DRM) is designed to increase the receptive field effectively and capture more contextual information. Further, a pixel-attention-guided fusion module (PAG) and a depth-guided fusion module (DGFM) are proposed to effectively fuse deep and shallow features to extract multi-scale information. Lastly, a mixed attention module (MAM) is proposed to enhance the network’s attention to important grasp features. The experimental results show that the proposed network can reach 98.96% and 98.30% accuracy of grasp detection for the image splitting and object splitting subsets of the Cornell dataset, respectively. The accuracy of grasp detection for the plug seedling grasp dataset is up to 98.83%, and the speed of image detection is up to 113 images/sec, with the number of parameters only 12.67 M. Compared with the comparison network, the proposed network not only has a smaller computational volume and number of parameters, but also significantly improves the accuracy and speed of grasp detection, and the generated grasp results can effectively avoid seedlings, reduce the damage rate in the grasp phase of the plug seedlings, and realize a low-damage grasp, which provides the theoretical basis and method for low-damage transplantation mechanical equipment. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

Search Results (519)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (519)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI