MDPI - Publisher of Open Access Journals

30 pages, 4298 KB

Open AccessArticle

Integrating Convolutional, Transformer, and Graph Neural Networks for Precision Agriculture and Food Security

by Esraa A. Mahareek, Mehmet Akif Cifci and Abeer S. Desuky

AgriEngineering 2025, 7(10), 353; https://doi.org/10.3390/agriengineering7100353 (registering DOI) - 19 Oct 2025

Ensuring global food security requires accurate and robust solutions for crop health monitoring, weed detection, and large-scale land-cover classification. To this end, we propose AgroVisionNet, a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs) for local feature extraction, Vision Transformers (ViTs) [...] Read more.

Ensuring global food security requires accurate and robust solutions for crop health monitoring, weed detection, and large-scale land-cover classification. To this end, we propose AgroVisionNet, a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs) for local feature extraction, Vision Transformers (ViTs) for capturing long-range global dependencies, and Graph Neural Networks (GNNs) for modeling spatial relationships between image regions. The framework was evaluated on five diverse benchmark datasets—PlantVillage (leaf-level disease detection), Agriculture-Vision (field-scale anomaly segmentation), BigEarthNet (satellite-based land-cover classification), UAV Crop and Weed (weed segmentation), and EuroSAT (multi-class land-cover recognition). Across these datasets, AgroVisionNet consistently outperformed strong baselines including ResNet-50, EfficientNet-B0, ViT, and Mask R-CNN. For example, it achieved 97.8% accuracy and 95.6% IoU on PlantVillage, 94.5% accuracy on Agriculture-Vision, 92.3% accuracy on BigEarthNet, 91.5% accuracy on UAV Crop and Weed, and 96.4% accuracy on EuroSAT. These results demonstrate the framework’s robustness across tasks ranging from fine-grained disease detection to large-scale anomaly mapping. The proposed hybrid approach addresses persistent challenges in agricultural imaging, including class imbalance, image quality variability, and the need for multi-scale feature integration. By combining complementary architectural strengths, AgroVisionNet establishes a new benchmark for deep learning applications in precision agriculture. Full article

(This article belongs to the Special Issue The Application of Machine Learning and Deep Learning Techniques in Agriculture)

► Show Figures

Figure 1

27 pages, 3749 KB

Open AccessArticle

A Lightweight Deep Learning Model for Tea Leaf Disease Identification

by Bo-Yu Lien and Chih-Chin Lai

Mach. Learn. Knowl. Extr. 2025, 7(4), 123; https://doi.org/10.3390/make7040123 (registering DOI) - 19 Oct 2025

Abstract

Tea is a globally important economic crop, and the ability to quickly and accurately identify tea leaf diseases can significantly improve both the yield and quality of tea production. With advances in deep learning, many recent studies have demonstrated that convolutional neural networks [...] Read more.

Tea is a globally important economic crop, and the ability to quickly and accurately identify tea leaf diseases can significantly improve both the yield and quality of tea production. With advances in deep learning, many recent studies have demonstrated that convolutional neural networks are both feasible and effective for identifying tea leaf diseases. In this paper, we propose a modified EfficientNetB0 lightweight convolutional neural network, enhanced with the ECA module, to reliably identify various tea leaf diseases. We used two tea leaf disease datasets from the Kaggle platform: the Tea_Leaf_Disease dataset, which contains six categories, and the teaLeafBD dataset, which includes seven categories. Experimental results show that our method substantially reduces computational costs, the number of parameters, and overall model size. Additionally, it achieves accuracies of 99.49% and 90.73% on these widely used datasets, making it highly suitable for practical deployment on resource-constrained edge devices. Full article

► Show Figures

Figure 1

15 pages, 1727 KB

Open AccessArticle

Artificial Intelligence for Diagnosing Cranial Nerve III, IV, and VI Palsies Using Nine-Directional Ocular Photographs

by Hyun Jin Shin, Seok Jin Kim, Sung Hyun Park, Min Seok Kim and Hyunkyoo Kang

Appl. Sci. 2025, 15(20), 11174; https://doi.org/10.3390/app152011174 (registering DOI) - 18 Oct 2025

Viewed by 41

Abstract

Eye movements are regulated by the ocular motor nerves (cranial nerves [CNs] III, IV, and VI), which control the six extraocular muscles of each eye. Palsies of CNs III, IV, and VI can restrict eye movements, resulting in strabismus and diplopia, and so [...] Read more.

Eye movements are regulated by the ocular motor nerves (cranial nerves [CNs] III, IV, and VI), which control the six extraocular muscles of each eye. Palsies of CNs III, IV, and VI can restrict eye movements, resulting in strabismus and diplopia, and so clinical evaluations of eye movements are crucial for diagnosing CN palsies. This study aimed to develop an accurate artificial intelligence (AI) system for classifying CN III, IV, and VI palsies using nine-gaze ocular photographs. We analyzed 478 nine-gaze photographs comprising 70, 29, and 58 cases of CN III, IV, and VI palsies, respectively. The images were processed using MATLAB. For model training, each photograph of eye movements in the nine directions was numerically coded. A multinetwork model was employed to ensure precise analyses of paralytic strabismus. The AI system operates by referring data on minor abnormalities in the nine-gaze image to a network designed to detect CN IV abnormalities, which re-examines downward and lateral gazes to detect distinctions. Data on major abnormalities are directed to a different network trained to differentiate between CN III and VI abnormalities. EfficientNet-B0 was applied to reduce overfitting and improve learning efficiency in training with limited medical imaging data as the neural network architecture. The diagnostic accuracies of the proposed network for CN III, IV, and VI palsies were 99.31%, 97.7%, and 98.22%, respectively. This study has demonstrated the design of an AI model using a relatively small dataset and a multinetwork training system for analyzing nine-gaze photographs in strabismus patients with CN III, IV, and VI palsies, achieving an overall accuracy of 98.77%. Full article

(This article belongs to the Special Issue Advanced Digital Technology and Artificial Intelligence in Ophthalmology)

► Show Figures

Figure 1

30 pages, 3661 KB

Open AccessArticle

Bio-Inspired Optimization of Transfer Learning Models for Diabetic Macular Edema Classification

by A. M. Mutawa, Khalid Sabti, Bibin Shalini Sundaram Thankaleela and Seemant Raizada

AI 2025, 6(10), 269; https://doi.org/10.3390/ai6100269 - 17 Oct 2025

Viewed by 106

Abstract

Diabetic Macular Edema (DME) poses a significant threat to vision, often leading to permanent blindness if not detected and addressed swiftly. Existing manual diagnostic methods are arduous and inconsistent, highlighting the pressing necessity for automated, accurate, and personalized solutions. This study presents a [...] Read more.

Diabetic Macular Edema (DME) poses a significant threat to vision, often leading to permanent blindness if not detected and addressed swiftly. Existing manual diagnostic methods are arduous and inconsistent, highlighting the pressing necessity for automated, accurate, and personalized solutions. This study presents a novel methodology for diagnosing DME and categorizing choroidal neovascularization (CNV), drusen, and normal conditions from fundus images through the application of transfer learning models and bio-inspired optimization methodologies. The methodology utilizes advanced transfer learning architectures, including VGG16, VGG19, ResNet50, EfficientNetB7, EfficientNetV2-S, InceptionV3, and InceptionResNetV2, for analyzing both binary and multi-class Optical Coherence Tomography (OCT) datasets. We combined the OCT datasets OCT2017 and OCTC8 to create a new dataset for our study. The parameters, including learning rate, batch size, and dropout layer of the fully connected network, are further adjusted using the bio-inspired Particle Swarm Optimization (PSO) method, in conjunction with thorough preprocessing. Explainable AI approaches, especially Shapley additive explanations (SHAP), provide transparent insights into the model’s decision-making processes. Experimental findings demonstrate that our bio-inspired optimized transfer learning Inception V3 significantly surpasses conventional deep learning techniques for DME classification, as evidenced by enhanced metrics including the accuracy, precision, recall, F1-score, misclassification rate, Matthew’s correlation coefficient, intersection over union, and kappa coefficient for both binary and multi-class scenarios. The accuracy achieved is approximately 98% in binary classification and roughly 90% in multi-class classification with the Inception V3 model. The integration of contemporary transfer learning architectures with nature-inspired PSO enhances diagnostic precision to approximately 95% in multi-class classification, while also improving interpretability and reliability, which are crucial for clinical implementation. This research promotes the advancement of more precise, personalized, and timely diagnostic and therapeutic strategies for Diabetic Macular Edema, aiming to avert vision loss and improve patient outcomes. Full article

(This article belongs to the Special Issue Artificial Intelligence in Biomedical Engineering: Challenges and Developments)

27 pages, 9637 KB

Open AccessArticle

ConvNeXt-L-Based Recognition of Decorative Patterns in Historical Architecture: A Case Study of Macau

by Junling Zhou, Lingfeng Xie, Pia Fricker and Kuan Liu

Buildings 2025, 15(20), 3705; https://doi.org/10.3390/buildings15203705 - 14 Oct 2025

Viewed by 224

Abstract

As a well-known World Cultural Heritage Site, the Historic Centre of Macao’s historical buildings possess a wealth of decorative patterns. These patterns contain cultural esthetics, geographical environment, cultural traditions, and other elements from specific historical periods, deeply reflecting the evolution of religious rituals [...] Read more.

As a well-known World Cultural Heritage Site, the Historic Centre of Macao’s historical buildings possess a wealth of decorative patterns. These patterns contain cultural esthetics, geographical environment, cultural traditions, and other elements from specific historical periods, deeply reflecting the evolution of religious rituals and political and economic systems throughout history. Through long-term research, this article constructs a dataset of 11,807 images of local decorative patterns of historical buildings in Macau, and proposes a fine-grained image classification method using the ConvNeXt-L model. The ConvNeXt-L model is an efficient convolutional neural network that has demonstrated excellent performance in image classification tasks in fields such as medicine and architecture. Its outstanding advantages lie in limited training samples, diverse image features, and complex scenes. The most typical advantage of this model is its structural integration of key design concepts from a Transformer, which significantly enhances the feature extraction and generalization ability of samples. In response to the objective reality that the decorative patterns of historical buildings in Macau have rich levels of detail and a limited number of functional building categories, ConvNeXt-L maximizes its ability to recognize and classify patterns while ensuring computational efficiency. This provides a more ideal technical path for the classification of small-sample complex images. This article constructs a deep learning system based on the PyTorch 1.11 framework and compares ResNet50, EfficientNet-B7, ViT-B/16, Swin-B, RegNet-Y-16GF, and ConvNeXt series models. The results indicate a positive correlation between model performance and structural complexity, with ConvNeXt-L being the most ideal in terms of accuracy in decorative pattern classification, due to its fusion of convolution and attention mechanisms. This study not only provides a multidimensional exploration for the protection and revitalization of Macao’s historical and cultural heritage and enriches theoretical support and practical foundations but also provides new research paths and methodological support for artificial intelligence technology to assist in the planning and decision-making of historical urban areas. Full article

(This article belongs to the Special Issue Cutting-Edge Research on Smart, Sustainable, and Resilient Buildings and Cities)

► Show Figures

Figure 1

19 pages, 4172 KB

Open AccessArticle

Deep Learning Application of Fruit Planting Classification Based on Multi-Source Remote Sensing Images

by Jiamei Miao, Jian Gao, Lei Wang, Lei Luo and Zhi Pu

Appl. Sci. 2025, 15(20), 10995; https://doi.org/10.3390/app152010995 - 13 Oct 2025

Viewed by 196

Abstract

With global climate change, urbanization, and agricultural resource limitations, precision agriculture and crop monitoring are crucial worldwide. Integrating multi-source remote sensing data with deep learning enables accurate crop mapping, but selecting optimal network architectures remains challenging. To improve remote sensing-based fruit planting classification [...] Read more.

With global climate change, urbanization, and agricultural resource limitations, precision agriculture and crop monitoring are crucial worldwide. Integrating multi-source remote sensing data with deep learning enables accurate crop mapping, but selecting optimal network architectures remains challenging. To improve remote sensing-based fruit planting classification and support orchard management and rural revitalization, this study explored feature selection and network optimization. We proposed an improved CF-EfficientNet model (incorporating FGMF and CGAR modules) for fruit planting classification. Multi-source remote sensing data (Sentinel-1, Sentinel-2, and SRTM) were used to extract spectral, vegetation, polarization, terrain, and texture features, thereby constructing a high-dimensional feature space. Feature selection identified 13 highly discriminative bands, forming an optimal dataset, namely the preferred bands (PBs). At the same time, two classification datasets—multi-spectral bands (MS) and preferred bands (PBs)—were constructed, and five typical deep learning models were introduced to compare performance: (1) EfficientNetB0, (2) AlexNet, (3) VGG16, (4) ResNet18, (5) RepVGG. The experimental results showed that the EfficientNetB0 model based on the preferred band performed best in terms of overall accuracy (87.1%) and Kappa coefficient (0.677). Furthermore, a Fine-Grained Multi-scale Fusion (FGMF) and a Condition-Guided Attention Refinement (CGAR) were incorporated into EfficientNetB0, and the traditional SGD optimizer was replaced with Adam to construct the CF-EfficientNet architecture. The results indicated that the improved CF-EfficientNet model achieved high performance in crop classification, with an overall accuracy of 92.6% and a Kappa coefficient of 0.830. These represent improvements of 5.5 percentage points and 0.153, compared with the baseline model, demonstrating superiority in both classification accuracy and stability. Full article

► Show Figures

Figure 1

23 pages, 3293 KB

Open AccessArticle

Organic and Mineral Fertilization on the Photosynthetic, Nutritional, and Productive Efficiency of (Ficus carica L.) Subjected to Conduction Systems in a Semi-Arid Region of Brazil

by Agda Malany Forte de Oliveira, Vander Mendonça, Patrycia Elen Costa Amorim, Raires Irlenizia da Silva Freire, Lucas Rodrigues Bezerra da Silva, David Emanoel Gomes da Silva, Fagner Nogueira Ferreira, Semako Ibrahim Bonou, Luderlândio de Andrade Silva, Pedro Dantas Fernandes, Alberto Soares de Melo and Francisco Vanies da Silva Sá

Agriculture 2025, 15(20), 2128; https://doi.org/10.3390/agriculture15202128 - 13 Oct 2025

Viewed by 283

Abstract

Fig tree growth and development are highly susceptible to variations influenced by abiotic factors and management practices, including fertilization and training systems. This study aimed to evaluate the effect of organic and mineral fertilization on the photosynthetic, nutritional, and productive efficiency of fig [...] Read more.

Fig tree growth and development are highly susceptible to variations influenced by abiotic factors and management practices, including fertilization and training systems. This study aimed to evaluate the effect of organic and mineral fertilization on the photosynthetic, nutritional, and productive efficiency of fig trees subjected to different training systems in semi-arid regions. The experimental design was randomized blocks in a 5 × 4 factorial scheme, with three blocks and three plants per plot. The treatments consisted of five fertilizer sources (mineral fertilizer (NPK) applied at a dose of 126 g N, 90 g P, and 90 g K per plant (M); and four organic sources—cattle manure (CM), organic compost (OC), chicken litter (CL), and sheep manure (SM), all applied at a dose of 10 kg per plant); and four types of training systems (plants with two branches (2B), three branches (3B), four branches (4B), and espalier). Our results demonstrated that the mineral fertilizer (M) and chicken litter (CL) treatments yielded the highest results, particularly in photosynthetic performance. Fig trees fertilized with mineral fertilizer and subjected to the 3B system showed enhanced net photosynthesis (36.96 µmol m⁻² s⁻¹) and, consequently, higher productivity of 21.28 t ha⁻¹. Similarly, plants fertilized with chicken litter (CL) under the 4B system produced comparable results. These findings demonstrate that the use of mineral and organic fertilizers, combined with an appropriate training system, is a viable strategy for optimizing fig productivity and profitability in semi-arid conditions. Full article

(This article belongs to the Special Issue Advanced Cultivation Technologies for Horticultural Crops Production)

► Show Figures

Figure 1

19 pages, 1951 KB

Open AccessArticle

Enhancing Lemon Leaf Disease Detection: A Hybrid Approach Combining Deep Learning Feature Extraction and mRMR-Optimized SVM Classification

by Ahmet Saygılı

Appl. Sci. 2025, 15(20), 10988; https://doi.org/10.3390/app152010988 - 13 Oct 2025

Viewed by 251

Abstract

This study presents a robust and extensible hybrid classification framework for accurately detecting diseases in citrus leaves by integrating transfer learning-based deep learning models with classical machine learning techniques. Features were extracted using advanced pretrained architectures—DenseNet201, ResNet50, MobileNetV2, and EfficientNet-B0—and refined via the [...] Read more.

This study presents a robust and extensible hybrid classification framework for accurately detecting diseases in citrus leaves by integrating transfer learning-based deep learning models with classical machine learning techniques. Features were extracted using advanced pretrained architectures—DenseNet201, ResNet50, MobileNetV2, and EfficientNet-B0—and refined via the minimum redundancy maximum relevance (mRMR) method to reduce redundancy while maximizing discriminative power. These features were classified using support vector machines (SVMs), ensemble bagged trees, k-nearest neighbors (kNNs), and neural networks under stratified 10-fold cross-validation. On the lemon dataset, the best configuration (DenseNet201 + SVM) achieved 94.1 ± 4.9% accuracy, 93.2 ± 5.7% F1 score, and a balanced accuracy of 93.4 ± 6.0%, demonstrating strong and stable performance. To assess external generalization, the same pipeline was applied to mango and pomegranate leaves, achieving 100.0 ± 0.0% and 98.7 ± 1.5% accuracy, respectively—confirming the model’s robustness across citrus and non-citrus domains. Beyond accuracy, lightweight models such as EfficientNet-B0 and MobileNetV2 provided significantly higher throughput and lower latency, underscoring their suitability for real-time agricultural applications. These findings highlight the importance of combining deep representations with efficient classical classifiers for precision agriculture, offering both high diagnostic accuracy and practical deployability in field conditions. Full article

(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)

► Show Figures

Figure 1

22 pages, 7434 KB

Open AccessArticle

A Lightweight Image-Based Decision Support Model for Marine Cylinder Lubrication Based on CNN-ViT Fusion

by Qiuyu Li, Guichen Zhang and Enrui Zhao

J. Mar. Sci. Eng. 2025, 13(10), 1956; https://doi.org/10.3390/jmse13101956 - 13 Oct 2025

Viewed by 219

Abstract

Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, [...] Read more.

Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, a lightweight image retrieval model for cylinder lubrication is proposed, leveraging deep learning and computer vision to support oiling decisions based on visual features. The model comprises three components: a backbone network, a feature enhancement module, and a similarity retrieval module. Specifically, EfficientNetB0 serves as the backbone for efficient feature extraction under low computational overhead. MobileViT Blocks are integrated to combine local feature perception of Convolutional Neural Networks (CNNs) with the global modeling capacity of Transformers. To further improve receptive field and multi-scale representation, Receptive Field Blocks (RFB) are introduced between the components. Additionally, the Convolutional Block Attention Module (CBAM) attention mechanism enhances focus on salient regions, improving feature discrimination. A high-quality image dataset was constructed using WINNING’s large bulk carriers under various sea conditions. The experimental results demonstrate that the EfficientNetB0 + RFB + MobileViT + CBAM model achieves excellent performance with minimal computational cost: 99.71% Precision, 99.69% Recall, and 99.70% F1-score—improvements of 11.81%, 15.36%, and 13.62%, respectively, over the baseline EfficientNetB0. With only a 0.3 GFLOP and 8.3 MB increase in model size, the approach balances accuracy and inference efficiency. The model also demonstrates good robustness and application stability in real-world ship testing, with potential for further adoption in the field of intelligent ship maintenance. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

32 pages, 6508 KB

Open AccessArticle

An Explainable Web-Based Diagnostic System for Alzheimer’s Disease Using XRAI and Deep Learning on Brain MRI

by Serra Aksoy and Arij Daou

Diagnostics 2025, 15(20), 2559; https://doi.org/10.3390/diagnostics15202559 - 10 Oct 2025

Viewed by 510

Abstract

Background: Alzheimer’s disease (AD) is a progressive neurodegenerative condition marked by cognitive decline and memory loss. Despite advancements in AI-driven neuroimaging analysis for AD detection, clinical deployment remains limited due to challenges in model interpretability and usability. Explainable AI (XAI) frameworks such as [...] Read more.

Background: Alzheimer’s disease (AD) is a progressive neurodegenerative condition marked by cognitive decline and memory loss. Despite advancements in AI-driven neuroimaging analysis for AD detection, clinical deployment remains limited due to challenges in model interpretability and usability. Explainable AI (XAI) frameworks such as XRAI offer potential to bridge this gap by providing clinically meaningful visualizations of model decision-making. Methods: This study developed a comprehensive, clinically deployable AI system for AD severity classification using 2D brain MRI data. Three deep learning architectures MobileNet-V3 Large, EfficientNet-B4, and ResNet-50 were trained on an augmented Kaggle dataset (33,984 images across four AD severity classes). The models were evaluated on both augmented and original datasets, with integrated XRAI explainability providing region-based attribution maps. A web-based clinical interface was built using Gradio to deliver real-time predictions and visual explanations. Results: MobileNet-V3 achieved the highest accuracy (99.18% on the augmented test set; 99.47% on the original dataset), while using the fewest parameters (4.2 M), confirming its efficiency and suitability for clinical use. XRAI visualizations aligned with known neuroanatomical patterns of AD progression, enhancing clinical interpretability. The web interface delivered sub-20 s inference with high classification confidence across all AD severity levels, successfully supporting real-world diagnostic workflows. Conclusions: This research presents the first systematic integration of XRAI into AD severity classification using MRI and deep learning. The MobileNet-V3-based system offers high accuracy, computational efficiency, and interpretability through a user-friendly clinical interface. These contributions demonstrate a practical pathway toward real-world adoption of explainable AI for early and accurate Alzheimer’s disease detection. Full article

(This article belongs to the Special Issue Alzheimer's Disease Diagnosis Based on Deep Learning)

► Show Figures

Figure 1

25 pages, 4843 KB

Open AccessArticle

Tools and Methods for Achieving Wi-Fi Sensing in Embedded Devices

by Jesus A. Armenta-Garcia, Felix F. Gonzalez-Navarro, Jesus Caro-Gutierrez and Conrado I. Garcia-Reyes

Sensors 2025, 25(19), 6220; https://doi.org/10.3390/s25196220 - 8 Oct 2025

Viewed by 563

Abstract

Wi-Fi sensing has emerged as a powerful approach to Human Activity Recognition (HAR) by utilizing Channel State Information (CSI). However, current implementations face two significant challenges: reliance on firmware-modified hardware for CSI collection and dependence on GPU/cloud-based deep learning models for inference. To [...] Read more.

Wi-Fi sensing has emerged as a powerful approach to Human Activity Recognition (HAR) by utilizing Channel State Information (CSI). However, current implementations face two significant challenges: reliance on firmware-modified hardware for CSI collection and dependence on GPU/cloud-based deep learning models for inference. To address these limitations, we propose a two-fold embedded solution: a novel CSI collection tool built on low-cost microcontrollers that surpass existing embedded alternatives in packet rate efficiency under standard baud rate conditions and an optimized DenseNet-based HAR model deployable on resource-constrained edge devices without cloud dependency. In addition, a new HAR dataset is presented. To deal with the scarcity of training data, an Empirical Mode Decomposition (EMD)-based data augmentation method is presented. With this strategy, it was possible to enhance model accuracy from 59.91% to 97.55%. Leveraging this enhanced dataset, a compact DenseNet variant is presented. An accuracy of 92.43% at 232 ms inference latency is achieved when implemented on an ESP32-S3 microcontroller. Using as little as 127 kB of memory, the proposed model offers acceptable performance in terms of accuracy and privacy-preserving HAR at the edge; it also represents a scalable and low-cost Wi-Fi sensing solution. Full article

(This article belongs to the Section State-of-the-Art Sensors Technologies)

► Show Figures

Figure 1

21 pages, 4053 KB

Open AccessArticle

Self-Attention-Enhanced Deep Learning Framework with Multi-Scale Feature Fusion for Potato Disease Detection in Complex Multi-Leaf Field Conditions

by Ke Xie, Decheng Xu and Sheng Chang

Appl. Sci. 2025, 15(19), 10697; https://doi.org/10.3390/app151910697 - 3 Oct 2025

Viewed by 378

Abstract

Potato leaf diseases are recognized as a major threat to agricultural productivity and global food security, emphasizing the need for rapid and accurate detection methods. Conventional manual diagnosis is limited by inefficiency and susceptibility to bias, whereas existing automated approaches are often constrained [...] Read more.

Potato leaf diseases are recognized as a major threat to agricultural productivity and global food security, emphasizing the need for rapid and accurate detection methods. Conventional manual diagnosis is limited by inefficiency and susceptibility to bias, whereas existing automated approaches are often constrained by insufficient feature extraction, inadequate integration of multiple leaves, and poor generalization under complex field conditions. To overcome these challenges, a ResNet18-SAWF model was developed, integrating a self-attention mechanism with a multi-scale feature-fusion strategy within the ResNet18 framework. The self-attention module was designed to enhance the extraction of key features, including leaf color, texture, and disease spots, while the feature-fusion module was implemented to improve the holistic representation of multi-leaf structures under complex backgrounds. Experimental evaluation was conducted using a comprehensive dataset comprising both simple and complex background conditions. The proposed model was demonstrated to achieve an accuracy of 98.36% on multi-leaf images with complex backgrounds, outperforming baseline ResNet18 (91.80%), EfficientNet-B0 (86.89%), and MobileNet_V2 (88.53%) by 6.56, 11.47, and 9.83 percentage points, respectively. Compared with existing methods, superior performance was observed, with an 11.55 percentage point improvement over the average accuracy of complex background studies (86.81%) and a 0.7 percentage point increase relative to simple background studies (97.66%). These results indicate that the proposed approach provides a robust, accurate, and practical solution for potato leaf disease detection in real field environments, thereby advancing precision agriculture technologies. Full article

(This article belongs to the Section Agricultural Science and Technology)

► Show Figures

Figure 1

38 pages, 6947 KB

Open AccessArticle

EfficientNet-B3-Based Automated Deep Learning Framework for Multiclass Endoscopic Bladder Tissue Classification

by A. A. Abd El-Aziz, Mahmood A. Mahmood and Sameh Abd El-Ghany

Diagnostics 2025, 15(19), 2515; https://doi.org/10.3390/diagnostics15192515 - 3 Oct 2025

Viewed by 343

Abstract

Background: Bladder cancer (BLCA) is a malignant growth that originates from the urothelial lining of the urinary bladder. Diagnosing BLCA is complex due to the variety of tumor features and its heterogeneous nature, which leads to significant morbidity and mortality. Understanding tumor histopathology [...] Read more.

Background: Bladder cancer (BLCA) is a malignant growth that originates from the urothelial lining of the urinary bladder. Diagnosing BLCA is complex due to the variety of tumor features and its heterogeneous nature, which leads to significant morbidity and mortality. Understanding tumor histopathology is crucial for developing tailored therapies and improving patient outcomes. Objectives: Early diagnosis and treatment are essential to lower the mortality rate associated with bladder cancer. Manual classification of muscular tissues by pathologists is labor-intensive and relies heavily on experience, which can result in interobserver variability due to the similarities in cancerous cell morphology. Traditional methods for analyzing endoscopic images are often time-consuming and resource-intensive, making it difficult to efficiently identify tissue types. Therefore, there is a strong demand for a fully automated and reliable system for classifying smooth muscle images. Methods: This paper proposes a deep learning (DL) technique utilizing the EfficientNet-B3 model and a five-fold cross-validation method to assist in the early detection of BLCA. This model enables timely intervention and improved patient outcomes while streamlining the diagnostic process, ultimately reducing both time and costs for patients. We conducted experiments using the Endoscopic Bladder Tissue Classification (EBTC) dataset for multiclass classification tasks. The dataset was preprocessed using resizing and normalization methods to ensure consistent input. In-depth experiments were carried out utilizing the EBTC dataset, along with ablation studies to evaluate the best hyperparameters. A thorough statistical analysis and comparisons with five leading DL models—ConvNeXtBase, DenseNet-169, MobileNet, ResNet-101, and VGG-16—showed that the proposed model outperformed the others. Conclusions: The EfficientNet-B3 model achieved impressive results: accuracy of 99.03%, specificity of 99.30%, precision of 97.95%, recall of 96.85%, and an F1-score of 97.36%. These findings indicate that the EfficientNet-B3 model demonstrates significant potential in accurately and efficiently diagnosing BLCA. Its high performance and ability to reduce diagnostic time and cost make it a valuable tool for clinicians in the field of oncology and urology. Full article

(This article belongs to the Special Issue AI and Big Data in Medical Diagnostics)

► Show Figures

Figure 1

21 pages, 2189 KB

Open AccessArticle

Hybrid CNN-Swin Transformer Model to Advance the Diagnosis of Maxillary Sinus Abnormalities on CT Images Using Explainable AI

by Mohammad Alhumaid and Ayman G. Fayoumi

Computers 2025, 14(10), 419; https://doi.org/10.3390/computers14100419 - 2 Oct 2025

Viewed by 292

Abstract

Accurate diagnosis of sinusitis is essential due to its widespread prevalence and its considerable impact on patient quality of life. While multiple imaging techniques are available for detecting maxillary sinus, computed tomography (CT) remains the preferred modality because of its high sensitivity and [...] Read more.

Accurate diagnosis of sinusitis is essential due to its widespread prevalence and its considerable impact on patient quality of life. While multiple imaging techniques are available for detecting maxillary sinus, computed tomography (CT) remains the preferred modality because of its high sensitivity and spatial resolution. Although recent advances in deep learning have led to the development of automated methods for sinusitis classification, many existing models perform poorly in the presence of complex pathological features and offer limited interpretability, which hinders their integration into clinical workflows. In this study, we propose a hybrid deep learning framework that combines EfficientNetB0, a convolutional neural network, with the Swin Transformer, a vision transformer, to improve feature representation. An attention-based fusion module is used to integrate both local and global information, thereby enhancing diagnostic accuracy. To improve transparency and support clinical adoption, the model incorporates explainable artificial intelligence (XAI) techniques using Gradient-weighted Class Activation Mapping (Grad-CAM). This allows for visualization of the regions influencing the model’s predictions, helping radiologists assess the clinical relevance of the results. We evaluate the proposed method on a curated maxillary sinus CT dataset covering four diagnostic categories: Normal, Opacified, Polyposis, and Retention Cysts. The model achieves a classification accuracy of 95.83%, with precision, recall, and F1 score all at 95%. Grad-CAM visualizations indicate that the model consistently focuses on clinically significant regions of the sinus anatomy, supporting its potential utility as a reliable diagnostic aid in medical practice. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Modeling Frameworks in Health Informatics and Related Fields)

► Show Figures

Figure 1

20 pages, 14055 KB

Open AccessArticle

TL-Efficient-SE: A Transfer Learning-Based Attention-Enhanced Model for Fingerprint Liveness Detection Across Multi-Sensor Spoof Attacks

by Archana Pallakonda, Rayappa David Amar Raj, Rama Muni Reddy Yanamala, Christian Napoli and Cristian Randieri

Mach. Learn. Knowl. Extr. 2025, 7(4), 113; https://doi.org/10.3390/make7040113 - 1 Oct 2025

Viewed by 369

Abstract

Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both [...] Read more.

Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both real and fake fingerprints taken using four optical sensors and spoofs made using PlayDoh, Ecoflex, and Gelatine, is used to train and test the model architecture. Stratified splitting is performed once the images being input have been scaled and normalized to conform to EfficientNetB0’s format. The SE module adaptively improves appropriate features to competently differentiate live from fake inputs. The classification head comprises fully connected layers, dropout, batch normalization, and a sigmoid output. Empirical results exhibit accuracy between 98.50% and 99.50%, with an AUC varying from 0.978 to 0.9995, providing high precision and recall for genuine users, and robust generalization across unseen spoof types. Compared to existing methods like Slim-ResCNN and HyiPAD, the novelty of our model lies in the Squeeze-and-Excitation mechanism, which enhances feature discrimination by adaptively recalibrating the channels of the feature maps, thereby improving the model’s ability to differentiate between live and spoofed fingerprints. This model has practical implications for deployment in real-time biometric systems, including mobile authentication and secure access control, presenting an efficient solution for protecting against sophisticated spoofing methods. Future research will focus on sensor-invariant learning and adaptive thresholds to further enhance resilience against varying spoofing attacks. Full article

(This article belongs to the Special Issue Advances in Machine and Deep Learning)

► Show Figures

Figure 1

Search Results (766)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (766)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI