Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,511)

Search Parameters:
Keywords = vision deep learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 9767 KiB  
Article
Improved Binary Classification of Underwater Images Using a Modified ResNet-18 Model
by Mehrunnisa, Mikolaj Leszczuk, Dawid Juszka and Yi Zhang
Electronics 2025, 14(15), 2954; https://doi.org/10.3390/electronics14152954 (registering DOI) - 24 Jul 2025
Abstract
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine [...] Read more.
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine eco-systems, analysis of biological habitats, and monitoring underwater infrastructure. Extracting useful information from underwater images is highly challenging due to factors such as light distortion, scattering, poor contrast, and complex foreground patterns. These difficulties make traditional image processing and machine learning techniques struggle to analyze images accurately. As a result, these challenges and complexities make the classification difficult or poor to perform. Recently, deep learning techniques, especially convolutional neural network (CNN), have emerged as influential tools for underwater image classification, contributing noteworthy improvements in accuracy and performance in the presence of all these challenges. In this paper, we have proposed a modified ResNet-18 model for the binary classification of underwater images into raw and enhanced images. In the proposed modified ResNet-18 model, we have added new layers such as Linear, rectified linear unit (ReLU) and dropout layers, arranged in a block that was repeated three times to enhance feature extraction and improve learning. This enables our model to learn the complex patterns present in the image in more detail, which helps the model to perform the classification very well. Due to these newly added layers, our proposed model addresses various complexities such as noise, distortion, varying illumination conditions, and complex patterns by learning vigorous features from underwater image datasets. To handle the issue of class imbalance present in the dataset, we applied a data augmentation technique. The proposed model achieved outstanding performance, with 96% accuracy, 99% precision, 92% sensitivity, 99% specificity, 95% F1-score, and a 96% Area under the Receiver Operating Characteristic Curve (AUC-ROC) score. These results demonstrate the strength and reliability of our proposed model in handling the challenges posed by the underwater imagery and making it a favorable solution for advancing underwater image classification tasks. Full article
Show Figures

Figure 1

15 pages, 2123 KiB  
Article
Multi-Class Visual Cyberbullying Detection Using Deep Neural Networks and the CVID Dataset
by Muhammad Asad Arshed, Zunera Samreen, Arslan Ahmad, Laiba Amjad, Hasnain Muavia, Christine Dewi and Muhammad Kabir
Information 2025, 16(8), 630; https://doi.org/10.3390/info16080630 (registering DOI) - 24 Jul 2025
Abstract
In an era where online interactions increasingly shape social dynamics, the pervasive issue of cyberbullying poses a significant threat to the well-being of individuals, particularly among vulnerable groups. Despite extensive research on text-based cyberbullying detection, the rise of visual content on social media [...] Read more.
In an era where online interactions increasingly shape social dynamics, the pervasive issue of cyberbullying poses a significant threat to the well-being of individuals, particularly among vulnerable groups. Despite extensive research on text-based cyberbullying detection, the rise of visual content on social media platforms necessitates new approaches to address cyberbullying using images. This domain has been largely overlooked. In this paper, we present a novel dataset specifically designed for the detection of visual cyberbullying, encompassing four distinct classes: abuse, curse, discourage, and threat. The initial prepared dataset (cyberbullying visual indicators dataset (CVID)) comprised 664 samples for training and validation, expanded through data augmentation techniques to ensure balanced and accurate results across all classes. We analyzed this dataset using several advanced deep learning models, including VGG16, VGG19, MobileNetV2, and Vision Transformer. The proposed model, based on DenseNet201, achieved the highest test accuracy of 99%, demonstrating its efficacy in identifying the visual cues associated with cyberbullying. To prove the proposed model’s generalizability, the 5-fold stratified K-fold was also considered, and the model achieved an average test accuracy of 99%. This work introduces a dataset and highlights the potential of leveraging deep learning models to address the multifaceted challenges of detecting cyberbullying in visual content. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

18 pages, 4165 KiB  
Article
Localization and Pixel-Confidence Network for Surface Defect Segmentation
by Yueyou Wang, Zixuan Xu, Li Mei, Ruiqing Guo, Jing Zhang, Tingbo Zhang and Hongqi Liu
Sensors 2025, 25(15), 4548; https://doi.org/10.3390/s25154548 - 23 Jul 2025
Abstract
Surface defect segmentation based on deep learning has been widely applied in industrial inspection. However, two major challenges persist in specific application scenarios: first, the imbalanced area distribution between defects and the background leads to degraded segmentation performance; second, fine gaps within defects [...] Read more.
Surface defect segmentation based on deep learning has been widely applied in industrial inspection. However, two major challenges persist in specific application scenarios: first, the imbalanced area distribution between defects and the background leads to degraded segmentation performance; second, fine gaps within defects are prone to over-segmentation. To address these issues, this study proposes a two-stage image segmentation network that integrates a Defect Localization Module and a Pixel Confidence Module. In the first stage, the Defect Localization Module performs a coarse localization of defect regions and embeds the resulting feature vectors into the backbone of the second stage. In the second stage, the Pixel Confidence Module captures the probabilistic distribution of neighboring pixels, thereby refining the initial predictions. Experimental results demonstrate that the improved network achieves gains of 1.58%±0.80% in mPA, 1.35%±0.77% in mIoU on the self-built Carbon Fabric Defect Dataset and 2.66%±1.12% in mPA, 1.44%±0.79% in mIoU on the public Magnetic Tile Defect Dataset compared to the other network. These enhancements translate to more reliable automated quality assurance in industrial production environments. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

19 pages, 675 KiB  
Article
A Multicomponent Face Verification and Identification System
by Athanasios Douklias, Ioannis Zorzos, Εvangelos Maltezos, Vasilis Nousis, Spyridon Nektarios Bolierakis, Lazaros Karagiannidis, Eleftherios Ouzounoglou and Angelos Amditis
Appl. Sci. 2025, 15(15), 8161; https://doi.org/10.3390/app15158161 - 22 Jul 2025
Abstract
Face recognition technology is a biometric technology, which is based on the identification or verification of facial features. Automatic face recognition is an active research field in the context of computer vision and artificial intelligence (AI) that is fundamental for a variety of [...] Read more.
Face recognition technology is a biometric technology, which is based on the identification or verification of facial features. Automatic face recognition is an active research field in the context of computer vision and artificial intelligence (AI) that is fundamental for a variety of real-time applications. In this research, the design and implementation of a face verification and identification system of a flexible, modular, secure, and scalable architecture is proposed. The proposed system incorporates several and various types of system components: (i) portable capabilities (mobile application and mixed reality [MR] glasses), (ii) enhanced monitoring and visualization via a user-friendly Web-based user interface (UI), and (iii) information sharing via middleware to other external systems. The experiments showed that such interconnected and complementary system components were able to perform robust and real-time results related to face identification and verification. Furthermore, to identify a proper model of high accuracy, robustness, and performance speed for face identification and verification tasks, a comprehensive evaluation of multiple face recognition pre-trained models (FaceNet, ArcFace, Dlib, and MobileNetV2) on a curated version of the ID vs. Spot dataset was performed. Among the models used, FaceNet emerged as a preferable choice for real-time tasks due to its balance between accuracy and inference speed for both face identification and verification tasks achieving AUC of 0.99, Rank-1 of 91.8%, Rank-5 of 95.8%, FNR of 2% and FAR of 0.1%, accuracy of 98.6%, and inference speed of 52 ms. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
27 pages, 10927 KiB  
Article
Enhanced Recognition of Sustainable Wood Building Materials Based on Deep Learning and Augmentation
by Wei Gan, Shengbiao Li, Jinyu Li, Shuqi Peng, Ruoxi Li, Lan Qiu, Baofeng Li and Yi He
Sustainability 2025, 17(15), 6683; https://doi.org/10.3390/su17156683 - 22 Jul 2025
Abstract
The accurate identification of wood patterns is critical for optimizing the use of sustainable wood building materials, promoting resource efficiency, and reducing waste in construction. This study presents a deep learning-based approach for enhanced wood material recognition, combining EfficientNet architecture with advanced data [...] Read more.
The accurate identification of wood patterns is critical for optimizing the use of sustainable wood building materials, promoting resource efficiency, and reducing waste in construction. This study presents a deep learning-based approach for enhanced wood material recognition, combining EfficientNet architecture with advanced data augmentation techniques to achieve robust classification. The augmentation strategy incorporates geometric transformations (flips, shifts, and rotations) and photometric adjustments (brightness and contrast) to improve dataset diversity while preserving discriminative wood grain features. Validation was performed using a controlled augmentation pipeline to ensure realistic performance assessment. Experimental results demonstrate the model’s effectiveness, achieving 88.9% accuracy (eight out of nine correct predictions), with further improvements from targeted image preprocessing. The approach provides valuable support for preliminary sustainable building material classification, and can be deployed through user-friendly interfaces without requiring specialized AI expertise. The system retains critical wood pattern characteristics while enhancing adaptability to real-world variability, supporting reliable material classification in sustainable construction. This study highlights the potential of integrating optimized neural networks with tailored preprocessing to advance AI-driven sustainability in building material recognition, contributing to circular economy practices and resource-efficient construction. Full article
(This article belongs to the Special Issue Analysis on Real-Estate Marketing and Sustainable Civil Engineering)
Show Figures

Figure 1

20 pages, 3000 KiB  
Article
NRNH-AR: A Small Robotic Agent Using Tri-Fold Learning for Navigation and Obstacle Avoidance
by Carlos Vasquez-Jalpa, Mariko Nakano, Martin Velasco-Villa and Osvaldo Lopez-Garcia
Appl. Sci. 2025, 15(15), 8149; https://doi.org/10.3390/app15158149 - 22 Jul 2025
Abstract
We propose a tri-fold learning algorithm, called Neuroevolution of Hybrid Neural Networks in a Robotic Agent (acronym in Spanish, NRNH-AR), based on deep reinforcement learning (DRL), with self-supervised learning (SSL) and unsupervised learning (USL) steps, specifically designed to be implemented in a small [...] Read more.
We propose a tri-fold learning algorithm, called Neuroevolution of Hybrid Neural Networks in a Robotic Agent (acronym in Spanish, NRNH-AR), based on deep reinforcement learning (DRL), with self-supervised learning (SSL) and unsupervised learning (USL) steps, specifically designed to be implemented in a small autonomous navigation robot capable of operating in constrained physical environments. The NRNH-AR algorithm is designed for a small physical robotic agent with limited resources. The proposed algorithm was evaluated in four critical aspects: computational cost, learning stability, required memory size, and operation speed. The results obtained show that the performance of NRNH-AR is within the ranges of the Deep Q Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Twin Delayed Deep Deterministic Policy Gradient (TD3). The proposed algorithm comprises three types of learning algorithms: SSL, USL, and DRL. Thanks to the series of learning algorithms, the proposed algorithm optimizes the use of resources and demonstrates adaptability in dynamic environments, a crucial aspect of navigation robotics. By integrating computer vision techniques based on a Convolutional Neuronal Network (CNN), the algorithm enhances its abilities to understand visual observations of the environment rapidly and detect a specific object, avoiding obstacles. Full article
Show Figures

Figure 1

24 pages, 5200 KiB  
Article
DRFAN: A Lightweight Hybrid Attention Network for High-Fidelity Image Super-Resolution in Visual Inspection Applications
by Ze-Long Li, Bai Jiang, Liang Xu, Zhe Lu, Zi-Teng Wang, Bin Liu, Si-Ye Jia, Hong-Dan Liu and Bing Li
Algorithms 2025, 18(8), 454; https://doi.org/10.3390/a18080454 - 22 Jul 2025
Viewed by 22
Abstract
Single-image super-resolution (SISR) plays a critical role in enhancing visual quality for real-world applications, including industrial inspection and embedded vision systems. While deep learning-based approaches have made significant progress in SR, existing lightweight SR models often fail to accurately reconstruct high-frequency textures, especially [...] Read more.
Single-image super-resolution (SISR) plays a critical role in enhancing visual quality for real-world applications, including industrial inspection and embedded vision systems. While deep learning-based approaches have made significant progress in SR, existing lightweight SR models often fail to accurately reconstruct high-frequency textures, especially under complex degradation scenarios, resulting in blurry edges and structural artifacts. To address this challenge, we propose a Dense Residual Fused Attention Network (DRFAN), a novel lightweight hybrid architecture designed to enhance high-frequency texture recovery in challenging degradation conditions. Moreover, by coupling convolutional layers and attention mechanisms through gated interaction modules, the DRFAN enhances local details and global dependencies with linear computational complexity, enabling the efficient utilization of multi-level spatial information while effectively alleviating the loss of high-frequency texture details. To evaluate its effectiveness, we conducted ×4 super-resolution experiments on five public benchmarks. The DRFAN achieves the best performance among all compared lightweight models. Visual comparisons show that the DRFAN restores more accurate geometric structures, with up to +1.2 dB/+0.0281 SSIM gain over SwinIR-S on Urban100 samples. Additionally, on a domain-specific rice grain dataset, the DRFAN outperforms SwinIR-S by +0.19 dB in PSNR and +0.0015 in SSIM, restoring clearer textures and grain boundaries essential for industrial quality inspection. The proposed method provides a compelling balance between model complexity and image reconstruction fidelity, making it well-suited for deployment in resource-constrained visual systems and industrial applications. Full article
Show Figures

Figure 1

15 pages, 1193 KiB  
Article
Enhanced Brain Stroke Lesion Segmentation in MRI Using a 2.5D Transformer Backbone U-Net Model
by Mahsa Karimzadeh, Hadi Seyedarabi, Ata Jodeiri and Reza Afrouzian
Brain Sci. 2025, 15(8), 778; https://doi.org/10.3390/brainsci15080778 - 22 Jul 2025
Viewed by 37
Abstract
Background/Objectives: Accurate segmentation of brain stroke lesions from MRI images is a critical task in medical image analysis that is essential for timely diagnosis and treatment planning. Methods: This paper presents a novel approach for segmenting brain stroke lesions using a deep learning [...] Read more.
Background/Objectives: Accurate segmentation of brain stroke lesions from MRI images is a critical task in medical image analysis that is essential for timely diagnosis and treatment planning. Methods: This paper presents a novel approach for segmenting brain stroke lesions using a deep learning model based on the U-Net neural network architecture. We enhanced the traditional U-Net by integrating a transformer-based backbone, specifically the Mix Vision Transformer (MiT), and compared its performance against other commonly used backbones such as ResNet and EfficientNet. Additionally, we implemented a 2.5D method, which leverages 2D networks to process three-dimensional data slices, effectively balancing the rich spatial context of 3D methods and the simplicity of 2D methods. The 2.5D approach captures inter-slice dependencies, leading to improved lesion delineation without the computational complexity of full 3D models. Utilizing the 2015 ISLES dataset, which includes MRI images and corresponding lesion masks for 20 patients, we conducted our experiments with 4-fold cross-validation to ensure robustness and reliability. To evaluate the effectiveness of our method, we conducted comparative experiments with several state-of-the-art (SOTA) segmentation models, including CNN-based UNet, nnU-Net, TransUNet, and SwinUNet. Results: Our proposed model outperformed all competing methods in terms of Dice Coefficient and Intersection over Union (IoU), demonstrating its robustness and superiority. Our extensive experiments demonstrate that the proposed U-Net with the MiT Backbone, combined with 2.5D data preparation, achieves superior performance metrics, specifically achieving DICE and IoU scores of 0.8153 ± 0.0101 and 0.7835 ± 0.0079, respectively, outperforming other backbone configurations. Conclusions: These results indicate that the integration of transformer-based backbones and 2.5D techniques offers a significant advancement in the accurate segmentation of brain stroke lesions, paving the way for more reliable and efficient diagnostic tools in clinical settings. Full article
(This article belongs to the Section Neural Engineering, Neuroergonomics and Neurorobotics)
Show Figures

Figure 1

37 pages, 1831 KiB  
Review
Deep Learning Techniques for Retinal Layer Segmentation to Aid Ocular Disease Diagnosis: A Review
by Oliver Jonathan Quintana-Quintana, Marco Antonio Aceves-Fernández, Jesús Carlos Pedraza-Ortega, Gendry Alfonso-Francia and Saul Tovar-Arriaga
Computers 2025, 14(8), 298; https://doi.org/10.3390/computers14080298 - 22 Jul 2025
Viewed by 36
Abstract
Age-related ocular conditions like macular degeneration (AMD), diabetic retinopathy (DR), and glaucoma are leading causes of irreversible vision loss globally. Optical coherence tomography (OCT) provides essential non-invasive visualization of retinal structures for early diagnosis, but manual analysis of these images is labor-intensive and [...] Read more.
Age-related ocular conditions like macular degeneration (AMD), diabetic retinopathy (DR), and glaucoma are leading causes of irreversible vision loss globally. Optical coherence tomography (OCT) provides essential non-invasive visualization of retinal structures for early diagnosis, but manual analysis of these images is labor-intensive and prone to variability. Deep learning (DL) techniques have emerged as powerful tools for automating the segmentation of the retinal layer in OCT scans, potentially improving diagnostic efficiency and consistency. This review systematically evaluates the state of the art in DL-based retinal layer segmentation using the PRISMA methodology. We analyze various architectures (including CNNs, U-Net variants, GANs, and transformers), examine the characteristics and availability of datasets, discuss common preprocessing and data augmentation strategies, identify frequently targeted retinal layers, and compare performance evaluation metrics across studies. Our synthesis highlights significant progress, particularly with U-Net-based models, which often achieve Dice scores exceeding 0.90 for well-defined layers, such as the retinal pigment epithelium (RPE). However, it also identifies ongoing challenges, including dataset heterogeneity, inconsistent evaluation protocols, difficulties in segmenting specific layers (e.g., OPL, RNFL), and the need for improved clinical integration. This review provides a comprehensive overview of current strengths, limitations, and future directions to guide research towards more robust and clinically applicable automated segmentation tools for enhanced ocular disease diagnosis. Full article
Show Figures

Figure 1

27 pages, 3019 KiB  
Article
New Deep Learning-Based Approach for Source Code Generation: Application to Computer Vision Systems
by Wafa Alshehri, Salma Kammoun Jarraya and Arwa Allinjawi
AI 2025, 6(7), 162; https://doi.org/10.3390/ai6070162 - 21 Jul 2025
Viewed by 203
Abstract
Deep learning has enabled significant progress in source code generation, aiming to reduce the manual, error-prone, and time-consuming aspects of software development. While many existing models rely on recurrent neural networks (RNNs) with sequence-to-sequence architectures, these approaches struggle with the long and complex [...] Read more.
Deep learning has enabled significant progress in source code generation, aiming to reduce the manual, error-prone, and time-consuming aspects of software development. While many existing models rely on recurrent neural networks (RNNs) with sequence-to-sequence architectures, these approaches struggle with the long and complex token sequences typical in source code. To address this, we propose a grammar-based convolutional neural network (CNN) combined with a tree-based representation to enhance accuracy and efficiency. Our model achieves state-of-the-art results on the benchmark HEARTHSTONE dataset, with a BLEU score of 81.4 and an Acc+ of 62.1%. We further evaluate the model on our proposed dataset, AST2CVCode, designed for computer vision applications, achieving 86.2 BLEU and 51.9% EM. Additionally, we introduce BLEU+, an enhanced evaluation metric tailored for functional correctness in code generation, which achieves a BLEU+ score of 92.0% on the AST2CVCode dataset. These results demonstrate the effectiveness of our approach in both model architecture and evaluation methodology. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

15 pages, 677 KiB  
Article
Zero-Shot Learning for Sustainable Municipal Waste Classification
by Dishant Mewada, Eoin Martino Grua, Ciaran Eising, Patrick Denny, Pepijn Van de Ven and Anthony Scanlan
Recycling 2025, 10(4), 144; https://doi.org/10.3390/recycling10040144 - 21 Jul 2025
Viewed by 143
Abstract
Automated waste classification is an essential step toward efficient recycling and waste management. Traditional deep learning models, such as convolutional neural networks, rely on extensive labeled datasets to achieve high accuracy. However, the annotation process is labor-intensive and time-consuming, limiting the scalability of [...] Read more.
Automated waste classification is an essential step toward efficient recycling and waste management. Traditional deep learning models, such as convolutional neural networks, rely on extensive labeled datasets to achieve high accuracy. However, the annotation process is labor-intensive and time-consuming, limiting the scalability of these approaches in real-world applications. Zero-shot learning is a machine learning paradigm that enables a model to recognize and classify objects it has never seen during training by leveraging semantic relationships and external knowledge sources. In this study, we investigate the potential of zero-shot learning for waste classification using two vision-language models: OWL-ViT and OpenCLIP. These models can classify waste without direct exposure to labeled examples by leveraging textual prompts. We apply this approach to the TrashNet dataset, which consists of images of municipal solid waste organized into six distinct categories: cardboard, glass, metal, paper, plastic, and trash. Our experimental results yield an average classification accuracy of 76.30% with Open Clip ViT-L/14-336 model, demonstrating the feasibility of zero-shot learning for waste classification while highlighting challenges in prompt sensitivity and class imbalance. Despite lower accuracy than CNN- and ViT-based classification models, zero-shot learning offers scalability and adaptability by enabling the classification of novel waste categories without retraining. This study underscores the potential of zero-shot learning in automated recycling systems, paving the way for more efficient, scalable, and annotation-free waste classification methodologies. Full article
Show Figures

Figure 1

16 pages, 2914 KiB  
Article
Smart Dairy Farming: A Mobile Application for Milk Yield Classification Tasks
by Allan Hall-Solorio, Graciela Ramirez-Alonso, Alfonso Juventino Chay-Canul, Héctor A. Lee-Rangel, Einar Vargas-Bello-Pérez and David R. Lopez-Flores
Animals 2025, 15(14), 2146; https://doi.org/10.3390/ani15142146 - 21 Jul 2025
Viewed by 160
Abstract
This study analyzes the use of a lightweight image-based deep learning model to classify dairy cows into low-, medium-, and high-milk-yield categories by automatically detecting the udder region of the cow. The implemented model was based on the YOLOv11 architecture, which enables efficient [...] Read more.
This study analyzes the use of a lightweight image-based deep learning model to classify dairy cows into low-, medium-, and high-milk-yield categories by automatically detecting the udder region of the cow. The implemented model was based on the YOLOv11 architecture, which enables efficient object detection and classification with real-time performance. The model is trained on a public dataset of cow images labeled with 305-day milk yield records. Thresholds were established to define the three yield classes, and a balanced subset of labeled images was selected for training, validation, and testing purposes. To assess the robustness and consistency of the proposed approach, the model was trained 30 times following the same experimental protocol. The system achieves precision, recall, and mean Average Precision (mAP@50) of 0.408 ± 0.044, 0.739 ± 0.095, and 0.492 ± 0.031, respectively, across all classes. The highest precision (0.445 ± 0.055), recall (0.766 ± 0.107), and mAP@50 (0.558 ± 0.036) were observed in the low-yield class. Qualitative analysis revealed that misclassifications mainly occurred near class boundaries, emphasizing the importance of consistent image acquisition conditions. The resulting model was deployed in a mobile application designed to support field-level assessment by non-specialist users. These findings demonstrate the practical feasibility of applying vision-based models to support decision-making in dairy production systems, particularly in settings where traditional data collection methods are unavailable or impractical. Full article
Show Figures

Figure 1

24 pages, 637 KiB  
Review
Deep Learning Network Selection and Optimized Information Fusion for Enhanced COVID-19 Detection: A Literature Review
by Olga Adriana Caliman Sturdza, Florin Filip, Monica Terteliu Baitan and Mihai Dimian
Diagnostics 2025, 15(14), 1830; https://doi.org/10.3390/diagnostics15141830 - 21 Jul 2025
Viewed by 263
Abstract
The rapid spread of COVID-19 increased the need for speedy diagnostic tools, which led scientists to conduct extensive research on deep learning (DL) applications that use chest imaging, such as chest X-ray (CXR) and computed tomography (CT). This review examines the development and [...] Read more.
The rapid spread of COVID-19 increased the need for speedy diagnostic tools, which led scientists to conduct extensive research on deep learning (DL) applications that use chest imaging, such as chest X-ray (CXR) and computed tomography (CT). This review examines the development and performance of DL architectures, notably convolutional neural networks (CNNs) and emerging vision transformers (ViTs), in identifying COVID-19-related lung abnormalities. Individual ResNet architectures, along with CNN models, demonstrate strong diagnostic performance through the transfer protocol; however, ViTs provide better performance, with improved readability and reduced data requirements. Multimodal diagnostic systems now incorporate alternative methods, in addition to imaging, which use lung ultrasounds, clinical data, and cough sound evaluation. Information fusion techniques, which operate at the data, feature, and decision levels, enhance diagnostic performance. However, progress in COVID-19 detection is hindered by ongoing issues stemming from restricted and non-uniform datasets, as well as domain differences in image standards and complications with both diagnostic overfitting and poor generalization capabilities. Recent developments in COVID-19 diagnosis involve constructing expansive multi-noise information sets while creating clinical process-oriented AI algorithms and implementing distributed learning protocols for securing information security and system stability. While deep learning-based COVID-19 detection systems show strong potential for clinical application, broader validation, regulatory approvals, and continuous adaptation remain essential for their successful deployment and for preparing future pandemic response strategies. Full article
Show Figures

Figure 1

21 pages, 8521 KiB  
Article
Estimating Forest Carbon Stock Using Enhanced ResNet and Sentinel-2 Imagery
by Jintong Ren, Lizhi Liu, You Wu, Lijian Ouyang and Zhenyu Yu
Forests 2025, 16(7), 1198; https://doi.org/10.3390/f16071198 - 20 Jul 2025
Viewed by 191
Abstract
Accurate estimation of forest carbon stock is critical for understanding ecosystem carbon dynamics and informing climate mitigation strategies. This study presents a deep learning framework that integrates Sentinel-2 multispectral imagery with an enhanced residual neural network for estimating aboveground forest carbon stock in [...] Read more.
Accurate estimation of forest carbon stock is critical for understanding ecosystem carbon dynamics and informing climate mitigation strategies. This study presents a deep learning framework that integrates Sentinel-2 multispectral imagery with an enhanced residual neural network for estimating aboveground forest carbon stock in the Liuchong River Basin, Bijie City, Guizhou Province, China. The proposed model incorporates multiscale residual blocks and channel attention mechanisms to improve spatial feature extraction and spectral dependency modeling. A dataset of 150 ground inventory plots was employed for supervised training and validation. Comparative experiments with Random Forest, Gradient Boosting Decision Trees (GBDT), and Vision Transformer (ViT) demonstrate that the enhanced ResNet achieves the best performance, with a root mean square error (RMSE) of 23.02 Mg/ha and a coefficient of determination (R2) of 0.773 on the test set. Spatial mapping results further reveal that the model effectively captures fine-scale carbon stock variations across mountainous forested landscapes. These findings underscore the potential of combining multispectral remote sensing and advanced neural architectures for scalable, high-resolution forest carbon estimation in complex terrain. Full article
(This article belongs to the Special Issue Mapping and Modeling Forests Using Geospatial Technologies)
Show Figures

Figure 1

15 pages, 508 KiB  
Review
The Role of Artificial Intelligence in the Diagnosis and Management of Diabetic Retinopathy
by Areeb Ansari, Nabiha Ansari, Usman Khalid, Daniel Markov, Kristian Bechev, Vladimir Aleksiev, Galabin Markov and Elena Poryazova
J. Clin. Med. 2025, 14(14), 5150; https://doi.org/10.3390/jcm14145150 - 20 Jul 2025
Viewed by 275
Abstract
Background/Objectives: Diabetic retinopathy (DR) is a progressive microvascular complication of diabetes mellitus and a leading cause of vision impairment worldwide. Early detection and timely management are critical in preventing vision loss, yet current screening programs face challenges, including limited specialist availability and [...] Read more.
Background/Objectives: Diabetic retinopathy (DR) is a progressive microvascular complication of diabetes mellitus and a leading cause of vision impairment worldwide. Early detection and timely management are critical in preventing vision loss, yet current screening programs face challenges, including limited specialist availability and variability in diagnoses, particularly in underserved areas. This literature review explores the evolving role of artificial intelligence (AI) in enhancing the diagnosis, screening, and management of diabetic retinopathy. It examines AI’s potential to improve diagnostic accuracy, accessibility, and patient outcomes through advanced machine-learning and deep-learning algorithms. Methods: We conducted a non-systematic review of the published literature to explore advancements in the diagnostics of diabetic retinopathy. Relevant articles were identified by searching the PubMed and Google Scholar databases. Studies focusing on the application of artificial intelligence in screening, diagnosis, and improving healthcare accessibility for diabetic retinopathy were included. Key information was extracted and synthesized to provide an overview of recent progress and clinical implications. Conclusions: Artificial intelligence holds transformative potential in diabetic retinopathy care by enabling earlier detection, improving screening coverage, and supporting individualized disease management. Continued research and ethical deployment will be essential to maximize AI’s benefits and address challenges in real-world applications, ultimately improving global vision health outcomes. Full article
(This article belongs to the Section Ophthalmology)
Show Figures

Figure 1

Back to TopTop