Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (265)

Search Parameters:
Keywords = faster region-based CNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 9489 KB  
Article
Detection of Missing Insulators in High-Voltage Transmission Lines Using UAV Images
by Yulong Zhang, Xianghong Xue, Lingxia Mu, Jing Xin, Yichi Yang and Youmin Zhang
Drones 2026, 10(3), 213; https://doi.org/10.3390/drones10030213 - 18 Mar 2026
Viewed by 167
Abstract
Insulators are essential components in high-voltage transmission lines and require regular inspection to ensure reliable power delivery. Traditional manual inspection methods are inefficient and labor intensive, highlighting the need for intelligent and automated solutions. In this study, we propose a missing insulator detection [...] Read more.
Insulators are essential components in high-voltage transmission lines and require regular inspection to ensure reliable power delivery. Traditional manual inspection methods are inefficient and labor intensive, highlighting the need for intelligent and automated solutions. In this study, we propose a missing insulator detection method that integrates Unmanned Aerial Vehicle (UAV) imaging with deep learning techniques. Firstly, an improved Faster Region-based Convolutional Neural Network (Faster R-CNN) is employed to detect and localize insulators in aerial images. Secondly, the localized insulators are segmented using an improved U-Net to reduce background interference. A bounding box regression approach is adopted to obtain the minimum enclosing rectangles, and the insulators are aligned vertically. Adaptive thresholding is then applied to extract binary images of the insulators. These binary images are further transformed into defect curves, from which missing insulators are identified based on curve distribution. To address the limited availability of labeled samples, a transfer learning-based strategy is adopted to improve model generalization. A dataset of glass insulators was collected using a DJI M300 UAV equipped with an H20T camera along a 330 kV overhead transmission line. On the collected UAV insulator dataset, the proposed method achieved an AP@0.5 of 99.85% and an average IoU of 88.56% for insulator string detection, while the improved U-Net achieved an mIoU of 89.73% for insulator string segmentation. Outdoor flight experiments further verified performance under varying backgrounds and illumination conditions in our UAV inspection scenarios. Full article
Show Figures

Figure 1

22 pages, 4777 KB  
Article
Defect-Aware RGB Representation and Resolution-Efficient Deep Learning for Photovoltaic Failure Detection in Electroluminescence Images
by Damian Grzechca, Fatima Ez-Zahiri, Łukasz Chruszczyk and Fei Bian
Appl. Sci. 2026, 16(4), 2148; https://doi.org/10.3390/app16042148 - 23 Feb 2026
Viewed by 330
Abstract
Electroluminescence (EL) imaging is widely used for non-destructive inspection of photovoltaic (PV) cells; however, the low contrast of grayscale EL images limits the performance of automated defect detection methods. This manuscript proposes a defect-aware EL image classification framework that enhances defect visibility through [...] Read more.
Electroluminescence (EL) imaging is widely used for non-destructive inspection of photovoltaic (PV) cells; however, the low contrast of grayscale EL images limits the performance of automated defect detection methods. This manuscript proposes a defect-aware EL image classification framework that enhances defect visibility through local contrast enhancement and physically motivated RGB false-color mapping. Instead of simple channel replication, grayscale intensities are segmented into defect-related ranges and encoded to emphasize cracks, inactive regions, healthy silicon emission, and conductive pathways. The approach is evaluated on the public ELPV benchmark dataset proposing ResNet–50, EfficientNet–B0, and EfficientNet–B3 architectures at two input resolutions. The proposed representation consistently improves defect discrimination and achieves a maximum classification accuracy, outperforming previously reported CNN-based results on the same dataset. Notably, comparable accuracy is obtained at lower resolution, significantly reducing computational cost and inference time, which supports deployment with cheaper sensors and faster inspection pipelines. Class imbalance is addressed using focal loss, class weighting, and threshold calibration without artificial resampling, preserving realistic operating conditions. The results confirm that combining defect-aware RGB representation with resolution-efficient learning provides an accurate and computationally practical solution for EL-based PV defect detection. Full article
(This article belongs to the Special Issue AI-Based Machinery Health Monitoring)
Show Figures

Figure 1

17 pages, 8796 KB  
Article
Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model
by Mingzhou Bai, Qun Ma, Hongyu Liu and Zilun Zhang
Sustainability 2026, 18(3), 1273; https://doi.org/10.3390/su18031273 - 27 Jan 2026
Viewed by 279
Abstract
This study compares three detectors—Single Shot MultiBox Detector (SSD), Faster Region-based Convolutional Neural Network (Faster R-CNN), and Only Look Once v11 (YOLOv11)—for detecting subgrade distress in GPR radargrams. SSD converges fastest but shows weaker detection performance, while Faster R-CNN achieves higher localization accuracy [...] Read more.
This study compares three detectors—Single Shot MultiBox Detector (SSD), Faster Region-based Convolutional Neural Network (Faster R-CNN), and Only Look Once v11 (YOLOv11)—for detecting subgrade distress in GPR radargrams. SSD converges fastest but shows weaker detection performance, while Faster R-CNN achieves higher localization accuracy at the cost of slower convergence. YOLOv11 offers the best overall performance. To push YOLOv11 further, we introduce three enhancements: a Multi-Scale Edge Enhancement Module (MEEM), a Multi-Feature Multi-Scale Attention (MFMSA) mechanism, and a hybrid configuration that combines both. On a representative dataset, YOLOv11_MEEM yields a 0.2 percentage-point increase in precision with a 0.2 percentage-point decrease in recall and a 0.3 percentage-point gain in mean Average Precision@0.5:0.95, indicating improved generalization and efficiency. YOLOv11_MFMSA achieves precision comparable to MEEM but suffers a substantial recall drop and slower inference. The hybrid YOLOv11_MEEM+MFMSA underperforms on key metrics due to gradient conflicts. MEEM reduces electromagnetic interference through dynamic edge enhancement, preserving real-time performance and robust generalization. Overall, MEEM-enhanced YOLOv11 is suitable for real-time subgrade distress detection in GPR radargrams. The research findings can offer technical support for the intelligent detection of subgrade engineering while also promoting the resilient development and sustainable operation and maintenance of urban infrastructure. Full article
Show Figures

Figure 1

22 pages, 15950 KB  
Article
An Automatic Identification Method for Large-Scale Landslide Hazard Potential Integrating InSAR and CRF-Faster RCNN: A Case Study of Ahai Reservoir Area in Jinsha River Basin
by Yujuan Dong, Yongfa Li, Xiaoqing Zuo, Na Liu, Xiaona Gu, Haoyi Shi, Rukun Jiang, Fangzhen Guo, Zhengxiong Gu and Yongzhi Chen
Remote Sens. 2026, 18(2), 283; https://doi.org/10.3390/rs18020283 - 15 Jan 2026
Viewed by 368
Abstract
Currently, the manual delineation of landslide anomalies from Interferometric Synthetic Aperture Radar(InSAR )deformation data is labor-intensive and time-consuming, creating a major bottleneck for operational large-scale landslide mapping. This study proposes an automated approach for large-scale landslide identification by integrating InSAR technology with an [...] Read more.
Currently, the manual delineation of landslide anomalies from Interferometric Synthetic Aperture Radar(InSAR )deformation data is labor-intensive and time-consuming, creating a major bottleneck for operational large-scale landslide mapping. This study proposes an automated approach for large-scale landslide identification by integrating InSAR technology with an improved Faster Regional Convolutional Neural Network (Faster R-CNN). First, surface deformation over the study area was obtained using the Small Baseline Subset Interferometric Synthetic Aperture Radar (SBAS-InSAR) technique. An enhanced CRF-Faster R-CNN model was then developed by incorporating a Residual Network with 50 layers (ResNet-50)-based backbone, strengthened with a Convolutional Block Attention Module (CBAM), within a Feature Pyramid Network (FPN) framework. This model was applied to deformation velocity maps for the automated detection of landslide-prone areas. Preliminary results were subsequently validated and refined using optical images to produce a final landslide inventory. The proposed method was evaluated in the Ahai Reservoir area of the Jinsha River Basin using 248 ascending and descending Sentinel-1A images acquired between January 2019 and December 2021. Its performance was compared with that of the standard Faster R-CNN model. The results indicate that the CRF-Faster R-CNN model outperforms the conventional approach in terms of landslide anomaly detection, convergence speed, and overall accuracy. A total of 38 potential landslide hazards were identified in the Ahai Reservoir area, with an 84% validation accuracy confirmed through field investigations. This study provides crucial technical support for the rapid identification and operational application of large-scale potential landslide hazards. Full article
Show Figures

Figure 1

19 pages, 3550 KB  
Article
CAG-Net: A Novel Change Attention Guided Network for Substation Defect Detection
by Dao Xiang, Xiaofei Du and Zhaoyang Liu
Mathematics 2026, 14(1), 178; https://doi.org/10.3390/math14010178 - 2 Jan 2026
Viewed by 452
Abstract
Timely detection and handling of substation defects plays a foundational role in ensuring the stable operation of power systems. Existing substation defect detection methods fail to make full use of the temporal information contained in substation inspection samples, resulting in problems such as [...] Read more.
Timely detection and handling of substation defects plays a foundational role in ensuring the stable operation of power systems. Existing substation defect detection methods fail to make full use of the temporal information contained in substation inspection samples, resulting in problems such as weak generalization ability and susceptibility to background interference. To address these issues, a change attention guided substation defect detection algorithm (CAG-Net) based on a dual-temporal encoder–decoder framework is proposed. The encoder module employs a Siamese backbone network composed of efficient local-global context aggregation modules to extract multi-scale features, balancing local details and global semantics, and designs a change attention guidance module that takes feature differences as attention weights to dynamically enhance the saliency of defect regions and suppress background interference. The decoder module adopts an improved FPN structure to fuse high-level and low-level features, supplement defect details, and improve the model’s ability to detect small targets and multi-scale defects. Experimental results on the self-built substation multi-phase defect dataset (SMDD) show that the proposed method achieves 81.76% in terms of mAP, which is 3.79% higher than that of Faster R-CNN and outperforms mainstream detection models such as GoldYOLO and YOLOv10. Ablation experiments and visualization analysis demonstrate that the method can effectively focus on defect regions in complex environments, improving the positioning accuracy of multi-scale targets. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

17 pages, 5035 KB  
Article
An Improved Cascade R-CNN-Based Fastener Detection Method for Coating Workshop Inspection
by Jiaqi Liu, Shanhui Liu, Yuhong Chen, Jiawen Zhao and Jiahao Fu
Coatings 2026, 16(1), 37; https://doi.org/10.3390/coatings16010037 - 30 Dec 2025
Viewed by 365
Abstract
To address the challenges of small fastener targets, complex backgrounds, and the low efficiency of traditional manual inspection in coating workshop scenarios, this paper proposes an improved Cascade R-CNN-based fastener detection method. A VOC-format dataset was constructed covering three target categories—Marking-painted fastener, Fastener, [...] Read more.
To address the challenges of small fastener targets, complex backgrounds, and the low efficiency of traditional manual inspection in coating workshop scenarios, this paper proposes an improved Cascade R-CNN-based fastener detection method. A VOC-format dataset was constructed covering three target categories—Marking-painted fastener, Fastener, and Fallen off—which represents typical inspection scenarios of coating equipment under diverse operating conditions and enhances the adaptability of the model. Within the Cascade R-CNN framework, three improvements were introduced: the Convolutional Block Attention Module (CBAM) was integrated into the ResNet-101 backbone to enhance feature representation of small objects; anchor scales were reduced to better align with the actual size distribution of fasteners; and Soft-NMS was adopted in place of conventional NMS to effectively reduce missed detections in overlapping regions. Experimental results demonstrate that the proposed method achieves a mean Average Precision (mAP) of 96.60% on the self-constructed dataset, with both Precision and Recall exceeding 95%, significantly outperforming Faster R-CNN and the original Cascade R-CNN. The method enables accurate detection and missing-state recognition of fasteners in complex backgrounds and small-object scenarios, providing reliable technical support for the automation and intelligence of printing equipment inspection. Full article
Show Figures

Figure 1

16 pages, 4674 KB  
Article
Field-Oriented Rice Pest Detection: Dataset Construction and Performance Analysis
by Bocheng Mo, Zheng Zhang, Changcheng Li, Qifeng Zhang and Changjian Chen
Agronomy 2026, 16(1), 53; https://doi.org/10.3390/agronomy16010053 - 24 Dec 2025
Viewed by 684
Abstract
Rice is one of the world’s most important staple crops, and outbreaks of insect pests pose a persistent threat to yield stability and food security in major rice-growing regions. Reliable field-scale rice pest detection remains challenging due to limited datasets, heterogeneous imaging conditions, [...] Read more.
Rice is one of the world’s most important staple crops, and outbreaks of insect pests pose a persistent threat to yield stability and food security in major rice-growing regions. Reliable field-scale rice pest detection remains challenging due to limited datasets, heterogeneous imaging conditions, and inconsistent annotations. To address these limitations, we construct RicePest-30, a field-oriented dataset comprising 8848 images and 62,227 annotated instances covering 30 major rice pest species. Images were collected using standardized square-framing protocols to preserve spatial context and visual consistency under diverse illumination and background conditions. Based on RicePest-30, YOLOv11 was adopted as the primary detection framework and optimized through a systematic hyperparameter tuning process. The learning rate was selected via grid search within the range of 0.001–0.01, yielding an optimal value of 0.002. Training was conducted for up to 300 epochs with an early-stopping strategy to prevent overfitting. For fair comparison, YOLOv5s, YOLOv8s, Faster R-CNN, and RetinaNet were trained for the same number of epochs under unified settings, using the Adam optimizer with a learning rate of 0.001. Model performance was evaluated using Precision, Recall, AP@50, mAP@50:95, and counting error metrics. The experimental results indicate that YOLOv11 provides the most balanced performance across precision, localization accuracy, and counting stability. However, all models exhibit degraded performance in small-object scenarios, dense pest distributions, and visually similar categories. Error analyses further reveal that class imbalance and field-scene variability are the primary factors limiting detection robustness. Overall, this study contributes a high-quality, uniformly annotated rice pest dataset and a systematic benchmark of mainstream detection models under realistic field conditions. The findings highlight critical challenges in fine-grained pest recognition and provide a solid foundation for future research on small-object enhancement, adaptive data augmentation, and robust deployment of intelligent pest monitoring systems. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

22 pages, 3280 KB  
Article
A Novel Scenario-Based Comparative Framework for Short- and Medium-Term Solar PV Power Forecasting Using Deep Learning Models
by Elif Yönt Aydın, Kevser Önal, Cem Haydaroğlu, Heybet Kılıç, Özal Yıldırım, Oğuzhan Katar and Hüseyin Erdoğan
Appl. Sci. 2025, 15(24), 12965; https://doi.org/10.3390/app152412965 - 9 Dec 2025
Cited by 1 | Viewed by 765
Abstract
Accurate short- and medium-term forecasting of photovoltaic (PV) power generation is vital for grid stability and renewable energy integration. This study presents a comparative scenario-based approach using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) models trained with [...] Read more.
Accurate short- and medium-term forecasting of photovoltaic (PV) power generation is vital for grid stability and renewable energy integration. This study presents a comparative scenario-based approach using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) models trained with one year of real-time meteorological and production data from a 250 kWp grid-connected PV system located at Dicle University in Diyarbakır, Southeastern Anatolia, Turkey. The dataset includes hourly measurements of solar irradiance (average annual GHI 5.4 kWh/m2/day), ambient temperature, humidity, and wind speed, with missing data below 2% after preprocessing. Six forecasting scenarios were designed for different horizons (6 h to 1 month). Results indicate that the LSTM model achieved the best performance in short-term scenarios, reaching R2 values above 0.90 and lower MAE and RMSE compared to CNN and GRU. The GRU model showed similar accuracy with faster training time, while CNN produced higher errors due to the dominant temporal nature of PV output. These results align with recent studies that emphasize selecting suitable deep learning architectures for time-series energy forecasting. This work highlights the benefit of integrating real local meteorological data with deep learning models in a scenario-based design and provides practical insights for regional grid operators and energy planners to reduce production uncertainty. Future studies can improve forecast reliability by testing hybrid models and implementing real-time adaptive training strategies to better handle extreme weather fluctuations. Full article
Show Figures

Figure 1

20 pages, 8313 KB  
Article
Pipe Burst Detection and Localization in Water Distribution Networks Using Faster Region-Based Convolutional Neural Network
by Kyoungwon Min, Joong Hoon Kim, Donghwi Jung, Seungyub Lee and Doosun Kang
Water 2025, 17(23), 3380; https://doi.org/10.3390/w17233380 - 26 Nov 2025
Cited by 1 | Viewed by 1029
Abstract
Pipe leakage and bursts are the primary contributors to water losses in water distribution networks (WDNs). However, the use of object detection techniques for identifying such failures is underexplored. This study proposes a novel deep-learning-based framework for pipe burst detection and localization (PBD&L) [...] Read more.
Pipe leakage and bursts are the primary contributors to water losses in water distribution networks (WDNs). However, the use of object detection techniques for identifying such failures is underexplored. This study proposes a novel deep-learning-based framework for pipe burst detection and localization (PBD&L) within WDNs. The framework employs spatial encoding of pressure fields obtained from hydraulic simulations of normal and burst scenarios. These encoded images serve as inputs to a faster region-based convolutional neural network (Faster R-CNN) object detection model, specifically designed for infrastructure monitoring. The framework was tested on three WDNs—Fossolo, PB23, and CM53—under varying sensor coverages (100%, 75%, and 50%). The results indicate that the model consistently achieves high detection accuracy across different network configurations, even with limited sensor availability. For Fossolo and PB23, the model demonstrated stable performance; however, for the CM53 network, accuracy decreased at full sensor coverage, possibly owing to overfitting or signal redundancy. Overall, the proposed method presents a robust solution for PBD&L in WDNs, showcasing significant practical applicability. Its ability to maintain high performance under partial observability and diverse network conditions demonstrates its potential for integration into real-time smart water management systems, enabling automated monitoring, rapid response, and improved operational efficiency. Full article
Show Figures

Figure 1

23 pages, 39304 KB  
Article
Anatomical Alignment of Femoral Radiographs Enables Robust AI-Powered Detection of Incomplete Atypical Femoral Fractures
by Doyoung Kwon, Jin-Han Lee, Joon-Woo Kim, Ji-Wan Kim, Sun-jung Yoon, Sungmoon Jeong and Chang-Wug Oh
Mathematics 2025, 13(22), 3720; https://doi.org/10.3390/math13223720 - 20 Nov 2025
Cited by 1 | Viewed by 742
Abstract
An Incomplete Atypical femoral fracture is subtle and requires early diagnosis. However, artificial intelligence models for these fractures often fail in real-world clinical settings due to the “domain shift” problem, where performance degrades when applied to new data sources. This study proposes a [...] Read more.
An Incomplete Atypical femoral fracture is subtle and requires early diagnosis. However, artificial intelligence models for these fractures often fail in real-world clinical settings due to the “domain shift” problem, where performance degrades when applied to new data sources. This study proposes a data-centric approach to overcome this problem. We introduce an anatomy-based four-step preprocessing pipeline to normalize femoral X-ray images. This pipeline consists of (1) semantic segmentation of the femur, (2) skeletonization and centroid extraction using RANSAC, (3) rotational alignment to the vertical direction, and (4) cropping a normalized region of interest (ROI). We evaluate the effectiveness of this pipeline across various one-stage (YOLO) and two-stage (Faster R-CNN) object detection models. On the source domain data, the proposed alignment pipeline significantly improves the performance of the YOLO model, with YOLOv10n achieving the best performance of 0.6472 at mAP@50–95. More importantly, in zero-shot evaluation on a completely new domain, standing AP X-ray, the model trained on aligned data exhibited strong generalization performance, while the existing models completely failed (mAP = 0), YOLOv10s, which applied the proposed method, achieved 0.4616 at mAP@50–95. The first-stage detector showed more consistent performance gains from the alignment technique than the second-stage detector. Normalizing medical images based on inherent anatomical consistency is a highly effective and efficient strategy for achieving domain generalization. This data-driven paradigm, which simplifies the input to AI, can create clinically applicable, robust models without increasing the complexity of the model architecture. Full article
Show Figures

Figure 1

12 pages, 1677 KB  
Article
Quantization of Faster R-CNN
by Tamás Menyhárt and Róbert Lakatos
Future Transp. 2025, 5(4), 175; https://doi.org/10.3390/futuretransp5040175 - 17 Nov 2025
Viewed by 841
Abstract
The Faster Region-based Convolutional Network (Faster R-CNN) is an efficient object detection model. However, its large size and significant computational requirements limit its applicability in embedded systems and real-time environments. Quantization is a proven method for reducing models’ size and computational requirements, but [...] Read more.
The Faster Region-based Convolutional Network (Faster R-CNN) is an efficient object detection model. However, its large size and significant computational requirements limit its applicability in embedded systems and real-time environments. Quantization is a proven method for reducing models’ size and computational requirements, but there is currently no open-source general implementation for quantizing Faster R-CNN. The main reason is that individual architecture components need to be quantized separately due to their structural characteristics. We present a general Faster R-CNN quantization algorithm, for which our implementation is open-source and compatible with the PyTorch (2.7.0+cu126, pt12) ecosystem. Our solution reduces the model size by 67.2% and the detection time by 50.4% while maintaining the accuracy measured on the test data within an error margin of 8.2% and a standard deviation of ±3.4%. It also allows for the visualization of model errors by extracting the model’s internal activation maps, supporting a more efficient understanding of its behavior. We demonstrate that the proposed method can effectively quantize Faster R-CNN, enabling the model to run on low-power hardware. This is particularly important in applications such as autonomous vehicles, embedded sensor systems, and real-time security surveillance, where fast and energy-efficient object detection is crucial. Full article
(This article belongs to the Special Issue Future of Vehicles (FoV2025))
Show Figures

Figure 1

23 pages, 8309 KB  
Article
Hybrid Faster R-CNN for Tooth Numbering in Periapical Radiographs Based on Fédération Dentaire Internationale System
by Yong-Shao Su, I Elizabeth Cha, Yi-Cheng Mao, Li-Hsin Chang, Zi-Chun Kao, Shun-Yuan Tien, Yuan-Jin Lin, Shih-Lun Chen, Kuo-Chen Li and Patricia Angela R. Abu
Diagnostics 2025, 15(22), 2900; https://doi.org/10.3390/diagnostics15222900 - 15 Nov 2025
Cited by 1 | Viewed by 1048
Abstract
Background/Objectives: Tooth numbering is essential because it allows dental clinicians to identify lesion locations during diagnosis, typically using the Fédération Dentaire Internationale system. However, accurate tooth numbering is challenging due to variations in periapical radiograph (PA) angles. In this study, we aimed to [...] Read more.
Background/Objectives: Tooth numbering is essential because it allows dental clinicians to identify lesion locations during diagnosis, typically using the Fédération Dentaire Internationale system. However, accurate tooth numbering is challenging due to variations in periapical radiograph (PA) angles. In this study, we aimed to develop a deep learning-based tool to assist dentists in accurately identifying teeth via tooth numbering and improve diagnostic efficiency and accuracy. Methods: We developed a Hybrid Faster Region-based Convolutional Neural Network (R-CNN) technique and a custom loss function tailored for PA tooth numbering to accelerate training. Additionally, we developed a tooth-numbering position auxiliary localization algorithm to address challenges associated with missing teeth and extensive crown loss in existing datasets. Results: We achieved a maximum precision of 95.16% utilizing the transformer-based NextViT-Faster R-CNN hybrid model, along with an accuracy increase of at least 8.5% and a 19.8% reduction in training time compared to models without the proposed tooth-numbering position auxiliary localization algorithm and conventional methods. Conclusions: The results demonstrate the effectiveness of the proposed method in overcoming challenges in PA tooth numbering within AI-assisted dental diagnostics, enhancing clinical efficiency, and reducing the risk of misdiagnosis in dental practices. Full article
(This article belongs to the Special Issue 3rd Edition: AI/ML-Based Medical Image Processing and Analysis)
Show Figures

Figure 1

9 pages, 1557 KB  
Proceeding Paper
XAI-Interpreter: A Dual-Attention Framework for Transparent and Explainable Decision-Making in Autonomous Vehicles
by Candaş Ünal, Pelin Öksüz, Tolga Bodrumlu and Musa Yazar
Eng. Proc. 2025, 118(1), 84; https://doi.org/10.3390/ECSA-12-26531 - 7 Nov 2025
Viewed by 329
Abstract
Autonomous vehicles need to explain their actions to improve reliability and build user trust. This study focuses on enhancing the transparency and explainability of the decision-making process in such systems. A module named XAI-Interpreter is developed to identify and highlight the most influential [...] Read more.
Autonomous vehicles need to explain their actions to improve reliability and build user trust. This study focuses on enhancing the transparency and explainability of the decision-making process in such systems. A module named XAI-Interpreter is developed to identify and highlight the most influential factors in driving decisions. The module combines two complementary methods: Learned Attention Weights (LAW) and Object-Level Attention (OLA). In the LAW method, images captured from the ego vehicle’s front and rear cameras in the CARLA simulation environment are processed using the Faster R-CNN model for object detection. GRAD-CAM is then applied to generate visual attention heatmaps, showing which regions and objects in the images affect the model’s decisions. The OLA method analyzes nearby dynamic objects, such as other vehicles, based on their size, speed, position, and orientation relative to the ego vehicle. Each object receives a normalized attention score between 0 and 1, indicating its influence on the vehicle’s behavior. These scores can be used in downstream modules such as planning, control, and safety. The module is currently tested in simulation. Future work will involve deploying the system on real vehicles. By helping the vehicle focus on the most critical elements in its surroundings, the Explainable Artificial Intelligence (XAI)-Interpreter supports more transparent and explainable autonomous driving systems. Full article
Show Figures

Figure 1

32 pages, 2758 KB  
Article
A Hybrid Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM)–Attention Model Architecture for Precise Medical Image Analysis and Disease Diagnosis
by Md. Tanvir Hayat, Yazan M. Allawi, Wasan Alamro, Salman Md Sultan, Ahmad Abadleh, Hunseok Kang and Aymen I. Zreikat
Diagnostics 2025, 15(21), 2673; https://doi.org/10.3390/diagnostics15212673 - 23 Oct 2025
Cited by 2 | Viewed by 2438
Abstract
Background: Deep learning (DL)-based medical image classification is becoming increasingly reliable, enabling physicians to make faster and more accurate decisions in diagnosis and treatment. A plethora of algorithms have been developed to classify and analyze various types of medical images. Among them, Convolutional [...] Read more.
Background: Deep learning (DL)-based medical image classification is becoming increasingly reliable, enabling physicians to make faster and more accurate decisions in diagnosis and treatment. A plethora of algorithms have been developed to classify and analyze various types of medical images. Among them, Convolutional Neural Networks (CNNs) have proven highly effective, particularly in medical image analysis and disease detection. Methods: To further enhance these capabilities, this research introduces MediVision, a hybrid DL-based model that integrates a vision backbone based on CNNs for feature extraction, capturing detailed patterns and structures essential for precise classification. These features are then processed through Long Short-Term Memory (LSTM), which identifies sequential dependencies to better recognize disease progression. An attention mechanism is then incorporated that selectively focuses on salient features detected by the LSTM, improving the model’s ability to highlight critical abnormalities. Additionally, MediVision utilizes a skip connection, merging attention outputs with LSTM outputs along with Grad-CAM heatmap to visualize the most important regions of the analyzed medical image and further enhance feature representation and classification accuracy. Results: Tested on ten diverse medical image datasets (including, Alzheimer’s disease, breast ultrasound, blood cell, chest X-ray, chest CT scans, diabetic retinopathy, kidney diseases, bone fracture multi-region, retinal OCT, and brain tumor), MediVision consistently achieved classification accuracies above 95%, with a peak of 98%. Conclusions: The proposed MediVision model offers a robust and effective framework for medical image classification, improving interpretability, reliability, and automated disease diagnosis. To support research reproducibility, the codes and datasets used in this study have been publicly made available through an open-access repository. Full article
(This article belongs to the Special Issue Machine-Learning-Based Disease Diagnosis and Prediction)
Show Figures

Figure 1

28 pages, 12549 KB  
Article
An Enhanced Faster R-CNN for High-Throughput Winter Wheat Spike Monitoring to Improved Yield Prediction and Water Use Efficiency
by Donglin Wang, Longfei Shi, Yanbin Li, Binbin Zhang, Guangguang Yang and Serestina Viriri
Agronomy 2025, 15(10), 2388; https://doi.org/10.3390/agronomy15102388 - 14 Oct 2025
Viewed by 936
Abstract
This study develops an innovative unmanned aerial vehicle (UAV)-based intelligent system for winter wheat yield prediction, addressing the inefficiencies of traditional manual counting methods (with approximately 15% error rate) and enabling quantitative analysis of water–fertilizer interactions. By integrating an enhanced Faster Region-Based Convolutional [...] Read more.
This study develops an innovative unmanned aerial vehicle (UAV)-based intelligent system for winter wheat yield prediction, addressing the inefficiencies of traditional manual counting methods (with approximately 15% error rate) and enabling quantitative analysis of water–fertilizer interactions. By integrating an enhanced Faster Region-Based Convolutional Neural Network (Faster R-CNN) architecture with multi-source data fusion and machine learning, the system significantly improves both spike detection accuracy and yield forecasting performance. Field experiments during the 2022–2023 growing season captured high-resolution multispectral imagery for varied irrigation regimes and fertilization treatments. The optimized detection model incorporates ResNet-50 as the backbone feature extraction network, with residual connections and channel attention mechanisms, achieving a mean average precision (mAP) of 91.2% (calculated at IoU threshold 0.5) and 88.72% recall while reducing computational complexity. The model outperformed YOLOv8 by a statistically significant 2.1% margin (p < 0.05). Using model-generated spike counts as input, the random forest (RF) model regressor demonstrated superior yield prediction performance (R2 = 0.82, RMSE = 324.42 kg·ha−1), exceeding the Partial Least Squares Regression (PLSR) (R2 +46%, RMSE-44.3%), Least Squares Support Vector Machine (LSSVM) (R2 + 32.3%, RMSE-32.4%), Support Vector Regression (SVR) (R2 + 30.2%, RMSE-29.6%), and Backpropagation (BP) Neural Network (R2+22.4%, RMSE-24.4%) models. Analysis of different water–fertilizer treatments revealed that while organic fertilizer under full irrigation (750 m3 ha−1) conditions achieved maximum yield benefit (13,679.26 CNY·ha−1), it showed relatively low water productivity (WP = 7.43 kg·m−3). Conversely, under deficit irrigation (450 m3 ha−1) conditions, the 3:7 organic/inorganic fertilizer treatment achieved optimal WP (11.65 kg m−3) and WUE (20.16 kg∙ha−1∙mm−1) while increasing yield benefit by 25.46% compared to organic fertilizer alone. This research establishes an integrated technical framework for high-throughput spike monitoring and yield estimation, providing actionable insights for synergistic water–fertilizer management strategies in sustainable precision agriculture. Full article
(This article belongs to the Section Water Use and Irrigation)
Show Figures

Figure 1

Back to TopTop