MDPI - Publisher of Open Access Journals

18 pages, 3368 KiB

Open AccessArticle

Segmentation-Assisted Fusion-Based Classification for Automated CXR Image Analysis

by Shilu Kang, Dongfang Li, Jiaxin Xu, Aokun Mei and Hua Huo

Sensors 2025, 25(15), 4580; https://doi.org/10.3390/s25154580 - 24 Jul 2025

Accurate classification of chest X-ray (CXR) images is crucial for diagnosing lung diseases in medical imaging. Existing deep learning models for CXR image classification face challenges in distinguishing non-lung features. In this work, we propose a new segmentation-assisted fusion-based classification method. The method [...] Read more.

Accurate classification of chest X-ray (CXR) images is crucial for diagnosing lung diseases in medical imaging. Existing deep learning models for CXR image classification face challenges in distinguishing non-lung features. In this work, we propose a new segmentation-assisted fusion-based classification method. The method involves two stages: first, we use a lightweight segmentation model, Partial Convolutional Segmentation Network (PCSNet) designed based on an encoder–decoder architecture, to accurately obtain lung masks from CXR images. Then, a fusion of the masked CXR image with the original image enables classification using the improved lightweight ShuffleNetV2 model. The proposed method is trained and evaluated on segmentation datasets including the Montgomery County Dataset (MC) and Shenzhen Hospital Dataset (SH), and classification datasets such as Chest X-Ray Images for Pneumonia (CXIP) and COVIDx. Compared with seven segmentation models (U-Net, Attention-Net, SegNet, FPNNet, DANet, DMNet, and SETR), five classification models (ResNet34, ResNet50, DenseNet121, Swin-Transforms, and ShuffleNetV2), and state-of-the-art methods, our PCSNet model achieved high segmentation performance on CXR images. Compared to the state-of-the-art Attention-Net model, the accuracy of PCSNet increased by 0.19% (98.94% vs. 98.75%), and the boundary accuracy improved by 0.3% (97.86% vs. 97.56%), while requiring 62% fewer parameters. For pneumonia classification using the CXIP dataset, the proposed strategy outperforms the current best model by 0.14% in accuracy (98.55% vs. 98.41%). For COVID-19 classification with the COVIDx dataset, the model reached an accuracy of 97.50%, the absolute improvement in accuracy compared to CovXNet was 0.1%, and clinical metrics demonstrate more significant gains: specificity increased from 94.7% to 99.5%. These results highlight the model’s effectiveness in medical image analysis, demonstrating clinically meaningful improvements over state-of-the-art approaches. Full article

(This article belongs to the Special Issue Vision- and Image-Based Biomedical Diagnostics—2nd Edition)

► Show Figures

Figure 1

19 pages, 3805 KiB

Open AccessArticle

Assessment of Urban Rooftop Photovoltaic Potential Based on Deep Learning: A Case Study of the Central Urban Area of Wuhan

by Yu Zhang, Wei He, Jinyan Hu, Chaohui Zhou, Bo Ren, Huiheng Luo, Zhiyong Tian and Weili Liu

Buildings 2025, 15(15), 2607; https://doi.org/10.3390/buildings15152607 - 23 Jul 2025

Abstract

Accurate assessment of urban rooftop solar photovoltaic (PV) potential is critical for the low-carbon energy transition. This study presents a deep learning-based approach using high-resolution (0.5 m) aerial imagery to automatically identify building rooftops in the central urban area of Wuhan, China (covering [...] Read more.

Accurate assessment of urban rooftop solar photovoltaic (PV) potential is critical for the low-carbon energy transition. This study presents a deep learning-based approach using high-resolution (0.5 m) aerial imagery to automatically identify building rooftops in the central urban area of Wuhan, China (covering seven districts), and to estimate their PV installation potential. Two state-of-the-art semantic segmentation models (DeepLabv3+ and U-Net) were trained and evaluated on a local rooftop dataset; U-Net with a ResNet50 backbone achieved the best performance with an overall segmentation accuracy of ~94%. Using this optimized model, we extracted approximately 130 km² of suitable rooftop area, which could support an estimated 18.18 GW of PV capacity. These results demonstrate the effectiveness of deep learning for city-scale rooftop mapping and provide a data-driven basis for strategic planning of distributed PV installations to support carbon neutrality goals. The proposed framework can be generalized to facilitate large-scale solar energy assessments in other cities. Full article

(This article belongs to the Special Issue Smart Technologies for Climate-Responsive Building Envelopes)

► Show Figures

Figure 1

21 pages, 2919 KiB

Open AccessArticle

A Feasible Domain Segmentation Algorithm for Unmanned Vessels Based on Coordinate-Aware Multi-Scale Features

by Zhengxun Zhou, Weixian Li, Yuhan Wang, Haozheng Liu and Ning Wu

J. Mar. Sci. Eng. 2025, 13(8), 1387; https://doi.org/10.3390/jmse13081387 - 22 Jul 2025

Abstract

The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface [...] Read more.

The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface undulations, among other disruptions, in turn making it challenging to achieve rapid and precise boundary segmentation. To cope with these challenges, in this paper, we propose a coordinate-aware multi-scale feature network (GASF-ResNet) method for water segmentation. The method integrates the attention module Global Grouping Coordinate Attention (GGCA) in the four downsampling branches of ResNet-50, thus enhancing the model’s ability to capture target features and improving the feature representation. To expand the model’s receptive field and boost its capability in extracting features of multi-scale targets, the Avoidance Spatial Pyramid Pooling (ASPP) technique is used. Combined with multi-scale feature fusion, this effectively enhances the expression of semantic information at different scales and improves the segmentation accuracy of the model in complex water environments. The experimental results show that the average pixel accuracy (mPA) and average intersection and union ratio (mIoU) of the proposed method on the self-made dataset and on the USVInaland unmanned ship dataset are 99.31% and 98.61%, and 98.55% and 99.27%, respectively, significantly better results than those obtained for the existing mainstream models. These results are helpful in overcoming the background interference caused by water surface reflection and uneven lighting in the aquatic environment and in realizing the accurate segmentation of the water area for the safe navigation of unmanned vessels, which is of great value for the stable operation of unmanned vessels in complex environments. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

15 pages, 1193 KiB

Open AccessArticle

Enhanced Brain Stroke Lesion Segmentation in MRI Using a 2.5D Transformer Backbone U-Net Model

by Mahsa Karimzadeh, Hadi Seyedarabi, Ata Jodeiri and Reza Afrouzian

Brain Sci. 2025, 15(8), 778; https://doi.org/10.3390/brainsci15080778 - 22 Jul 2025

Viewed by 37

Abstract

Background/Objectives: Accurate segmentation of brain stroke lesions from MRI images is a critical task in medical image analysis that is essential for timely diagnosis and treatment planning. Methods: This paper presents a novel approach for segmenting brain stroke lesions using a deep learning [...] Read more.

Background/Objectives: Accurate segmentation of brain stroke lesions from MRI images is a critical task in medical image analysis that is essential for timely diagnosis and treatment planning. Methods: This paper presents a novel approach for segmenting brain stroke lesions using a deep learning model based on the U-Net neural network architecture. We enhanced the traditional U-Net by integrating a transformer-based backbone, specifically the Mix Vision Transformer (MiT), and compared its performance against other commonly used backbones such as ResNet and EfficientNet. Additionally, we implemented a 2.5D method, which leverages 2D networks to process three-dimensional data slices, effectively balancing the rich spatial context of 3D methods and the simplicity of 2D methods. The 2.5D approach captures inter-slice dependencies, leading to improved lesion delineation without the computational complexity of full 3D models. Utilizing the 2015 ISLES dataset, which includes MRI images and corresponding lesion masks for 20 patients, we conducted our experiments with 4-fold cross-validation to ensure robustness and reliability. To evaluate the effectiveness of our method, we conducted comparative experiments with several state-of-the-art (SOTA) segmentation models, including CNN-based UNet, nnU-Net, TransUNet, and SwinUNet. Results: Our proposed model outperformed all competing methods in terms of Dice Coefficient and Intersection over Union (IoU), demonstrating its robustness and superiority. Our extensive experiments demonstrate that the proposed U-Net with the MiT Backbone, combined with 2.5D data preparation, achieves superior performance metrics, specifically achieving DICE and IoU scores of 0.8153 ± 0.0101 and 0.7835 ± 0.0079, respectively, outperforming other backbone configurations. Conclusions: These results indicate that the integration of transformer-based backbones and 2.5D techniques offers a significant advancement in the accurate segmentation of brain stroke lesions, paving the way for more reliable and efficient diagnostic tools in clinical settings. Full article

(This article belongs to the Section Neural Engineering, Neuroergonomics and Neurorobotics)

► Show Figures

Figure 1

29 pages, 9069 KiB

Open AccessArticle

Prediction of Temperature Distribution with Deep Learning Approaches for SM1 Flame Configuration

by Gökhan Deveci, Özgün Yücel and Ali Bahadır Olcay

Energies 2025, 18(14), 3783; https://doi.org/10.3390/en18143783 - 17 Jul 2025

Viewed by 221

Abstract

This study investigates the application of deep learning (DL) techniques for predicting temperature fields in the SM1 swirl-stabilized turbulent non-premixed flame. Two distinct DL approaches were developed using a comprehensive CFD database generated via the steady laminar flamelet model coupled with the SST [...] Read more.

This study investigates the application of deep learning (DL) techniques for predicting temperature fields in the SM1 swirl-stabilized turbulent non-premixed flame. Two distinct DL approaches were developed using a comprehensive CFD database generated via the steady laminar flamelet model coupled with the SST k-ω turbulence model. The first approach employs a fully connected dense neural network to directly map scalar input parameters—fuel velocity, swirl ratio, and equivalence ratio—to high-resolution temperature contour images. In addition, a comparison was made with different deep learning networks, namely Res-Net, EfficientNetB0, and Inception Net V3, to better understand the performance of the model. In the first approach, the results of the Inception V3 model and the developed Dense Model were found to be better than Res-Net and Efficient Net. At the same time, file sizes and usability were examined. The second framework employs a U-Net-based convolutional neural network enhanced by an RGB Fusion preprocessing technique, which integrates multiple scalar fields from non-reacting (cold flow) conditions into composite images, significantly improving spatial feature extraction. The training and validation processes for both models were conducted using 80% of the CFD data for training and 20% for testing, which helped assess their ability to generalize new input conditions. In the secondary approach, similar to the first approach, studies were conducted with different deep learning models, namely Res-Net, Efficient Net, and Inception Net, to evaluate model performance. The U-Net model, which is well developed, stands out with its low error and small file size. The dense network is appropriate for direct parametric analyses, while the image-based U-Net model provides a rapid and scalable option to utilize the cold flow CFD images. This framework can be further refined in future research to estimate more flow factors and tested against experimental measurements for enhanced applicability. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)

► Show Figures

Figure 1

21 pages, 5148 KiB

Open AccessArticle

Research on Buckwheat Weed Recognition in Multispectral UAV Images Based on MSU-Net

by Jinlong Wu, Xin Wu and Ronghui Miao

Agriculture 2025, 15(14), 1471; https://doi.org/10.3390/agriculture15141471 - 9 Jul 2025

Viewed by 232

Abstract

Quickly and accurately identifying weed areas is of great significance for improving weeding efficiency, reducing pesticide residues, protecting soil ecological environment, and increasing crop yield and quality. Targeting low detection efficiency in complex agricultural environments and inability of multispectral input in weed recognition [...] Read more.

Quickly and accurately identifying weed areas is of great significance for improving weeding efficiency, reducing pesticide residues, protecting soil ecological environment, and increasing crop yield and quality. Targeting low detection efficiency in complex agricultural environments and inability of multispectral input in weed recognition of minor grain based on unmanned aerial vehicles (UAVs), a semantic segmentation model for buckwheat weeds based on MSU-Net (multispectral U-shaped network) was proposed to explore the influence of different band optimizations on recognition accuracy. Five spectral features—red (R), blue (B), green (G), red edge (REdge), and near-infrared (NIR)—were collected in August when the weeds were more prominent. Based on the U-net image semantic segmentation model, the input module was improved to adaptively adjust the input bands. The neuron death caused by the original ReLU activation function may lead to misidentification, so it was replaced by the Swish function to improve the adaptability to complex inputs. Five single-band multispectral datasets and nine groups of multi-band combined data were, respectively, input into the improved MSU-Net model to verify the performance of our method. Experimental results show that in the single-band recognition results, the B band performs better than other bands, with mean pixel accuracy (mPA), mean intersection over union (mIoU), Dice, and F1 values of 0.75, 0.61, 0.87, and 0.80, respectively. In the multi-band recognition results, the R+G+B+NIR band performs better than other combined bands, with mPA, mIoU, Dice, and F1 values of 0.76, 0.65, 0.85, and 0.78, respectively. Compared with U-Net, DenseASPP, PSPNet, and DeepLabv3, our method achieved a preferable balance between model accuracy and resource consumption. These results indicate that our method can adapt to multispectral input bands and achieve good results in weed segmentation tasks. It can also provide reference for multispectral data analysis and semantic segmentation in the field of minor grain crops. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

30 pages, 4399 KiB

Open AccessArticle

Confident Learning-Based Label Correction for Retinal Image Segmentation

by Tanatorn Pethmunee, Supaporn Kansomkeat, Patama Bhurayanontachai and Sathit Intajag

Diagnostics 2025, 15(14), 1735; https://doi.org/10.3390/diagnostics15141735 - 8 Jul 2025

Viewed by 266

Abstract

Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative [...] Read more.

Background/Objectives: In automatic medical image analysis, particularly for diabetic retinopathy, the accuracy of labeled data is crucial, as label noise can significantly complicate the analysis and lead to diagnostic errors. To tackle the issue of label noise in retinal image segmentation, an innovative label correction framework is introduced that combines Confident Learning (CL) with a human-in-the-loop re-annotation process to meticulously detect and rectify pixel-level labeling inaccuracies. Methods: Two CL-oriented strategies are assessed: Confident Joint Analysis (CJA) employing DeeplabV3+ with a ResNet-50 architecture, and Prune by Noise Rate (PBNR) utilizing ResNet-18. These methodologies are implemented on four publicly available retinal image datasets: HRF, STARE, DRIVE, and CHASE_DB1. After the models have been trained on the original labeled datasets, label noise is quantified, and amendments are executed on suspected misclassified pixels prior to the assessment of model performance. Results: The reduction in label noise yielded consistent advancements in accuracy, Intersection over Union (IoU), and weighted IoU across all the datasets. The segmentation of tiny structures, such as the fovea, demonstrated a significant enhancement following refinement. The Mean Boundary F1 Score (MeanBFScore) remained invariant, signifying the maintenance of boundary integrity. CJA and PBNR demonstrated strengths under different conditions, producing variations in performance that were dependent on the noise level and dataset characteristics. CL-based label correction techniques, when amalgamated with human refinement, could significantly enhance the segmentation accuracy and evaluation robustness for Accuracy, IoU, and MeanBFScore, achieving values of 0.9156, 0.8037, and 0.9856, respectively, with regard to the original ground truth, reflecting increases of 4.05%, 9.95%, and 1.28% respectively. Conclusions: This methodology represents a feasible and scalable solution to the challenge of label noise in medical image analysis, holding particular significance for real-world clinical applications. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image and Signal Processing: Recent Advancements and Applications)

► Show Figures

Figure 1

19 pages, 4052 KiB

Open AccessArticle

RDM-YOLO: A Lightweight Multi-Scale Model for Real-Time Behavior Recognition of Fourth Instar Silkworms in Sericulture

by Jinye Gao, Jun Sun, Xiaohong Wu and Chunxia Dai

Agriculture 2025, 15(13), 1450; https://doi.org/10.3390/agriculture15131450 - 5 Jul 2025

Viewed by 314

Abstract

Accurate behavioral monitoring of silkworms (Bombyx mori) during the fourth instar development is crucial for enhancing productivity and welfare in sericulture operations. Current manual observation paradigms face critical limitations in temporal resolution, inter-observer variability, and scalability. This study presents RDM-YOLO, a [...] Read more.

Accurate behavioral monitoring of silkworms (Bombyx mori) during the fourth instar development is crucial for enhancing productivity and welfare in sericulture operations. Current manual observation paradigms face critical limitations in temporal resolution, inter-observer variability, and scalability. This study presents RDM-YOLO, a computationally efficient deep learning framework derived from YOLOv5s architecture, specifically designed for the automated detection of three essential behaviors (resting, wriggling, and eating) in fourth instar silkworms. Methodologically, Res2Net blocks are first integrated into the backbone network to enable hierarchical residual connections, expanding receptive fields and improving multi-scale feature representation. Second, standard convolutional layers are replaced with distribution shifting convolution (DSConv), leveraging dynamic sparsity and quantization mechanisms to reduce computational complexity. Additionally, the minimum point distance intersection over union (MPDIoU) loss function is proposed to enhance bounding box regression efficiency, mitigating challenges posed by overlapping targets and positional deviations. Experimental results demonstrate that RDM-YOLO achieves 99% mAP@0.5 accuracy and 150 FPS inference speed on the datasets, significantly outperforming baseline YOLOv5s while reducing the model parameters by 24%. Specifically designed for deployment on resource-constrained devices, the model ensures real-time monitoring capabilities in practical sericulture environments. Full article

(This article belongs to the Section Digital Agriculture)

► Show Figures

Figure 1

31 pages, 6788 KiB

Open AccessArticle

A Novel Dual-Modal Deep Learning Network for Soil Salinization Mapping in the Keriya Oasis Using GF-3 and Sentinel-2 Imagery

by Ilyas Nurmemet, Yang Xiang, Aihepa Aihaiti, Yu Qin, Yilizhati Aili, Hengrui Tang and Ling Li

Agriculture 2025, 15(13), 1376; https://doi.org/10.3390/agriculture15131376 - 27 Jun 2025

Viewed by 403

Abstract

Soil salinization poses a significant threat to agricultural productivity, food security, and ecological sustainability in arid and semi-arid regions. Effectively and timely mapping of different degrees of salinized soils is essential for sustainable land management and ecological restoration. Although deep learning (DL) methods [...] Read more.

Soil salinization poses a significant threat to agricultural productivity, food security, and ecological sustainability in arid and semi-arid regions. Effectively and timely mapping of different degrees of salinized soils is essential for sustainable land management and ecological restoration. Although deep learning (DL) methods have been widely employed for soil salinization extraction from remote sensing (RS) data, the integration of multi-source RS data with DL methods remains challenging due to issues such as limited data availability, speckle noise, geometric distortions, and suboptimal data fusion strategies. This study focuses on the Keriya Oasis, Xinjiang, China, utilizing RS data, including Sentinel-2 multispectral and GF-3 full-polarimetric SAR (PolSAR) images, to conduct soil salinization classification. We propose a Dual-Modal deep learning network for Soil Salinization named DMSSNet, which aims to improve the mapping accuracy of salinization soils by effectively fusing spectral and polarimetric features. DMSSNet incorporates self-attention mechanisms and a Convolutional Block Attention Module (CBAM) within a hierarchical fusion framework, enabling the model to capture both intra-modal and cross-modal dependencies and to improve spatial feature representation. Polarimetric decomposition features and spectral indices are jointly exploited to characterize diverse land surface conditions. Comprehensive field surveys and expert interpretation were employed to construct a high-quality training and validation dataset. Experimental results indicate that DMSSNet achieves an overall accuracy of 92.94%, a Kappa coefficient of 79.12%, and a macro F1-score of 86.52%, positively outperforming conventional DL models (ResUNet, SegNet, DeepLabv3+). The results confirm the superiority of attention-guided dual-branch fusion networks for distinguishing varying degrees of soil salinization across heterogeneous landscapes and highlight the value of integrating Sentinel-2 optical and GF-3 PolSAR data for complex land surface classification tasks. Full article

(This article belongs to the Section Digital Agriculture)

► Show Figures

Figure 1

13 pages, 1014 KiB

Open AccessArticle

Discrete Wavelet Transform-Based Data Fusion with ResUNet Model for Liver Tumor Segmentation

by Ümran Şeker Ertuğrul and Halife Kodaz

Electronics 2025, 14(13), 2589; https://doi.org/10.3390/electronics14132589 - 27 Jun 2025

Viewed by 390

Abstract

Liver tumors negatively affect vital functions such as digestion and nutrient storage, significantly reducing patients’ quality of life. Therefore, early detection and accurate treatment planning are of great importance. This study aims to support physicians by automatically identifying the type and location of [...] Read more.

Liver tumors negatively affect vital functions such as digestion and nutrient storage, significantly reducing patients’ quality of life. Therefore, early detection and accurate treatment planning are of great importance. This study aims to support physicians by automatically identifying the type and location of tumors, enabling rapid diagnosis and treatment. The segmentation process was carried out using deep learning methods based on artificial intelligence, particularly the U-Net architecture, which is designed for biomedical imaging. U-Net was modified by adding residual blocks, resulting in a deeper architecture called ResUNet. Due to the limited availability of medical data, both normal data fusion and discrete wavelet transform (DWT) methods were applied during the data preprocessing phase. A total of 131 liver tumor images, resized to 120 × 120 pixels, were analyzed. The DWT-based fusion method achieved more successful results, with a dice coefficient of 94.45%. This study demonstrates the effectiveness of artificial intelligence-supported approaches in liver tumor segmentation and suggests that such applications will become more widely used in the medical field in the future. Full article

► Show Figures

Figure 1

31 pages, 4585 KiB

Open AccessArticle

CAAF-ResUNet: Adaptive Attention Fusion with Boundary-Aware Loss for Lung Nodule Segmentation

by Thang Quoc Pham, Thai Hoang Le, Khai Dinh Lai, Dat Quoc Ngo, Tan Van Pham, Quang Hong Hua, Khang Quang Le, Huyen Duy Mai Le and Tuyen Ngoc Lam Nguyen

Medicina 2025, 61(7), 1126; https://doi.org/10.3390/medicina61071126 - 22 Jun 2025

Viewed by 370

Abstract

Background and Objectives: The accurate segmentation of pulmonary nodules in computed tomography (CT) remains a critical yet challenging task due to variations in nodule size, shape, and boundary ambiguity. This study proposes CAAF-ResUNet (Context-Aware Adaptive Attention Fusion ResUNet), a novel deep learning [...] Read more.

Background and Objectives: The accurate segmentation of pulmonary nodules in computed tomography (CT) remains a critical yet challenging task due to variations in nodule size, shape, and boundary ambiguity. This study proposes CAAF-ResUNet (Context-Aware Adaptive Attention Fusion ResUNet), a novel deep learning model designed to address these challenges through adaptive feature fusion and edge-sensitive learning. Materials and Methods: Central to our approach is the Adaptive Attention Controller (AAC), which dynamically adjusts the contribution of channel and position attention based on contextual features in each input. To further enhance boundary localization, we incorporate three complementary boundary-aware loss functions: Sobel, Laplacian, and Hausdorff. Results: An extensive evaluation of two benchmark datasets demonstrates the superiority of the proposed model, achieving Dice scores of 90.88% on LUNA16 and 85.92% on LIDC-IDRI, both exceeding prior state-of-the-art methods. A clinical validation of a dataset comprising 804 CT slices from 35 patients at the University Medical Center of Ho Chi Minh City confirmed the model’s practical reliability, yielding a Dice score of 95.34% and a notably low Miss Rate of 4.60% under the Hausdorff loss configuration. Conclusions: These results establish CAAF-ResUNet as a robust and clinically viable solution for pulmonary nodule segmentation, offering enhanced boundary precision and minimized false negatives, two critical properties in early-stage lung cancer diagnosis and radiological decision support. Full article

(This article belongs to the Special Issue Advances in Imaging and Diagnostics in Lung Disease: A Multimodal Approach)

► Show Figures

Figure 1

24 pages, 7924 KiB

Open AccessArticle

Optimizing Car Collision Detection Using Large Dashcam-Based Datasets: A Comparative Study of Pre-Trained Models and Hyperparameter Configurations

by Muhammad Shahid, Martin Gregurić, Amirhossein Hassani and Marko Ševrović

Appl. Sci. 2025, 15(13), 7001; https://doi.org/10.3390/app15137001 - 21 Jun 2025

Viewed by 373

Abstract

The automatic identification of traffic collisions is an emerging topic in modern traffic surveillance systems. The increasing number of surveillance cameras at urban intersections connected to traffic surveillance systems has created new opportunities for leveraging computer vision techniques for automatic collision detection. This [...] Read more.

The automatic identification of traffic collisions is an emerging topic in modern traffic surveillance systems. The increasing number of surveillance cameras at urban intersections connected to traffic surveillance systems has created new opportunities for leveraging computer vision techniques for automatic collision detection. This study investigates the effectiveness of transfer learning utilizing pre-trained deep learning models for collision detection through dashcam images. We evaluated several state-of-the-art (SOTA) image classification models and fine-tuned them using different hyperparameter combinations to test their performance on the car collision detection problem. Our methodology systematically investigates the influence of optimizers, loss functions, schedulers, and learning rates on model generalization. A comprehensive analysis is conducted using 7 performance metrics to assess classification performance. Experiments on a large dashcam-based images dataset show that ResNet50, optimized with AdamW, a learning rate of 0.0001, CosineAnnealingLR scheduler, and Focal Loss, emerged as the top performer, achieving an accuracy of 0.9782, F1-score of 0.9617, and IoU of 0.9262, indicating a strong ability to reduce false negatives. Full article

► Show Figures

Figure 1

18 pages, 3051 KiB

Open AccessArticle

Segmentation and Fractional Coverage Estimation of Soil, Illuminated Vegetation, and Shaded Vegetation in Corn Canopy Images Using CCSNet and UAV Remote Sensing

by Shanxin Zhang, Jibo Yue, Xiaoyan Wang, Haikuan Feng, Yang Liu and Meiyan Shu

Agriculture 2025, 15(12), 1309; https://doi.org/10.3390/agriculture15121309 - 18 Jun 2025

Viewed by 534

Abstract

The accurate estimation of corn canopy structure and light conditions is essential for effective crop management and informed variety selection. This study introduces CCSNet, a deep learning-based semantic segmentation model specifically developed to extract fractional coverages of soil, illuminated vegetation, and shaded vegetation [...] Read more.

The accurate estimation of corn canopy structure and light conditions is essential for effective crop management and informed variety selection. This study introduces CCSNet, a deep learning-based semantic segmentation model specifically developed to extract fractional coverages of soil, illuminated vegetation, and shaded vegetation from high-resolution corn canopy images acquired by UAVs. CCSNet improves segmentation accuracy by employing multi-level feature fusion and pyramid pooling to effectively capture multi-scale contextual information. The model was evaluated using Pixel Accuracy (PA), mean Intersection over Union (mIoU), and Recall, and was benchmarked against U-Net, PSPNet and UNetFormer. On the test set, CCSNet utilizing a ResNet50 backbone achieved the highest accuracy, with an mIoU of 86.42% and a PA of 93.58%. In addition, its estimation of fractional coverage for key canopy components yielded a root mean squared error (RMSE) ranging from 3.16% to 5.02%. Compared to lightweight backbones (e.g., MobileNetV2), CCSNet exhibited superior generalization performance when integrated with deeper backbones. These results highlight CCSNet’s capability to deliver high-precision segmentation and reliable phenotypic measurements. This provides valuable insights for breeders to evaluate light-use efficiency and facilitates intelligent decision-making in precision agriculture. Full article

(This article belongs to the Special Issue Research Advances in Perception for Agricultural Robots)

► Show Figures

Figure 1

15 pages, 1943 KiB

Open AccessArticle

LSE-Net: Integrated Segmentation and Ensemble Deep Learning for Enhanced Lung Disease Classification

by Bhavan Kumar Basavaraju and Mohammad Masum

Electronics 2025, 14(12), 2407; https://doi.org/10.3390/electronics14122407 - 12 Jun 2025

Viewed by 443

Abstract

Accurate classification of lung diseases is vital for timely diagnosis and effective treatment of respiratory conditions such as COPD, pneumonia, asthma, and lung cancer. Traditional diagnostic approaches often suffer from limited consistency and elevated false-positive rates, highlighting the demand for more dependable automated [...] Read more.

Accurate classification of lung diseases is vital for timely diagnosis and effective treatment of respiratory conditions such as COPD, pneumonia, asthma, and lung cancer. Traditional diagnostic approaches often suffer from limited consistency and elevated false-positive rates, highlighting the demand for more dependable automated systems. To address this challenge, we introduce LSE-Net, an end-to-end deep learning framework that combines precise lung segmentation using an optimized U-Net++ with robust classification powered by an ensemble of DenseNet121 and ResNet50. Leveraging structured hyperparameter tuning and patient-level evaluation, LSE-Net achieves 92.7% accuracy, 96.7% recall, and an F1-score of 94.0%, along with improved segmentation performance (DSC = 0.59 ± 0.01, IoU = 0.523 ± 0.07). These results demonstrate LSE-Net’s ability to reduce diagnostic uncertainty, enhance classification precision, and provide a practical, high-performing solution for real-world clinical deployment in lung disease assessment. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 3120 KiB

Open AccessArticle

LAAVOS: A DeAOT-Based Approach for Medaka Larval Ventricular Video Segmentation

by Kai Rao, Minghao Wang and Shutan Xu

Appl. Sci. 2025, 15(12), 6537; https://doi.org/10.3390/app15126537 - 10 Jun 2025

Viewed by 391

Abstract

Accurate segmentation of the ventricular region in embryonic heart videos of medaka fish (Oryzias latipes) holds significant scientific value for research on heart development mechanisms. However, existing medaka ventricular datasets are overly simplistic and fail to meet practical application requirements. And [...] Read more.

Accurate segmentation of the ventricular region in embryonic heart videos of medaka fish (Oryzias latipes) holds significant scientific value for research on heart development mechanisms. However, existing medaka ventricular datasets are overly simplistic and fail to meet practical application requirements. And the video frames contain multiple complex interfering factors, including optical interference from the filming environment, dynamic color changes caused by blood flow, significant diversity in ventricular scales, image blurring in certain video frames, high similarity in organ structures, and indistinct boundaries between the ventricles and atria. These challenges mean existing methods still face notable technical difficulties in medaka embryonic ventricular segmentation tasks. To address these challenges, this study first constructs a medaka embryonic ventricular video dataset containing 4200 frames with pixel-level annotations. Building upon this, we propose a semi-supervised video segmentation model based on the hierarchical propagation feature decoupling framework (DeAOT) and innovatively design an architecture that combines the LA-ResNet encoder with the AFPViS decoder, significantly improving the accuracy of medaka ventricular segmentation. Experimental results demonstrate that, compared to the traditional U-Net model, our method achieves a 13.48% improvement in the mean Intersection over Union (mIoU) metric. Additionally, compared to the state-of-the-art DeAOT method, it achieves a notable 4.83% enhancement in the comprehensive evaluation metric Jaccard and F-measure (J&F), providing reliable technical support for research on embryonic heart development. Full article

(This article belongs to the Special Issue Pattern Recognition in Video Processing)

► Show Figures

Figure 1

Search Results (510)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (510)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI