Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (545)

Search Parameters:
Keywords = pre-visual detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 5900 KB  
Article
Hybrid Attention Mechanism Combined with U-Net for Extracting Vascular Branching Points in Intracavitary Images
by Kaiyang Xu, Haibin Wu, Liang Yu and Xin He
Electronics 2026, 15(2), 322; https://doi.org/10.3390/electronics15020322 - 11 Jan 2026
Viewed by 153
Abstract
To address the application requirements of Visual Simultaneous Localization and Mapping (VSLAM) in intracavitary environments and the scarcity of gold-standard datasets for deep learning methods, this study proposes a hybrid attention mechanism combined with U-Net for vascular branch point extraction in endoluminal images [...] Read more.
To address the application requirements of Visual Simultaneous Localization and Mapping (VSLAM) in intracavitary environments and the scarcity of gold-standard datasets for deep learning methods, this study proposes a hybrid attention mechanism combined with U-Net for vascular branch point extraction in endoluminal images (SuperVessel). The network is initialized via transfer learning with pre-trained SuperRetina model parameters and integrated with a vascular feature detection and matching method based on dual branch fusion and structure enhancement, generating a pseudo-gold-standard vascular branch point dataset. The framework employs a dual-decoder architecture, incorporates a dynamic up-sampling module (CBAM-Dysample) to refine local vessel features through hybrid attention mechanisms, designs a Dice-Det loss function weighted by branching features to prioritize vessel junctions, and introduces a dynamically weighted Triplet-Des loss function optimized for descriptor discrimination. Experiments on the Vivo test set demonstrate that the proposed method achieves an average Area Under Curve (AUC) of 0.760, with mean feature points, accuracy, and repeatability scores of 42,795, 0.5294, and 0.46, respectively. Compared to SuperRetina, the method maintains matching stability while exhibiting superior repeatability, feature point density, and robustness in low-texture/deformation scenarios. Ablation studies confirm the CBAM-Dysample module’s efficacy in enhancing feature expression and convergence speed, offering a robust solution for intracavitary SLAM systems. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

25 pages, 14552 KB  
Article
TransferLearning-Driven Large-Scale CNN Benchmarking with Explainable AI for Image-Based Dust Detection on Solar Panels
by Hafeez Anwar
Information 2026, 17(1), 52; https://doi.org/10.3390/info17010052 - 6 Jan 2026
Viewed by 154
Abstract
Solar panel power plants are typically established in regions with maximum solar irradiation, yet these conditions result in heavy dust accumulation on the panels causing significant performance degradation and reduced power output. The paper addresses this issue via an image-based dust detection solution [...] Read more.
Solar panel power plants are typically established in regions with maximum solar irradiation, yet these conditions result in heavy dust accumulation on the panels causing significant performance degradation and reduced power output. The paper addresses this issue via an image-based dust detection solution powered by deep learning, particularly convolutional neural networks (CNNs). Most of such solutions use state-of-the-art CNNs either as backbones/features extractors, or propose custom models built upon them. Given such a reliance, future research requires a comprehensive benchmarking of CNN models to identify the ones that achieve superior performance on classifying clean vs. dusty solar panels both with respect to accuracy and efficiency. To this end, we evaluate 100 CNN models that belong to 16 families for image-based dust detection on solar panels, where the pre-trained models of these CNN architectures are used to encode solar panel images. Upon these image encodings, we then train and test a linear support vector machine (SVM) to determine the best-performing models in terms of classification accuracy and training time. The use of such a simple classifier ensures a fair comparison where the encodings do not benefit from the classifier itself and their performance reflects each CNN’s ability to capture the underlying image features. Experiments were conducted on a publicly available dust detection dataset, using stratified shuffle-split with 70–30, 80–20, and 90–10 splits, repeated 10 times. convnext_xxlarge and resnetv2_152 achieved the best classification rates of above 90%, with resnetv2_152 offering superior efficiency that is also supported by features analysis such as tSNE and UMAP, and explainableAI (XAI) such as LIME visualization. To prove their generalization capability, we tested the image encodings of resnetv2_152 on an unseen real-world image dataset captured via a drone camera, which achieved a remarkable accuracy of 96%. Consequently, our findings guide the selection of optimal CNN backbones for future image-based dust detection systems. Full article
Show Figures

Figure 1

25 pages, 4574 KB  
Article
Clustering Based Approach for Enhanced Characterization of Anomalies in Traffic Flows
by Mohammed Khasawneh and Anjali Awasthi
Future Transp. 2026, 6(1), 11; https://doi.org/10.3390/futuretransp6010011 - 4 Jan 2026
Viewed by 167
Abstract
Traffic flow anomalies represent significant deviations from normal traffic behavior and disrupt the smooth operation of transportation systems. These may appear as unusually high or low traffic volumes compared to historical trends. Unexpectedly high volume can lead to congestion exceeding usual capacity, while [...] Read more.
Traffic flow anomalies represent significant deviations from normal traffic behavior and disrupt the smooth operation of transportation systems. These may appear as unusually high or low traffic volumes compared to historical trends. Unexpectedly high volume can lead to congestion exceeding usual capacity, while unusually low volume might indicate incidents like road closures, or malfunctioning traffic signals. Identifying and understanding both types of anomalies is crucial for effective traffic management. This paper presents a clustering based approach for enhanced characterization of anamolies in traffic flows. Anomalies in traffic patterns are determined using three anomaly detection techniques: Elliptic Envelope, Isolation Forest, and Local Outlier Factor. These anomalies were newly detected in this work on the Montréal dataset after preprocessing, rather than directly reused from earlier studies. These methods were applied to a dataset that had been pre-processed using windowing techniques with different configuration settings to enhance the detection process. Then, to leverage the detected anomalies, we utilized clustering algorithms, specifically k-means and hierarchical clustering, to segment these anomalies. Each clustering algorithm was used to determine the optimal number of clusters. Subsequently, we characterized these clusters through detailed visualization and mapped them according to their unique characteristics. This approach not only identifies traffic anomalies effectively but also provides a comprehensive understanding of their spatial and temporal distributions, which is crucial for traffic management and urban planning. Full article
Show Figures

Figure 1

12 pages, 465 KB  
Article
Using QR Codes for Payment Card Fraud Detection
by Rachid Chelouah and Prince Nwaekwu
Information 2026, 17(1), 39; https://doi.org/10.3390/info17010039 - 4 Jan 2026
Viewed by 244
Abstract
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial [...] Read more.
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial to develop an effective, secure, and reliable payment system. This research presents a comprehensive study on payment card fraud detection using deep learning techniques. The introduction highlights the significance of a strong financial system supported by a quick and secure payment system. It emphasizes the need for advanced methods to detect fraudulent activities in card transactions. The proposed methodology focuses on the conversion of a comma-separated values (CSV) dataset into quick response (QR) code images, enabling the application of deep neural networks and transfer learning. This representation allows leveraging pre-trained image-based architectures to provide a layer of privacy by encoding numeric transaction attributes into visual patterns. The feature extraction process involves the use of a convolutional neural network, specifically a residual network architecture. The results obtained through the under-sampling dataset balancing method revealed promising performance in terms of precision, accuracy, recall, and F1 score for the traditional models such as K-nearest neighbors (KNN), Decision tree, Random Forest, AdaBoost, Bagging, and Gaussian Naive Bayes. Furthermore, the proposed deep neural network model achieved high precision, indicating its effectiveness in detecting card fraud. The model also achieved high accuracy, recall, and F1 score, showcasing its superior performance compared to traditional machine learning models. In summary, this research contributes to the field of payment card fraud detection by leveraging deep learning techniques. The proposed methodology offers a sophisticated approach to detecting fraudulent activities in card payment systems, addressing the growing concerns of identity theft and payment security. By deploying the trained model in an Android application, real-time fraud detection becomes possible, further enhancing the security of card transactions. The findings of this study provide insights and avenues for future advancements in the field of payment card fraud detection. Full article
(This article belongs to the Section Information Security and Privacy)
Show Figures

Figure 1

10 pages, 2891 KB  
Case Report
Posterior Reversible Encephalopathy Syndrome as an Under-Recognized Neurological Complication of Multisystem Inflammatory Syndrome in Children: A Case from Indonesia
by Ido Narpati Bramantya, Ratna Sutanto, Callistus Bruce Henfry Sulay and Gilbert Sterling Octavius
COVID 2026, 6(1), 8; https://doi.org/10.3390/covid6010008 - 31 Dec 2025
Viewed by 235
Abstract
Posterior Reversible Encephalopathy Syndrome (PRES) is a rare but potentially reversible neurological manifestation associated with Multisystem Inflammatory Syndrome in Children (MIS-C). We report an eight-year-old boy who developed PRES secondary to MIS-C following asymptomatic SARS-CoV-2 exposure. The patient presented with fever, seizures, decreased [...] Read more.
Posterior Reversible Encephalopathy Syndrome (PRES) is a rare but potentially reversible neurological manifestation associated with Multisystem Inflammatory Syndrome in Children (MIS-C). We report an eight-year-old boy who developed PRES secondary to MIS-C following asymptomatic SARS-CoV-2 exposure. The patient presented with fever, seizures, decreased consciousness, and visual disturbances. MRI revealed characteristic bilateral parieto-occipital and posterior temporal cortical–subcortical hyperintensities, while CT scans were normal. The patient achieved full neurological recovery with corticosteroid therapy, blood pressure control, and supportive management. This case underscores the importance of early MRI in detecting PRES when clinical or CT findings are inconclusive, emphasizing the need for heightened awareness among pediatric clinicians to prevent irreversible neurological sequelae. Full article
(This article belongs to the Section COVID Clinical Manifestations and Management)
Show Figures

Figure 1

33 pages, 40054 KB  
Article
MVDCNN: A Multi-View Deep Convolutional Network with Feature Fusion for Robust Sonar Image Target Recognition
by Yue Fan, Cheng Peng, Peng Zhang, Zhisheng Zhang, Guoping Zhang and Jinsong Tang
Remote Sens. 2026, 18(1), 76; https://doi.org/10.3390/rs18010076 - 25 Dec 2025
Viewed by 335
Abstract
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these [...] Read more.
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these critical limitations, this paper proposes a Multi-View Deep Convolutional Neural Network (MVDCNN) based on feature-level fusion for robust sonar image target recognition. The MVDCNN adopts a highly modular and extensible architecture consisting of four interconnected modules: an input reshaping module that adapts multi-view images to match the input format of pre-trained backbone networks via dimension merging and channel replication; a shared-weight feature extraction module that leverages Convolutional Neural Network (CNN) or Transformer backbones (e.g., ResNet, Swin Transformer, Vision Transformer) to extract discriminative features from each view, ensuring parameter efficiency and cross-view feature consistency; a feature fusion module that aggregates complementary features (e.g., target texture and shape) across views using max-pooling to retain the most salient characteristics and suppress noisy or occluded view interference; and a lightweight classification module that maps the fused feature representations to target categories. Additionally, to mitigate the data scarcity bottleneck in sonar ATR, we design a multi-view sample augmentation method based on sonar imaging geometric principles: this method systematically combines single-view samples of the same target via the combination formula and screens valid samples within a predefined azimuth range, constructing high-quality multi-view training datasets without relying on complex generative models or massive initial labeled data. Comprehensive evaluations on the Custom Side-Scan Sonar Image Dataset (CSSID) and Nankai Sonar Image Dataset (NKSID) demonstrate the superiority of our framework over single-view baselines. Specifically, the two-view MVDCNN achieves average classification accuracies of 94.72% (CSSID) and 97.24% (NKSID), with relative improvements of 7.93% and 5.05%, respectively; the three-view MVDCNN further boosts the average accuracies to 96.60% and 98.28%. Moreover, MVDCNN substantially elevates the precision and recall of small-sample categories (e.g., Fishing net and Small propeller in NKSID), effectively alleviating the class imbalance challenge. Mechanism validation via t-Distributed Stochastic Neighbor Embedding (t-SNE) feature visualization and prediction confidence distribution analysis confirms that MVDCNN yields more separable feature representations and more confident category predictions, with stronger intra-class compactness and inter-class discrimination in the feature space. The proposed MVDCNN framework provides a robust and interpretable solution for advancing sonar ATR and offers a technical paradigm for multi-view acoustic image understanding in complex underwater environments. Full article
(This article belongs to the Special Issue Underwater Remote Sensing: Status, New Challenges and Opportunities)
Show Figures

Graphical abstract

30 pages, 3641 KB  
Article
Modified EfficientNet-B0 Architecture Optimized with Quantum-Behaved Algorithm for Skin Cancer Lesion Assessment
by Abdul Rehman Altaf, Abdullah Altaf and Faizan Ur Rehman
Diagnostics 2025, 15(24), 3245; https://doi.org/10.3390/diagnostics15243245 - 18 Dec 2025
Viewed by 423
Abstract
Background/Objectives: Skin cancer is one of the most common diseases in the world, whose early and accurate detection can have a survival rate more than 90% while the chance of mortality is almost 80% in case of late diagnostics. Methods: A [...] Read more.
Background/Objectives: Skin cancer is one of the most common diseases in the world, whose early and accurate detection can have a survival rate more than 90% while the chance of mortality is almost 80% in case of late diagnostics. Methods: A modified EfficientNet-B0 is developed based on mobile inverted bottleneck convolution with squeeze and excitation approach. The 3 × 3 convolutional layer is used to capture low-level visual features while the core features are extracted using a sequence of Mobile Inverted Bottleneck Convolution blocks having both 3 × 3 and 5 × 5 kernels. They not only balance fine-grained extraction with broader contextual representation but also increase the network’s learning capacity while maintaining computational cost. The proposed architecture hyperparameters and extracted feature vectors of standard benchmark datasets (HAM10000, ISIC 2019 and MSLD v2.0) of dermoscopic images are optimized with the quantum-behaved particle swarm optimization algorithm (QBPSO). The merit function is formulated by the training loss given in the form of standard classification cross-entropy with label smoothing, mean fitness value (mfval), average accuracy (mAcc), mean computational time (mCT) and other standard performance indicators. Results: Comprehensive scenario-based simulations were performed using the proposed framework on a publicly available dataset and found an mAcc of 99.62% and 92.5%, mfval of 2.912 × 10−10 and 1.7921 × 10−8, mCT of 501.431 s and 752.421 s for HAM10000 and ISIC2019 datasets, respectively. The results are compared with state of the art, pre-trained existing models like EfficentNet-B4, RegNetY-320, ResNetXt-101, EfficentNetV2-M, VGG-16, Deep Lab V3 as well as reported techniques based on Mask RCCN, Deep Belief Net, Ensemble CNN, SCDNet and FixMatch-LS techniques having varying accuracies from 85% to 94.8%. The reliability of the proposed architecture and stability of QBPSO is examined through Monte Carlo simulation of 100 independent runs and their statistical soundings. Conclusions: The proposed framework reduces diagnostic errors and assists dermatologists in clinical decisions for an improved patient outcomes despite the challenges like data imbalance and interpretability. Full article
(This article belongs to the Special Issue Medical Image Analysis and Machine Learning)
Show Figures

Figure 1

14 pages, 1241 KB  
Article
Rapid Detection of Chicken Residues on Poultry Plant Surfaces Using Color and Fluorescence Spectrometry
by Clark Griscom, Dongyi Wang, Corliss A. O’Bryan, Rimmo Rõõm and Philip G. Crandall
Foods 2025, 14(24), 4352; https://doi.org/10.3390/foods14244352 - 18 Dec 2025
Viewed by 384
Abstract
Color and fluorescence spectrometry were evaluated as rapid, objective tools for verifying the cleanliness of poultry-processing food-contacting surfaces contaminated with a model chicken solution across six common materials. Both techniques detected chicken residues at dilutions several orders of magnitude below human visual and [...] Read more.
Color and fluorescence spectrometry were evaluated as rapid, objective tools for verifying the cleanliness of poultry-processing food-contacting surfaces contaminated with a model chicken solution across six common materials. Both techniques detected chicken residues at dilutions several orders of magnitude below human visual and olfactory thresholds, with stainless steel and blue plastic yielding the largest color differences between clean and contaminated states and fluorescence measurements remaining highly sensitive on all tested surfaces. Representative limits of detection were on the order of 1:50–1:100 dilution of chicken residue for color measurements on most surfaces and approximately 1:50 for fluorescence measurements, compared with human detection thresholds of approximately 1:50. Cleaning chemicals routinely used in poultry plants did not measurably reduce detection performance, and a simple machine learning classifier further improved separation of clean versus contaminated readings. These findings indicate that compact color and fluorescence instruments can provide fast, quantitative pre-sanitation checks that strengthen SSOP verification and reduce reliance on subjective human inspection in poultry processing facilities. Full article
Show Figures

Figure 1

21 pages, 3364 KB  
Article
Advancing Lateral Flow Detection in CRISPR/Cas12a Systems Through Rational Understanding and Design Strategies of Reporter Interactions
by Irina V. Safenkova, Maria V. Kamionskaya, Dmitriy V. Sotnikov, Sergey F. Biketov, Anatoly V. Zherdev and Boris B. Dzantiev
Biosensors 2025, 15(12), 812; https://doi.org/10.3390/bios15120812 - 13 Dec 2025
Viewed by 696
Abstract
CRISPR/Cas12a systems coupled with lateral flow tests (LFTs) are a promising route to rapid, instrument-free nucleic acid diagnostics due to conversion target recognition into a simple visual readout via cleavage of dual-labeled single-stranded DNA reporters. However, the conventional CRISPR/Cas12a–LFT system is constructed in [...] Read more.
CRISPR/Cas12a systems coupled with lateral flow tests (LFTs) are a promising route to rapid, instrument-free nucleic acid diagnostics due to conversion target recognition into a simple visual readout via cleavage of dual-labeled single-stranded DNA reporters. However, the conventional CRISPR/Cas12a–LFT system is constructed in a format where the intact reporter should block nanoparticle conjugate migration and can produce false-positive signals and shows strong dependence on component stoichiometry and kinetics. Here, we present the first combined experimental and theoretical analysis quantifying these limitations and defining practical solutions. The experimental evaluation included 480 variants of LFT configuration with reporters differing in the concentration of interacting components and the kinetic conditions of the interactions. The most influential factor leading to 100% false-positive results was insufficient interaction time between the components; pre-incubation of the conjugate with the reporter for 5 min eliminated these artifacts. Theoretical analysis of the LFT kinetics based on a mathematical model confirmed kinetic constraints at interaction times below a few minutes, which affect the detectable signal. Reporter concentration and conjugate architecture represented the second major factors: lowering reporter concentration to 20 nM and using smaller gold nanoparticles with multivalent fluorescent reporters markedly improved sensitivity. The difference in sensitivity between various LFT configurations exceeded 50-fold. The combination of identified strategies eliminated false-positive reactions and enabled the detection of up to 20 pM of DNA target (the hisZ gene of Erwinia amylovora, a bacterial phytopathogen). The strategies reported here are general and readily transferable to other DNA targets and CRISPR/Cas12a amplification-free diagnostics. Full article
(This article belongs to the Special Issue CRISPR/Cas System-Based Biosensors)
Show Figures

Figure 1

16 pages, 6248 KB  
Article
Building Modeling Process Using Point Cloud Data and the Digital Twin Approach: An Industrial Case Study from Turkey
by Zeliha Hazal Kandemir and Özge Akboğa Kale
Buildings 2025, 15(24), 4469; https://doi.org/10.3390/buildings15244469 - 10 Dec 2025
Viewed by 714
Abstract
This study presents a terrestrial-laser-scanning-based scan-to-BIM workflow that transforms point cloud data into a BIM-based digital twin and analyzes how data collected with LiDAR (Light Detection and Ranging) can be converted into an information-rich model using Autodesk ReCap and Revit. Point clouds provided [...] Read more.
This study presents a terrestrial-laser-scanning-based scan-to-BIM workflow that transforms point cloud data into a BIM-based digital twin and analyzes how data collected with LiDAR (Light Detection and Ranging) can be converted into an information-rich model using Autodesk ReCap and Revit. Point clouds provided by laser scanning were processed in the ReCap environment and imported into Revit in an application that took place within an industrial facility of approximately 240 m2 in Izmir. The scans were registered and pre-processed in Autodesk ReCap 2022 and modeled in Autodesk Revit 2022, with visualization updates prepared in Autodesk Revit 2023. Geometric quality was evaluated using point-to-model distance checks, since the dataset was imported in a pre-registered form and ReCap did not provide station-level RMSE values. The findings indicate that the ReCap–Revit integration offers high geometric accuracy and visual detail for both building elements and production-line machinery, but that high data density and complex geometry limit processing performance and interactivity. The study highlights both the practical applicability and the current technical limitations of terrestrial-laser-scanning-based scan-to-BIM workflows in an industrial context, offering a replicable reference model for future digital twin implementations in Turkey. Full article
(This article belongs to the Special Issue Digital Twins in Construction, Engineering and Management)
Show Figures

Figure 1

27 pages, 4969 KB  
Article
LegalEye: Multimodal Court Deception Detection Across Multiple Languages
by Rommel Isaac A. Baldivas, Nivedha Sreenivasan, So Young Kang, Alexandra My-Linh Miller, Megan Chacko, Shreya Krishnan, Carmen Ayala, Esperanza Ayala and Dohyeong Kim
Behav. Sci. 2025, 15(12), 1707; https://doi.org/10.3390/bs15121707 - 9 Dec 2025
Viewed by 536
Abstract
This study introduces LegalEye, a multimodal machine-learning model developed to detect deception in courtroom settings across three languages: English, Spanish, and Tagalog. The research investigates whether integrating audio, visual, and textual data can enhance deception detection accuracy and reduce bias in diverse legal [...] Read more.
This study introduces LegalEye, a multimodal machine-learning model developed to detect deception in courtroom settings across three languages: English, Spanish, and Tagalog. The research investigates whether integrating audio, visual, and textual data can enhance deception detection accuracy and reduce bias in diverse legal contexts. LegalEye uses neural networks and late fusion techniques to analyze multimodal courtroom testimony data. The dataset was carefully constructed with balanced representation across racial groups (White, Black, Hispanic, Asian) and genders, with attention to minimizing implicit bias. Performance was evaluated using accuracy and AUC across individual and combined modalities. The model achieved high deception detection rates—97% for English, 85% for Spanish, and 86% for Tagalog. Late fusion of modalities outperformed single-modality models, with visual features being most influential for English and Tagalog, while Spanish showed stronger audio and textual performance. The Tagalog audio model underperformed due to frequent code-switching. Dataset balancing helped mitigate demographic bias, though Asian representation remained limited. LegalEye shows strong potential for language-adaptive and culturally sensitive deception detection, offering a robust tool for pre-trial interviews and legal analysis. While not suited for real-time courtroom decisions, its objective insights can support legal counsel and promote fairer judicial outcomes. Future work should expand linguistic and demographic coverage. Full article
(This article belongs to the Special Issue Advanced Studies in Human-Centred AI)
Show Figures

Figure 1

28 pages, 8872 KB  
Article
Development and Application of an Intelligent Recognition System for Polar Environmental Targets Based on the YOLO Algorithm
by Jun Jian, Zhongying Wu, Kai Sun, Jiawei Guo and Ronglin Gao
J. Mar. Sci. Eng. 2025, 13(12), 2313; https://doi.org/10.3390/jmse13122313 - 5 Dec 2025
Viewed by 374
Abstract
As global climate warming enhances the navigability of Arctic routes, their navigation value has become prominent, yet ships operating in ice-covered waters face severe threats from sea ice and icebergs. Existing manual observation and radar monitoring remain limited, highlighting an urgent need for [...] Read more.
As global climate warming enhances the navigability of Arctic routes, their navigation value has become prominent, yet ships operating in ice-covered waters face severe threats from sea ice and icebergs. Existing manual observation and radar monitoring remain limited, highlighting an urgent need for efficient target recognition technology. This study focuses on polar environmental target detection by constructing a polar dataset with 1342 JPG images covering four classes, including sea ice, icebergs, ice channels, and ships, obtained via web collection and video frame extraction. The “Grounding DINO pre-annotation + LabelImg manual fine-tuning” strategy is employed to improve annotation efficiency and accuracy, with data augmentation further enhancing dataset diversity. After comparing YOLOv5n, YOLOv8n, and YOLOv11n, YOLOv8n is selected as the baseline model and improved by introducing the CBAM/SE attention mechanism, SCConv/AKConv convolutions, and BiFPN network. Among these models, the improved YOLOv8n + SCConv achieves the best in polar target detection, with a mean average precision (mAP) of 0.844–1.4% higher than the original model. It effectively reduces missed detections of sea ice and icebergs, thereby enhancing adaptability to complex polar environments. The experimental results demonstrate that the improved model exhibits good robustness in images of varying resolutions, scenes with water surface reflections, and AI-generated images. In addition, a visual GUI with image/video detection functions was developed to support real-time monitoring and result visualization. This research provides essential technical support for safe navigation in ice-covered waters, polar resource exploration, and scientific activities. Full article
Show Figures

Figure 1

19 pages, 3521 KB  
Article
Intelligent Traffic Management: Comparative Evaluation of YOLOv3, YOLOv5, and YOLOv8 for Vehicle Detection in Urban Environments in Montería, Colombia
by Darío Doria Usta, Ricardo Hundelshaussen, César López Martínez, João Felipe Coimbra Leite Costa and Diego Machado Marques
Future Transp. 2025, 5(4), 191; https://doi.org/10.3390/futuretransp5040191 - 5 Dec 2025
Viewed by 434
Abstract
This study compares the performance of three YOLO-based object detection models—YOLOv3, YOLOv5, and YOLOv8—for vehicle detection and classification at an urban intersection in Montería, Colombia. Recordings from five consecutive days, spanning three time slots, were used, totaling approximately 135,000 frames with variability in [...] Read more.
This study compares the performance of three YOLO-based object detection models—YOLOv3, YOLOv5, and YOLOv8—for vehicle detection and classification at an urban intersection in Montería, Colombia. Recordings from five consecutive days, spanning three time slots, were used, totaling approximately 135,000 frames with variability in lighting and weather conditions. Frames were preprocessed by maintaining the aspect ratio and were normalized according to each model. The evaluation employed models pre-trained on COCO, without fine-tuning, enabling an objective assessment of their generalization capacity. Precision, recall, F1-score, and mAP@0.5 were computed globally and by vehicle class. YOLOv5 achieved the best balance between precision and recall (F1-score = 0.78) and the highest mAP (0.63), while YOLOv3 showed lower recall and mAP, and YOLOv8 performed competitively but slightly below YOLOv5. Cars and motorcycles were the most robust classes, whereas bicycles and trucks showed greater detection challenges. Visual evaluation confirmed stable performance on cloudy days and in light rain, with reduced accuracy under sunny conditions with high contrast. These findings highlight the potential of modern YOLO architectures for intelligent urban traffic monitoring and management. The generated dataset constitutes a replicable resource for future mobility research in similar contexts. Full article
Show Figures

Figure 1

36 pages, 22245 KB  
Article
CMSNet: A SAM-Enhanced CNN–Mamba Framework for Damaged Building Change Detection in Remote Sensing Imagery
by Jianli Zhang, Liwei Tao, Wenbo Wei, Pengfei Ma and Mengdi Shi
Remote Sens. 2025, 17(23), 3913; https://doi.org/10.3390/rs17233913 - 3 Dec 2025
Viewed by 790
Abstract
In war and explosion scenarios, buildings often suffer varying degrees of damage characterized by complex, irregular, and fragmented spatial patterns, posing significant challenges for remote sensing–based change detection. Additionally, the scarcity of high-quality datasets limits the development and generalization of deep learning approaches. [...] Read more.
In war and explosion scenarios, buildings often suffer varying degrees of damage characterized by complex, irregular, and fragmented spatial patterns, posing significant challenges for remote sensing–based change detection. Additionally, the scarcity of high-quality datasets limits the development and generalization of deep learning approaches. To overcome these issues, we propose CMSNet, an end-to-end framework that integrates the structural priors of the Segment Anything Model (SAM) with the efficient temporal modeling and fine-grained representation capabilities of CNN–Mamba. Specifically, CMSNet adopts CNN–Mamba as the backbone to extract multi-scale semantic features from bi-temporal images, while SAM-derived visual priors guide the network to focus on building boundaries and structural variations. A Pre-trained Visual Prior-Guided Feature Fusion Module (PVPF-FM) is introduced to align and fuse these priors with change features, enhancing robustness against local damage, non-rigid deformations, and complex background interference. Furthermore, we construct a new RWSBD (Real-world War Scene Building Damage) dataset based on Gaza war scenes, comprising 42,732 annotated building damage instances across diverse scales, offering a strong benchmark for real-world scenarios. Extensive experiments on RWSBD and three public datasets (CWBD, WHU-CD, and LEVIR-CD+) demonstrate that CMSNet consistently outperforms eight state-of-the-art methods in both quantitative metrics (F1, IoU, Precision, Recall) and qualitative evaluations, especially in fine-grained boundary preservation, small-scale change detection, and complex scene adaptability. Overall, this work introduces a novel detection framework that combines foundation model priors with efficient change modeling, along with a new large-scale war damage dataset, contributing valuable advances to both research and practical applications in remote sensing change detection. Additionally, the strong generalization ability and efficient architecture of CMSNet highlight its potential for scalable deployment and practical use in large-area post-disaster assessment. Full article
Show Figures

Figure 1

28 pages, 3284 KB  
Article
Diffusion-Enhanced Underwater Debris Detection via Improved YOLOv12n Framework
by Jianghan Tao, Fan Zhao, Yijia Chen, Yongying Liu, Feng Xue, Jian Song, Hao Wu, Jundong Chen, Peiran Li and Nan Xu
Remote Sens. 2025, 17(23), 3910; https://doi.org/10.3390/rs17233910 - 2 Dec 2025
Viewed by 664
Abstract
Detecting underwater debris is important for monitoring the marine environment but remains challenging due to poor image quality, visual noise, object occlusions, and diverse debris appearances in underwater scenes. This study proposes UDD-YOLO, a novel detection framework that, for the first time, applies [...] Read more.
Detecting underwater debris is important for monitoring the marine environment but remains challenging due to poor image quality, visual noise, object occlusions, and diverse debris appearances in underwater scenes. This study proposes UDD-YOLO, a novel detection framework that, for the first time, applies a diffusion-based model to underwater image enhancement, introducing a new paradigm for improving perceptual quality in marine vision tasks. Specifically, the proposed framework integrates three key components: (1) a Cold Diffusion module that acts as a pre-processing stage to restore image clarity and contrast by reversing deterministic degradation such as blur and occlusion—without injecting stochastic noise—making it the first diffusion-based enhancement applied to underwater object detection; (2) an AMC2f feature extraction module that combines multi-scale separable convolutions and learnable normalization to improve representation for targets with complex morphology and scale variation; and (3) a Unified-IoU (UIoU) loss function designed to dynamically balance localization learning between high- and low-quality predictions, thereby reducing errors caused by occlusion or boundary ambiguity. Extensive experiments are conducted on the public underwater plastic pollution detection dataset, which includes 15 categories of underwater debris. The proposed method achieves a mAP50 of 81.8%, with 87.3% precision and 75.1% recall, surpassing eleven advanced detection models such as Faster R-CNN, RT-DETR-L, YOLOv8n, and YOLOv12n. Ablation studies verify the function of every module. These findings show that diffusion-driven enhancement, when coupled with feature extraction and localization optimization, offers a promising direction for accurate, robust underwater perception, opening new opportunities for environmental monitoring and autonomous marine systems. Full article
Show Figures

Figure 1

Back to TopTop