MDPI - Publisher of Open Access Journals

21 pages, 5493 KiB

Open AccessArticle

Estimating Snow-Related Daily Change Events in the Canadian Winter Season: A Deep Learning-Based Approach

by Karim Malik, Isteyak Isteyak and Colin Robertson

J. Imaging 2025, 11(7), 239; https://doi.org/10.3390/jimaging11070239 - 14 Jul 2025

Viewed by 225

Snow water equivalent (SWE), an essential parameter of snow, is largely studied to understand the impact of climate regime effects on snowmelt patterns. This study developed a Siamese Attention U-Net (Si-Att-UNet) model to detect daily change events in the winter season. The daily [...] Read more.

Snow water equivalent (SWE), an essential parameter of snow, is largely studied to understand the impact of climate regime effects on snowmelt patterns. This study developed a Siamese Attention U-Net (Si-Att-UNet) model to detect daily change events in the winter season. The daily SWE change event detection task is treated as an image content comparison problem in which the Si-Att-UNet compares a pair of SWE maps sampled at two temporal windows. The model detected SWE similarity and dissimilarity with an F1 score of 99.3% at a 50% confidence threshold. The change events were derived from the model’s prediction of SWE similarity using the 50% threshold. Daily SWE change events increased between 1979 and 2018. However, the SWE change events were significant in March and April, with a positive Mann–Kendall test statistic (tau = 0.25 and 0.38, respectively). The highest frequency of zero-change events occurred in February. A comparison of the SWE change events and mean change segments with those of the northern hemisphere’s climate anomalies revealed that low temperature and low precipitation anomalies reduced the frequency of SWE change events. The findings highlight the influence of climate variables on daily changes in snow-related water storage in March and April. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))

► Show Figures

Figure 1

16 pages, 2607 KiB

Open AccessArticle

Deep Learning-Based Detection and Assessment of Road Damage Caused by Disaster with Satellite Imagery

by Jungeun Cha, Seunghyeok Lee and Hoe-Kyoung Kim

Appl. Sci. 2025, 15(14), 7669; https://doi.org/10.3390/app15147669 - 8 Jul 2025

Viewed by 544

Abstract

Natural disasters can cause severe damage to critical infrastructure such as road networks, significantly delaying rescue and recovery efforts. Conventional road damage assessments rely heavily on manual inspection, which is labor-intensive, time-consuming, and infeasible in large-scale disaster-affected areas. This study aims to propose [...] Read more.

Natural disasters can cause severe damage to critical infrastructure such as road networks, significantly delaying rescue and recovery efforts. Conventional road damage assessments rely heavily on manual inspection, which is labor-intensive, time-consuming, and infeasible in large-scale disaster-affected areas. This study aims to propose a deep learning-based framework to automatically detect and quantitatively assess road damage using high-resolution pre- and post-disaster satellite imagery. To achieve this, the study systematically compares three distinct change detection approaches: single-timeframe overlay, difference-based segmentation, and Siamese feature fusion. Experimental results, validated over multiple runs, show the difference-based model achieved the highest overall F1-score (0.594 ± 0.025), surpassing the overlay and Siamese models by approximately 127.6% and 27.5%, respectively. However, a key finding of this study is that even this best-performing model is constrained by a low detection recall (0.445 ± 0.051) for the ‘damaged road’ class. This reveals that severe class imbalance is a fundamental hurdle in this domain for which standard training strategies are insufficient. This study establishes a crucial benchmark for the field, highlighting that future research must focus on methods that directly address class imbalance to improve detection recall. Despite its quantified limitations, the proposed framework enables the visualization of damage density maps, supporting emergency response strategies such as prioritizing road restoration and accessibility planning in disaster-stricken areas. Full article

(This article belongs to the Special Issue Remote Sensing Image Processing and Application, 2nd Edition)

► Show Figures

Figure 1

21 pages, 3406 KiB

Open AccessArticle

ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification

by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu

Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025

Viewed by 568

Abstract

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article

(This article belongs to the Special Issue Artificial Intelligence and Sensor-Enhanced Fault Diagnosis for Industrial Application)

► Show Figures

Figure 1

23 pages, 1523 KiB

Open AccessArticle

Deep One-Directional Neural Semantic Siamese Network for High-Accuracy Fact Verification

by Muchammad Naseer, Jauzak Hussaini Windiatmaja, Muhamad Asvial and Riri Fitri Sari

Big Data Cogn. Comput. 2025, 9(7), 172; https://doi.org/10.3390/bdcc9070172 - 30 Jun 2025

Viewed by 655

Abstract

Fake news has eroded trust in credible news sources, driving the need for tools to verify the accuracy of circulating information. Fact verification addresses this issue by classifying claims as Supports (S), Refutes (R), or Not Enough Info (NEI) based on evidence. Neural [...] Read more.

Fake news has eroded trust in credible news sources, driving the need for tools to verify the accuracy of circulating information. Fact verification addresses this issue by classifying claims as Supports (S), Refutes (R), or Not Enough Info (NEI) based on evidence. Neural Semantic Matching Networks (NSMN) is an algorithm designed for this purpose, but its reliance on BiLSTM has shown limitations, particularly overfitting. This study aims to enhance NSMN for fact verification through a structured framework comprising encoding, alignment, matching, and output layers. The proposed approach employed Siamese MaLSTM in the matching layer and introduced the Manhattan Fact Relatedness Score (MFRS) in the output layer, culminating in a novel algorithm called Deep One-Directional Neural Semantic Siamese Network (DOD–NSSN). Performance evaluation compared DOD–NSSN with NSMN and transformer-based algorithms (BERT, RoBERTa, XLM, XL-Net). Results demonstrated that DOD–NSSN achieved 91.86% accuracy and consistently outperformed other models, achieving over 95% accuracy across diverse topics, including sports, government, politics, health, and industry. The findings highlight the DOD–NSSN model’s capability to generalize effectively across various domains, providing a robust tool for automated fact verification. Full article

(This article belongs to the Special Issue Machine Learning and AI Technology for Sustainable Development)

► Show Figures

Figure 1

14 pages, 5250 KiB

Open AccessArticle

An Enhanced Siamese Network-Based Visual Tracking Algorithm with a Dual Attention Mechanism

by Xueying Cai, Sheng Feng, Varshosaz Masood, Senang Ying, Binchao Zhou, Wentao Jia, Jianing Yang, Canlin Wei and Yucheng Feng

Electronics 2025, 14(13), 2579; https://doi.org/10.3390/electronics14132579 - 26 Jun 2025

Viewed by 226

Abstract

Aiming at the problems of SiamFC, such as shallow network architecture, a fixed template, a lack of semantic understanding, and temporal modeling, this paper proposes a robust target-tracking algorithm that incorporates both channel and spatial attention mechanisms. The backbone network of our algorithm [...] Read more.

Aiming at the problems of SiamFC, such as shallow network architecture, a fixed template, a lack of semantic understanding, and temporal modeling, this paper proposes a robust target-tracking algorithm that incorporates both channel and spatial attention mechanisms. The backbone network of our algorithm adopts depthwise, separable convolution to improve computational efficiency, adjusts the output stride and convolution kernel size to improve the network feature extraction capability, and optimizes the network structure through neural architecture search, enabling the extraction of deeper, richer features with stronger semantic information. In addition, we add channel attention to the target template branch after feature extraction to make it adaptively adjust the weights of different feature channels. In the search region branch, a sequential combination of channel and spatial attention is introduced to model spatial dependencies among pixels and suppress background and distractor information. Finally, we evaluate the proposed algorithm on the OTB2015, VOT2018, and VOT2016 datasets. The results show that our method achieves a tracking precision of 0.631 and a success rate of 0.468, improving upon the original SiamFC by 3.4% and 1.2%, respectively. The algorithm ensures robust tracking in complex scenarios, maintains real-time performance, and further reduces both parameter counts and overall computational complexity. Full article

(This article belongs to the Special Issue Advances in Mobile Networked Systems)

► Show Figures

Figure 1

37 pages, 3049 KiB

Open AccessArticle

English-Arabic Hybrid Semantic Text Chunking Based on Fine-Tuning BERT

by Mai Alammar, Khalil El Hindi and Hend Al-Khalifa

Computation 2025, 13(6), 151; https://doi.org/10.3390/computation13060151 - 16 Jun 2025

Cited by 1 | Viewed by 831

Abstract

Semantic text chunking refers to segmenting text into coherently semantic chunks, i.e., into sets of statements that are semantically related. Semantic chunking is an essential pre-processing step in various NLP tasks e.g., document summarization, sentiment analysis and question answering. In this paper, we [...] Read more.

Semantic text chunking refers to segmenting text into coherently semantic chunks, i.e., into sets of statements that are semantically related. Semantic chunking is an essential pre-processing step in various NLP tasks e.g., document summarization, sentiment analysis and question answering. In this paper, we propose a hybrid chunking; two-steps semantic text chunking method that combines the effectiveness of unsupervised semantic text chunking based on the similarities between sentences embeddings and the pre-trained language models (PLMs) especially BERT by fine-tuning the BERT on semantic textual similarity task (STS) to provide a flexible and effective semantic text chunking. We evaluated the proposed method in English and Arabic. To the best of our knowledge, there is an absence of an Arabic dataset created to assess semantic text chunking at this level. Therefore, we created an AraWiki50k to evaluate our proposed text chunking method inspired by an existing English dataset. Our experiments showed that exploiting the fine-tuned pre-trained BERT on STS enhances results over unsupervised semantic chunking by an average of 7.4 in the PK metric and by an average of 11.19 in the WindowDiff metric on four English evaluation datasets, and 0.12 in the PK and 2.29 in the WindowDiff for the Arabic dataset. Full article

(This article belongs to the Section Computational Social Science)

► Show Figures

Figure 1

26 pages, 5846 KiB

Open AccessArticle

AGEN: Adaptive Error Control-Driven Cross-View Geo-Localization Under Extreme Weather Conditions

by Mengmeng Xu, Hongxiang Lv, Hai Zhu, Enlai Dong and Fei Wu

Sensors 2025, 25(12), 3749; https://doi.org/10.3390/s25123749 - 15 Jun 2025

Viewed by 564

Abstract

Cross-view geo-localization is a task of matching the same geographic image from different views, e.g., drone and satellite. Due to its GPS-free advantage, cross-view geo-localization is gaining increasing research interest, especially in drone-based localization and navigation applications. In order to guarantee system accuracy, [...] Read more.

Cross-view geo-localization is a task of matching the same geographic image from different views, e.g., drone and satellite. Due to its GPS-free advantage, cross-view geo-localization is gaining increasing research interest, especially in drone-based localization and navigation applications. In order to guarantee system accuracy, existing methods mainly focused on image augmentation and denoising while still facing performance degradation when extreme weather conditions are considered. In this paper, we propose a robust end-to-end image retrieval framework, AGEN, serving for cross-view geo-localization under extreme weather conditions. Inspired by the strengths of the DINOv2 network, particularly its strong performance in global feature extraction, while acknowledging its limitations in capturing fine-grained details, we integrate the DINOv2 network with the Local Pattern Network (LPN) algorithm module to extract valuable classification features more efficiently. Additionally, to further enhance model robustness, we innovatively introduce an Adaptive Error Control (AEC) module based on fuzzy control to optimize the loss function dynamically. Specifically, by adjusting loss weights adaptively, the AEC module allows the model to better handle complex and challenging scenarios. Experimental results demonstrate that AGEN achieves a Recall@1 accuracy of 91.71% on the University160k-WX dataset under extreme weather conditions. Through extensive experiments on two well-known public datasets, i.e., University-1652 and SUES-200, AGEN achieves state-of-the-art Recall@1 accuracy in both drone-view target localization tasks and drone navigation tasks, outperforming existing models. In particular, on the University-1652 dataset, AGEN reaches 95.43% Recall@1 in the drone-view target localization task, showcasing its superior capability in handling challenging scenarios. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

20 pages, 2150 KiB

Open AccessArticle

Industrial Image Anomaly Detection via Synthetic-Anomaly Contrastive Distillation

by Junxian Li, Mingxing Li, Shucheng Huang, Gang Wang and Xinjing Zhao

Sensors 2025, 25(12), 3721; https://doi.org/10.3390/s25123721 - 13 Jun 2025

Viewed by 599

Abstract

Industrial image anomaly detection plays a critical role in intelligent manufacturing by automatically identifying defective products through visual inspection. While unsupervised approaches eliminate dependency on annotated anomaly samples, current teacher–student framework-based methods still face two fundamental limitations: insufficient discriminative capability for structural anomalies [...] Read more.

Industrial image anomaly detection plays a critical role in intelligent manufacturing by automatically identifying defective products through visual inspection. While unsupervised approaches eliminate dependency on annotated anomaly samples, current teacher–student framework-based methods still face two fundamental limitations: insufficient discriminative capability for structural anomalies and suboptimal anomaly feature decoupling efficiency. To address these challenges, we propose a Synthetic-Anomaly Contrastive Distillation (SACD) framework for industrial anomaly detection. SACD comprises two pivotal components: (1) a reverse distillation (RD) paradigm whereby a pre-trained teacher network extracts hierarchically structured representations, subsequently guiding the student network with inverse architectural configuration to establish hierarchical feature alignment; (2) a group of feature calibration (FeaCali) modules designed to refine the student’s outputs by eliminating anomalous feature responses. During training, SACD adopts a dual-branch strategy, where one branch encodes multi-scale features from defect-free images, while a Siamese anomaly branch processes synthetically corrupted counterparts. FeaCali modules are trained to strip out a student’s anomalous patterns in anomaly branches, enhancing the student network’s exclusive modeling of normal patterns. We construct a dual-objective optimization integrating cross-model distillation loss and intra-model contrastive loss to train SACD for feature alignment and discrepancy amplification. At the inference stage, pixel-wise anomaly scores are computed through multi-layer feature discrepancies between the teacher’s representations and the student’s refined outputs. Comprehensive evaluations on the MVTec AD and BTAD benchmark demonstrate that our method is effective and superior to current knowledge distillation-based approaches. Full article

(This article belongs to the Section Industrial Sensors)

► Show Figures

Figure 1

19 pages, 6772 KiB

Open AccessArticle

A Cross-Mamba Interaction Network for UAV-to-Satallite Geolocalization

by Lingyun Tian, Qiang Shen, Yang Gao, Simiao Wang, Yunan Liu and Zilong Deng

Drones 2025, 9(6), 427; https://doi.org/10.3390/drones9060427 - 12 Jun 2025

Viewed by 976

Abstract

The geolocalization of unmanned aerial vehicles (UAVs) in satellite-denied environments has emerged as a key research focus. Recent advancements in this area have been largely driven by learning-based frameworks that utilize convolutional neural networks (CNNs) and Transformers. However, both CNNs and Transformers face [...] Read more.

The geolocalization of unmanned aerial vehicles (UAVs) in satellite-denied environments has emerged as a key research focus. Recent advancements in this area have been largely driven by learning-based frameworks that utilize convolutional neural networks (CNNs) and Transformers. However, both CNNs and Transformers face challenges in capturing global feature dependencies due to their restricted receptive fields. Inspired by state-space models (SSMs), which have demonstrated efficacy in modeling long sequences, we propose a pure Mamba-based method called the Cross-Mamba Interaction Network (CMIN) for UAV geolocalization. CMIN consists of three key components: feature extraction, information interaction, and feature fusion. It leverages Mamba’s strengths in global information modeling to effectively capture feature correlations between UAV and satellite images over a larger receptive field. For feature extraction, we design a Siamese Feature Extraction Module (SFEM) based on two basic vision Mamba blocks, enabling the model to capture the correlation between UAV and satellite image features. In terms of information interaction, we introduce a Local Cross-Attention Module (LCAM) to fuse cross-Mamba features, providing a solution for feature matching via deep learning. By aggregating features from various layers of SFEMs, we generate heatmaps for the satellite image that help determine the UAV’s geographical coordinates. Additionally, we propose a Center Masking strategy for data augmentation, which promotes the model’s ability to learn richer contextual information from UAV images. Experimental results on benchmark datasets show that our method achieves state-of-the-art performance. Ablation studies further validate the effectiveness of each component of CMIN. Full article

► Show Figures

Figure 1

19 pages, 1563 KiB

Open AccessArticle

Small Object Tracking in LiDAR Point Clouds: Learning the Target-Awareness Prototype and Fine-Grained Search Region

by Shengjing Tian, Yinan Han, Xiantong Zhao and Xiuping Liu

Sensors 2025, 25(12), 3633; https://doi.org/10.3390/s25123633 - 10 Jun 2025

Viewed by 686

Abstract

Light Detection and Ranging (LiDAR) point clouds are an essential perception modality for artificial intelligence systems like autonomous driving and robotics, where the ubiquity of small objects in real-world scenarios substantially challenges the visual tracking of small targets amidst the vastness of point [...] Read more.

Light Detection and Ranging (LiDAR) point clouds are an essential perception modality for artificial intelligence systems like autonomous driving and robotics, where the ubiquity of small objects in real-world scenarios substantially challenges the visual tracking of small targets amidst the vastness of point cloud data. Current methods predominantly focus on developing universal frameworks for general object categories, often sidelining the persistent difficulties associated with small objects. These challenges stem from a scarcity of foreground points and a low tolerance for disturbances. To this end, we propose a deep neural network framework that trains a Siamese network for feature extraction and innovatively incorporates two pivotal modules: the target-awareness prototype mining (TAPM) module and the regional grid subdivision (RGS) module. The TAPM module utilizes the reconstruction mechanism of the masked auto-encoder to distill prototypes within the feature space, thereby enhancing the salience of foreground points and aiding in the precise localization of small objects. To heighten the tolerance of disturbances in feature maps, the RGS module is devised to retrieve detailed features of the search area, capitalizing on Vision Transformer and pixel shuffle technologies. Furthermore, beyond standard experimental configurations, we have meticulously crafted scaling experiments to assess the robustness of various trackers when dealing with small objects. Comprehensive evaluations show our method achieves a mean Success of 64.9% and 60.4% under original and scaled settings, outperforming benchmarks by +3.6% and +5.4%, respectively. Full article

(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems)

► Show Figures

Figure 1

18 pages, 3721 KiB

Open AccessArticle

Haptic–Vision Fusion for Accurate Position Identification in Robotic Multiple Peg-in-Hole Assembly

by Jinlong Chen, Deming Luo, Zhigang Xiao, Minghao Yang, Xingguo Qin and Yongsong Zhan

Electronics 2025, 14(11), 2163; https://doi.org/10.3390/electronics14112163 - 26 May 2025

Viewed by 504

Abstract

Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes [...] Read more.

Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes a dual-stream Siamese network (DSSN) framework that fuses visual and tactile modalities to achieve accurate position identification in six-degree-of-freedom robotic connector assembly tasks. The DSSN employs ConvNeXt for visual feature extraction and SE-ResNet-50 with integrated attention mechanisms for tactile feature extraction, while a gated attention module adaptively fuses multimodal features. A bidirectional long short-term memory (Bi-LSTM) recurrent neural network is introduced to jointly model spatiotemporal deviations in position and orientation. Compared with state-of-the-art methods, the proposed DSSN achieves improvements of approximately 7.4%, 5.7%, and 5.4% in assembly success rates after 1, 5, and 10 buckling iterations, respectively. Experimental results validate that the integration of multimodal adaptive fusion and sequential spatiotemporal learning enables robust and precise robotic connectors assembly under high-tolerance conditions. Full article

► Show Figures

Figure 1

28 pages, 5257 KiB

Open AccessArticle

Comparative Evaluation of Sequential Neural Network (GRU, LSTM, Transformer) Within Siamese Networks for Enhanced Job–Candidate Matching in Applied Recruitment Systems

by Mateusz Łępicki, Tomasz Latkowski, Izabella Antoniuk, Michał Bukowski, Bartosz Świderski, Grzegorz Baranik, Bogusz Nowak, Robert Zakowicz, Łukasz Dobrakowski, Bogdan Act and Jarosław Kurek

Appl. Sci. 2025, 15(11), 5988; https://doi.org/10.3390/app15115988 - 26 May 2025

Viewed by 782

Abstract

Job–candidate matching is pivotal in recruitment, yet traditional manual or keyword-based methods can be laborious and prone to missing qualified candidates. In this study, we introduce the first Siamese framework that systematically contrasts GRU, LSTM, and Transformer sequential heads on top of a [...] Read more.

Job–candidate matching is pivotal in recruitment, yet traditional manual or keyword-based methods can be laborious and prone to missing qualified candidates. In this study, we introduce the first Siamese framework that systematically contrasts GRU, LSTM, and Transformer sequential heads on top of a multilingual Sentence Transformer backbone, which is trained end-to-end with triplet loss on real-world recruitment data. This combination captures both long-range dependencies across document segments and global semantics, representing a substantial advance over approaches that rely solely on static embeddings. We compare the three heads using ranking metrics such as Top-K accuracy and Mean Reciprocal Rank (MRR). The Transformer-based model yields the best overall performance, with an MRR of 0.979 and a Top-100 accuracy of 87.20% on the test set. Visualization of learned embeddings (t-SNE) shows that self-attention more effectively clusters matching texts and separates them from irrelevant ones. These findings underscore the potential of combining multilingual base embeddings with specialized sequential layers to reduce manual screening efforts and improve recruitment efficiency. Full article

(This article belongs to the Special Issue Innovations in Artificial Neural Network Applications)

► Show Figures

Figure 1

23 pages, 1701 KiB

Open AccessArticle

Left Meets Right: A Siamese Network Approach to Cross-Palmprint Biometric Recognition

by Mohamed Ezz

Electronics 2025, 14(10), 2093; https://doi.org/10.3390/electronics14102093 - 21 May 2025

Viewed by 385

Abstract

What if you could identify someone’s right palmprint just by looking at their left—and vice versa? That is exactly what I set out to do. I built a specially adapted Siamese network that only needs one palm to reliably recognize the other, making [...] Read more.

What if you could identify someone’s right palmprint just by looking at their left—and vice versa? That is exactly what I set out to do. I built a specially adapted Siamese network that only needs one palm to reliably recognize the other, making biometric systems far more flexible in everyday settings. My solution rests on two simple but powerful ideas. First, Anchor Embedding through Feature Aggregation (AnchorEFA) creates a “super-anchor” by averaging four palmprint samples from the same person. This pooled anchor smooths out noise and highlights the consistent patterns shared between left and right palms. Second, I use a Concatenated Similarity Measurement—combining Euclidean distance with Element-wise Absolute Difference (EAD)—so the model can pick up both big structural similarities and tiny textural differences. I tested this approach on three public datasets (POLYU_Left_Right, TongjiS1_Left_Right, and CASIA_Left_Right) and saw a clear jump in accuracy compared to traditional methods. In fact, my four-sample AnchorEFA plus hybrid similarity metric did not just beat the baseline—it set a new benchmark for cross-palmprint recognition. In short, recognizing a palmprint from its opposite pair is not just feasible—it is practical, accurate, and ready for real-world use. This work opens the door to more secure, user-friendly biometric systems that still work even when only one palmprint is available. Full article

► Show Figures

Figure 1

47 pages, 6632 KiB

Open AccessArticle

Comparison of Deep Transfer Learning Against Contrastive Learning in Industrial Quality Applications for Heavily Unbalanced Data Scenarios When Data Augmentation Is Limited

by Amir Farmanesh, Raúl G. Sanchis and Joaquín Ordieres-Meré

Sensors 2025, 25(10), 3048; https://doi.org/10.3390/s25103048 - 12 May 2025

Viewed by 1479

Abstract

AI-oriented quality inspection in manufacturing often faces highly imbalanced data, as defective products are rare, and there are limited possibilities for data augmentation. This paper presents a systematic comparison between Deep Transfer Learning (DTL) and Contrastive Learning (CL) under such challenging conditions, addressing [...] Read more.

AI-oriented quality inspection in manufacturing often faces highly imbalanced data, as defective products are rare, and there are limited possibilities for data augmentation. This paper presents a systematic comparison between Deep Transfer Learning (DTL) and Contrastive Learning (CL) under such challenging conditions, addressing a critical gap in the industrial machine learning literature. We focus on a galvanized steel coil quality classification task with acceptable vs. defective classes, where the vast majority of samples (>95%) are acceptable. We implement a DTL approach using strategically fine-tuned YOLOv8 models pre-trained on large-scale datasets, and a CL approach using a Siamese network with multi-reference design to learn robust similarity metrics for one-shot classification. Experiments employ k-fold cross-validation and a held-out gold-standard test set of coil images, with statistical validation through bootstrap resampling. Results demonstrate that DTL significantly outperforms CL, achieving higher overall accuracy (81.7% vs. 61.6%), F1-score (79.2% vs. 62.1%), and precision (91.3% vs. 61.0%) on the challenging test set. Computational analysis reveals that DTL requires 40% less training time and 25% fewer parameters while maintaining superior generalization capabilities. We provide concrete guidance on when to select DTL over CL based on dataset characteristics, demonstrating that DTL is particularly advantageous when data augmentation is constrained by domain-specific spatial patterns. Additionally, we introduce a novel adaptive inspection framework that integrates human-in-the-loop feedback with domain adaptation techniques for continuous model improvement in production environments. Our comprehensive comparative analysis offers empirically validated insights into performance trade-offs between these approaches under extreme class imbalance, providing valuable direction for practitioners implementing industrial quality inspection systems with limited, skewed datasets. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Graphical abstract

34 pages, 3562 KiB

Open AccessArticle

Unknown IoT Device Identification Models and Algorithms Based on CSCL-Siamese Networks and Weighted-Voting Clustering Ensemble

by Junhao Qian, Wenyu Zheng, Xulin Lu and Zhihua Li

Appl. Sci. 2025, 15(10), 5274; https://doi.org/10.3390/app15105274 - 9 May 2025

Viewed by 316

Abstract

Current methods for identifying unknown Internet of Things (IoT) devices are relatively limited. Most approaches can identify only one type of the unknown IoT devices at a time and with a relatively low accuracy. Herein, we propose a method for unknown IoT device [...] Read more.

Current methods for identifying unknown Internet of Things (IoT) devices are relatively limited. Most approaches can identify only one type of the unknown IoT devices at a time and with a relatively low accuracy. Herein, we propose a method for unknown IoT device identification (UDI) based on cost-sensitive contrastive loss (CSCL)-Siamese networks and a weighted-voting clustering ensemble (WVE). First, we integrate data visualization techniques with a permutation sample-pairing strategy to generate a complete and nonredundant set of positive–negative sample pairs. Then, we present an algorithm to generate permutation positive–negative sample pairs to provide a rich set of contrastive training data. To overcome the bias in the decision boundary caused by an insufficient number of positive sample pairs, we developed a Siamese network based on CSCL. The CSCL-Siamese network is used to identify known IoT devices and establish an embedded vector database for known IoT devices. Next, we extract the embedding vectors of unknown IoT devices using the trained CSCL-Siamese network and the embedded vector database. Finally, combining weighting factors with a voting ensemble strategy, we develop a UDI algorithm based on a WVE. This presented algorithm integrates the clustering capabilities of multiple unsupervised clustering algorithms to perform clustering on the extracted embedding vectors of unknown IoT devices, thereby enhancing the identification capability of the CSCL-WVE-UDI method. Experimental results demonstrate that the CSCL-WVE-UDI method can effectively identify multiple types of unknown IoT devices at the same time. Full article

► Show Figures

Figure 1

Search Results (521)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (521)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI