MDPI - Publisher of Open Access Journals

26 pages, 24887 KB

Open AccessArticle

Artificial Intelligence-Assisted Pathogen Detection: Algorithms, Biosensing Platforms, and Applications

by Jiani Liu, Wang Gao, Chengxi Guo, Wenzhuo Cai, Ziyan Tang, Song Li, Yan Deng, Xiaoguang Qu and Zhu Chen

Biosensors 2026, 16(5), 267; https://doi.org/10.3390/bios16050267 - 5 May 2026

Viewed by 1507

Abstract

Rapid and accurate pathogen detection serves as a core component in infectious disease prevention and control, clinical diagnosis and treatment, and public health surveillance systems. Although traditional detection methods have been widely adopted in clinical practice, they still exhibit significant limitations in terms [...] Read more.

Rapid and accurate pathogen detection serves as a core component in infectious disease prevention and control, clinical diagnosis and treatment, and public health surveillance systems. Although traditional detection methods have been widely adopted in clinical practice, they still exhibit significant limitations in terms of detection speed, throughput, automation levels, and adaptability to complex samples. In recent years, artificial intelligence (AI) technology has provided novel technical pathways for pathogen detection by leveraging its strengths in feature learning, pattern recognition, and multidimensional data modeling. The core contribution of this review lies in providing a novel, integrated analytical framework that overcomes the limitations of existing reviews, which often focus on a single modality (such as imaging alone or molecular diagnostics alone). Based on this framework, this paper systematically reviews AI research progress in pathogen detection, focusing on typical applications of machine learning and deep learning algorithms in analyzing imaging data, molecular diagnostic data, sensor signals, microscopic images, and multimodal data. It summarizes AI’s enabling value in enhancing detection sensitivity, specificity, automation, and point-of-care capabilities. Concurrently, this paper delves into key challenges facing AI-assisted pathogen detection, including data standardization, model generalization, interpretability, and clinical translation. It also outlines future trends toward intelligent, integrated, and clinically deployable applications. This paper aims to provide researchers and clinicians in the interdisciplinary field of artificial intelligence, biosensing, and clinical medicine with a comprehensive reference and roadmap for future development. Full article

(This article belongs to the Special Issue Materials and Techniques for Bioanalysis and Biosensing—2nd Edition)

► Show Figures

Figure 1

24 pages, 2456 KB

Open AccessArticle

Adaptive Label Reweighting via Boundary-Aware Meta Learning for Long-Tail Legal Element Recognition

by Kun Han, Chengcheng Han and Pengcheng Zhao

Symmetry 2026, 18(4), 664; https://doi.org/10.3390/sym18040664 - 16 Apr 2026

Viewed by 249

Abstract

Legal element recognition, which identifies discrete factual elements in Chinese court judgments to support judicial analysis and case retrieval, faces a severe long-tail challenge: head-to-tail label-frequency ratios exceed 100:1, and over 60% of sentences carry no label, starving rare elements of training signal. [...] Read more.

Legal element recognition, which identifies discrete factual elements in Chinese court judgments to support judicial analysis and case retrieval, faces a severe long-tail challenge: head-to-tail label-frequency ratios exceed 100:1, and over 60% of sentences carry no label, starving rare elements of training signal. Static reweighting methods assign fixed weights prior to training and cannot respond to the model’s evolving confidence; sample-level meta-learning couples all co-occurring label gradients to a single scalar, preventing independent tail-label amplification. We propose BML-Trans, a boundary-aware meta-learning framework that addresses both limitations. A label-wise meta-weighting mechanism maintains per-label gradient weights updated via bilevel hypergradient descent, decoupling tail-label amplification from co-occurring head labels. A boundary-aware meta-set concentrates calibration signal on high-uncertainty, tail-triggering sentences rather than on easy negatives, and a lightweight Multi-Scale Adapter sharpens the warm-up probability estimates on which boundary selection depends. Concretely, BML-Trans achieves an average Avg-F1 of 82.5% on CAIL2019 across the labor, divorce, and loan domains, outperforming the strongest baseline by 1.2 percentage points overall and by up to 5.7 percentage points on tail-label Macro-F1, at only 14% additional training cost. Ablation confirms a cascade dependency among the three components, establishing that the gains are structural rather than incidental to threshold selection or initialization. Full article

► Show Figures

Figure 1

26 pages, 13111 KB

Open AccessReview

Advancing Terahertz Biochemical Sensing: From Spectral Fingerprinting to Intelligent Detection

by Haitao Zhang, Zijie Dai, Yunxia Ye and Xudong Ren

Photonics 2026, 13(4), 379; https://doi.org/10.3390/photonics13040379 - 16 Apr 2026

Viewed by 1322

Abstract

Biochemical detection is fundamental to various scientific disciplines, yet conventional methods still face inherent bottlenecks in achieving rapid, ultrasensitive, and simultaneous multi-target analysis. Terahertz (THz) waves, characterized by their unique spectral fingerprinting capabilities and non-destructive properties, have emerged as a compelling platform for [...] Read more.

Biochemical detection is fundamental to various scientific disciplines, yet conventional methods still face inherent bottlenecks in achieving rapid, ultrasensitive, and simultaneous multi-target analysis. Terahertz (THz) waves, characterized by their unique spectral fingerprinting capabilities and non-destructive properties, have emerged as a compelling platform for advanced biochemical sensing. This review outlines the evolution of THz biochemical sensing over the past two decades, tracing its progression from passive identification toward intelligent perception. We structure this technological trajectory around four core themes: sensitivity enhancement, specific recognition, multi-target visualization, and system intelligence. We first evaluate the fundamental limitations of direct detection techniques, such as THz time-domain spectroscopy (THz-TDS). Building on this, we examine how metamaterial-assisted architectures utilize high-quality-factor resonances to achieve trace-level detection, pushing the limits of detection (LOD) down to the ng/mL or even pg/mL scale, and how surface chemical functionalization provides a molecular lock mechanism for selective targeting in complex samples. Furthermore, we highlight the paradigm shift from single-point spectral measurements to spatially resolved multi-target imaging using pixelated metasurfaces. Finally, the review addresses emerging directions, including dynamically tunable intelligent metasurfaces, multimodal on-chip integration platforms, and the growing integration of artificial intelligence (AI) in inverse design and data interpretation, which achieves classification accuracies exceeding 95% even in complex matrices. By synthesizing these developments, this review provides a comprehensive perspective on the future trajectory of THz sensing technologies. Full article

(This article belongs to the Special Issue Advancements in Terahertz Metamaterial Optics, Devices, and Applications)

► Show Figures

Figure 1

32 pages, 15567 KB

Open AccessArticle

Multi-Module Collaborative Optimization for SAR Image Aircraft Recognition: The SAR-YOLOv8l-ADE Network

by Xing Wang, Wen Hong, Qi Li, Yunqing Liu, Qiong Zhang and Ping Xin

Remote Sens. 2026, 18(2), 236; https://doi.org/10.3390/rs18020236 - 11 Jan 2026

Viewed by 541

Abstract

As a core node of the air transportation network, airports rely on aircraft model identification as a key link to support the development of smart aviation. Synthetic Aperture Radar (SAR), with its strong-penetration imaging capabilities, provides high-quality data support for this task. However, [...] Read more.

As a core node of the air transportation network, airports rely on aircraft model identification as a key link to support the development of smart aviation. Synthetic Aperture Radar (SAR), with its strong-penetration imaging capabilities, provides high-quality data support for this task. However, the field of SAR image interpretation faces numerous challenges. To address the core challenges in SAR image-based aircraft recognition, including insufficient dataset samples, single-dimensional target features, significant variations in target sizes, and high missed-detection rates for small targets, this study proposed an improved network architecture, SAR-YOLOv8l-ADE. Four modules achieve collaborative optimization: SAR-ACGAN integrates a self-attention mechanism to expand the dataset; SAR-DFE, a parameter-learnable dual-residual module, extracts multidimensional, detailed features; SAR-C2f, a residual module with multi-receptive field fusion, adapts to multi-scale targets; and 4SDC, a four-branch module with adaptive weights, enhances small-target recognition. Experimental results on the fused dataset SAR-Aircraft-EXT show that the mAP₅₀ of the SAR-YOLOv8l-ADE network is 6.1% higher than that of the baseline network YOLOv8l, reaching 96.5%. Notably, its recognition accuracy for small aircraft targets shows a greater improvement, reaching 95.2%. The proposed network outperforms existing methods in terms of recognition accuracy and generalization under complex scenarios, providing technical support for airport management and control, as well as for emergency rescue in smart aviation. Full article

► Show Figures

Figure 1

20 pages, 5981 KB

Open AccessArticle

A Multimodal Visual–Textual Framework for Detection and Counting of Diseased Trees Caused by Invasive Species in Complex Forest Scenes

by Rui Zhang, Zhibo Chen, Guangyu Huo, Xiaoyu Zhang, Wenda Luo and Liping Mu

Remote Sens. 2025, 17(24), 3971; https://doi.org/10.3390/rs17243971 - 9 Dec 2025

Viewed by 632

Abstract

With the large-scale invasion of alien species, forest ecosystems are facing severe challenges, and the health of trees is increasingly threatened. Accurately detecting and counting trees affected by such invasive species has become a critical issue in forest conservation and resource management. Traditional [...] Read more.

With the large-scale invasion of alien species, forest ecosystems are facing severe challenges, and the health of trees is increasingly threatened. Accurately detecting and counting trees affected by such invasive species has become a critical issue in forest conservation and resource management. Traditional detection methods usually rely only on the information of a single modality of an image, lack linguistic or semantic guidance, and often can only model a specific diseased tree situation during training, making it difficult to achieve effective differentiation and generalization of multiple diseased tree types, which limits their practicality. To address the above challenges, we propose an end-to-end multimodal diseased tree detection model. In the visual encoder of the model, we introduce rotational positional encoding to enhance the model’s ability to perceive detailed structures of trees in images. This design enables more accurate extraction of features related to diseased trees, especially when processing images with complex environments. At the same time, we further introduce a cross-attention mechanism between image and text modalities, so that the model can realize the deep fusion of visual and verbal information, thus improving the detection accuracy based on understanding and recognizing the semantics of the disease. Additionally, this method possesses strong generalization capabilities, enabling effective recognition based on textual descriptions even when samples are not available. Our model achieves optimal results on the Larch Casebearer dataset and the Pests and Diseases Tree dataset, verifying the effectiveness and generalizability of the method. Full article

(This article belongs to the Special Issue Object Detection in Remote Sensing Images Based on Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 3102 KB

Open AccessArticle

MFFN-FCSA: Multi-Modal Feature Fusion Networks with Fully Connected Self-Attention for Radar Space Target Recognition

by Leiyao Liao, Yunda Jiang, Gengxin Zhang and Ziwei Liu

Appl. Sci. 2025, 15(22), 11940; https://doi.org/10.3390/app152211940 - 10 Nov 2025

Cited by 1 | Viewed by 1065

Abstract

Radar space target recognition is faced with inherent challenges due to complex electromagnetic scattering properties and limited training samples. Conventional single-modality approaches cannot fully characterize targets due to information incompleteness, and existing multi-modal fusion methods often neglect deep exploration of cross-modal feature correlations. [...] Read more.

Radar space target recognition is faced with inherent challenges due to complex electromagnetic scattering properties and limited training samples. Conventional single-modality approaches cannot fully characterize targets due to information incompleteness, and existing multi-modal fusion methods often neglect deep exploration of cross-modal feature correlations. To address this issue, this paper presents a novel multi-modal feature fusion network with fully connected self-attention (MFFN-FCSA) for robust radar space target recognition. The proposed framework innovatively integrates multi-modal radar data, including high-resolution range profiles (HRRPs) and inverse synthetic aperture radar (ISAR) images, to exploit the complementary characteristics comprehensively. Our MFFN-FCSA consists of three modules: the parallel convolutional branches for modality-specific feature extraction of HRRPs and ISAR images, an FCSA-based fusion module for cross-modal feature fusion, and a classification head. Specially, the designed FCSA fusion module simultaneously learns spatial and channel-wise dependencies via a fully connected self-attention mechanism, which enables learning dynamic weights of discriminative features across modalities. Furthermore, our end-to-end MFFN-FCSA model incorporates a composite loss function that combines a focal cross-entropy loss to address class imbalance and a triplet margin loss for enhanced metric learning. Experimental results based on a space target dataset with 10 categories show the high recognition accuracy of our model compared to related single-modality and existing fusion approaches, particularly showing promising generalization capabilities on few-shot and polarization variation scenarios. Full article

(This article belongs to the Special Issue AI-Driven Computer Vision and Pattern Recognition: Challenges and Applications)

► Show Figures

Figure 1

17 pages, 2093 KB

Open AccessArticle

Plant Bioelectrical Signals for Environmental and Emotional State Classification

by Peter A. Gloor

Biosensors 2025, 15(11), 744; https://doi.org/10.3390/bios15110744 - 5 Nov 2025

Cited by 1 | Viewed by 3691

Abstract

In this study, we present a pilot investigation using a single Purple Heart plant (Tradescantia pallida) to explore whether bioelectrical signals for dual-purpose classification tasks: environmental state detection and human emotion recognition. Using an AD8232 ECG sensor at 400 Hz sampling rate, we [...] Read more.

In this study, we present a pilot investigation using a single Purple Heart plant (Tradescantia pallida) to explore whether bioelectrical signals for dual-purpose classification tasks: environmental state detection and human emotion recognition. Using an AD8232 ECG sensor at 400 Hz sampling rate, we recorded 3 s bioelectrical signal segments with 1 s overlap, converting them to mel-spectrograms for ResNet18 CNN (Convolutional Neural Network) classification. For lamp on/off detection, we achieved 85.4% accuracy with balanced precision (0.85–0.86) and recall (0.84–0.86) metrics across 2767 spectrogram samples. For human emotion classification, our system achieved optimal performance at 73% accuracy with 1 s lag, distinguishing between happy and sad emotional states across 1619 samples. These results should be viewed as preliminary and exploratory, demonstrating feasibility rather than definitive evidence of plant-based emotion sensing. Replication across plants, days, and experimental sites will be essential to establish robustness. The current study is limited by a single-plant setup, modest sample size, and reliance on human face-tracking labels, which together preclude strong claims about generalizability. Full article

(This article belongs to the Special Issue Biosensing Technology in Agriculture and Biological Products)

► Show Figures

Figure 1

17 pages, 2060 KB

Open AccessArticle

Continuous Optical Biosensing of IL-8 Cancer Biomarker Using a Multimodal Platform

by A. L. Hernandez, K. Mandal, B. Santamaria, S. Quintero, M. R. Dokmeci, V. Jucaud and M. Holgado

Bioengineering 2025, 12(10), 1115; https://doi.org/10.3390/bioengineering12101115 - 17 Oct 2025

Cited by 1 | Viewed by 1404

Abstract

In this work, we used a label-free biosensor that provides optical readouts to perform continuous detection of human interleukin 8 (IL-8), which is especially overexpressed in certain cancers and, thus, could be an effective biomarker for cancer prognosis estimation and therapy evaluation. For [...] Read more.

In this work, we used a label-free biosensor that provides optical readouts to perform continuous detection of human interleukin 8 (IL-8), which is especially overexpressed in certain cancers and, thus, could be an effective biomarker for cancer prognosis estimation and therapy evaluation. For this purpose, we engineered a compact, portable, and easy-to-assemble biosensing module device. It combines a fluidic chip for reagent flow, a biosensing chip for signal transduction, and an optical readout head based on fiber optics in a single module. The biosensing chip is based on independent arrays of resonant nanopillar transducer (RNP) networks. We integrated the biosensing chip with the RNPs facing down in a simple and rapidly fabricated polydimethyl siloxane (PDMS) microfluidic chip, with inlet and outlet channels for the sample flowing through the RNPs. The RNPs were vertically oriented from the backside through an optical fiber mounted on a holder head fabricated ad hoc on polytetrafluoroethylene (PTFE). The optical fiber was connected to a visible spectrometer for optical response analysis and consecutive biomolecule detection. We obtained a sensogram showing anti-IL-8 immobilization and the specific recognition of IL-8. This unique portable and easy-to-handle module can be used for biomolecule detection within minutes and is particularly suitable for in-line sensing of physiological and biomimetic organ-on-a-chip systems. Cancer biomarkers’ continuous monitoring arises as an efficient and non-invasive alternative to classical tools (imaging, immunohistology) for determining clinical prognostic factors and therapeutic responses to anticancer drugs. In addition, the multiplexed layout of the optical transducers and the simplicity of the monolithic sensing module yield potential high-throughput screening of a combination of different biomarkers, which, together with other medical exams (such as imaging and/or patient history), could become a cutting-edge technology for further and more accurate diagnosis and prediction of cancer and similar diseases. Full article

(This article belongs to the Section Biosignal Processing)

► Show Figures

Figure 1

44 pages, 7582 KB

Open AccessEditor’s ChoiceArticle

Continuous Authentication in Resource-Constrained Devices via Biometric and Environmental Fusion

by Nida Zeeshan, Makhabbat Bakyt, Naghmeh Moradpoor and Luigi La Spada

Sensors 2025, 25(18), 5711; https://doi.org/10.3390/s25185711 - 12 Sep 2025

Cited by 2 | Viewed by 4087

Abstract

Continuous authentication allows devices to keep checking that the active user is still the rightful owner instead of relying on a single login. However, current methods can be tricked by forging faces, revealing personal data, or draining the battery. Additionally, the environment where [...] Read more.

Continuous authentication allows devices to keep checking that the active user is still the rightful owner instead of relying on a single login. However, current methods can be tricked by forging faces, revealing personal data, or draining the battery. Additionally, the environment where the user plays a vital role in determining the user’s online security. Thanks to several security attacks, such as impersonation and replay, the user or the device can easily be compromised. We present a lightweight system that pairs face recognition with complex environmental sensing, i.e., the phone validates the user when the surrounding light or noise changes. A convolutional network turns each captured face into a 128-bit code, which is combined with a random “nonce” and protected by hashing. A camera–microphone module monitors light and sound to decide when to sample again, reducing unnecessary checks. We verified the protocol with formal security tools (Scyther v1.1.3.) and confirmed resistance to replay, interception, deepfake, and impersonation attacks. Across 2700 authentication cycles on a Snapdragon 778G testbed, the median decision time decreased from 61.2 ± 3.4 ms to 42.3 ± 2.1 ms (p < 0.01, paired t-test). Data usage per authentication cycle fell by an average of 24.7% ± 1.8%, and mean energy consumption per cycle decreased from 21.3 mJ to 19.8 mJ (∆ = 6.6 mJ, 95% CI: 5.9–7.2). These differences were consistent across varying lighting (≤50, 50–300, >300 lux) and noise conditions (30–55 dB SPL). These results show that smart-sensor-triggered face recognition can offer secure and energy-efficient continuous verification, supporting smart imaging and deep-learning-based face recognition. Full article

(This article belongs to the Section Environmental Sensing)

► Show Figures

Figure 1

34 pages, 10418 KB

Open AccessArticle

Entropy-Fused Enhanced Symplectic Geometric Mode Decomposition for Hybrid Power Quality Disturbance Recognition

by Chencheng He, Wenbo Wang, Xuezhuang E, Hao Yuan and Yuyi Lu

Entropy 2025, 27(9), 920; https://doi.org/10.3390/e27090920 - 30 Aug 2025

Cited by 3 | Viewed by 1122

Abstract

Electrical networks face operational challenges from power quality-affecting disturbances. Since disturbance signatures directly affect classifier performance, optimized feature selection becomes critical for accurate power quality assessment. The pursuit of robust feature extraction inevitably constrains the dimensionality of the discriminative feature set, but the [...] Read more.

Electrical networks face operational challenges from power quality-affecting disturbances. Since disturbance signatures directly affect classifier performance, optimized feature selection becomes critical for accurate power quality assessment. The pursuit of robust feature extraction inevitably constrains the dimensionality of the discriminative feature set, but the complexity of the recognition model will be increased and the recognition speed will be reduced if the feature vector dimension is too high. Building upon the aforementioned requirements, in this paper, we propose a feature extraction framework that combines improved symplectic geometric mode decomposition, refined generalized multiscale quantum entropy, and refined generalized multiscale reverse dispersion entropy. Firstly, based on the intrinsic properties of power quality disturbance (PQD) signals, the embedding dimension of symplectic geometric mode decomposition and the adaptive mode component screening method are improved, and the PQD signal undergoes tri-band decomposition via improved symplectic geometric mode decomposition (ISGMD), yielding distinct high-frequency, medium-frequency, and low-frequency components. Secondly, utilizing the enhanced symplectic geometric mode decomposition as a foundation, the perturbation features are extracted by the combination of refined generalized multiscale quantum entropy and refined generalized multiscale reverse dispersion entropy to construct high-precision and low-dimensional feature vectors. Finally, a double-layer composite power quality disturbance model is constructed by a deep extreme learning machine algorithm to identify power quality disturbance signals. After analysis and comparison, the proposed method is found to be effective even in a strong noise environment with a single interference, and the average recognition accuracy across different noise environments is 97.3%. Under the complex conditions involving multiple types of mixed perturbations, the average recognition accuracy is maintained above 96%. Compared with the existing CNN + LSTM method, the recognition accuracy of the proposed method is improved by 3.7%. In addition, its recognition accuracy in scenarios with small data samples is significantly better than that of traditional methods, such as single CNN models and LSTM models. The experimental results show that the proposed strategy can accurately classify and identify various power quality interferences and that it is better than traditional methods in terms of classification accuracy and robustness. The experimental results of the simulation and measured data show that the combined feature extraction methodology reliably extracts discriminative feature vectors from PQD. The double-layer combined classification model can further enhance the model’s recognition capabilities. This method has high accuracy and certain noise resistance. In the 30 dB white noise environment, the average classification accuracy of the model is 99.10% for the simulation database containing 63 PQD types. Meanwhile, for the test data based on a hardware platform, the average accuracy is 99.03%, and the approach’s dependability is further evidenced by rigorous validation experiments. Full article

► Show Figures

Figure 1

25 pages, 5445 KB

Open AccessArticle

HyperspectralMamba: A Novel State Space Model Architecture for Hyperspectral Image Classification

by Jianshang Liao and Liguo Wang

Remote Sens. 2025, 17(15), 2577; https://doi.org/10.3390/rs17152577 - 24 Jul 2025

Cited by 10 | Viewed by 2425

Abstract

Hyperspectral image classification faces challenges with high-dimensional spectral data and complex dependencies between bands. This paper proposes HyperspectralMamba, a novel architecture for hyperspectral image classification that integrates state space modeling with adaptive recalibration mechanisms. The method addresses limitations in existing techniques through three [...] Read more.

Hyperspectral image classification faces challenges with high-dimensional spectral data and complex dependencies between bands. This paper proposes HyperspectralMamba, a novel architecture for hyperspectral image classification that integrates state space modeling with adaptive recalibration mechanisms. The method addresses limitations in existing techniques through three key innovations: (1) a novel dual-stream architecture that combines SSM global modeling with parallel convolutional local feature extraction, distinguishing our approach from existing single-stream SSM methods; (2) a band-adaptive feature recalibration mechanism specifically designed for hyperspectral data that adaptively adjusts the importance of different spectral band features; and (3) an effective feature fusion strategy that integrates global and local features through residual connections. Experimental results on three benchmark datasets—Indian Pines, Pavia University, and Salinas Valley—demonstrate that the proposed method achieves overall accuracies of 95.31%, 98.60%, and 96.40%, respectively, significantly outperforming existing convolutional neural networks, attention-enhanced networks, and Transformer methods. HyperspectralMamba demonstrates an exceptional performance in small-sample class recognition and distinguishing spectrally similar terrain, while maintaining lower computational complexity, providing a new technical approach for high-precision hyperspectral image classification. Full article

(This article belongs to the Special Issue Recent Advances in the Processing of Hyperspectral Images (Second Edition))

► Show Figures

Figure 1

14 pages, 743 KB

Open AccessArticle

AD-VAE: Adversarial Disentangling Variational Autoencoder

by Adson Silva and Ricardo Farias

Sensors 2025, 25(5), 1574; https://doi.org/10.3390/s25051574 - 4 Mar 2025

Cited by 8 | Viewed by 2530

Abstract

Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like [...] Read more.

Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject’s identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets—AR, E-YaleB, CAS-PEAL, and FERET—with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications. Full article

(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)

► Show Figures

Figure 1

24 pages, 6943 KB

Open AccessArticle

Multi-Channel Fusion Decision-Making Online Detection Network for Surface Defects in Automotive Pipelines Based on Transfer Learning VGG16 Network

by Jian Song, Yingzhong Tian and Xiang Wan

Sensors 2024, 24(24), 7914; https://doi.org/10.3390/s24247914 - 11 Dec 2024

Cited by 5 | Viewed by 1741

Abstract

Although approaches for the online surface detection of automotive pipelines exist, low defect area rates, small-sample and long-tailed data, and the difficulty of detection due to the variable morphology of defects are three major problems faced when using such methods. In order to [...] Read more.

Although approaches for the online surface detection of automotive pipelines exist, low defect area rates, small-sample and long-tailed data, and the difficulty of detection due to the variable morphology of defects are three major problems faced when using such methods. In order to solve these problems, this study combines traditional visual detection methods and deep neural network technology to propose a transfer learning multi-channel fusion decision network without significantly increasing the number of network layers or the structural complexity. Each channel of the network is designed according to the characteristics of different types of defects. Dynamic weights are assigned to achieve decision-level fusion through the use of a matrix of indicators to evaluate the performance of each channel’s recognition ability. In order to improve the detection efficiency and reduce the amount of data transmission and processing, an improved ROI detection algorithm for surface defects is proposed. It can enable the rapid screening of target surfaces for the high-quality and rapid acquisition of surface defect images. On an automotive pipeline surface defect dataset, the detection accuracy of the multi-channel fusion decision network with transfer learning was 97.78% and its detection speed was 153.8 FPS. The experimental results indicate that the multi-channel fusion decision network could simultaneously take into account the needs for real-time detection and accuracy, synthesize the advantages of different network structures, and avoid the limitations of single-channel networks. Full article

(This article belongs to the Section Communications)

► Show Figures

Figure 1

23 pages, 3243 KB

Open AccessArticle

StarCAN-PFD: An Efficient and Simplified Multi-Scale Feature Detection Network for Small Objects in Complex Scenarios

by Zongxuan Chai, Tingting Zheng and Feixiang Lu

Electronics 2024, 13(15), 3076; https://doi.org/10.3390/electronics13153076 - 3 Aug 2024

Cited by 8 | Viewed by 3002

Abstract

Small object detection in traffic sign applications often faces challenges like complex backgrounds, blurry samples, and multi-scale variations. Existing solutions tend to complicate the algorithms. In this study, we designed an efficient and simple algorithm network called StarCAN-PFD, based on the single-stage YOLOv8 [...] Read more.

Small object detection in traffic sign applications often faces challenges like complex backgrounds, blurry samples, and multi-scale variations. Existing solutions tend to complicate the algorithms. In this study, we designed an efficient and simple algorithm network called StarCAN-PFD, based on the single-stage YOLOv8 framework, to accurately recognize small objects in complex scenarios. We proposed the StarCAN feature extraction network, which was enhanced with the Context Anchor Attention (CAA). We designed the Pyramid Focus and Diffusion Network (PFDNet) to address multi-scale information loss and developed the Detail-Enhanced Conv Shared Detect (DESDetect) module to improve the recognition of complex samples while keeping the network lightweight. Experiments on the CCTSDB dataset validated the effectiveness of each module. Compared to YOLOv8, our algorithm improved mAP@0.5 by 4%, reduced the model size to less than half, and demonstrated better performance on different traffic sign datasets. It excels at detecting small traffic sign targets in complex scenes, including challenging samples such as blurry, low-light night, occluded, and overexposed conditions, showcasing strong generalization ability. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)

► Show Figures

Figure 1

18 pages, 10575 KB

Open AccessArticle

Synthetic Image Generation Using Conditional GAN-Provided Single-Sample Face Image

by Muhammad Ali Iqbal, Waqas Jadoon and Soo Kyun Kim

Appl. Sci. 2024, 14(12), 5049; https://doi.org/10.3390/app14125049 - 10 Jun 2024

Cited by 15 | Viewed by 6291

Abstract

The performance of facial recognition systems significantly decreases when faced with a lack of training images. This issue is exacerbated when there is only one image per subject available. Probe images may contain variations such as illumination, expression, and disguise, which are difficult [...] Read more.

The performance of facial recognition systems significantly decreases when faced with a lack of training images. This issue is exacerbated when there is only one image per subject available. Probe images may contain variations such as illumination, expression, and disguise, which are difficult to recognize accurately. In this work, we present a model that generates six facial variations from a single neutral face image. Our model is based on a CGAN, designed to produce six highly realistic facial expressions from one neutral face image. To evaluate the accuracy of our approach comprehensively, we employed several pre-trained models (VGG-Face, ResNet-50, FaceNet, and DeepFace) along with a custom CNN model. Initially, these models achieved only about 76% accuracy on single-sample neutral images, highlighting the SSPP challenge. However, after fine-tuning on the synthetic expressions generated by our CGAN from these single images, their accuracy increased significantly to around 99%. Our method has proven highly effective in addressing SSPP issues, as evidenced by the significant improvement achieved. Full article

(This article belongs to the Special Issue Recent Research on Big Data Mining for Social Networks)

► Show Figures

Figure 1

Search Results (40)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (40)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI