Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection

Lee, Taehui; Jeong, Seyoung; Lee, Sang Jun

doi:10.3390/s26123757

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection

by

Taehui Lee

,

Seyoung Jeong

and

Sang Jun Lee

^*

Division of Electronic Engineering, Jeonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju 54896, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sensors 2026, 26(12), 3757; https://doi.org/10.3390/s26123757 (registering DOI)

Submission received: 13 May 2026 / Revised: 29 May 2026 / Accepted: 9 June 2026 / Published: 12 June 2026

(This article belongs to the Special Issue Multi-Sensor Fusion Technology for Feature Extraction and Intelligent Fault Diagnosis)

Download Versions Notes

Abstract

In industrial inspection, subtle defects often appear as local variations in appearance or geometry, making reliable anomaly detection challenging. A single sensing modality can miss important defect cues, while multimodal inspection combines appearance and geometric information to represent industrial objects more comprehensively. Many existing multimodal anomaly detection methods adopt early fusion strategies that integrate features at an early stage of the network. Such early integration can dilute modality-specific anomaly responses and cause anomaly smoothing, leading to degraded detection and localization performance. To address these challenges, we propose a reconstruction-based unsupervised multimodal anomaly detection framework integrating Discrepancy-Guided Complementary Fusion (DGCF) and Noise to Feature (N2F). Specifically, DGCF reduces anomaly smoothing by exploiting cross-modal discrepancies to extract complementary information, rather than directly summing or concatenating features from different modalities. Furthermore, N2F injects Gaussian noise into the feature space to regularize feature reconstruction and encourage the decoder to learn robust normal representations. Experimental results on the MVTec 3D-AD and Eyecandies datasets demonstrate the effectiveness of the proposed method. The proposed method achieves 97.3% I-AUROC, 99.6% P-AUROC, and 97.6% AUPRO on MVTec 3D-AD, and 94.8% I-AUROC, 98.6% P-AUROC, and 93.4% AUPRO on Eyecandies.

Keywords: industrial anomaly detection; multimodal anomaly detection; multi-sensor fusion; feature-level fusion; feature extraction; unsupervised learning

Share and Cite

MDPI and ACS Style

Lee, T.; Jeong, S.; Lee, S.J. Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection. Sensors 2026, 26, 3757. https://doi.org/10.3390/s26123757

AMA Style

Lee T, Jeong S, Lee SJ. Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection. Sensors. 2026; 26(12):3757. https://doi.org/10.3390/s26123757

Chicago/Turabian Style

Lee, Taehui, Seyoung Jeong, and Sang Jun Lee. 2026. "Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection" Sensors 26, no. 12: 3757. https://doi.org/10.3390/s26123757

APA Style

Lee, T., Jeong, S., & Lee, S. J. (2026). Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection. Sensors, 26(12), 3757. https://doi.org/10.3390/s26123757

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Discrepancy-Guided Complementary Fusion for Unsupervised Multimodal Anomaly Detection

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI