Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (125)

Search Parameters:
Keywords = unsupervised fusion network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1198 KB  
Article
GSMTNet: Dual-Stream Video Anomaly Detection via Gated Spatio-Temporal Graph and Multi-Scale Temporal Learning
by Di Jiang, Huicheng Lai, Guxue Gao, Dan Ma and Liejun Wang
Electronics 2026, 15(6), 1200; https://doi.org/10.3390/electronics15061200 - 13 Mar 2026
Viewed by 122
Abstract
Video Anomaly Detection aims to identify video segments containing abnormal events. However, detecting anomalies relies more heavily on temporal modeling, particularly when anomalies exhibit only subtle deviations from normal events. However, most existing methods inadequately model the heterogeneity in spatiotemporal relationships, especially the [...] Read more.
Video Anomaly Detection aims to identify video segments containing abnormal events. However, detecting anomalies relies more heavily on temporal modeling, particularly when anomalies exhibit only subtle deviations from normal events. However, most existing methods inadequately model the heterogeneity in spatiotemporal relationships, especially the dynamic interactions between human pose and video appearance. To address this, we propose GSMTNet, a dual-stream heterogeneous unsupervised network integrating gated spatio-temporal graph convolution and multi-scale temporal learning. First, we introduce a dynamic graph structure learning module, which leverages gated spatio-temporal graph convolutions with manifold transformations to model latent spatial relationships via human pose graphs. This is coupled with a normalizing flow-based density estimation module to model the probability distribution of normal samples in a latent space. Second, we design a hybrid dilated temporal module that employs multi-scale temporal feature learning to simultaneously capture long- and short-term dependencies, thereby enhancing the separability between normal patterns and potential deviations. Finally, we propose a dual-stream fusion module to hierarchically integrate features learned from pose graphs and raw video sequences, followed by a prediction head that computes anomaly scores from the fused features. Extensive experiments demonstrate state-of-the-art performance, achieving 86.81% AUC on ShanghaiTech and 70.43% on UBnormal, outperforming existing methods in rare anomaly scenarios. Full article
Show Figures

Figure 1

22 pages, 3475 KB  
Article
Cross-Layer Feature Fusion and Attention-Based Class Feature Alignment Network for Unsupervised Cross-Domain Remote Sensing Scene Classification
by Jiahao Wei, Erzhu Li and Ce Zhang
Remote Sens. 2026, 18(6), 859; https://doi.org/10.3390/rs18060859 - 11 Mar 2026
Viewed by 105
Abstract
Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus [...] Read more.
Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus in scene classification. Existing UDA methods focus primarily on aligning the overall feature distributions across domains but neglect class feature alignment, resulting in the loss of critical class information. To address this issue, a cross-layer feature fusion and attention-based class feature alignment network (CFACA-NET) is proposed for unsupervised cross-domain remote sensing scene classification. Specifically, a multi-layer feature extraction module (MFEM) consisting of a cross-layer feature fusion module (CFFM), a multi-scale dynamic attention module (MSDAM), and a fused feature optimization module (FFOM) is designed to enhance the representation ability of scene features. A high-confidence sample selection module is further introduced, which utilizes evidence theory and information entropy to obtain reliable pseudo-labels. Finally, a class feature alignment module is proposed, incorporating a two-stage training strategy to achieve effective class feature alignment. Experimental results on three remote sensing scene classification datasets demonstrate that CFACA-NET outperforms existing state-of-the-art methods in cross-domain classification performance, effectively enhancing cross-domain adaptation capability. Full article
Show Figures

Figure 1

25 pages, 1853 KB  
Article
Deep Learning for Process Monitoring and Defect Detection of Laser-Based Powder Bed Fusion of Polymers
by Mohammadali Vaezi, Victor Klamert and Mugdim Bublin
Polymers 2026, 18(5), 629; https://doi.org/10.3390/polym18050629 - 3 Mar 2026
Viewed by 477
Abstract
Maintaining consistent part quality remains a critical challenge in industrial additive manufacturing, particularly in laser-based powder bed fusion of polymers (PBF-LB/P), where crystallization-driven thermal instabilities, governed by isothermal crystallization within a narrow sintering window, precipitate defects such as curling, warping, and delamination. In [...] Read more.
Maintaining consistent part quality remains a critical challenge in industrial additive manufacturing, particularly in laser-based powder bed fusion of polymers (PBF-LB/P), where crystallization-driven thermal instabilities, governed by isothermal crystallization within a narrow sintering window, precipitate defects such as curling, warping, and delamination. In contrast to metal-based systems dominated by melt-pool hydrodynamics, polymer PBF-LB/P requires monitoring strategies capable of resolving subtle spatio-temporal thermal deviations under realistic industrial operating conditions. Although machine learning, particularly convolutional neural networks (CNNs), has demonstrated efficacy in defect detection, a structured evaluation of heterogeneous modeling paradigms and their deployment feasibility in polymer PBF-LB/P remains limited. This study presents a systematic cross-paradigm assessment of unsupervised anomaly detection (autoencoders and generative adversarial networks), supervised CNN classifiers (VGG-16, ResNet50, and Xception), hybrid CNN-LSTM architectures, and physics-informed neural networks (PINNs) using 76,450 synchronized thermal and RGB images acquired from a commercial industrial system operating under closed control constraints. CNN-based models enable frame- and sequence-level defect classification, whereas the PINN component complements detection by providing physically consistent thermal-field regression. The results reveal quantifiable trade-offs between detection performance, temporal robustness, physical consistency, and algorithmic complexity. Pre-trained CNNs achieve up to 99.09% frame-level accuracy but impose a substantial computational burden for edge deployment. The PINN model attains an RMSE of approximately 27 K under quasi-isothermal process conditions, supporting trend-level thermal monitoring. A lightweight hybrid CNN achieves 99.7% validation accuracy with 1860 parameters and a CPU-benchmarked forward-pass inference time of 1.6 ms (excluding sensor acquisition latency). Collectively, this study establishes a rigorously benchmarked, scalable, and resource-efficient deep-learning framework tailored to crystallization-dominated polymer PBF-LB/P, providing a technically grounded basis for real-time industrial quality monitoring. Full article
(This article belongs to the Special Issue Artificial Intelligence in Polymers)
Show Figures

Graphical abstract

18 pages, 1427 KB  
Article
Whole-Slide Image Classification via Deep Feature Fusion and Unsupervised Conditional Domain Adaptation
by Pin Wang, Jinhua Zhang, Yongming Li, Tianqi Long and Pufei Li
Appl. Sci. 2026, 16(5), 2310; https://doi.org/10.3390/app16052310 - 27 Feb 2026
Viewed by 178
Abstract
Deep learning models have received widespread attention in pathological image classification and recognition tasks. However, their performance relies on large amounts of annotated data, which are difficult to obtain for pathological images, severely limiting model generalization. Moreover, whole-slide images (WSIs) are extremely large [...] Read more.
Deep learning models have received widespread attention in pathological image classification and recognition tasks. However, their performance relies on large amounts of annotated data, which are difficult to obtain for pathological images, severely limiting model generalization. Moreover, whole-slide images (WSIs) are extremely large and must be divided into patches for processing, which often leads to the loss of global information and degrades recognition performance. To address these issues, this paper proposes a cross-domain WSI classification and recognition method based on deep feature fusion and conditional domain alignment (CTCA). The method targets unsupervised domain adaptation scenarios. It constructs a parallel architecture of transfer-pretrained networks, BreNet and Swin Transformer, to jointly extract local detail and global contextual features, achieving multi-scale and multi-perspective deep feature fusion. Subsequently, labels are introduced as conditional variables in the latent space to perform conditional domain alignment, preserving category correlations while reducing distribution discrepancies between domains. Finally, lesion regions in WSIs are visually annotated using predicted probability heatmaps. Experiments show that, under an unsupervised setting, the method effectively leverages small-scale labeled data to guide lesion recognition in unlabeled WSIs, outperforming existing unsupervised domain adaptation methods in accuracy and stability and enabling visualization of regions of interest to support clinical diagnosis. Full article
Show Figures

Figure 1

16 pages, 1443 KB  
Article
DCRDF-Net: A Dual-Channel Reverse-Distillation Fusion Network for 3D Industrial Anomaly Detection
by Chunshui Wang, Jianbo Chen and Heng Zhang
Sensors 2026, 26(2), 412; https://doi.org/10.3390/s26020412 - 8 Jan 2026
Viewed by 372
Abstract
Industrial surface defect detection is essential for ensuring product quality, but real-world production lines often provide only a limited number of defective samples, making supervised training difficult. Multimodal anomaly detection with aligned RGB and depth data is a promising solution, yet existing fusion [...] Read more.
Industrial surface defect detection is essential for ensuring product quality, but real-world production lines often provide only a limited number of defective samples, making supervised training difficult. Multimodal anomaly detection with aligned RGB and depth data is a promising solution, yet existing fusion schemes tend to overlook modality-specific characteristics and cross-modal inconsistencies, so that defects visible in only one modality may be suppressed or diluted. In this work, we propose DCRDF-Net, a dual-channel reverse-distillation fusion network for unsupervised RGB–depth industrial anomaly detection. The framework learns modality-specific normal manifolds from nominal RGB and depth data and detects defects as deviations from these learned manifolds. It consists of three collaborative components: a Perlin-guided pseudo-anomaly generator that injects appearance–geometry-consistent perturbations into both modalities to enrich training signals; a dual-channel reverse-distillation architecture with guided feature refinement that denoises teacher features and constrains RGB and depth students towards clean, defect-free representations; and a cross-modal squeeze–excitation gated fusion module that adaptively combines RGB and depth anomaly evidence based on their reliability and agreement.Extensive experiments on the MVTec 3D-AD dataset show that DCRDF-Net achieves 97.1% image-level I-AUROC and 98.8% pixel-level PRO, surpassing current state-of-the-art multimodal methods on this benchmark. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

24 pages, 4080 KB  
Article
An Unsupervised Situation Awareness Framework for UAV Sensor Data Fusion Enabled by a Stabilized Deep Variational Autoencoder
by Anxin Guo, Zhenxing Zhang, Rennong Yang, Ying Zhang, Liping Hu and Leyan Li
Sensors 2026, 26(1), 111; https://doi.org/10.3390/s26010111 - 24 Dec 2025
Viewed by 562
Abstract
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training [...] Read more.
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training instability and the difficulty of representing multi-modal distributions inherent in dynamic flight maneuvers. To address this, this paper proposes a novel unsupervised sensor data processing framework to overcome these issues. Our core innovation is a deep generative model, VAE-WRBM-MDN, specifically engineered for stable feature extraction from non-linear time-series sensor data. We demonstrate that while standard Variational Autoencoders (VAEs) often struggle to converge on this task, our introduction of Weighted-uncertainty Restricted Boltzmann Machines (WRBM) for layer-wise pre-training ensures stable learning. Furthermore, the integration of a Mixture Density Network (MDN) enables the decoder to accurately reconstruct the complex, multi-modal conditional distributions of sensor readings. Comparative experiments validate our approach, achieving 95.69% classification accuracy in identifying situational patterns. The results confirm that our framework provides robust enabling technology for real-time intelligent sensing and raw data interpretation in autonomous systems. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

26 pages, 12587 KB  
Article
Shift-Invariant Unsupervised Pansharpening Based on Diffusion Model
by Jialei Xie, Luyan Ji, Jinzhou Ye, Jilei Liu, Qi Feng, Kejian Liu and Yongchao Zhao
Remote Sens. 2026, 18(1), 27; https://doi.org/10.3390/rs18010027 - 22 Dec 2025
Viewed by 366
Abstract
Pansharpening is a crucial topic in remote sensing, and numerous deep learning-based methods have recently been proposed to explore the potential of deep neural networks (DNNs). However, existing approaches are often sensitive to spatial translation errors between high-resolution panchromatic (HRPan) and low-resolution multispectral [...] Read more.
Pansharpening is a crucial topic in remote sensing, and numerous deep learning-based methods have recently been proposed to explore the potential of deep neural networks (DNNs). However, existing approaches are often sensitive to spatial translation errors between high-resolution panchromatic (HRPan) and low-resolution multispectral (LRMS) images, leading to noticeable artifacts in the fused results. To address this issue, we propose an unsupervised pansharpening method that is robust to translation misalignment between HRPan and LRMS inputs. The proposed framework integrates a shift-invariant module to estimate subpixel spatial offsets and a diffusion-based generative model to progressively enhance spatial and spectral details. Moreover, a multi-scale detail injection module is designed to guide the diffusion process with fine-grained structural information. In addition, a carefully formulated loss function is established to preserve the fidelity of fusion results and facilitate the estimation of translation errors. Experiments conducted on the GaoFen-2, GaoFen-1, and WorldView-2 datasets demonstrate that the proposed method achieves superior fusion quality compared with state-of-the-art approaches and effectively suppresses artifacts caused by translation errors. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

27 pages, 2900 KB  
Article
Graph-SENet: An Unsupervised Learning-Based Graph Neural Network for Skeleton Extraction from Point Cloud
by Jie Li, Wei Guo and Wenli Zhang
Future Internet 2025, 17(12), 558; https://doi.org/10.3390/fi17120558 - 3 Dec 2025
Viewed by 654
Abstract
Extracting 3D skeletons from point clouds is a challenging task in computer vision. Most existing deep learning methods rely heavily on supervised data requiring extensive manual annotation. Consequently, re-labeling is often necessary for cross-category applications, while the process of 3D point cloud annotation [...] Read more.
Extracting 3D skeletons from point clouds is a challenging task in computer vision. Most existing deep learning methods rely heavily on supervised data requiring extensive manual annotation. Consequently, re-labeling is often necessary for cross-category applications, while the process of 3D point cloud annotation is inherently time-consuming and expensive. Simultaneously, existing unsupervised methods often suffer from significant skeleton point deviations due to limited capabilities in modeling local structures. To address these limitations, we propose Graph-SENet, an unsupervised learning-based graph neural network method for skeleton extraction. This method integrates dynamic graph convolution with a multi-level feature fusion mechanism to more comprehensively capture local geometric relationships. Through a multi-dimensional unsupervised feature loss, it learns the structural representation of skeleton points, significantly improving the precision and stability of skeleton point localization under annotation-free conditions. Furthermore, we propose a graph autoencoder structure optimized by cosine similarity to predict topological connections between skeleton points, thereby recovering semantically consistent and structurally complete 3D skeleton representations in an end-to-end manner. Experimental results on multiple datasets, including ShapeNet, ITOP, and Soybean-MVS, demonstrate that Graph-SENet outperforms existing mainstream unsupervised methods in terms of Chamfer Distance and F1-score. It exhibits superior accuracy, robustness, and cross-category generalization capabilities, effectively reducing manual annotation costs while enhancing the completeness and semantic consistency of skeleton recovery. These results validate the application potential and practical value of Graph-SENet in 3D structure understanding and downstream 3D analysis tasks. Full article
(This article belongs to the Special Issue Algorithms and Models for Next-Generation Vision Systems)
Show Figures

Figure 1

31 pages, 2154 KB  
Review
Application of Machine Learning in Food Safety Risk Assessment
by Qingchuan Zhang, Zhe Lu, Zhenqiao Liu, Jialu Li, Mingchao Chang and Min Zuo
Foods 2025, 14(23), 4005; https://doi.org/10.3390/foods14234005 - 22 Nov 2025
Cited by 5 | Viewed by 2103
Abstract
With the increasing globalization of supply chains, ensuring food safety has become more complex, necessitating advanced approaches for risk assessment. This study aims to review the transformative role of machine learning (ML) and deep learning (DL) in enabling intelligent food safety management by [...] Read more.
With the increasing globalization of supply chains, ensuring food safety has become more complex, necessitating advanced approaches for risk assessment. This study aims to review the transformative role of machine learning (ML) and deep learning (DL) in enabling intelligent food safety management by efficiently analyzing high-quality and nonlinear data. We systematically summarize recent advances in the application of ML and DL, focusing on key areas such as biotoxin detection, heavy metal contamination, analysis of pesticide and veterinary drug residues, and microbial risk prediction. While traditional algorithms including support vector machines and random forests demonstrate strong performance in classification and risk evaluation, unsupervised methods such as K-means and hierarchical cluster analysis facilitate pattern recognition in unlabeled datasets. Furthermore, novel DL architectures, such as convolutional neural networks, recurrent neural networks, and transformers, enable automated feature extraction and multimodal data integration, substantially improving detection accuracy and efficiency. In conclusion, we recommend future work to emphasize model interpretability, multi-modal data fusion, and integration into HACCP systems, thereby supporting intelligent, interpretable, and real-time food safety management. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Graphical abstract

19 pages, 3290 KB  
Article
Multi-Granularity Content-Aware Network with Semantic Integration for Unsupervised Anomaly Detection
by Xinyu Guo, Shihui Zhao, Jianbin Xue, Dongdong Liu, Xinyang Han, Shuai Zhang and Yufeng Zhang
Appl. Sci. 2025, 15(21), 11842; https://doi.org/10.3390/app152111842 - 6 Nov 2025
Cited by 1 | Viewed by 800
Abstract
Unsupervised anomaly detection has been widely applied to industrial scenarios. Recently, transformer-based methods have also been developed and have produced good performance. Although the global dependencies in anomaly images are considered, the typical patch partition strategy in the vanilla self-attention mechanism ignores the [...] Read more.
Unsupervised anomaly detection has been widely applied to industrial scenarios. Recently, transformer-based methods have also been developed and have produced good performance. Although the global dependencies in anomaly images are considered, the typical patch partition strategy in the vanilla self-attention mechanism ignores the content consistencies in anomaly defects or normal regions. To sufficiently exploit the content consistency in images, we propose the multi-granularity content-aware network with semantic integration (MGCA-Net), in which superpixel segmentation is introduced into feature space to divide images according to their spatial structures. Specifically, we adopt a pre-trained ResNet as the encoder to extract features. Then, we design content-aware attention blocks (CAABs) to capture the global information in features at different granularities. In this block, we impose superpixel segmentation on the features from the encoder and employ the superpixels as tokens for the learning of global relationships. Because the superpixels are divided according to their content consistencies, the spatial structures of objects in anomaly or normal regions are preserved. Meanwhile, the multi-granularity semantic integration block is devised to further integrate the global information of all granularities. Next, we use semantic-guided fusion blocks (SGFBs) to progressively upsample the features with the help of CAABs. Finally, the differences between the outputs of CAABs and SGFBs are calculated and merged to predict the anomaly defects. Thanks to the preservation of content consistency of objects, experimental results on two benchmark datasets demonstrate that our proposed MGCA-Net achieves superior anomaly detection performance over state-of-the-art methods. Full article
(This article belongs to the Topic Intelligent Image Processing Technology)
Show Figures

Figure 1

27 pages, 14010 KB  
Article
A Novel Unsupervised Structural Damage Detection Method Based on TCN-GAT Autoencoder
by Yanchun Ni, Qiyuan Jin and Rui Hu
Sensors 2025, 25(21), 6724; https://doi.org/10.3390/s25216724 - 3 Nov 2025
Cited by 2 | Viewed by 1256
Abstract
Over the service life of several decades, structural damage detection is crucial for ensuring the safety and durability of engineering structures. However, existing methods often overlook the spatiotemporal coupling in multi-sensor data, hindering the full exploitation of structural dynamic evolution and spatial correlations. [...] Read more.
Over the service life of several decades, structural damage detection is crucial for ensuring the safety and durability of engineering structures. However, existing methods often overlook the spatiotemporal coupling in multi-sensor data, hindering the full exploitation of structural dynamic evolution and spatial correlations. This paper proposes an autoencoder model integrating Temporal Convolutional Networks (TCN) and Graph Attention Networks (GAT), termed TCNGAT-AE, to establish an unsupervised damage detection method. The model utilizes the TCN module to extract temporal dependencies and dynamic features from vibration signals, while leveraging the GAT module to explicitly capture the spatial topological relationships within the sensor network, thereby achieving deep fusion of spatiotemporal features. The proposed method adopts an “offline training-online detection” framework, requiring only data from the healthy state of the structure for training, and employs reconstruction error as the damage indicator. To validate the proposed method, two sets of experimentally measured data are utilized: one from the Z-24 concrete box-girder bridge under ambient excitation, and the other from the Old Ada Bridge under vehicle load excitation. Additionally, ablation studies are conducted to analyze the effectiveness of the spatiotemporal fusion mechanism. Results demonstrate that the proposed method achieves effective damage detection in both different structural types and excitation scenarios. Furthermore, the explicit modeling of spatiotemporal features significantly enhances detection performance, with the anomaly detection rate showing substantial improvement compared to baseline models utilizing only temporal or spatial modeling. Moreover, this end-to-end framework processes raw vibration signals directly, avoiding complex preprocessing. This makes it highly suitable for practical and near-real-time monitoring. The findings of this study demonstrate that the damage detection method based on TCNGAT-AE can be effectively applied to structural safety monitoring in complex engineering environments, and can be further integrated with real-time monitoring systems of critical structures for online analysis. Full article
(This article belongs to the Special Issue Women’s Special Issue Series: Sensors)
Show Figures

Figure 1

29 pages, 5406 KB  
Article
An Efficient 3D Multi-Object Tracking Algorithm for Low-Cost UGV Using Multi-Level Data Association
by Xiaochun Yang, Anmin Huang, Jin Lou, Junhua Gou, Wenxing Fu and Jie Yan
Drones 2025, 9(11), 747; https://doi.org/10.3390/drones9110747 - 28 Oct 2025
Cited by 1 | Viewed by 1226
Abstract
3D object detection and tracking technology are increasingly being adopted in unmanned ground vehicles, as robust perception systems significantly improve the obstacle avoidance performance of a UGV. However, most existing algorithms depend heavily on computationally intensive point cloud neural networks, rendering them unsuitable [...] Read more.
3D object detection and tracking technology are increasingly being adopted in unmanned ground vehicles, as robust perception systems significantly improve the obstacle avoidance performance of a UGV. However, most existing algorithms depend heavily on computationally intensive point cloud neural networks, rendering them unsuitable for resource-constrained platforms. In this work, we propose an efficient 3D object detection and tracking method specially designed for deployment on low-cost vehicle platforms. For the detection phase, our method integrates an image-based 2D detector with data fusion techniques to coarsely extract object point clouds, followed by an unsupervised learning approach to isolate objects from noisy point cloud data. For the tracking process, we propose a multi-target tracking algorithm based on multi-level data association. This method introduces an additional data association step to handle targets that fail in 3D detection, thereby effectively reducing the impact of detection errors on tracking performance. Moreover, our method enhances association precision between detection outputs and existing trajectories through the integration of 2D and 3D information, thereby further mitigating the adverse effects of detection inaccuracies. By adopting unsupervised learning as an alternative to complex neural networks, our approach demonstrates strong compatibility with both low-resolution LiDAR and GPU-free computing platforms. Experiments on the KITTI benchmark demonstrate that our tracking framework achieves significant computational efficiency gains while maintaining detection accuracy. Furthermore, experimental evaluations on the real-world UGV platform demonstrated the deployment feasibility of our approach. Full article
Show Figures

Figure 1

47 pages, 3959 KB  
Review
A Review of Deep Learning in Rotating Machinery Fault Diagnosis and Its Prospects for Port Applications
by Haifeng Wang, Hui Wang and Xianqiong Tang
Appl. Sci. 2025, 15(21), 11303; https://doi.org/10.3390/app152111303 - 22 Oct 2025
Cited by 5 | Viewed by 5192
Abstract
As port operations rapidly evolve toward intelligent and heavy-duty applications, fault diagnosis for core equipment demands higher levels of real-time performance and robustness. Deep learning, with its powerful autonomous feature learning capabilities, demonstrates significant potential in mechanical fault prediction and health management. This [...] Read more.
As port operations rapidly evolve toward intelligent and heavy-duty applications, fault diagnosis for core equipment demands higher levels of real-time performance and robustness. Deep learning, with its powerful autonomous feature learning capabilities, demonstrates significant potential in mechanical fault prediction and health management. This paper first provides a systematic review of deep learning research advances in rotating machinery fault diagnosis over the past eight years, focusing on the technical approaches and application cases of four representative models: Deep Belief Networks (DBNs), Convolutional Neural Networks (CNNs), Auto-encoders (AEs), and Recurrent Neural Networks (RNNs). These models, respectively, embody four core paradigms, unsupervised feature generation, spatial pattern extraction, data reconstruction learning, and temporal dependency modeling, forming the technological foundation of contemporary intelligent diagnostics. Building upon this foundation, this paper delves into the unique challenges encountered when transferring these methods from generic laboratory components to specialized port equipment such as shore cranes and yard cranes—including complex operating conditions, harsh environments, and system coupling. It further explores future research directions, including cross-condition transfer, multi-source information fusion, and lightweight deployment, aiming to provide theoretical references and implementation pathways for the technological advancement of intelligent operation and maintenance in port equipment. Full article
Show Figures

Figure 1

21 pages, 14964 KB  
Article
An Automated Framework for Abnormal Target Segmentation in Levee Scenarios Using Fusion of UAV-Based Infrared and Visible Imagery
by Jiyuan Zhang, Zhonggen Wang, Jing Chen, Fei Wang and Lyuzhou Gao
Remote Sens. 2025, 17(20), 3398; https://doi.org/10.3390/rs17203398 - 10 Oct 2025
Cited by 3 | Viewed by 1042
Abstract
Levees are critical for flood defence, but their integrity is threatened by hazards such as piping and seepage, especially during high-water-level periods. Traditional manual inspections for these hazards and associated emergency response elements, such as personnel and assets, are inefficient and often impractical. [...] Read more.
Levees are critical for flood defence, but their integrity is threatened by hazards such as piping and seepage, especially during high-water-level periods. Traditional manual inspections for these hazards and associated emergency response elements, such as personnel and assets, are inefficient and often impractical. While UAV-based remote sensing offers a promising alternative, the effective fusion of multi-modal data and the scarcity of labelled data for supervised model training remain significant challenges. To overcome these limitations, this paper reframes levee monitoring as an unsupervised anomaly detection task. We propose a novel, fully automated framework that unifies geophysical hazards and emergency response elements into a single analytical category of “abnormal targets” for comprehensive situational awareness. The framework consists of three key modules: (1) a state-of-the-art registration algorithm to precisely align infrared and visible images; (2) a generative adversarial network to fuse the thermal information from IR images with the textural details from visible images; and (3) an adaptive, unsupervised segmentation module where a mean-shift clustering algorithm, with its hyperparameters automatically tuned by Bayesian optimization, delineates the targets. We validated our framework on a real-world dataset collected from a levee on the Pajiang River, China. The proposed method demonstrates superior performance over all baselines, achieving an Intersection over Union of 0.348 and a macro F1-Score of 0.479. This work provides a practical, training-free solution for comprehensive levee monitoring and demonstrates the synergistic potential of multi-modal fusion and automated machine learning for disaster management. Full article
Show Figures

Graphical abstract

35 pages, 5316 KB  
Review
Machine Learning for Quality Control in the Food Industry: A Review
by Konstantinos G. Liakos, Vassilis Athanasiadis, Eleni Bozinou and Stavros I. Lalas
Foods 2025, 14(19), 3424; https://doi.org/10.3390/foods14193424 - 4 Oct 2025
Cited by 17 | Viewed by 10725
Abstract
The increasing complexity of modern food production demands advanced solutions for quality control (QC), safety monitoring, and process optimization. This review systematically explores recent advancements in machine learning (ML) for QC across six domains: Food Quality Applications; Defect Detection and Visual Inspection Systems; [...] Read more.
The increasing complexity of modern food production demands advanced solutions for quality control (QC), safety monitoring, and process optimization. This review systematically explores recent advancements in machine learning (ML) for QC across six domains: Food Quality Applications; Defect Detection and Visual Inspection Systems; Ingredient Optimization and Nutritional Assessment; Packaging—Sensors and Predictive QC; Supply Chain—Traceability and Transparency and Food Industry Efficiency; and Industry 4.0 Models. Following a PRISMA-based methodology, a structured search of the Scopus database using thematic Boolean keywords identified 124 peer-reviewed publications (2005–2025), from which 25 studies were selected based on predefined inclusion and exclusion criteria, methodological rigor, and innovation. Neural networks dominated the reviewed approaches, with ensemble learning as a secondary method, and supervised learning prevailing across tasks. Emerging trends include hyperspectral imaging, sensor fusion, explainable AI, and blockchain-enabled traceability. Limitations in current research include domain coverage biases, data scarcity, and underexplored unsupervised and hybrid methods. Real-world implementation challenges involve integration with legacy systems, regulatory compliance, scalability, and cost–benefit trade-offs. The novelty of this review lies in combining a transparent PRISMA approach, a six-domain thematic framework, and Industry 4.0/5.0 integration, providing cross-domain insights and a roadmap for robust, transparent, and adaptive QC systems in the food industry. Full article
(This article belongs to the Special Issue Artificial Intelligence for the Food Industry)
Show Figures

Figure 1

Back to TopTop