Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (73)

Search Parameters:
Keywords = multistage feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 1672 KiB  
Article
TSE-APT: An APT Attack-Detection Method Based on Time-Series and Ensemble-Learning Models
by Mingyue Cheng, Ga Xiang, Qunsheng Yang, Zhixing Ma and Haoyang Zhang
Electronics 2025, 14(15), 2924; https://doi.org/10.3390/electronics14152924 - 22 Jul 2025
Viewed by 244
Abstract
Advanced Persistent Threat (APT) attacks pose a serious challenge to traditional detection methods. These methods often suffer from high false-alarm rates and limited accuracy due to the multi-stage and covert nature of APT attacks. In this paper, we propose TSE-APT, a time-series ensemble [...] Read more.
Advanced Persistent Threat (APT) attacks pose a serious challenge to traditional detection methods. These methods often suffer from high false-alarm rates and limited accuracy due to the multi-stage and covert nature of APT attacks. In this paper, we propose TSE-APT, a time-series ensemble model that addresses these two limitations. It combines multiple machine-learning models, such as Random Forest (RF), Multi-Layer Perceptron (MLP), and Bidirectional Long Short-Term Memory Network (BiLSTM) models, to dynamically capture correlations between multiple stages of the attack process based on time-series features. It discovers hidden features through the integration of multiple machine-learning models to significantly improve the accuracy and robustness of APT detection. First, we extract a collection of dynamic time-series features such as traffic mean, flow duration, and flag frequency. We fuse them with static contextual features, including the port service matrix and protocol type distribution, to effectively capture the multi-stage behaviors of APT attacks. Then, we utilize an ensemble-learning model with a dynamic weight-allocation mechanism using a self-attention network to adaptively adjust the sub-model contribution. The experiments showed that using time-series feature fusion significantly enhanced the detection performance. The RF, MLP, and BiLSTM models achieved 96.7% accuracy, considerably enhancing recall and the false positive rate. The adaptive mechanism optimizes the model’s performance and reduces false-alarm rates. This study provides an analytical method for APT attack detection, considering both temporal dynamics and context static characteristics, and provides new ideas for security protection in complex networks. Full article
(This article belongs to the Special Issue AI in Cybersecurity, 2nd Edition)
Show Figures

Figure 1

18 pages, 4374 KiB  
Article
Elevation-Aware Domain Adaptation for Sematic Segmentation of Aerial Images
by Zihao Sun, Peng Guo, Zehui Li, Xiuwan Chen and Xinbo Liu
Remote Sens. 2025, 17(14), 2529; https://doi.org/10.3390/rs17142529 - 21 Jul 2025
Viewed by 329
Abstract
Recent advancements in Earth observation technologies have accelerated remote sensing (RS) data acquisition, yet cross-domain semantic segmentation remains challenged by domain shifts. Traditional unsupervised domain adaptation (UDA) methods often rely on computationally intensive and unstable generative adversarial networks (GANs). This study introduces elevation-aware [...] Read more.
Recent advancements in Earth observation technologies have accelerated remote sensing (RS) data acquisition, yet cross-domain semantic segmentation remains challenged by domain shifts. Traditional unsupervised domain adaptation (UDA) methods often rely on computationally intensive and unstable generative adversarial networks (GANs). This study introduces elevation-aware domain adaptation (EADA), a multi-task framework that integrates elevation estimation (via digital surface models) with semantic segmentation to address distribution discrepancies. EADA employs a shared encoder and task-specific decoders, enhanced by a spatial attention-based feature fusion module. Experiments on Potsdam and Vaihingen datasets under cross-domain settings (e.g., Potsdam IRRG → Vaihingen IRRG) show that EADA achieves state-of-the-art performance, with a mean IoU of 54.62% and an F1-score of 65.47%, outperforming single-stage baselines. Elevation awareness significantly improves the segmentation of height-sensitive classes, such as buildings, while maintaining computational efficiency. Compared to multi-stage approaches, EADA’s end-to-end design reduces training complexity without sacrificing accuracy. These results demonstrate that incorporating elevation data effectively mitigates domain shifts in RS imagery. However, lower accuracy for elevation-insensitive classes suggests the need for further refinement to enhance overall generalizability. Full article
Show Figures

Figure 1

25 pages, 4232 KiB  
Article
Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft
by Zhikai Wang, Sen Wang, Yiwen Hu, Yangfan Zhou, Na Li and Xiaofeng Zhang
Biomimetics 2025, 10(7), 448; https://doi.org/10.3390/biomimetics10070448 - 7 Jul 2025
Viewed by 457
Abstract
This paper presents FWStab, a specialized video stabilization dataset tailored for flapping-wing platforms. The dataset encompasses five typical flight scenarios, featuring 48 video clips with intense dynamic jitter. The corresponding Inertial Measurement Unit (IMU) sensor data are synchronously collected, which jointly provide reliable [...] Read more.
This paper presents FWStab, a specialized video stabilization dataset tailored for flapping-wing platforms. The dataset encompasses five typical flight scenarios, featuring 48 video clips with intense dynamic jitter. The corresponding Inertial Measurement Unit (IMU) sensor data are synchronously collected, which jointly provide reliable support for multimodal modeling. Based on this, to address the issue of poor image acquisition quality due to severe vibrations in aerial vehicles, this paper proposes a multi-modal signal fusion video stabilization framework. This framework effectively integrates image features and inertial sensor features to predict smooth and stable camera poses. During the video stabilization process, the true camera motion originally estimated based on sensors is warped to the smooth trajectory predicted by the network, thereby optimizing the inter-frame stability. This approach maintains the global rigidity of scene motion, avoids visual artifacts caused by traditional dense optical flow-based spatiotemporal warping, and rectifies rolling shutter-induced distortions. Furthermore, the network is trained in an unsupervised manner by leveraging a joint loss function that integrates camera pose smoothness and optical flow residuals. When coupled with a multi-stage training strategy, this framework demonstrates remarkable stabilization adaptability across a wide range of scenarios. The entire framework employs Long Short-Term Memory (LSTM) to model the temporal characteristics of camera trajectories, enabling high-precision prediction of smooth trajectories. Full article
Show Figures

Figure 1

17 pages, 1609 KiB  
Article
Parallel Multi-Scale Semantic-Depth Interactive Fusion Network for Depth Estimation
by Chenchen Fu, Sujunjie Sun, Ning Wei, Vincent Chau, Xueyong Xu and Weiwei Wu
J. Imaging 2025, 11(7), 218; https://doi.org/10.3390/jimaging11070218 - 1 Jul 2025
Viewed by 340
Abstract
Self-supervised depth estimation from monocular image sequences provides depth information without costly sensors like LiDAR, offering significant value for autonomous driving. Although self-supervised algorithms can reduce the dependence on labeled data, the performance is still affected by scene occlusions, lighting differences, and sparse [...] Read more.
Self-supervised depth estimation from monocular image sequences provides depth information without costly sensors like LiDAR, offering significant value for autonomous driving. Although self-supervised algorithms can reduce the dependence on labeled data, the performance is still affected by scene occlusions, lighting differences, and sparse textures. Existing methods do not consider the enhancement and interaction fusion of features. In this paper, we propose a novel parallel multi-scale semantic-depth interactive fusion network. First, we adopt a multi-stage feature attention network for feature extraction, and a parallel semantic-depth interactive fusion module is introduced to refine edges. Furthermore, we also employ a metric loss based on semantic edges to take full advantage of semantic geometric information. Our network is trained and evaluated on KITTI datasets. The experimental results show that the methods achieve satisfactory performance compared to other existing methods. Full article
Show Figures

Figure 1

18 pages, 2503 KiB  
Article
Defect Identification and Diagnosis for Distribution Network Electrical Equipment Based on Fused Image and Voiceprint Joint Perception
by An Chen, Junle Liu, Silin Liu, Jinchao Fan and Bin Liao
Energies 2025, 18(13), 3451; https://doi.org/10.3390/en18133451 - 30 Jun 2025
Viewed by 225
Abstract
As the scale of distribution networks expand, existing defect identification methods face numerous challenges, including limitations in single-modal feature identification, insufficient cross-modal information fusion, and the lack of a multi-stage feedback mechanism. To address these issues, we first propose a joint perception of [...] Read more.
As the scale of distribution networks expand, existing defect identification methods face numerous challenges, including limitations in single-modal feature identification, insufficient cross-modal information fusion, and the lack of a multi-stage feedback mechanism. To address these issues, we first propose a joint perception of image and voiceprint features based on bidirectional coupled attention, which enhances deep interaction across modalities and overcomes the shortcomings of traditional methods in cross-modal fusion. Secondly, a defect identification and diagnosis method of distribution network electrical equipment based on two-stage convolutional neural networks (CNN) is introduced, which makes the network pay more attention to typical and frequent defects, and enhances defect diagnosis accuracy and robustness. The proposed algorithm is compared with two baseline algorithms. Baseline 1 is a long short term memory (LSTM)-based algorithm that performs separate feature extraction and processing for image and voiceprint signals without coupling the features of the two modalities, and Baseline 2 is a traditional CNN algorithm that uses classical convolutional layers for feature learning and classification through pooling and fully connected layers. Compared with two baselines, simulation results demonstrate that the proposed method improves accuracy by 12.1% and 33.7%, recall by 12.5% and 33.1%, and diagnosis efficiency by 22.92% and 60.42%. Full article
Show Figures

Figure 1

22 pages, 44010 KiB  
Article
SMM-POD: Panoramic 3D Object Detection via Spherical Multi-Stage Multi-Modal Fusion
by Jinghan Zhang, Yusheng Yang, Zhiyuan Gao, Hang Shi and Yangmin Xie
Remote Sens. 2025, 17(12), 2089; https://doi.org/10.3390/rs17122089 - 18 Jun 2025
Viewed by 540
Abstract
Panoramic 3D object detection is a challenging task due to image distortion, sensor heterogeneity, and the difficulty of combining information from multiple modalities over a wide field-of-view (FoV). To address these issues, we propose SMM-POD, a novel framework that introduces a spherical multi-stage [...] Read more.
Panoramic 3D object detection is a challenging task due to image distortion, sensor heterogeneity, and the difficulty of combining information from multiple modalities over a wide field-of-view (FoV). To address these issues, we propose SMM-POD, a novel framework that introduces a spherical multi-stage fusion strategy for panoramic 3D detection. Our approach creates a five-channel spherical image aligned with LiDAR data and uses a quasi-uniform Voronoi sphere (UVS) model to reduce projection distortion. A cross-attention-based feature extraction module and a transformer encoder–decoder with spherical positional encoding enable the accurate and efficient fusion of image and point cloud features. For precise 3D localization, we adopt a Frustum PointNet module. Experiments on the DAIR-V2X-I benchmark and our self-collected SHU-3DPOD dataset show that SMM-POD achieves a state-of-the-art performance across all object categories. It significantly improves the detection of small objects like cyclists and pedestrians and maintains stable results under various environmental conditions. These results demonstrate the effectiveness of SMM-POD in panoramic multi-modal 3D perception and establish it as a strong baseline for wide FoV object detection. Full article
(This article belongs to the Section Urban Remote Sensing)
Show Figures

Figure 1

23 pages, 5700 KiB  
Article
Hybrid Deep Learning Architecture with Adaptive Feature Fusion for Multi-Stage Alzheimer’s Disease Classification
by Ahmad Muhammad, Qi Jin, Osman Elwasila and Yonis Gulzar
Brain Sci. 2025, 15(6), 612; https://doi.org/10.3390/brainsci15060612 - 6 Jun 2025
Cited by 1 | Viewed by 1024
Abstract
Background/Objectives: Alzheimer’s disease (AD), a progressive neurodegenerative disorder, demands precise early diagnosis to enable timely interventions. Traditional convolutional neural networks (CNNs) and deep learning models often fail to effectively integrate localized brain changes with global connectivity patterns, limiting their efficacy in Alzheimer’s disease [...] Read more.
Background/Objectives: Alzheimer’s disease (AD), a progressive neurodegenerative disorder, demands precise early diagnosis to enable timely interventions. Traditional convolutional neural networks (CNNs) and deep learning models often fail to effectively integrate localized brain changes with global connectivity patterns, limiting their efficacy in Alzheimer’s disease (AD) classification. Methods: This research proposes a novel deep learning framework for multi-stage Alzheimer’s disease (AD) classification using T1-weighted MRI scans. The adaptive feature fusion layer, a pivotal advancement, facilitates the dynamic integration of features extracted from a ResNet50-based CNN and a vision transformer (ViT). Unlike static fusion methods, our adaptive feature fusion layer employs an attention mechanism to dynamically integrate ResNet50’s localized structural features and vision transformer (ViT) global connectivity patterns, significantly enhancing stage-specific Alzheimer’s disease classification accuracy. Results: Evaluated on the Alzheimer’s 5-Class (AD5C) dataset comprising 2380 MRI scans, the framework achieves an accuracy of 99.42% (precision: 99.55%; recall: 99.46%; F1-score: 99.50%), surpassing the prior benchmark of 98.24% by 1.18%. Ablation studies underscore the essential role of adaptive feature fusion in minimizing misclassifications, while external validation on a four-class dataset confirms robust generalizability. Conclusions: This framework enables precise early Alzheimer’s disease (AD) diagnosis by integrating multi-scale neuroimaging features, empowering clinicians to optimize patient care through timely and targeted interventions. Full article
Show Figures

Figure 1

17 pages, 11008 KiB  
Article
Retinex-Based Low-Light Image Enhancement via Spatial-Channel Redundancy Compression and Joint Attention
by Jinlong Chen, Zhigang Xiao, Xingguo Qin and Deming Luo
Electronics 2025, 14(11), 2212; https://doi.org/10.3390/electronics14112212 - 29 May 2025
Viewed by 477
Abstract
Low-light image enhancement (LLIE) methods based on Retinex theory often involve complex, multi-stage training and are commonly built on convolutional neural networks (CNNs). However, CNNs suffer from limitations in capturing long-range dependencies and often introduce redundant computations, leading to high computational costs. To [...] Read more.
Low-light image enhancement (LLIE) methods based on Retinex theory often involve complex, multi-stage training and are commonly built on convolutional neural networks (CNNs). However, CNNs suffer from limitations in capturing long-range dependencies and often introduce redundant computations, leading to high computational costs. To address these issues, we propose a lightweight and efficient LLIE framework that incorporates an optimized CNN compression strategy and a novel attention mechanism. Specifically, we design a Spatial-Channel Feature Reconstruction Module (SCFRM) to suppress spatial and channel redundancy via split-reconstruction and separation-fusion strategies. SCFRM is composed of two parts, a Spatial Feature Enhancement Unit (SFEU) and a Channel Refinement Block (CRB), which together enhance feature representation while reducing computational load. Additionally, we introduce a Joint Attention (JOA) mechanism that captures long-range dependencies across spatial dimensions while preserving positional accuracy. Our Retinex-based framework separates the processing of illumination and reflectance components using a Denoising Network (DNNet) and a Light Enhancement Network (LINet). SCFRM is embedded into DNNet for improved denoising, while JOA is applied in LINet for precise brightness adjustment. Extensive experiments on multiple benchmark datasets demonstrate that our method achieves superior or comparable performance to state-of-the-art LLIE approaches, while significantly reducing computational complexity. On the LOL and VE-LOL datasets, our approach achieves the best or second-best scores in terms of PSNR and SSIM metrics, validating its effectiveness and efficiency. Full article
Show Figures

Figure 1

27 pages, 6636 KiB  
Article
SCF-CIL: A Multi-Stage Regularization-Based SAR Class-Incremental Learning Method Fused with Electromagnetic Scattering Features
by Yunpeng Zhang, Mengdao Xing, Jinsong Zhang and Sergio Vitale
Remote Sens. 2025, 17(9), 1586; https://doi.org/10.3390/rs17091586 - 30 Apr 2025
Cited by 1 | Viewed by 388
Abstract
Synthetic aperture radar (SAR) recognition systems often need to collect new data and update the network accordingly. However, the network faces the challenge of catastrophic forgetting, where previously learned knowledge might be lost during the incremental learning of new data. To improve the [...] Read more.
Synthetic aperture radar (SAR) recognition systems often need to collect new data and update the network accordingly. However, the network faces the challenge of catastrophic forgetting, where previously learned knowledge might be lost during the incremental learning of new data. To improve the applicability and sustainability of SAR target classification methods, we propose a multi-stage regularization-based class-incremental learning (CIL) method for SAR targets, called SCF-CIL, which addresses catastrophic forgetting. This method offers three main contributions. First, for the feature extractor, we fuse the convolutional neural network features with the scattering center features using a cross-attention feature fusion structure, ensuring both the plasticity and stability of the extracted features. Next, an overfitting training strategy is applied to provide clustering space for unseen classes with an acceptable trade-off in the accuracy of the current classes. Finally, we analyze the influence of training with imbalanced data on the last fully connected layer and introduce a multi-stage regularization method by dividing the calculation of the fully connected layer into three parts and applying regularization to each. Our experiments on SAR datasets demonstrate the effectiveness of these improvements. Full article
(This article belongs to the Special Issue Recent Advances in SAR: Signal Processing and Target Recognition)
Show Figures

Graphical abstract

18 pages, 6047 KiB  
Article
AS-YOLO: A Novel YOLO Model with Multi-Scale Feature Fusion for Intracranial Aneurysm Recognition
by Jun Yang, Chen Wang, Yang Chen, Zhengkui Chen and Jijun Tong
Electronics 2025, 14(8), 1692; https://doi.org/10.3390/electronics14081692 - 21 Apr 2025
Viewed by 484
Abstract
Intracranial aneurysm is a common clinical disease that seriously endangers the health of patients. In view of the shortcomings of existing intracranial aneurysm recognition methods in dealing with complex aneurysm morphologies, varying sizes, as well as multi-scale feature extraction and lightweight deployment, this [...] Read more.
Intracranial aneurysm is a common clinical disease that seriously endangers the health of patients. In view of the shortcomings of existing intracranial aneurysm recognition methods in dealing with complex aneurysm morphologies, varying sizes, as well as multi-scale feature extraction and lightweight deployment, this study introduces an intracranial aneurysm detection framework, AS-YOLO, which is designed to enhance recognition precision while ensuring compatibility with lightweight device deployment. Built on the YOLOv8n backbone, this approach incorporates a cascaded enhancement module to refine representation learning across scales. In addition, a multi-stage fusion strategy was employed to facilitate efficient integration of cross-scale semantic features. Then, the detection head was improved by proposing an efficient depthwise separable convolutional aggregation detection head. This modification significantly lowers both the parameter count and computational burden without compromising recognition precision. Finally, the SIoU-based regression loss was employed, enhancing the bounding box alignment and boosting overall detection performance. Compared with the original YOLOv8, the proposed solution achieves higher recognition precision for aneurysm detection—boosting mAP@0.5 by 8.7% and mAP@0.5:0.95 by 4.96%. Meanwhile, the overall model complexity is effectively reduced, with a parameter count reduction of 8.21%. Incorporating multi-scale representation fusion and lightweight design, the introduced model maintains high detection accuracy and exhibits strong adaptability in environments with limited computational resources, including mobile health applications. Full article
Show Figures

Figure 1

23 pages, 7938 KiB  
Article
Non-Destructive Detection of Chilled Mutton Freshness Using a Dual-Branch Hierarchical Spectral Feature-Aware Network
by Jixiang E, Chengjun Zhai, Xinhua Jiang, Ziyang Xu, Muqiu Wudan and Danyang Li
Foods 2025, 14(8), 1379; https://doi.org/10.3390/foods14081379 - 17 Apr 2025
Viewed by 583
Abstract
Precise detection of meat freshness levels is essential for food consumer safety and real-time quality monitoring. This study aims to achieve the high-accuracy freshness detection of chilled mutton freshness by integrating hyperspectral imaging with deep learning methods. Although hyperspectral data can effectively capture [...] Read more.
Precise detection of meat freshness levels is essential for food consumer safety and real-time quality monitoring. This study aims to achieve the high-accuracy freshness detection of chilled mutton freshness by integrating hyperspectral imaging with deep learning methods. Although hyperspectral data can effectively capture changes in mutton freshness, sparse raw spectra require optimal data processing strategies to minimize redundancy. Therefore, this study employs a multi-stage data processing approach to enhance the purity of feature spectra. Meanwhile, to address issues such as overlapping feature categories, imbalanced sample distributions, and insufficient intermediate features, we propose a Dual-Branch Hierarchical Spectral Feature-Aware Network (DBHSNet) for chilled mutton freshness detection. First, at the feature interaction stage, the PBCA module addresses the drawback that global and local branches in a conventional dual-branch framework tend to perceive spectral features independently. By enabling effective information exchange and bidirectional flow between the two branches, and injecting positional information into each spectral band, the model’s awareness of sequential spectral bands is enhanced. Second, at the feature fusion stage, the task-driven MSMHA module is introduced to address the dynamics of freshness variation and the accumulation of different metabolites. By leveraging multi-head attention and cross-scale fusion, the model more effectively captures both the overall spectral variation trends and fine-grained feature details. Third, at the classification output stage, dynamic loss weighting is set according to training epochs and relative losses to balance classification performance, effectively mitigating the impact of insufficiently discriminative intermediate features. The results demonstrate that the DBHSNet enables a more precise assessment of mutton freshness, achieving up to 7.59% higher accuracy than conventional methods under the same preprocessing conditions, while maintaining superior weighted metrics. Overall, this study offers a novel approach for mutton freshness detection and provides valuable support for freshness monitoring in cold-chain meat systems. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

18 pages, 2813 KiB  
Article
Multi-Scale Feature Fusion Based on Difference Enhancement for Remote Sensing Image Change Detection
by Haoyuan Hou, Yixuan Wang, Qin Qin, Yin Tan and Tonglai Liu
Symmetry 2025, 17(4), 590; https://doi.org/10.3390/sym17040590 - 12 Apr 2025
Cited by 1 | Viewed by 583
Abstract
Remote sensing image change detection is a core task of remote sensing image analysis; its purpose is to identify and quantify land cover changes in different periods. However, when the existing methods deal with complex features and subtle changes in buildings, vegetation, water [...] Read more.
Remote sensing image change detection is a core task of remote sensing image analysis; its purpose is to identify and quantify land cover changes in different periods. However, when the existing methods deal with complex features and subtle changes in buildings, vegetation, water bodies, roads, and other ground objects, there are often problems of false detection and missing detection, which affect the detection accuracy. To improve the accuracy of change detection, a multi-scale feature fusion network based on difference enhancement (FEDNet) is proposed. The FEDNet consists of a difference enhancement module (DEM) and a multi-scale feature fusion module (MFM). By summing the variation features of two-phase remote sensing images, the DEM enhances pixel-level differences, captures subtle changes, and aggregates features. The MFM fully integrates the multi-stage deep semantic information, which enables better extraction of changing features in complex scenes. Experiments on the LEVIR-CD, CLCD, WHU, NJDS, and GBCNR datasets show that the FEDNet significantly improves the detection efficiency of changes in buildings, cities, and vegetation. In terms of F1 value, IoU (Intersection over Union), precision, and recall rate, the FEDNet is superior to existing methods, which verifies its excellent performance. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

30 pages, 9702 KiB  
Article
SiamCTCA: Cross-Temporal Correlation Aggregation Siamese Network for UAV Tracking
by Qiaochu Wang, Faxue Liu, Bao Zhang, Jinghong Liu, Fang Xu and Yulong Wang
Drones 2025, 9(4), 294; https://doi.org/10.3390/drones9040294 - 10 Apr 2025
Viewed by 647
Abstract
In aerial target-tracking research, complex scenarios place extremely high demands on the precision and robustness of tracking algorithms. Although the existing target-tracking algorithms have achieved good performance in general scenarios, all of them ignore the correlation between contextual information to a certain extent, [...] Read more.
In aerial target-tracking research, complex scenarios place extremely high demands on the precision and robustness of tracking algorithms. Although the existing target-tracking algorithms have achieved good performance in general scenarios, all of them ignore the correlation between contextual information to a certain extent, and the manipulation between features exacerbates the loss of information, leading to the degradation of precision and robustness, especially in the field of UAV target tracking. In response to this, we propose a new lightweight Siamese-based tracker, SiamCTCA. Its innovative cross-temporal aggregated strategy and three feature correlation fusion networks play a key role, in which the Transformer multistage embedding achieves cross-branch information fusion with the help of the intertemporal correlation interactive vision Transformer modules to efficiently integrate different levels of features, and the feed-forward residual multidimensional fusion edge mechanism reduces information loss by introducing residuals to cope with dynamic changes in the search region; and the response significance filter aggregation network suppresses the shallow noise amplification problem of neural networks. The modules are confirmed to be effective after ablation and comparison experiments, indicating that the tracker exhibits excellent tracking performance, and with faster tracking speeds than other trackers, these can be better deployed in the field of a UAV as a platform. Full article
(This article belongs to the Special Issue Detection, Identification and Tracking of UAVs and Drones)
Show Figures

Figure 1

30 pages, 14090 KiB  
Article
Integrated Technologies for Smart Building Energy Systems Refurbishment: A Case Study in Italy
by Lorenzo Villani, Martina Casciola and Davide Astiaso Garcia
Buildings 2025, 15(7), 1041; https://doi.org/10.3390/buildings15071041 - 24 Mar 2025
Cited by 1 | Viewed by 591
Abstract
This study presents an integrated approach for adapting building energy systems using Machine Learning (ML), the Internet of Things (IoT), and Building Information Modeling (BIM) in a hotel retrofit in Italy. In a concise multi-stage process, long-term climatic data and on-site technical documentation [...] Read more.
This study presents an integrated approach for adapting building energy systems using Machine Learning (ML), the Internet of Things (IoT), and Building Information Modeling (BIM) in a hotel retrofit in Italy. In a concise multi-stage process, long-term climatic data and on-site technical documentation were analyzed to create a detailed BIM model. This model enabled energy simulations using the Carrier–Pizzetti method and supported the design of a hybrid HVAC system—integrating VRF and hydronic circuits—further enhanced by a custom ML algorithm for adaptive, predictive energy management through BIM and IoT data fusion. The study also incorporated photovoltaic panels and solar collectors, reducing reliance on non-renewable energy sources. Results demonstrate the effectiveness of smart energy management, showcasing significant potential for scalability in similar building typologies. Future improvements include integrating a temporal evolution model, refining feature selection using advanced optimization techniques, and expanding validation across multiple case studies. This research highlights the transformative role of ML, IoT, and BIM in achieving sustainable, smart, and efficient building energy systems, offering a replicable framework for sustainable renovations in the hospitality sector. Full article
(This article belongs to the Special Issue Sustainable and Smart Energy Systems in the Built Environment)
Show Figures

Figure 1

26 pages, 6921 KiB  
Article
Automated Docking System for LNG Loading Arm Based on Machine Vision and Multi-Sensor Fusion
by Rui Xiang, Wuwei Feng, Songling Song and Hao Zhang
Appl. Sci. 2025, 15(5), 2264; https://doi.org/10.3390/app15052264 - 20 Feb 2025
Cited by 2 | Viewed by 657
Abstract
With the growth of global liquefied natural gas (LNG) demand, automation technology has become a key trend to improve the efficiency and safety of LNG handling. In this study, a novel automatic docking system is proposed which adopts a staged docking strategy based [...] Read more.
With the growth of global liquefied natural gas (LNG) demand, automation technology has become a key trend to improve the efficiency and safety of LNG handling. In this study, a novel automatic docking system is proposed which adopts a staged docking strategy based on a monocular camera for positioning and combines ultrasonic sensors to achieve multi-stage optimization in the fine docking stage. In the coarse docking stage, the system acquires flange image data through the monocular camera, calculates 3D coordinates based on geometric feature extraction and coordinate transformation, and completes the preliminary target localization and fast approach; in the fine docking stage, the ultrasonic sensor is used to measure the multidirectional distance deviation, and the fusion of the monocular data is used to make dynamic adjustments to achieve high-precision alignment and localization. Simulation and experimental verification show that the system has good robustness in complex environments, such as wind and waves, and can achieve docking accuracy within 3 mm, which is better than the traditional manual docking method. This study provides a practical solution for automated docking of LNG loading arms, which can significantly improve the efficiency and safety of LNG loading and unloading operations. Full article
Show Figures

Figure 1

Back to TopTop