Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (28)

Search Parameters:
Keywords = entropy guided fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 3055 KB  
Article
RDPNet: A Multi-Scale Residual Dilated Pyramid Network with Entropy-Based Feature Fusion for Epileptic EEG Classification
by Tongle Xie, Wei Zhao, Yanyouyou Liu and Shixiao Xiao
Entropy 2025, 27(8), 830; https://doi.org/10.3390/e27080830 - 5 Aug 2025
Viewed by 431
Abstract
Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across [...] Read more.
Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide. Electroencephalogram (EEG) signals play a vital role in the diagnosis and analysis of epileptic seizures. However, traditional machine learning techniques often rely on handcrafted features, limiting their robustness and generalizability across diverse EEG acquisition settings, seizure types, and patients. To address these limitations, we propose RDPNet, a multi-scale residual dilated pyramid network with entropy-guided feature fusion for automated epileptic EEG classification. RDPNet combines residual convolution modules to extract local features and a dilated convolutional pyramid to capture long-range temporal dependencies. A dual-pathway fusion strategy integrates pooled and entropy-based features from both shallow and deep branches, enabling robust representation of spatial saliency and statistical complexity. We evaluate RDPNet on two benchmark datasets: the University of Bonn and TUSZ. On the Bonn dataset, RDPNet achieves 99.56–100% accuracy in binary classification, 99.29–99.79% in ternary tasks, and 95.10% in five-class classification. On the clinically realistic TUSZ dataset, it reaches a weighted F1-score of 95.72% across seven seizure types. Compared with several baselines, RDPNet consistently outperforms existing approaches, demonstrating superior robustness, generalizability, and clinical potential for epileptic EEG analysis. Full article
(This article belongs to the Special Issue Complexity, Entropy and the Physics of Information II)
Show Figures

Figure 1

34 pages, 1156 KB  
Systematic Review
Mathematical Modelling and Optimization Methods in Geomechanically Informed Blast Design: A Systematic Literature Review
by Fabian Leon, Luis Rojas, Alvaro Peña, Paola Moraga, Pedro Robles, Blanca Gana and Jose García
Mathematics 2025, 13(15), 2456; https://doi.org/10.3390/math13152456 - 30 Jul 2025
Viewed by 445
Abstract
Background: Rock–blast design is a canonical inverse problem that joins elastodynamic partial differential equations (PDEs), fracture mechanics, and stochastic heterogeneity. Objective: Guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol, a systematic review of mathematical methods for geomechanically informed [...] Read more.
Background: Rock–blast design is a canonical inverse problem that joins elastodynamic partial differential equations (PDEs), fracture mechanics, and stochastic heterogeneity. Objective: Guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol, a systematic review of mathematical methods for geomechanically informed blast modelling and optimisation is provided. Methods: A Scopus–Web of Science search (2000–2025) retrieved 2415 records; semantic filtering and expert screening reduced the corpus to 97 studies. Topic modelling with Bidirectional Encoder Representations from Transformers Topic (BERTOPIC) and bibliometrics organised them into (i) finite-element and finite–discrete element simulations, including arbitrary Lagrangian–Eulerian (ALE) formulations; (ii) geomechanics-enhanced empirical laws; and (iii) machine-learning surrogates and multi-objective optimisers. Results: High-fidelity simulations delimit blast-induced damage with ≤0.2 m mean absolute error; extensions of the Kuznetsov–Ram equation cut median-size mean absolute percentage error (MAPE) from 27% to 15%; Gaussian-process and ensemble learners reach a coefficient of determination (R2>0.95) while providing closed-form uncertainty; Pareto optimisers lower peak particle velocity (PPV) by up to 48% without productivity loss. Synthesis: Four themes emerge—surrogate-assisted PDE-constrained optimisation, probabilistic domain adaptation, Bayesian model fusion for digital-twin updating, and entropy-based energy metrics. Conclusions: Persisting challenges in scalable uncertainty quantification, coupled discrete–continuous fracture solvers, and rigorous fusion of physics-informed and data-driven models position blast design as a fertile test bed for advances in applied mathematics, numerical analysis, and machine-learning theory. Full article
Show Figures

Figure 1

23 pages, 3791 KB  
Article
A Method for Few-Shot Radar Target Recognition Based on Multimodal Feature Fusion
by Yongjing Zhou, Yonggang Li and Weigang Zhu
Sensors 2025, 25(13), 4162; https://doi.org/10.3390/s25134162 - 4 Jul 2025
Viewed by 454
Abstract
Enhancing generalization capabilities and robustness in scenarios with limited sample sizes, while simultaneously decreasing reliance on extensive and high-quality datasets, represents a significant area of inquiry within the domain of radar target recognition. This study introduces a few-shot learning framework that leverages multimodal [...] Read more.
Enhancing generalization capabilities and robustness in scenarios with limited sample sizes, while simultaneously decreasing reliance on extensive and high-quality datasets, represents a significant area of inquiry within the domain of radar target recognition. This study introduces a few-shot learning framework that leverages multimodal feature fusion. We develop a cross-modal representation optimization mechanism tailored for the target recognition task by incorporating natural resonance frequency features that elucidate the target’s scattering characteristics. Furthermore, we establish a multimodal fusion classification network that integrates bi-directional long short-term memory and residual neural network architectures, facilitating deep bimodal fusion through an encoding-decoding framework augmented by an energy embedding strategy. To optimize the model, we propose a cross-modal equilibrium loss function that amalgamates similarity metrics from diverse features with cross-entropy loss, thereby guiding the optimization process towards enhancing metric spatial discrimination and balancing classification performance. Empirical results derived from simulated datasets indicate that the proposed methodology achieves a recognition accuracy of 95.36% in the 5-way 1-shot task, surpassing traditional unimodal image and concatenation fusion feature approaches by 2.26% and 8.73%, respectively. Additionally, the inter-class feature separation is improved by 18.37%, thereby substantiating the efficacy of the proposed method. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

26 pages, 4773 KB  
Article
LSE-CVCNet: A Generalized Stereoscopic Matching Network Based on Local Structural Entropy and Multi-Scale Fusion
by Wenbang Yang, Yong Zhao, Ye Gu, Lu Huang, Jianhua Li and Jianchuan Zhao
Entropy 2025, 27(6), 614; https://doi.org/10.3390/e27060614 - 9 Jun 2025
Viewed by 395
Abstract
This study presents LSE-CVCNet, a novel stereo matching network designed to resolve challenges in dynamic scenes, including dynamic feature misalignment caused by texture variability and contextual ambiguity from occlusions. By integrating three key innovations—local structural entropy (LSE) to quantify structural uncertainty in disparity [...] Read more.
This study presents LSE-CVCNet, a novel stereo matching network designed to resolve challenges in dynamic scenes, including dynamic feature misalignment caused by texture variability and contextual ambiguity from occlusions. By integrating three key innovations—local structural entropy (LSE) to quantify structural uncertainty in disparity maps and guide adaptive attention, a cross-image attention mechanism (CIAM-T) to asymmetrically extract features from left/right images for improved feature alignment, and multi-resolution cost volume fusion (MRCV-F) to preserve fine-grained details through multi-scale fusion—LSE-CVCNet enhances disparity estimation accuracy and cross-domain generalization. The experimental results demonstrate robustness under varying lighting, occlusions, and complex geometries, outperforming state-of-the-art methods across multiple data sets. Ablation studies validate each module’s contribution, while cross-domain tests confirm generalization in unseen scenarios. This work establishes a new paradigm for adaptive stereo matching in dynamic environments. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

22 pages, 1877 KB  
Article
Malicious Cloud Service Traffic Detection Based on Multi-Feature Fusion
by Zhouguo Chen, Chen Deng, Xiang Gao, Xinze Li and Hangyu Hu
Electronics 2025, 14(11), 2190; https://doi.org/10.3390/electronics14112190 - 28 May 2025
Viewed by 371
Abstract
With the rapid growth of cloud computing, malicious attacks targeting cloud services have become increasingly sophisticated and prevalent. To address the limitations of traditional detection methods—such as reliance on single-dimensional features and poor generalization—we propose a novel malicious request detection model based on [...] Read more.
With the rapid growth of cloud computing, malicious attacks targeting cloud services have become increasingly sophisticated and prevalent. To address the limitations of traditional detection methods—such as reliance on single-dimensional features and poor generalization—we propose a novel malicious request detection model based on multi-feature fusion. The model adopts a dual-branch architecture that independently extracts and learns from statistical attributes (e.g., field lengths, entropy) and field attributes (e.g., semantic content of the requested fields). To enhance feature representation, an attention-based fusion mechanism is introduced to dynamically weight and integrate field-level features, while a Gini coefficient-guided random forest algorithm is used to select the most informative statistical features. This design enables the model to capture both structural and semantic characteristics of cloud service traffic. Extensive experiments on the benchmark CSIC 2010 dataset and a real-world labeled cloud service dataset demonstrated that the proposed model significantly outperforms existing approaches in terms of accuracy, precision, recall, and F1 score. These results validate the effectiveness and robustness of our multi-feature fusion approach for detecting malicious requests in cloud environments. Full article
(This article belongs to the Special Issue Advancements in AI-Driven Cybersecurity and Securing AI Systems)
Show Figures

Figure 1

21 pages, 4536 KB  
Article
Feature Attention Cycle Generative Adversarial Network: A Multi-Scene Image Dehazing Method Based on Feature Attention
by Na Li, Na Liu, Yanan Duan and Yuyang Chai
Appl. Sci. 2025, 15(10), 5374; https://doi.org/10.3390/app15105374 - 12 May 2025
Viewed by 425
Abstract
For the clearing of hazy images, it is difficult to obtain dehazing datasets with paired mapping images. Currently, most algorithms are trained on synthetic datasets with insufficient complexity, which leads to model overfitting. At the same time, the physical characteristics of fog in [...] Read more.
For the clearing of hazy images, it is difficult to obtain dehazing datasets with paired mapping images. Currently, most algorithms are trained on synthetic datasets with insufficient complexity, which leads to model overfitting. At the same time, the physical characteristics of fog in the real world are ignored in most current algorithms; that is, the degree of fog is related to the depth of field and scattering coefficient. Moreover, most current dehazing algorithms only consider the image dehazing of land scenes and ignore maritime scenes. To address these problems, we propose a multi-scene image dehazing algorithm based on an improved cycle generative adversarial network (CycleGAN). The generator structure is improved based on the CycleGAN model, and a feature fusion attention module is proposed. This module obtains relevant contextual information by extracting different levels of features. The obtained feature information is fused using the idea of residual connections. An attention mechanism is introduced in this module to retain more feature information by assigning different weights. During the training process, the atmospheric scattering model is established to guide the learning of the neural network using its prior information. The experimental results show that, compared with the baseline model, the peak signal-to-noise ratio (PSNR) increases by 32.10%, the structural similarity index (SSIM) increases by 31.07%, the information entropy (IE) increases by 4.79%, and the NIQE index is reduced by 20.1% in quantitative comparison. Meanwhile, it demonstrates better visual effects than other advanced algorithms in qualitative comparisons on synthetic datasets and real datasets. Full article
Show Figures

Figure 1

23 pages, 6234 KB  
Article
SPIFFNet: A Statistical Prediction Interval-Guided Feature Fusion Network for SAR and Optical Image Classification
by Yingying Kong and Xin Ma
Remote Sens. 2025, 17(10), 1667; https://doi.org/10.3390/rs17101667 - 9 May 2025
Viewed by 465
Abstract
The problem of the feature extraction and fusion classification of optical and SAR data remains challenging due to the differences in optical and synthetic aperture radar (SAR) imaging mechanisms. To this end, a statistical prediction interval-guided feature fusion network, SPIFFNet, is proposed for [...] Read more.
The problem of the feature extraction and fusion classification of optical and SAR data remains challenging due to the differences in optical and synthetic aperture radar (SAR) imaging mechanisms. To this end, a statistical prediction interval-guided feature fusion network, SPIFFNet, is proposed for optical and SAR image classification. It consists of two modules, the feature propagation module (FPM) and the feature fusion module (FFM). Specifically, FPM imposes restrictions on the scale factor of the batch normalization (BN) layer by means of statistical prediction interval, and features exceeding the scale factor of the interval are considered redundant and are replaced by features from other modalities to improve the classification accuracy and enhance the information interaction. In the feature fusion stage, we combine channel attention (CA), spatial attention (SA), and multiscale squeeze enhanced axial attention (MSEA) to propose FFM to improve and fuse cross-modal features in a multiscale cross-learning manner. To counteract category imbalance, we also implement a weighted cross-entropy loss function. Extensive experiments on three optical–SAR datasets show that SPIFFNet exhibits excellent performance. Full article
Show Figures

Graphical abstract

24 pages, 3113 KB  
Article
Gradual Geometry-Guided Knowledge Distillation for Source-Data-Free Domain Adaptation
by Yangkuiyi Zhang and Song Tang
Mathematics 2025, 13(9), 1491; https://doi.org/10.3390/math13091491 - 30 Apr 2025
Viewed by 502
Abstract
Due to access to the source data during the transfer phase, conventional domain adaptation works have recently raised safety and privacy concerns. More research attention thus shifts to a more practical setting known as source-data-free domain adaptation (SFDA). The new challenge is how [...] Read more.
Due to access to the source data during the transfer phase, conventional domain adaptation works have recently raised safety and privacy concerns. More research attention thus shifts to a more practical setting known as source-data-free domain adaptation (SFDA). The new challenge is how to obtain reliable semantic supervision in the absence of source domain training data and the labels on the target domain. To that end, in this work, we introduce a novel Gradual Geometry-Guided Knowledge Distillation (G2KD) approach for SFDA. Specifically, to address the lack of supervision, we used local geometry of data to construct a more credible probability distribution over the potential categories, termed geometry-guided knowledge. Then, knowledge distillation was adopted to integrate this extra information for boosting the adaptation. More specifically, first, we constructed a neighborhood geometry for any target data using a similarity comparison on the whole target dataset. Second, based on pre-obtained semantic estimation by clustering, we mined soft semantic representations expressing the geometry-guided knowledge by semantic fusion. Third, using the soften labels, we performed knowledge distillation regulated by the new objective. Considering the unsupervised setting of SFDA, in addition to the distillation loss and student loss, we introduced a mixed entropy regulator that minimized the entropy of individual data as well as maximized the mutual entropy with augmentation data to utilize neighbor relation. Our contribution is that, through local geometry discovery with semantic representation and self-knowledge distillation, the semantic information hidden in the local structures is transformed to effective semantic self-supervision. Also, our knowledge distillation works in a gradual way that is helpful to capture the dynamic variations in the local geometry, mitigating the previous guidance degradation and deviation at the same time. Extensive experiments on five challenging benchmarks confirmed the state-of-the-art performance of our method. Full article
(This article belongs to the Special Issue Robust Perception and Control in Prognostic Systems)
Show Figures

Figure 1

20 pages, 2423 KB  
Article
Symmetry-Guided Prototype Alignment and Entropy Consistency for Multi-Source Pedestrian ReID in Power Grids: A Domain Adaptation Framework
by Jia He, Lei Zhang, Xiaofeng Zhang, Tong Xu, Kejun Wang, Pengsheng Li and Xia Liu
Symmetry 2025, 17(5), 672; https://doi.org/10.3390/sym17050672 - 28 Apr 2025
Viewed by 442
Abstract
This study proposes a multi-source unsupervised domain adaptation framework for person re-identification (ReID), addressing cross-domain feature discrepancies and label scarcity in electric power field operations. Inspired by symmetry principles in feature space optimization, the framework integrates (1) a Reverse Attention-based Feature Fusion (RAFF) [...] Read more.
This study proposes a multi-source unsupervised domain adaptation framework for person re-identification (ReID), addressing cross-domain feature discrepancies and label scarcity in electric power field operations. Inspired by symmetry principles in feature space optimization, the framework integrates (1) a Reverse Attention-based Feature Fusion (RAFF) module aligning cross-domain features using symmetry-guided prototype interactions that enforce bidirectional style-invariant representations and (2) a Self-Correcting Pseudo-Label Loss (SCPL) dynamically adjusting confidence thresholds using entropy symmetry constraints to balance source-target domain knowledge transfer. Experiments demonstrate 92.1% rank-1 accuracy on power industry benchmarks, outperforming DDAG and MTL by 9.5%, with validation confirming robustness in operational deployments. The symmetric design principles significantly enhance model adaptability to inherent symmetry breaking caused by heterogeneous power grid environments. Full article
Show Figures

Figure 1

18 pages, 3318 KB  
Article
A Cross-Modal Attention-Driven Multi-Sensor Fusion Method for Semantic Segmentation of Point Clouds
by Huisheng Shi, Xin Wang, Jianghong Zhao and Xinnan Hua
Sensors 2025, 25(8), 2474; https://doi.org/10.3390/s25082474 - 14 Apr 2025
Cited by 1 | Viewed by 1805
Abstract
To bridge the modality gap between camera images and LiDAR point clouds in autonomous driving systems—a critical challenge exacerbated by current fusion methods’ inability to effectively integrate cross-modal features—we propose the Cross-Modal Fusion (CMF) framework. This attention-driven architecture enables hierarchical multi-sensor data fusion, [...] Read more.
To bridge the modality gap between camera images and LiDAR point clouds in autonomous driving systems—a critical challenge exacerbated by current fusion methods’ inability to effectively integrate cross-modal features—we propose the Cross-Modal Fusion (CMF) framework. This attention-driven architecture enables hierarchical multi-sensor data fusion, achieving state-of-the-art performance in semantic segmentation tasks.The CMF framework first projects point clouds onto the camera coordinates through the use of perspective projection to provide spatio-depth information for RGB images. Then, a two-stream feature extraction network is proposed to extract features from the two modalities separately, and multilevel fusion of the two modalities is realized by a residual fusion module (RCF) with cross-modal attention. Finally, we design a perceptual alignment loss that integrates cross-entropy with feature matching terms, effectively minimizing the semantic discrepancy between camera and LiDAR representations during fusion. The experimental results based on the SemanticKITTI and nuScenes benchmark datasets demonstrate that the CMF method achieves mean intersection over union (mIoU) scores of 64.2% and 79.3%, respectively, outperforming existing state-of-the-art methods in regard to accuracy and exhibiting enhanced robustness in regard to complex scenarios. The results of the ablation studies further validate that enhancing the feature interaction and fusion capabilities in semantic segmentation models through cross-modal attention and perceptually guided cross-entropy loss (Pgce) is effective in regard to improving segmentation accuracy and robustness. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 946 KB  
Article
Multi-Modal Temporal Dynamic Graph Construction for Stock Rank Prediction
by Ying Liu, Zengyu Wei, Long Chen, Cai Xu and Ziyu Guan
Mathematics 2025, 13(5), 845; https://doi.org/10.3390/math13050845 - 3 Mar 2025
Viewed by 1394
Abstract
Stock rank prediction is an important and challenging task. Recently, graph-based prediction methods have emerged as a valuable approach for capturing the complex relationships between stocks. Existing works mainly construct static undirected relational graphs, leading to two main drawbacks: (1) overlooking the bidirectional [...] Read more.
Stock rank prediction is an important and challenging task. Recently, graph-based prediction methods have emerged as a valuable approach for capturing the complex relationships between stocks. Existing works mainly construct static undirected relational graphs, leading to two main drawbacks: (1) overlooking the bidirectional asymmetric effects of stock data, i.e., financial messages affect each other differently when they occur at different nodes of the graph; and (2) failing to capture the dynamic relationships of stocks over time. In this paper, we propose a Multi-modal Temporal Dynamic Graph method (MTDGraph). MTDGraph comprehensively considers the bidirectional relationships from multi-modal stock data (price and texts) and models the time-varying relationships. In particular, we generate the textual relationship strength from the topic sensitivity and the text topic embeddings. Then, we inject a causality factor via the transfer entropy between the interrelated stock historical sequential embeddings as the historical relationship strength. Afterwards, we apply both the textual and historical relationship strengths to guide the multi-modal information propagation in the graph. The framework of the MTDGraph method consists of the stock-level sequential embedding layer, the inter-stock relation embedding layer based on temporal dynamic graph construction and the multi-model information fusion layer. Finally, the MTDGraph optimizes the point-wise regression loss and the ranking-aware loss to obtain the appropriate stock rank list. We empirically validate MTDGraph in the publicly available dataset, CMUN-US and compare it with state-of-the-art baselines. The proposed MTDGraph method outperforms the baseline methods in both accuracy and investment revenues. Full article
Show Figures

Figure 1

20 pages, 4387 KB  
Article
Convolutional Sparse Modular Fusion Algorithm for Non-Rigid Registration of Visible–Infrared Images
by Tao Luo, Ning Chen, Xianyou Zhu, Heyuan Yi and Weiwen Duan
Appl. Sci. 2025, 15(5), 2508; https://doi.org/10.3390/app15052508 - 26 Feb 2025
Viewed by 554
Abstract
Existing image fusion algorithms involve extensive models and high computational demands when processing source images that require non-rigid registration, which may not align with the practical needs of engineering applications. To tackle this challenge, this study proposes a comprehensive framework for convolutional sparse [...] Read more.
Existing image fusion algorithms involve extensive models and high computational demands when processing source images that require non-rigid registration, which may not align with the practical needs of engineering applications. To tackle this challenge, this study proposes a comprehensive framework for convolutional sparse fusion in the context of non-rigid registration of visible–infrared images. Our approach begins with an attention-based convolutional sparse encoder to extract cross-modal feature encodings from source images. To enhance feature extraction, we introduce a feature-guided loss and an information entropy loss to guide the extraction of homogeneous and isolated features, resulting in a feature decomposition network. Next, we create a registration module that estimates the registration parameters based on homogeneous feature pairs. Finally, we develop an image fusion module by applying homogeneous and isolated feature filtering to the feature groups, resulting in high-quality fused images with maximized information retention. Experimental results on multiple datasets indicate that, compared with similar studies, the proposed algorithm achieves an average improvement of 8.3% in image registration and 30.6% in fusion performance in mutual information. In addition, in downstream target recognition tasks, the fusion images generated by the proposed algorithm show a maximum improvement of 20.1% in average relative accuracy compared with the original images. Importantly, our algorithm maintains a relatively lightweight computational and parameter load. Full article
Show Figures

Figure 1

32 pages, 28406 KB  
Article
Infrared and Harsh Light Visible Image Fusion Using an Environmental Light Perception Network
by Aiyun Yan, Shang Gao, Zhenlin Lu, Shuowei Jin and Jingrong Chen
Entropy 2024, 26(8), 696; https://doi.org/10.3390/e26080696 - 16 Aug 2024
Viewed by 1686
Abstract
The complementary combination of emphasizing target objects in infrared images and rich texture details in visible images can effectively enhance the information entropy of fused images, thereby providing substantial assistance for downstream composite high-level vision tasks, such as nighttime vehicle intelligent driving. However, [...] Read more.
The complementary combination of emphasizing target objects in infrared images and rich texture details in visible images can effectively enhance the information entropy of fused images, thereby providing substantial assistance for downstream composite high-level vision tasks, such as nighttime vehicle intelligent driving. However, mainstream fusion algorithms lack specific research on the contradiction between the low information entropy and high pixel intensity of visible images under harsh light nighttime road environments. As a result, fusion algorithms that perform well in normal conditions can only produce low information entropy fusion images similar to the information distribution of visible images under harsh light interference. In response to these problems, we designed an image fusion network resilient to harsh light environment interference, incorporating entropy and information theory principles to enhance robustness and information retention. Specifically, an edge feature extraction module was designed to extract key edge features of salient targets to optimize fusion information entropy. Additionally, a harsh light environment aware (HLEA) module was proposed to avoid the decrease in fusion image quality caused by the contradiction between low information entropy and high pixel intensity based on the information distribution characteristics of harsh light visible images. Finally, an edge-guided hierarchical fusion (EGHF) module was designed to achieve robust feature fusion, minimizing irrelevant noise entropy and maximizing useful information entropy. Extensive experiments demonstrate that, compared to other advanced algorithms, the method proposed fusion results contain more useful information and have significant advantages in high-level vision tasks under harsh nighttime lighting conditions. Full article
(This article belongs to the Special Issue Methods in Artificial Intelligence and Information Processing II)
Show Figures

Figure 1

16 pages, 14703 KB  
Article
Infrared/Visible Light Fire Image Fusion Method Based on Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer
by Haicheng Wei, Xinping Fu, Zhuokang Wang and Jing Zhao
Forests 2024, 15(6), 976; https://doi.org/10.3390/f15060976 - 1 Jun 2024
Cited by 3 | Viewed by 1606
Abstract
To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator [...] Read more.
To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator network architecture, integrating the efficient global representation capability of Transformers with wavelet-guided pooling for extracting finer-grained features and reconstructing higher-quality fusion images. To overcome the shortage of image data, transfer learning is utilized to apply the well-trained model to fire image fusion, thereby improving fusion precision. The experimental results demonstrate that VTW-GAN outperforms the DenseFuse, IFCNN, U2Fusion, SwinFusion, and TGFuse methods in both objective and subjective aspects. Specifically, on the KAIST dataset, the fusion images show significant improvements in Entropy (EN), Mutual Information (MI), and Quality Assessment based on Gradient-based Fusion (Qabf) by 2.78%, 11.89%, and 10.45%, respectively, over the next-best values. On the Corsican Fire dataset, compared to data-limited fusion models, the transfer-learned fusion images enhance the Standard Deviation (SD) and MI by 10.69% and 11.73%, respectively, and compared to other methods, they perform well in Average Gradient (AG), SD, and MI, improving them by 3.43%, 4.84%, and 4.21%, respectively, from the next-best values. Compared with DenseFuse, the operation efficiency is improved by 78.3%. The method achieves favorable subjective image outcomes and is effective for fire-detection applications. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

20 pages, 1503 KB  
Article
EFE-LSTM: A Feature Extension, Fusion and Extraction Approach Using Long Short-Term Memory for Navigation Aids State Recognition
by Jingjing Cao, Zhipeng Wen, Liang Huang, Jinshan Dai and Hu Qin
Mathematics 2024, 12(7), 1048; https://doi.org/10.3390/math12071048 - 30 Mar 2024
Viewed by 1499
Abstract
Navigation aids play a crucial role in guiding ship navigation and marking safe water areas. Therefore, ensuring the accurate and efficient recognition of a navigation aid’s state is critical for maritime safety. To address the issue of sparse features in navigation aid data, [...] Read more.
Navigation aids play a crucial role in guiding ship navigation and marking safe water areas. Therefore, ensuring the accurate and efficient recognition of a navigation aid’s state is critical for maritime safety. To address the issue of sparse features in navigation aid data, this paper proposes an approach that involves three distinct processes: the extension of rank entropy space, the fusion of multi-domain features, and the extraction of hidden features (EFE). Based on these processes, this paper introduces a new LSTM model termed EFE-LSTM. Specifically, in the feature extension module, we introduce a rank entropy operator for space extension. This method effectively captures uncertainty in data distribution and the interrelationships among features. The feature fusion module introduces new features in the time domain, frequency domain, and time–frequency domain, capturing the dynamic features of signals across multiple dimensions. Finally, in the feature extraction module, we employ the BiLSTM model to capture the hidden abstract features of navigational signals, enabling the model to more effectively differentiate between various navigation aids states. Extensive experimental results on four real-world navigation aid datasets indicate that the proposed model outperforms other benchmark algorithms, achieving the highest accuracy among all state recognition models at 92.32%. Full article
(This article belongs to the Special Issue Application of Machine Learning and Data Mining)
Show Figures

Figure 1

Back to TopTop