Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (164)

Search Parameters:
Keywords = representation mutual information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3498 KiB  
Article
Timestamp-Guided Knowledge Distillation for Robust Sensor-Based Time-Series Forecasting
by Jiahe Yan, Honghui Li, Yanhui Bai, Jie Liu, Hairui Lv and Yang Bai
Sensors 2025, 25(15), 4590; https://doi.org/10.3390/s25154590 - 24 Jul 2025
Viewed by 293
Abstract
Accurate time-series forecasting plays a vital role in sensor-driven applications such as energy monitoring, traffic flow prediction, and environmental sensing. While most existing approaches focus on extracting local patterns from historical observations, they often overlook the global temporal information embedded in timestamps. However, [...] Read more.
Accurate time-series forecasting plays a vital role in sensor-driven applications such as energy monitoring, traffic flow prediction, and environmental sensing. While most existing approaches focus on extracting local patterns from historical observations, they often overlook the global temporal information embedded in timestamps. However, this information represents a valuable yet underutilized aspect of sensor-based data that can significantly enhance forecasting performance. In this paper, we propose a novel timestamp-guided knowledge distillation framework (TKDF), which integrates both historical and timestamp information through mutual learning between heterogeneous prediction branches to improve forecasting robustness. The framework comprises two complementary branches: a Backbone Model that captures local dependencies from historical sequences, and a Timestamp Mapper that learns global temporal patterns encoded in timestamp features. To enhance information transfer and reduce representational redundancy, a self-distillation mechanism is introduced within the Timestamp Mapper. Extensive experiments on multiple real-world sensor datasets—covering electricity consumption, traffic flow, and meteorological measurements—demonstrate that the TKDF consistently improves the performance of mainstream forecasting models. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

16 pages, 1547 KiB  
Article
Two-Party Quantum Private Comparison with Pauli Operators
by Min Hou, Yue Wu and Shibin Zhang
Axioms 2025, 14(8), 549; https://doi.org/10.3390/axioms14080549 - 22 Jul 2025
Viewed by 149
Abstract
Quantum private comparison (QPC) is a quantum cryptographic protocol designed to enable two mutually distrustful parties to securely compare sensitive data without disclosing their private information to each other or any external entities. This study proposes a novel QPC protocol that leverages Bell [...] Read more.
Quantum private comparison (QPC) is a quantum cryptographic protocol designed to enable two mutually distrustful parties to securely compare sensitive data without disclosing their private information to each other or any external entities. This study proposes a novel QPC protocol that leverages Bell states to ensure data privacy, utilizing the fundamental principles of quantum mechanics. Within this framework, two participants, each possessing a secret integer, encode the binary representation of their values using Pauli-X and Pauli-Z operators applied to quantum states transmitted from a semi-honest third party (TP). The TP, which is bound to protocol compliance and prohibited from colluding with either participant, measures the received sequences to determine the comparison result without accessing the participants’ original inputs. Theoretical analyses and simulations validate the protocol’s strong security, high efficiency, and practical feasibility in quantum computing environments. An advantage of the proposed protocol lies in its optimized utilization of Bell states, which enhances qubit efficiency and experimental practicality. Moreover, the proposed protocol outperforms several existing Bell-state-based QPC schemes in terms of efficiency. Full article
(This article belongs to the Special Issue Recent Advances in Quantum Mechanics and Mathematical Physics)
Show Figures

Figure 1

21 pages, 7084 KiB  
Article
Chinese Paper-Cutting Style Transfer via Vision Transformer
by Chao Wu, Yao Ren, Yuying Zhou, Ming Lou and Qing Zhang
Entropy 2025, 27(7), 754; https://doi.org/10.3390/e27070754 - 15 Jul 2025
Viewed by 321
Abstract
Style transfer technology has seen substantial attention in image synthesis, notably in applications like oil painting, digital printing, and Chinese landscape painting. However, it is often difficult to generate migrated images that retain the essence of paper-cutting art and have strong visual appeal [...] Read more.
Style transfer technology has seen substantial attention in image synthesis, notably in applications like oil painting, digital printing, and Chinese landscape painting. However, it is often difficult to generate migrated images that retain the essence of paper-cutting art and have strong visual appeal when trying to apply the unique style of Chinese paper-cutting art to style transfer. Therefore, this paper proposes a new method for Chinese paper-cutting style transformation based on the Transformer, aiming at realizing the efficient transformation of Chinese paper-cutting art styles. Specifically, the network consists of a frequency-domain mixture block and a multi-level feature contrastive learning module. The frequency-domain mixture block explores spatial and frequency-domain interaction information, integrates multiple attention windows along with frequency-domain features, preserves critical details, and enhances the effectiveness of style conversion. To further embody the symmetrical structures and hollowed hierarchical patterns intrinsic to Chinese paper-cutting, the multi-level feature contrastive learning module is designed based on a contrastive learning strategy. This module maximizes mutual information between multi-level transferred features and content features, improves the consistency of representations across different layers, and thus accentuates the unique symmetrical aesthetics and artistic expression of paper-cutting. Extensive experimental results demonstrate that the proposed method outperforms existing state-of-the-art approaches in both qualitative and quantitative evaluations. Additionally, we created a Chinese paper-cutting dataset that, although modest in size, represents an important initial step towards enriching existing resources. This dataset provides valuable training data and a reference benchmark for future research in this field. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

17 pages, 4019 KiB  
Article
Oil-Painting Style Classification Using ResNet with Conditional Information Bottleneck Regularization
by Yaling Dang, Fei Duan and Jia Chen
Entropy 2025, 27(7), 677; https://doi.org/10.3390/e27070677 - 25 Jun 2025
Viewed by 683
Abstract
Automatic classification of oil-painting styles holds significant promise for art history, digital archiving, and forensic investigation by offering objective, scalable analysis of visual artistic attributes. In this paper, we introduce a deep conditional information bottleneck (CIB) framework, built atop ResNet-50, for fine-grained style [...] Read more.
Automatic classification of oil-painting styles holds significant promise for art history, digital archiving, and forensic investigation by offering objective, scalable analysis of visual artistic attributes. In this paper, we introduce a deep conditional information bottleneck (CIB) framework, built atop ResNet-50, for fine-grained style classification of oil paintings. Unlike traditional information bottleneck (IB) approaches that minimize the mutual information I(X;Z) between input X and latent representation Z, our CIB minimizes the conditional mutual information I(X;ZY), where Y denotes the painting’s style label. We implement this conditional term using a matrix-based Rényi’s entropy estimator, thereby avoiding costly variational approximations and ensuring computational efficiency. We evaluate our method on two public benchmarks: the Pandora dataset (7740 images across 12 artistic movements) and the OilPainting dataset (19,787 images across 17 styles). Our method outperforms the prevalent ResNet with a relative performance gain of 13.1% on Pandora and 11.9% on OilPainting. Beyond quantitative gains, our approach yields more disentangled latent representations that cluster semantically similar styles, facilitating interpretability. Full article
Show Figures

Figure 1

17 pages, 699 KiB  
Article
Fusion-Optimized Multimodal Entity Alignment with Textual Descriptions
by Chenchen Wang, Chaomurilige, Yu Weng, Xuan Liu and Zheng Liu
Information 2025, 16(7), 534; https://doi.org/10.3390/info16070534 - 24 Jun 2025
Viewed by 302
Abstract
Multimodal knowledge graph entity alignment is a key basic task of knowledge fusion and integration, which is used to identify entities with semantic equivalent but different representation forms in different knowledge graphs. Previous entity alignment research has mostly focused on encoding and utilizing [...] Read more.
Multimodal knowledge graph entity alignment is a key basic task of knowledge fusion and integration, which is used to identify entities with semantic equivalent but different representation forms in different knowledge graphs. Previous entity alignment research has mostly focused on encoding and utilizing basic features such as entity names and attributes; however, it is difficult to comprehensively capture the rich semantic information of entities by solely relying on these basic features. To effectively overcome this limitation, this paper proposes a fusion-optimized multimodal entity alignment method, FMEA-TD. Compared with previous work, this method makes full use of the textual description information in the knowledge graph to provide rich supplements for entity features, thereby better capturing the entity semantics and solving the problems faced by relying solely on the entity’s own features. FMEA-TD is able to effectively fuse the entity’s own information and text description information through multimodal cooperation confidence, establish the interaction mechanism between them, and thus promote mutual collaboration between different modalities, which enhances the model’s ability to understand the semantic text. Experimentally validated, FMEA-TD outperforms current state-of-the-art baseline methods on public knowledge graph datasets. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

28 pages, 11793 KiB  
Article
Unsupervised Multimodal UAV Image Registration via Style Transfer and Cascade Network
by Xiaoye Bi, Rongkai Qie, Chengyang Tao, Zhaoxiang Zhang and Yuelei Xu
Remote Sens. 2025, 17(13), 2160; https://doi.org/10.3390/rs17132160 - 24 Jun 2025
Cited by 1 | Viewed by 388
Abstract
Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% [...] Read more.
Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% reduction in mean squared error (from 0.0106 to 0.0068), 9.27% enhancement in normalized cross-correlation, 26% improvement in local normalized cross-correlation, and 8% increase in mutual information compared to state-of-the-art methods. The architecture integrates a cross-modal style transfer network (CSTNet) that transforms visible images into pseudo-infrared representations to unify modality characteristics, and a multi-scale cascaded registration network (MCRNet) that performs progressive spatial alignment across multiple resolution scales using diffeomorphic deformation modeling to ensure smooth and invertible transformations. A self-supervised learning paradigm based on image reconstruction eliminates reliance on manually annotated data while maintaining registration accuracy through synthetic deformation generation. Extensive experiments on the LLVIP dataset demonstrate the method’s robustness under challenging conditions involving large-scale transformations, with ablation studies confirming that style transfer contributes 28% MSE improvement and diffeomorphic registration prevents 10.6% performance degradation. The proposed approach provides a robust solution for cross-modal image registration in dynamic UAV environments, offering significant implications for downstream applications such as target detection, tracking, and surveillance. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches: UAV Data Analysis)
Show Figures

Graphical abstract

28 pages, 723 KiB  
Article
Targeting Rural Poverty: A Generalized Ordered Logit Model Analysis of Multidimensional Deprivation in Ethiopia’s Bilate River Basin
by Frew Moges, Tekle Leza and Yishak Gecho
Economies 2025, 13(7), 181; https://doi.org/10.3390/economies13070181 - 24 Jun 2025
Viewed by 299
Abstract
Understanding the complex and multidimensional nature of poverty is essential for designing effective and targeted policy interventions in rural Ethiopia. This study examined the determinants of multidimensional poverty in Bilate River Basin in South Ethiopia, employing cross-sectional household survey data collected in 2024. [...] Read more.
Understanding the complex and multidimensional nature of poverty is essential for designing effective and targeted policy interventions in rural Ethiopia. This study examined the determinants of multidimensional poverty in Bilate River Basin in South Ethiopia, employing cross-sectional household survey data collected in 2024. A total of 359 households were selected using a multistage sampling technique, ensuring representation across agro-ecological and socio-economic zones. The analysis applied the Generalized Ordered Logit (GOLOGIT) model to categorize households into four mutually exclusive poverty statuses: non-poor, vulnerable, poor, and extremely poor. The results reveal that age, dependency ratio, education level, livestock and ox ownership, access to information and credit, health status, and grazing land access significantly influence poverty status. Higher dependency ratios and poor health substantially increase the likelihood of extreme poverty, while livestock ownership and access to grazing land reduce it. Notably, credit use and access to information typically considered poverty reducing were associated with increased extreme poverty risks, likely due to poor financial literacy and exposure to misinformation. These findings underscored the multidimensional and dynamic nature of poverty, driven by both structural and behavioral factors. Policy implications point to the importance of integrated interventions that promote education, health, financial literacy, and access to productive assets to ensure sustainable poverty reduction and improved rural livelihoods in Ethiopia. Full article
Show Figures

Figure 1

27 pages, 3926 KiB  
Article
A Multi-Source Embedding-Based Named Entity Recognition Model for Knowledge Graph and Its Application to On-Site Operation Violations in Power Grid Systems
by Lingwen Meng, Yulin Wang, Guobang Ban, Yuanjun Huang, Xinshan Zhu and Shumei Zhang
Electronics 2025, 14(13), 2511; https://doi.org/10.3390/electronics14132511 - 20 Jun 2025
Viewed by 335
Abstract
With the increasing complexity of power grid field operations, frequent operational violations have emerged as a major concern in the domain of power grid field operation safety. To support dispatchers in accurately identifying and addressing violation risks, this paper introduces a profiling approach [...] Read more.
With the increasing complexity of power grid field operations, frequent operational violations have emerged as a major concern in the domain of power grid field operation safety. To support dispatchers in accurately identifying and addressing violation risks, this paper introduces a profiling approach for power grid field operation violations based on knowledge graph techniques. The method enables deep modeling and structured representation of violation behaviors. In the structured data processing phase, statistical analysis is conducted based on predefined rules, and mutual information is employed to quantify the contribution of various operational factors to violations. At the municipal bureau level, statistical modeling of violation characteristics is performed to support regional risk assessment. For unstructured textual data, a multi-source embedding-based named entity recognition (NER) model is developed, incorporating domain-specific power lexicon information to enhance the extraction of key entities. High-weight domain terms related to violations are further identified using the TF-IDF algorithm to characterize typical violation behaviors. Based on the extracted entities and relationships, a knowledge graph of field operation violations is constructed, providing a computable and inferable semantic representation of operational scenarios. Finally, visualization techniques are applied to present the structural patterns and distributional features of violations, offering graph-based support for violation risk analysis and dispatch decision-making. Experimental results demonstrate that the proposed method effectively identifies critical features of violation behaviors and provides a structured foundation for intelligent decision support in power grid operation management. Full article
(This article belongs to the Special Issue Knowledge Information Extraction Research)
Show Figures

Figure 1

24 pages, 1408 KiB  
Review
Biomolecular Basis of Life
by Janusz Wiesław Błaszczyk
Metabolites 2025, 15(6), 404; https://doi.org/10.3390/metabo15060404 - 16 Jun 2025
Viewed by 568
Abstract
Life is defined descriptively by the capacity for metabolism, homeostasis, self-organization, growth, adaptation, information metabolism, and reproduction. All these are achieved by a set of self-organizing and self-sustaining processes, among which energy and information metabolism play a dominant role. The energy metabolism of [...] Read more.
Life is defined descriptively by the capacity for metabolism, homeostasis, self-organization, growth, adaptation, information metabolism, and reproduction. All these are achieved by a set of self-organizing and self-sustaining processes, among which energy and information metabolism play a dominant role. The energy metabolism of the human body is based on glucose and lipid metabolism. All energy-dependent life processes are controlled by phosphate and calcium signaling. To maintain the optimal levels of energy metabolism, cells, tissues, and the nervous system communicate mutually, and as a result of this signaling, metabolism emerges with self-awareness, which allows for conscience social interactions, which are the most significant determinants of human life. Consequently, the brain representation of our body and the egocentric representation of the environment are built. The last determinant of life optimization is the limited life/death cycle, which exhibits the same pattern at cellular and social levels. This narrative review is my first attempt to systematize our knowledge of life phenomena. Due to the extreme magnitude of this challenge, in the current article, I tried to summarize the current knowledge about fundamental life processes, i.e., energy and information metabolism, and, thus, initiate a broader discussion about the life and future of our species. Full article
(This article belongs to the Section Thematic Reviews)
Show Figures

Figure 1

20 pages, 3404 KiB  
Article
Dynamic Synergy Network Analysis Reveals Stage-Specific Regional Dysfunction in Alzheimer’s Disease
by Xiaoyan Zhang, Chao Han, Jingbo Xia, Lingli Deng and Jiyang Dong
Brain Sci. 2025, 15(6), 636; https://doi.org/10.3390/brainsci15060636 - 12 Jun 2025
Viewed by 473
Abstract
Background: Alzheimer’s disease (AD) is a prevalent neurodegenerative disorder characterized by progressive neurodegeneration and connectivity deterioration. While resting-state functional magnetic resonance imaging (fMRI) provides critical insights into brain network abnormalities, traditional mutual information-based methods exhibit inherent limitations in characterizing the dynamic synergistic mechanisms [...] Read more.
Background: Alzheimer’s disease (AD) is a prevalent neurodegenerative disorder characterized by progressive neurodegeneration and connectivity deterioration. While resting-state functional magnetic resonance imaging (fMRI) provides critical insights into brain network abnormalities, traditional mutual information-based methods exhibit inherent limitations in characterizing the dynamic synergistic mechanisms between cerebral regions. Method: This study pioneered the application of an Integrated Information Decomposition (ΦID) framework in AD brain network analysis, constructing single-sample network models based on ΦID-derived synergy metrics to systematically compare their differences with mutual information-based methods in pathological sensitivity, computational robustness, and network representation capability, while detecting brain regions with declining dynamic synergy during AD progression through intergroup t-tests. Result: The key finding are as follows: (1) synergy metrics exhibited lower intra-group coefficient of variation than mutual information metrics, indicating higher computational stability; (2) single-sample reconstruction significantly enhanced the statistical power in intergroup difference detection; (3) synergy metrics captured brain network features that are undetectable by traditional mutual information methods, with more pronounced differences between networks; (4) key node analysis demonstrated spatiotemporal degradation patterns progressing from initial dysfunction in orbitofrontal–striatal–temporoparietal pathways accompanied by multi-regional impairments during prodromal stages, through moderate-phase decline located in the right middle frontal and postcentral gyri, to advanced-stage degeneration of the right supramarginal gyrus and left inferior parietal lobule. ΦID-driven dynamic synergy network analysis provides novel information integration theory-based biomarkers for AD progression diagnosis and potentially lays the foundation for pathological understanding and subsequent targeted therapy development. Full article
(This article belongs to the Special Issue Using Neuroimaging to Explore Neurodegenerative Diseases)
Show Figures

Figure 1

21 pages, 9082 KiB  
Article
Multi-Source Pansharpening of Island Sea Areas Based on Hybrid-Scale Regression Optimization
by Dongyang Fu, Jin Ma, Bei Liu and Yan Zhu
Sensors 2025, 25(11), 3530; https://doi.org/10.3390/s25113530 - 4 Jun 2025
Viewed by 797
Abstract
To address the demand for high spatial resolution data in the water color inversion task of multispectral satellite images in island sea areas, a feasible solution is to process through multi-source remote sensing data fusion methods. However, the inherent biases among multi-source sensors [...] Read more.
To address the demand for high spatial resolution data in the water color inversion task of multispectral satellite images in island sea areas, a feasible solution is to process through multi-source remote sensing data fusion methods. However, the inherent biases among multi-source sensors and the spectral distortion caused by the dynamic changes of water bodies in island sea areas restrict the fusion accuracy, necessitating more precise fusion solutions. Therefore, this paper proposes a pansharpening method based on Hybrid-Scale Mutual Information (HSMI). This method effectively enhances the accuracy and consistency of panchromatic sharpening results by integrating mixed-scale information into scale regression. Secondly, it introduces mutual information to quantify the spatial–spectral correlation among multi-source data to balance the fusion representation under mixed scales. Finally, the performance of various popular pansharpening methods was compared and analyzed using the coupled datasets of Sentinel-2 and Sentinel-3 in typical island and reef waters of the South China Sea. The results show that HSMI can enhance the spatial details and edge clarity of islands while better preserving the spectral characteristics of the surrounding sea areas. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 7036 KiB  
Article
Clustering Method for Edge and Inner Buildings Based on DGI Model and Graph Traversal
by Hesheng Huang and Yijun Zhang
ISPRS Int. J. Geo-Inf. 2025, 14(6), 222; https://doi.org/10.3390/ijgi14060222 - 3 Jun 2025
Viewed by 346
Abstract
Accurate clustering of buildings is a prerequisite for map generalization in densely populated urban data. Edge buildings at the edge of building groups, identified through human-eye recognition, may serve as boundary constraints for clustering. This paper proposes the use of seven Gestalt factors [...] Read more.
Accurate clustering of buildings is a prerequisite for map generalization in densely populated urban data. Edge buildings at the edge of building groups, identified through human-eye recognition, may serve as boundary constraints for clustering. This paper proposes the use of seven Gestalt factors to distinguish edge buildings from other buildings. Employing the DGI model to produce high-quality node embeddings, optimize the mutual information between the local node representation and the global summary vector. We then conduct training to identify edge buildings in the two test datasets using eight feature combinations. This research introduces a modified distance metric called the ‘m_dis’ feature, which is used to describe the closeness between two adjacent buildings. Finally, the clusters of edge and inner buildings are determined through a constrained graph traversal that is based on the ‘m_dis’ feature. This method is capable of effectively identifying and distinguishing densely distributed building groups in Chengdu City, China, as demonstrated by experimental results. It offers novel concepts for edge building recognition in dense urban areas, confirms the significance of the LOF factor and the ‘m_dis’ feature, and achieves superior clustering results in comparison to other methods. Additionally, this semi-supervised clustering method (DGI-EIC) has the potential to achieve an ARI index of approximately 0.5. Full article
Show Figures

Figure 1

31 pages, 8228 KiB  
Article
From Words to Ratings: Machine Learning and NLP for Wine Reviews
by Iliana Ilieva, Margarita Terziyska and Teofana Dimitrova
Beverages 2025, 11(3), 80; https://doi.org/10.3390/beverages11030080 - 1 Jun 2025
Viewed by 1023
Abstract
Wine production is an important sector of the food industry in Bulgaria, contributing to both economic development and cultural heritage. The present study aims to show how natural language processing (NLP) and machine learning methods can be applied to analyze expert-written Bulgarian wine [...] Read more.
Wine production is an important sector of the food industry in Bulgaria, contributing to both economic development and cultural heritage. The present study aims to show how natural language processing (NLP) and machine learning methods can be applied to analyze expert-written Bulgarian wine descriptions and to extract patterns related to wine quality and style. Based on a bilingual dataset of reviews (in Bulgarian and English), semantic analysis, classification, regression and clustering models were used, which combine textual and structured data. The descriptions were transformed into numerical representations using a pre-trained language model (BERT), after which algorithms were used to predict style categories and ratings. Additional sentiment and segmentation analyses revealed differences between wine types, and clustering identified thematic structures in the expert language. The comparison between predefined styles and automatically derived clusters was evaluated using metrics such as Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). The resulting analysis shows that text descriptions contain valuable information that allows for automated wine profiling. These findings can be applied by a wide range of stakeholders—researchers, producers, retailers, and marketing specialists. Full article
Show Figures

Graphical abstract

21 pages, 3106 KiB  
Article
Fine-Grained Identification of Benthic Diatom Scanning Electron Microscopy Images Using a Deep Learning Framework
by Fengjuan Feng, Shuo Wang, Xueqing Zhang, Xiaoyao Fang, Yuyang Xu and Jianlei Liu
J. Mar. Sci. Eng. 2025, 13(6), 1095; https://doi.org/10.3390/jmse13061095 - 30 May 2025
Viewed by 352
Abstract
Benthic diatoms are key primary producers in aquatic ecosystems and sensitive bioindicators for water quality monitoring; for example, the Yellow River Basin exhibits high diatom species diversity. However, traditional microscopic identification of such species remains inefficient and inaccurate. To enable automated identification, we [...] Read more.
Benthic diatoms are key primary producers in aquatic ecosystems and sensitive bioindicators for water quality monitoring; for example, the Yellow River Basin exhibits high diatom species diversity. However, traditional microscopic identification of such species remains inefficient and inaccurate. To enable automated identification, we established a benthic diatom dataset containing 3157 SEM images of 32 genera/species from the Yellow River Basin and developed a novel identification method. Specifically, the knowledge extraction module distinguishes foreground features from background noise by guiding spatial attention to focus on mutually exclusive regions within the image. This mechanism allows the network to focus more on foreground features that are useful for the classification task while significantly reducing the interference of background noise. Furthermore, a dual knowledge guidance module is designed to enhance the discriminative representation of fine-grained diatom images. This module strengthens multi-region foreground features through grouped channel attention, supplemented with contextual information through convolution-refined background features assigned low weights. Finally, the proposed method integrates multi-granularity learning, knowledge distillation, and multi-scale training strategies, further improving the classification accuracy. The experimental results demonstrate that the proposed network outperforms comparative methods on both the self-built diatom dataset and a public diatom dataset. Ablation studies and visualization further validate the efficacy of each module. Full article
(This article belongs to the Section Marine Biology)
Show Figures

Figure 1

24 pages, 6314 KiB  
Article
CDFAN: Cross-Domain Fusion Attention Network for Pansharpening
by Jinting Ding, Honghui Xu and Shengjun Zhou
Entropy 2025, 27(6), 567; https://doi.org/10.3390/e27060567 - 27 May 2025
Viewed by 480
Abstract
Pansharpening provides a computational solution to the resolution limitations of imaging hardware by enhancing the spatial quality of low-resolution hyperspectral (LRMS) images using high-resolution panchromatic (PAN) guidance. From an information-theoretic perspective, the task involves maximizing the mutual information between PAN and LRMS inputs [...] Read more.
Pansharpening provides a computational solution to the resolution limitations of imaging hardware by enhancing the spatial quality of low-resolution hyperspectral (LRMS) images using high-resolution panchromatic (PAN) guidance. From an information-theoretic perspective, the task involves maximizing the mutual information between PAN and LRMS inputs while minimizing spectral distortion and redundancy in the fused output. However, traditional spatial-domain methods often fail to preserve high-frequency texture details, leading to entropy degradation in the resulting images. On the other hand, frequency-based approaches struggle to effectively integrate spatial and spectral cues, often neglecting the underlying information content distributions across domains. To address these shortcomings, we introduce a novel architecture, termed the Cross-Domain Fusion Attention Network (CDFAN), specifically designed for the pansharpening task. CDFAN is composed of two core modules: the Multi-Domain Interactive Attention (MDIA) module and the Spatial Multi-Scale Enhancement (SMCE) module. The MDIA module utilizes discrete wavelet transform (DWT) to decompose the PAN image into frequency sub-bands, which are then employed to construct attention mechanisms across both wavelet and spatial domains. Specifically, wavelet-domain features are used to formulate query vectors, while key features are derived from the spatial domain, allowing attention weights to be computed over multi-domain representations. This design facilitates more effective fusion of spectral and spatial cues, contributing to superior reconstruction of high-resolution multispectral (HRMS) images. Complementing this, the SMCE module integrates multi-scale convolutional pathways to reinforce spatial detail extraction at varying receptive fields. Additionally, an Expert Feature Compensator is introduced to adaptively balance contributions from different scales, thereby optimizing the trade-off between local detail preservation and global contextual understanding. Comprehensive experiments conducted on standard benchmark datasets demonstrate that CDFAN achieves notable improvements over existing state-of-the-art pansharpening methods, delivering enhanced spectral–spatial fidelity and producing images with higher perceptual quality. Full article
(This article belongs to the Section Signal and Data Analysis)
Show Figures

Figure 1

Back to TopTop