MDPI - Publisher of Open Access Journals

26 pages, 1490 KB

Open AccessSystematic Review

Object Detection in Optical Remote Sensing Images: A Systematic Review of Methods, Benchmarks, and Operational Applications

by Neus Fontanet Garcia and Piero Boccardo

Remote Sens. 2026, 18(9), 1289; https://doi.org/10.3390/rs18091289 - 23 Apr 2026

Viewed by 174

Abstract

Object detection in optical remote sensing imagery has emerged as a crucial task in computer vision, with applications ranging between environmental monitoring to disaster management, precision agriculture, and urban planning. This review systematically examines current methodologies, categorising them into four principal approaches: (1) [...] Read more.

Object detection in optical remote sensing imagery has emerged as a crucial task in computer vision, with applications ranging between environmental monitoring to disaster management, precision agriculture, and urban planning. This review systematically examines current methodologies, categorising them into four principal approaches: (1) template matching-based methods, which leverage predefined patterns for object identification; (2) knowledge-based methods, which incorporate geometric and contextual information to enhance detection accuracy; (3) object-based image analysis (OBIA), which segments images into meaningful objects using spectral and spatial properties; (4) machine learning-based methods, particularly deep convolutional neural networks (CNNs), which have revolutionised the field through automatic feature learning. Each methodology’s performance characteristics, computational requirements, and suitability for different remote sensing applications are analysed. Our systematic review, following PRISMA guidelines, analysed 189 studies published from 2010 to 2025, of which 73 provided quantitative results on standard benchmarks. The three most critical challenges identified are as follows: (1) annotation bottleneck, as dense bounding box labelling of remote sensing imagery remains highly labour-intensive for deep learning approaches, (2) extreme scale variation spanning 2–3 orders of magnitude within single scenes, and (3) domain adaptation failures when models encounter new geographic regions or sensor characteristics. This review identifies critical research gaps and proposes prioritised future directions, emphasising foundation models for zero-shot detection, efficient architectures for resource-constrained deployment, and standardised benchmarks with size-specific metrics. The analysis provides practitioners with evidence-based decision frameworks for method selection and researchers with a roadmap for advancing object detection in remote sensing applications. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

20 pages, 3665 KB

Open AccessArticle

SDS-Former: A Transformer-Based Method for Semantic Segmentation of Arid Land Remote Sensing Imagery

by Yujie Du, Junfu Fan, Kuan Li and Yongrui Li

Algorithms 2026, 19(5), 325; https://doi.org/10.3390/a19050325 - 22 Apr 2026

Viewed by 137

Abstract

Semantic segmentation of land use and land cover (LULC) in arid regions remains challenging due to severe class imbalance, fragmented spatial distributions, and high spectral similarity among different land cover types. These characteristics often lead to an information bottleneck in deep segmentation networks [...] Read more.

Semantic segmentation of land use and land cover (LULC) in arid regions remains challenging due to severe class imbalance, fragmented spatial distributions, and high spectral similarity among different land cover types. These characteristics often lead to an information bottleneck in deep segmentation networks and hinder the extraction of discriminative semantic representations. To address these issues, we propose SDS-Former, a lightweight semantic segmentation network specifically designed for remote sensing imagery in arid environments. SDS-Former incorporates an SSM-inspired Lightweight Semantic Enhancement (LSE) module to strengthen contextual modeling and alleviate the loss of discriminative information in deep features. To tackle scale variations, a Dynamic Selective Feature Fusion (DSFF) module is employed in the decoder to adaptively weight and fuse high-level semantics with low-level spatial details. Furthermore, a Feature Refinement Head (FRH) is introduced to enhance boundary localization and improve the recognition of small-scale and sparsely distributed land cover objects. Extensive ablation and comparative experiments demonstrate that SDS-Former consistently outperforms representative semantic segmentation methods across multiple evaluation metrics. On the Tarim Basin dataset, the proposed network achieves a mean Intersection over Union (mIoU) of 82.51% and an F1 score of 86.47%, indicating its superior effectiveness and robustness. Qualitative results further verify that SDS-Former exhibits clear advantages in distinguishing spectrally similar land cover types and preserving the spatial continuity of ground objects in complex arid-region scenes. Full article

(This article belongs to the Special Issue Artificial Intelligence, Image Processing and Spatial Analytics in Environmental Informatics)

20 pages, 8567 KB

Open AccessArticle

Latent Diffusion Model for Chlorophyll Remote Sensing Spectral Synthesis Integrating Bio-Optical Priors and Band Attention Mechanisms

by Jinming Liu, Haoran Zhang, Jianlong Huang, Hanbin Wen, Qinpei Chen, Jiayi Liu, Chaowen Wen, Huiling Tang and Zhaohua Sun

Appl. Sci. 2026, 16(8), 3892; https://doi.org/10.3390/app16083892 - 17 Apr 2026

Viewed by 214

Abstract

Global freshwater resources face severe water quality degradation, with chlorophyll-a (Chl-a) concentration serving as a critical eutrophication indicator. While deep learning methods enable accurate Chl-a retrieval from remote sensing reflectance (Rrs) spectra, the scarcity of paired Rrs-Chl-a samples limits model generalization and causes [...] Read more.

Global freshwater resources face severe water quality degradation, with chlorophyll-a (Chl-a) concentration serving as a critical eutrophication indicator. While deep learning methods enable accurate Chl-a retrieval from remote sensing reflectance (Rrs) spectra, the scarcity of paired Rrs-Chl-a samples limits model generalization and causes overfitting, particularly in optically complex inland waters. To address this data bottleneck, we propose a physics-constrained latent diffusion model for synthesizing high-fidelity paired Rrs-Chl-a data to augment limited training sets for deep learning-based water quality retrieval. Our framework integrates three key innovations: (1) a lightweight variational autoencoder achieving 8.6:1 latent space compression, reducing computational overhead while preserving spectral features; (2) band-selective attention mechanisms targeting chlorophyll-sensitive wavelengths (440, 550, 680, and 700–750 nm) based on bio-optical principles; and (3) physics-guided conditional encoding that captures concentration-dependent spectral responses across oligotrophic to eutrophic regimes. Evaluated on the GLORIA dataset, our model demonstrates superior performance in spectral similarity (0.535), sample diversity (0.072), and distribution matching (Fréchet distance 0.0008) compared to conventional generative models. When applied to data augmentation, synthetic spectra improved downstream Chl-a retrieval from R²= 0.75 to 0.91, reducing RMSE by 39%. This physics-informed generative approach addresses data scarcity in aquatic remote sensing research, supporting global needs for enhanced understanding of inland and coastal water quality dynamics in data-limited regions. Full article

► Show Figures

Figure 1

21 pages, 1931 KB

Open AccessArticle

A Shapelet Transform-Based Method for Structural Damage Identification: A Case Study on a Wooden Truss Bridge

by Ke Gan, Yingzhuo Ye, Fulin Nie, Ching Tai Ng and Liujie Chen

Sensors 2026, 26(8), 2323; https://doi.org/10.3390/s26082323 - 9 Apr 2026

Viewed by 455

Abstract

The impact of environmental disturbances and sensor deployment variations on damage identification represents a critical bottleneck that constrains the practical effectiveness of structural health monitoring. Existing methods addressing these challenges often suffer from poor interpretability due to information loss during feature extraction or [...] Read more.

The impact of environmental disturbances and sensor deployment variations on damage identification represents a critical bottleneck that constrains the practical effectiveness of structural health monitoring. Existing methods addressing these challenges often suffer from poor interpretability due to information loss during feature extraction or exhibit insufficient sensitivity in identifying early-stage minor damage. This paper proposes a damage identification method based on the Shapelet Transform and Random Forest classifier, which extracts highly interpretable local shape features from vibration response signals to achieve robust identification of structural state changes. The study utilizes measured random vibration response data from a timber truss bridge. The dataset comprises four reference states collected on different dates and five damage states simulated by additional masses ranging from +23.5 g to +193.7 g, with sensors deployed in both vertical and horizontal directions. The Shapelet Transform selects local subsequences with high information gain from the original time series as features, which are subsequently classified using the Random Forest algorithm. The experimental design systematically investigates the influence of different damage severities, sensor locations, and environmental variations on method performance. The results demonstrate that with a Shapelet extraction time of 10 min, the method achieves 100% identification accuracy across multiple operating conditions comprehensively considering environmental variations, sensor location differences, and varying damage severities. When the extraction time is reduced to 5 min, 3 min, and 1 min, the average accuracies are 93.98%, 89.51%, and 58.48%, respectively. The method effectively identifies the minimum simulated damage (+23.5 g), which represents only 0.07% of the total structural mass, while maintaining stable performance under varying sensor locations and environmental conditions. Compared to traditional methods based on global frequency-domain features or statistical characteristics, the proposed method extracts physically meaningful local Shapelet features, offering significant advantages in interpretability. In contrast to deep learning approaches, this method demonstrates greater robustness under limited sample conditions. This study confirms that the combined framework of the Shapelet Transform and Random Forest can effectively address multiple real-world challenges in structural health monitoring, delivering high accuracy, strong robustness, and excellent interpretability, thereby providing a novel approach for developing practical real-time damage identification systems. Full article

(This article belongs to the Section Industrial Sensors)

► Show Figures

Figure 1

22 pages, 4742 KB

Open AccessArticle

PromptSeg: An End-to-End Universal Medical Image Segmentation Method via Visual Prompts

by Minfan Zhao, Bingxun Wang, Jun Shi and Hong An

Entropy 2026, 28(3), 342; https://doi.org/10.3390/e28030342 - 18 Mar 2026

Viewed by 420

Abstract

Deep learning has achieved remarkable advancements in medical image segmentation, yet its generalization capability across unseen tasks remains a significant challenge. The variety of task objectives, disease-dependent labeling variations, and multi-center data contribute to the high uncertainty of task-specific models on unseen distributions. [...] Read more.

Deep learning has achieved remarkable advancements in medical image segmentation, yet its generalization capability across unseen tasks remains a significant challenge. The variety of task objectives, disease-dependent labeling variations, and multi-center data contribute to the high uncertainty of task-specific models on unseen distributions. In this study, we propose PromptSeg, an innovative Transformer-based unified framework for universal 2D medical image segmentation. From an information-theoretic perspective, PromptSeg formulates the segmentation process as a conditional entropy minimization problem, utilizing visual prompts as side information to reduce the uncertainty of the target task. Guided by the information bottleneck principle, PromptSeg aims to utilize the provided visual prompts to filter out redundant noise and learn contextual representations, thereby breaking the restrictions of the task-specific paradigm. When faced with unseen datasets or segmentation targets, our method only requires a few annotated visual prompt pairs to extract task-specific semantics and segment the query images without retraining. Extensive experiments on CT and MRI datasets demonstrate that PromptSeg not only outperforms state-of-the-art methods but also exhibits strong multi-modality generalization capabilities. Full article

(This article belongs to the Special Issue Methods in Artificial Intelligence and Information Processing, 4th Edition)

► Show Figures

Figure 1

26 pages, 4173 KB

Open AccessArticle

Physics-Guided Variational Causal Intervention Network for Few-Shot Radar Jamming Recognition

by Dong Xia, Liming Lv, Youjian Zhang, Yanxi Lu, Fang Li, Lin Liu, Xiang Liu, Yajun Zeng and Zhan Ge

Sensors 2026, 26(6), 1900; https://doi.org/10.3390/s26061900 - 18 Mar 2026

Viewed by 307

Abstract

Rapid and accurate recognition of radar active jamming is a prerequisite for cognitive electronic countermeasures. However, under complex electromagnetic environments with scarce training samples, existing deep learning models are prone to capturing spurious correlations induced by environmental confounders, resulting in notable performance degradation. [...] Read more.

Rapid and accurate recognition of radar active jamming is a prerequisite for cognitive electronic countermeasures. However, under complex electromagnetic environments with scarce training samples, existing deep learning models are prone to capturing spurious correlations induced by environmental confounders, resulting in notable performance degradation. To address this causal confounding issue, we propose a physics-guided variational causal intervention network (PG-VCIN). First, we reconstruct a structured causal model of jamming signal generation, decoupling observations into robust physical statistical features and sensitive time–frequency image representations. Physical priors are then leveraged to perform dynamic precision-weighted modulation of visual feature extraction, enforcing physical consistency at the representation learning stage. Second, we formulate deconfounding within an active inference framework and introduce a variational information bottleneck to optimize mutual information, thereby filtering out high-complexity redundant information attributable to confounders while preserving the essential causal semantics. Finally, we numerically approximate the causal effect by imposing dual intervention constraints in the latent space, including intra-class invariance and confounder invariance. Experiments on a semi-physical simulation dataset demonstrate that the proposed method achieves substantially higher recognition accuracy than several representative few-shot baselines in extremely low-sample regimes, validating the effectiveness of integrating physical mechanisms with causal inference. Full article

(This article belongs to the Section Radar Sensors)

► Show Figures

Figure 1

29 pages, 6070 KB

Open AccessArticle

Clastic Rock Lithology Identification Based on Multivariate Feature Enhancement and Dynamic Confidence-Weighted Ensemble

by Kang Chen, Guoyun Zhong and Fan Diao

Appl. Sci. 2026, 16(4), 1808; https://doi.org/10.3390/app16041808 - 12 Feb 2026

Viewed by 322

Abstract

The strong heterogeneity of clastic reservoirs and the phenomenon of similar log responses for different lithologies (i.e., “same spectrum, different rocks”) significantly weaken feature separability. Furthermore, distribution shifts between different wells cause traditional models to suffer from severe generalization bottlenecks in cross-well applications. [...] Read more.

The strong heterogeneity of clastic reservoirs and the phenomenon of similar log responses for different lithologies (i.e., “same spectrum, different rocks”) significantly weaken feature separability. Furthermore, distribution shifts between different wells cause traditional models to suffer from severe generalization bottlenecks in cross-well applications. To address this critical challenge, this paper proposes a dual-driven framework comprising “Multivariate Feature Enhancement + Dynamic Ensemble”. At the feature level, physics-informed enhancement and multi-scale statistics are introduced to construct a Multivariate high-dimensional feature system, thereby strengthening the representation of geological patterns. At the model level, a sample-aware Dynamic Confidence-Weighted Ensemble (DCWE) strategy is designed to achieve sample-wise adaptive decision-making based on prediction uncertainty, fundamentally breaking through the limitations of fixed weights in static ensembles. This method combines the complementary advantages of Gradient Boosting Decision Trees (GBDT) and deep sequence networks, enabling the simultaneous capture of local textural variations and continuous trends across depths. Based on rigorous Leave-One-Group-Out (LOGO) cross-validation, the proposed framework achieves a maximum accuracy of 84.58%. It significantly reduces the misclassification rate in lithology transition zones and for minority class samples, while maintaining the geological continuity of prediction results. These results verify the significant advantages of the proposed method in cross-well generalization scenarios. Full article

► Show Figures

Figure 1

36 pages, 630 KB

Open AccessArticle

Semantic Communication Unlearning: A Variational Information Bottleneck Approach for Backdoor Defense in Wireless Systems

by Sümeye Nur Karahan, Merve Güllü, Mustafa Serdar Osmanca and Necaattin Barışçı

Future Internet 2026, 18(1), 17; https://doi.org/10.3390/fi18010017 - 28 Dec 2025

Viewed by 854

Abstract

Semantic communication systems leverage deep neural networks to extract and transmit essential information, achieving superior performance in bandwidth-constrained wireless environments. However, their vulnerability to backdoor attacks poses critical security threats, where adversaries can inject malicious triggers during training to manipulate system behavior. This [...] Read more.

Semantic communication systems leverage deep neural networks to extract and transmit essential information, achieving superior performance in bandwidth-constrained wireless environments. However, their vulnerability to backdoor attacks poses critical security threats, where adversaries can inject malicious triggers during training to manipulate system behavior. This paper introduces Selective Communication Unlearning (SCU), a novel defense mechanism based on Variational Information Bottleneck (VIB) principles. SCU employs a two-stage approach: (1) joint unlearning to remove backdoor knowledge from both encoder and decoder while preserving legitimate data representations, and (2) contrastive compensation to maximize feature separation between poisoned and clean samples. Extensive experiments on the RML2016.10a wireless signal dataset demonstrate that SCU achieves 629.5 ± 191.2% backdoor mitigation (5-seed average; 95% CI: [364.1%, 895.0%]), with peak performance of 1486% under optimal conditions, while maintaining only 11.5% clean performance degradation. This represents an order-of-magnitude improvement over detection-based defenses and fundamentally outperforms existing unlearning approaches that achieve near-zero or negative mitigation. We validate SCU across seven signal processing domains, four adaptive backdoor types, and varying SNR conditions, demonstrating unprecedented robustness and generalizability. The framework achieves a 243 s unlearning time, making it practical for resource-constrained edge deployments in 6G networks. Full article

(This article belongs to the Special Issue Future Industrial Networks: Technologies, Algorithms, and Protocols)

► Show Figures

Figure 1

18 pages, 2081 KB

Open AccessArticle

Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction

by Guoliang Yang, Yuyu Zhang and Hao Yang

Sensors 2026, 26(1), 105; https://doi.org/10.3390/s26010105 - 23 Dec 2025

Cited by 1 | Viewed by 797

Abstract

The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model [...] Read more.

The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model first uses the visual state space model (VSS) as an encoder for feature extraction to better capture its long-range dependencies. Second, a hybrid attention enhancement mechanism (HAEM) is designed at the bottleneck between the encoder and the decoder to provide fine-grained control of the feature map in both the channel and spatial dimensions, so that the network captures key features and regions more comprehensively. The decoder uses transposed convolution to upsample the feature map, gradually increasing the resolution and recovering its spatial information. Finally, the cross-fusion module (CFM) is constructed to simultaneously focus on the spatial information of the shallow feature map as well as the deep semantic information, which effectively reduces the interference of noise and artifacts. Experiments are carried out on BUSI and UDIAT datasets, and the Dice similarity coefficient and HD₉₅ indexes reach 76.04% and 20.28 mm, respectively, which show that the algorithm can effectively solve the problems of noise and artifacts in ultrasound image segmentation, and the segmentation performance is improved compared with the existing algorithms. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

17 pages, 1173 KB

Open AccessArticle

AL-Net: Adaptive Learning for Enhanced Cell Nucleus Segmentation in Pathological Images

by Zhuping Chen, Sheng-Lung Peng, Rui Yang, Ming Zhao and Chaolin Zhang

Electronics 2025, 14(17), 3507; https://doi.org/10.3390/electronics14173507 - 2 Sep 2025

Viewed by 1061

Abstract

Precise segmentation of cell nuclei in pathological images is the foundation of cancer diagnosis and quantitative analysis, but blurred boundaries, scale variability, and staining differences have long constrained its reliability. To address this, this paper proposes AL-Net—an adaptive learning network that breaks through [...] Read more.

Precise segmentation of cell nuclei in pathological images is the foundation of cancer diagnosis and quantitative analysis, but blurred boundaries, scale variability, and staining differences have long constrained its reliability. To address this, this paper proposes AL-Net—an adaptive learning network that breaks through these bottlenecks through three innovative mechanisms: First, it integrates dilated convolutions with attention-guided skip connections to dynamically integrate multi-scale contextual information, adapting to variations in cell nucleus morphology and size. Second, it employs self-scheduling loss optimization: during the initial training phase, it focuses on region segmentation (Dice loss) and later switches to a boundary refinement stage, introducing gradient manifold constraints to sharpen edge localization. Finally, it designs an adaptive optimizer strategy, leveraging symbolic exploration (Lion) to accelerate convergence, and switches to gradient fine-tuning after reaching a dynamic threshold to stabilize parameters. On the 2018 Data Science Bowl dataset, AL-Net achieved state-of-the-art performance (Dice coefficient 92.96%, IoU 86.86%), reducing boundary error by 15% compared to U-Net/DeepLab; in cross-domain testing (ETIS/ColonDB polyp segmentation), it demonstrated over 80% improvement in generalization performance. AL-Net establishes a new adaptive learning paradigm for computational pathology, significantly enhancing diagnostic reliability. Full article

(This article belongs to the Special Issue Image Segmentation, 2nd Edition)

► Show Figures

Figure 1

17 pages, 4019 KB

Open AccessArticle

Oil-Painting Style Classification Using ResNet with Conditional Information Bottleneck Regularization

by Yaling Dang, Fei Duan and Jia Chen

Entropy 2025, 27(7), 677; https://doi.org/10.3390/e27070677 - 25 Jun 2025

Viewed by 1416

Abstract

Automatic classification of oil-painting styles holds significant promise for art history, digital archiving, and forensic investigation by offering objective, scalable analysis of visual artistic attributes. In this paper, we introduce a deep conditional information bottleneck (CIB) framework, built atop ResNet-50, for fine-grained style [...] Read more.

Automatic classification of oil-painting styles holds significant promise for art history, digital archiving, and forensic investigation by offering objective, scalable analysis of visual artistic attributes. In this paper, we introduce a deep conditional information bottleneck (CIB) framework, built atop ResNet-50, for fine-grained style classification of oil paintings. Unlike traditional information bottleneck (IB) approaches that minimize the mutual information

I (X; Z)

between input X and latent representation Z, our CIB minimizes the conditional mutual information

I (X; Z ∣ Y)

, where Y denotes the painting’s style label. We implement this conditional term using a matrix-based Rényi’s entropy estimator, thereby avoiding costly variational approximations and ensuring computational efficiency. We evaluate our method on two public benchmarks: the Pandora dataset (7740 images across 12 artistic movements) and the OilPainting dataset (19,787 images across 17 styles). Our method outperforms the prevalent ResNet with a relative performance gain of

13.1 %

on Pandora and

11.9 %

on OilPainting. Beyond quantitative gains, our approach yields more disentangled latent representations that cluster semantically similar styles, facilitating interpretability. Full article

(This article belongs to the Special Issue Application of Information Theory to Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

17 pages, 1312 KB

Open AccessArticle

Uncertainty Detection: A Multi-View Decision Boundary Approach Against Healthcare Unknown Intents

by Yongxiang Zhang and Raymond Y. K. Lau

Appl. Sci. 2025, 15(13), 7114; https://doi.org/10.3390/app15137114 - 24 Jun 2025

Cited by 1 | Viewed by 1163

Abstract

Chatbots, an automatic dialogue system empowered by deep learning-oriented AI technology, have gained increasing attention in healthcare e-services for their ability to provide medical information around the clock. A formidable challenge is that chatbot dialogue systems have difficulty handling queries with unknown intents [...] Read more.

Chatbots, an automatic dialogue system empowered by deep learning-oriented AI technology, have gained increasing attention in healthcare e-services for their ability to provide medical information around the clock. A formidable challenge is that chatbot dialogue systems have difficulty handling queries with unknown intents due to the technical bottleneck and restricted user-intent answering scope. Furthermore, the wide variation in a user’s consultation needs and levels of medical knowledge further complicates the chatbot’s ability to understand natural human language. Failure to deal with unknown intents may lead to a significant risk of incorrect information acquisition. In this study, we develop an unknown intent detection model to facilitate chatbots’ decisions in responding to uncertain queries. Our work focuses on algorithmic innovation for high-risk healthcare scenarios, where asymmetric knowledge between patients and experts exacerbates intent recognition challenges. Given the multi-role context, we propose a novel query representation learning approach involving multiple views from chatbot users, medical experts, and system developers. Unknown intent detection is then accomplished through the transformed representation of each query, leveraging adaptive determination of intent decision boundaries. We conducted laboratory-level experiments and empirically validated the proposed method based on the real-world user query data from the Tianchi lab and medical information from the Xunyiwenyao website. Across all tested unknown intent ratios (25%, 50%, and 75%), our multi-view boundary learning method was proven to outperform all benchmark models on the metrics of accuracy score, macro F1-score, and macro F1-scores over known intent classes and over the unknown intent class. Full article

(This article belongs to the Special Issue Digital Innovations in Healthcare)

► Show Figures

Figure 1

22 pages, 1444 KB

Open AccessFeature PaperArticle

Importance-Aware Resource Allocations for MIMO Semantic Communication

by Yue Cao, Youlong Wu, Lixiang Lian and Meixia Tao

Entropy 2025, 27(6), 605; https://doi.org/10.3390/e27060605 - 5 Jun 2025

Cited by 5 | Viewed by 2044

Abstract

This study proposes a separate source-channel coding (SSCC) framework to address semantic communication challenges in MIMO systems, overcoming the limitations of joint source-channel coding (JSCC) in channel adaptation and model reusability. Traditional systems suffer from bit-level redundancy in 6G, while JSCC struggles with [...] Read more.

This study proposes a separate source-channel coding (SSCC) framework to address semantic communication challenges in MIMO systems, overcoming the limitations of joint source-channel coding (JSCC) in channel adaptation and model reusability. Traditional systems suffer from bit-level redundancy in 6G, while JSCC struggles with complex channel variations. Our solution decouples semantic processing from channel coding through a three-tier architecture: (1) Variational autoencoder (VAE)-based semantic encoder and decoder for source coding, (2) A communication-informed bottleneck attribution (CIBA) mechanism quantifying feature importance for learning tasks, and (3) An importance-aware resource allocation scheme aligning communication objectives with deep learning tasks. Systematic experiments validate CIBA’s effectiveness in deriving importance scores that bridge learning tasks and communication optimization. Comparisons of feature perturbation schemes confirm the necessity of importance-aware resource allocation, with the proposed allocation strategy outperforming conventional methods in task performance metrics. The SSCC design enhances model reusability while maintaining adaptability to diverse MIMO configurations. By integrating interpretable AI with resource management, this work establishes a foundation for SSCC semantic communication systems in resource-constrained environments, prioritizing semantic fidelity and task efficacy over bit-level redundancy. The methodology highlights the critical role of importance awareness in optimizing both communication efficiency and learning task performance. Full article

(This article belongs to the Special Issue Semantic Information Theory)

► Show Figures

Figure 1

21 pages, 6529 KB

Open AccessArticle

The Supervised Information Bottleneck

by Nir Z. Weingarten, Zohar Yakhini, Moshe Butman and Ronit Bustin

Entropy 2025, 27(5), 452; https://doi.org/10.3390/e27050452 - 22 Apr 2025

Cited by 2 | Viewed by 3956

Abstract

The Information Bottleneck (IB) framework offers a theoretically optimal approach to data modeling, although it is often intractable. Recent efforts have optimized supervised deep neural networks (DNNs) using a variational upper bound on the IB objective, leading to enhanced robustness to adversarial attacks. [...] Read more.

The Information Bottleneck (IB) framework offers a theoretically optimal approach to data modeling, although it is often intractable. Recent efforts have optimized supervised deep neural networks (DNNs) using a variational upper bound on the IB objective, leading to enhanced robustness to adversarial attacks. In these studies, supervision assumes a dual role: sometimes as a presumably constant and observed random variable and at other times as its variational approximation. This work proposes an extension to the IB framework and, consequent to the derivation of its variational bound, that resolves this duality. Applying the resulting bound as an objective for supervised DNNs induces empirical improvements and provides an information-theoretic motivation for decoder regularization. Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

18 pages, 12627 KB

Open AccessArticle

Cross-Domain Feature Fusion Network: A Lightweight Road Extraction Model Based on Multi-Scale Spatial-Frequency Feature Fusion

by Lin Gao, Tianyang Shi and Lincong Zhang

Appl. Sci. 2025, 15(4), 1968; https://doi.org/10.3390/app15041968 - 13 Feb 2025

Viewed by 1594

Abstract

Road extraction is a key task in the field of remote sensing image processing. Existing road extraction methods primarily leverage spatial domain features of remote sensing images, often neglecting the valuable information contained in the frequency domain. Spatial domain features capture semantic information [...] Read more.

Road extraction is a key task in the field of remote sensing image processing. Existing road extraction methods primarily leverage spatial domain features of remote sensing images, often neglecting the valuable information contained in the frequency domain. Spatial domain features capture semantic information and accurate spatial details for different categories within the image, while frequency domain features are more sensitive to areas with significant gray-scale variations, such as road edges and shadows caused by tree occlusions. To fully extract and effectively fuse spatial and frequency domain features, we propose a Cross-Domain Feature Fusion Network (CDFFNet). The framework consists of three main components: the Atrous Bottleneck Pyramid Module (ABPM), the Frequency Band Feature Separator (FBFS), and the Domain Fusion Module(DFM). First, the FBFS is used to decompose image features into low-frequency and high-frequency components. These components are then integrated with shallow spatial features and deep features extracted through the ABPM. Finally, the DFM is employed to perform spatial–frequency feature selection, ensuring consistency and complementarity between the spatial and frequency domain features. The experimental results on the CHN6_CUG and Massachusetts datasets confirm the effectiveness of CDFFNet. Full article

(This article belongs to the Special Issue Intelligent Computing and Remote Sensing—2nd Edition)

► Show Figures

Figure 1

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI