MDPI - Publisher of Open Access Journals

29 pages, 17421 KB

Open AccessArticle

Cross-Modality Spectral Expansion Combined with Physical–Semantic Dual Priors for Cloud Detection in GF-1 Imagery

by Jing Zhang, Kexiao Shen, Liangnong Song, Shiyi Pan and Yunsong Li

Remote Sens. 2026, 18(11), 1689; https://doi.org/10.3390/rs18111689 (registering DOI) - 23 May 2026

Cloud detection in high-resolution Gaofen-1 (GF-1) imagery is challenging due to the absence of short-wave infrared (SWIR) bands, which prevents the use of physically interpretable indices such as the Normalized Difference Snow Index (NDSI) and often leads to severe cloud–snow confusion. To address [...] Read more.

Cloud detection in high-resolution Gaofen-1 (GF-1) imagery is challenging due to the absence of short-wave infrared (SWIR) bands, which prevents the use of physically interpretable indices such as the Normalized Difference Snow Index (NDSI) and often leads to severe cloud–snow confusion. To address this limitation, we propose a unified framework, termed the Cross-Modality Spectral Expansion and Dual-Prior Network (CMSE-DPNet), that integrates cross-modality spectral expansion with physical–semantic dual priors. First, an improved CycleGAN reconstructs 13-band pseudo-Sentinel-2 spectra from four-band GF-1 imagery, enabling the computation of snow-sensitive physical indices. Second, a Snow-Aware Feature Attention Guidance Module (SAFAGM) introduces pixel-level physical priors derived from NDSI, while a Label-Guided Channel Attention Module (LG-CAM) injects scene-level semantic priors inferred from geographic metadata using a large language model. These complementary priors guide the network to better distinguish clouds from spectrally similar backgrounds. Experiments on the GF-1 dataset show that the proposed method achieves an F1-score of 94.41% and an Intersection over Union (IoU) of 89.40%, outperforming several state-of-the-art cloud detection methods. The results indicate that cross-modality spectral expansion combined with physical–semantic prior guidance effectively improves cloud detection performance in complex cloud–snow coexistence scenarios. Full article

(This article belongs to the Special Issue AI-Driven Hyperspectral Image Classification and Processing in Remote Sensing)

16 pages, 2831 KB

Open AccessArticle

2.5D Context Encoding with Latent-Space Variational Diffusion for CBCT-to-CT Synthesis

by Yeon Su Park and Ji Hye Won

Electronics 2026, 15(11), 2246; https://doi.org/10.3390/electronics15112246 - 22 May 2026

Abstract

Cone-beam computed tomography (CBCT) is widely used in image-guided radiotherapy because of its low radiation dose and on-board acquisition capability. However, CBCT images often suffer from scatter artifacts, increased noise, reduced soft-tissue contrast, and inaccurate Hounsfield Unit (HU) values, which limit their direct [...] Read more.

Cone-beam computed tomography (CBCT) is widely used in image-guided radiotherapy because of its low radiation dose and on-board acquisition capability. However, CBCT images often suffer from scatter artifacts, increased noise, reduced soft-tissue contrast, and inaccurate Hounsfield Unit (HU) values, which limit their direct use for accurate dose calculation and quantitative analysis. To address this limitation, we propose a CBCT-to-CT synthesis framework based on 2.5D context encoding (concatenating five adjacent slices along the channel dimension) and latent-space variational diffusion. The proposed method combines a Vector Quantized Variational Autoencoder (VQ-VAE) and a U-shaped Vision Transformer (U-ViT)-based latent-space Variational Diffusion Model (VDM) to translate CBCT images into synthetic CT (sCT) images in a compressed latent space. To incorporate inter-slice anatomical context while preserving the computational efficiency of 2D processing, five adjacent CBCT slices are concatenated along the channel dimension and used as input. We evaluated the proposed method on the SynthRAD2025 paired CBCT-CT dataset covering head-and-neck, thoracic, and abdominal regions. Under the provided benchmark setting, quantitative evaluation on the validation set showed that the proposed 2.5D model improved peak signal-to-noise ratio (PSNR) from 25.39 dB to 27.44 dB (averaged across regions), structural similarity index measure (SSIM) from 0.813 to 0.846, reduced mean squared error (MSE) from 0.00313 to 0.00200, and lowered Fréchet inception distance (FID) from 1009.33 to 869.53 compared with the 2D baseline. Qualitative results also showed improved anatomical consistency and reduced artifact-related distortions. These findings suggest that neighboring-slice context can enhance HU fidelity and overall image quality in a computationally practical synthesis framework, supporting the usefulness of efficient AI-based cross-modality reconstruction for radiotherapy-related imaging workflows. Full article

(This article belongs to the Special Issue Intelligent Approaches for Solving Software Problems with AI Techniques, 2nd Edition)

► Show Figures

Figure 1

33 pages, 6735 KB

Open AccessArticle

ADDFNet: A Robotic Grasping Depth Map Completion Network Integrating Differential Enhancement Convolution and Hybrid Attention

by Nan Liu, Yi-Horng Lai, Yue Wu, Jiaen Wang and Xian Yu

Actuators 2026, 15(6), 280; https://doi.org/10.3390/act15060280 - 22 May 2026

Abstract

In the field of industrial robotic vision, accurate recognition and localization of transparent objects pose significant challenges. Unlike opaque objects, transparent objects are difficult to distinguish in RGB images, and due to refraction and reflection, their depth information often suffers from large-area missing [...] Read more.

In the field of industrial robotic vision, accurate recognition and localization of transparent objects pose significant challenges. Unlike opaque objects, transparent objects are difficult to distinguish in RGB images, and due to refraction and reflection, their depth information often suffers from large-area missing or erroneous values, leading to failed grasp pose prediction. Therefore, depth completion is crucial for transparent object grasping tasks. However, existing depth completion methods still exhibit obvious limitations. Multi-stage optimization methods, while achieving high accuracy, involve complex pipelines and high computational costs. Single-stage end-to-end networks, when processing sparse edge features of transparent objects that are also contaminated by background interference, are constrained by the receptive field and smoothing effect of conventional convolutions, often resulting in contour blurring and loss of geometric details. Moreover, existing methods still lack sufficient capability in modeling multi-directional gradient variations of transparent objects under complex backgrounds. To address these issues, this paper proposes ADDFNet for transparent object depth completion, achieving synergistic improvement in accuracy and robustness through two key designs: MDAM and CMFR. To tackle the problem of sparse edge features of transparent objects that are easily disturbed by noise, we design the Multi-directional Differential Attention Module (MDAM), which explicitly extracts multi-directional gradient information through multi-branch differential convolution. Within MDAM, we introduce the Detail Enhancement Differential sub-Module (DEDM) and the Dynamic Convolution with Symmetry-enhanced Geometry Attention sub-module (DSCA) to enhance the network’s perception of fine contours and improve global–local synergistic modeling capability. To address insufficient cross-modal information interaction, we introduce the Cross-Modal Feature Refinement (CMFR) module, which utilizes RGB context to guide and enhance depth features layer by layer during the encoding stage, improving the accuracy and robustness of depth completion while mitigating feature degradation caused by traditional simple fusion approaches. Experimental results on the ClearPose and TransCG datasets demonstrate that ADDFNet outperforms comparison methods in terms of RMSE, REL, MAE, and threshold accuracy metrics, exhibiting more stable performance in edge recovery and internal detail reconstruction of transparent objects. Full article

(This article belongs to the Special Issue Actuation and Sensing of Intelligent Soft Robots—2nd Edition)

20 pages, 4449 KB

Open AccessArticle

Multimodal Factor Analysis Reveals Five Robust Phenotypes of Healthy Aging in a Russian Population Cohort

by Lyubov V. Machekhina, Alexandra A. Melnitskaya, Mikhail S. Arbatskiy, Anna V. Permyakova, Alexey V. Churov, Irina D. Strazhesko and Olga N. Tkacheva

Biomedicines 2026, 14(5), 1158; https://doi.org/10.3390/biomedicines14051158 - 20 May 2026

Viewed by 154

Abstract

Background/Objectives: Population aging necessitates a shift from disease-focused paradigms to a holistic characterization of biological aging processes. While chronological age remains the primary metric, it poorly captures inter-individual variability in physiological resilience and health trajectories. This study aimed to identify robust, multidimensional aging [...] Read more.

Background/Objectives: Population aging necessitates a shift from disease-focused paradigms to a holistic characterization of biological aging processes. While chronological age remains the primary metric, it poorly captures inter-individual variability in physiological resilience and health trajectories. This study aimed to identify robust, multidimensional aging phenotypes independent of chronological age and sex using integrative factor analysis of heterogeneous biomedical data from a Russian cohort—a population underrepresented in aging research. Methods: We analyzed data from 1201 conditionally healthy adults (aged 18–99 years) enrolled in the RUSS AGE study. A comprehensive dataset comprising 118 variables across 11 modalities—including biochemical markers, anthropometry, physical function, cognitive-emotional assessments, lifestyle factors, and psychosocial indicators—was integrated using Multi-Omics Factor Analysis v2 (MOFA2). Following the extraction of 16 latent factors and residualization for demographic confounders, consensus clustering was performed to identify distinct aging phenotypes. Phenotype stability was internally recapitulated using gradient-boosting classifiers (XGBoost, CatBoost) in a stratified five-fold cross-validation and on a held-out test set. Results: MOFA2 identified 16 stable latent factors, explaining 21.3% of the total variance and capturing coordinated variation across metabolic, inflammatory, cardiovascular, cognitive, and behavioral domains. Consensus clustering revealed five reproducible phenotypes—Anemic (n = 82), Metabolically Subcompensated (n = 99), Metabolically Decompensated (n = 304), Overloaded (n = 302), and Balanced (n = 414)—characterized by distinct multisystem profiles independent of age (p > 0.05 after FDR correction) and sex. Supervised classification achieved high discriminative performance (macro F1-score = 0.75, OvR ROC-AUC = 0.93 on the held-out test set), quantifying the internal reconstructability of the phenotype labels from the original feature space rather than external generalization to an independent cohort. Conclusions: This study demonstrates the feasibility of data-driven, biologically coherent phenotyping of healthy aging using integrative factor analysis. The identified phenotypes represent stable configurations of physiological, functional, and psychosocial characteristics that transcend chronological age, providing a foundation for the future development of risk-stratification tools, preventive interventions, and biological-age calculators, subject to subsequent validation in longitudinal and independent external cohorts. Full article

(This article belongs to the Section Molecular and Translational Medicine)

► Show Figures

Figure 1

28 pages, 15464 KB

Open AccessArticle

Spatio-Temporal Reconstruction of MODIS LAI Using a Self-Supervised Framework for Vegetation Dynamics Monitoring Across China

by Huijing Wu, Ting Tian, Haitao Wei and Hongwei Li

Land 2026, 15(5), 833; https://doi.org/10.3390/land15050833 (registering DOI) - 13 May 2026

Viewed by 162

Abstract

Leaf Area Index (LAI) is a key biophysical parameter for characterizing terrestrial vegetation dynamics and land surface processes. Time-series MODIS LAI products are widely used in ecological and land-related research, but cloud contamination and sensor noise lead to widespread spatio-temporal gaps, limiting their [...] Read more.

Leaf Area Index (LAI) is a key biophysical parameter for characterizing terrestrial vegetation dynamics and land surface processes. Time-series MODIS LAI products are widely used in ecological and land-related research, but cloud contamination and sensor noise lead to widespread spatio-temporal gaps, limiting their ability to support long-term, consistent vegetation monitoring over large areas. To address this issue, this study proposes a novel self-supervised LAI reconstruction framework (SSLAI) for generating gap-free and ecologically consistent LAI datasets across China. The framework integrates cross-modal environmental fusion, multi-scale spatio-temporal modeling, and adaptive phenological constraints to ensure the reconstructed LAI aligns with realistic vegetation growth rhythms. SSLAI outperforms seven traditional and state-of-the-art deep learning methods, maintaining a root mean square error (RMSE) below 0.20 even with 16 missing time windows. Field validation confirms its high accuracy, with a coefficient of determination (R²) of 0.885 and an RMSE of 0.477. Furthermore, SSLAI’s response to meteorological changes aligns with ecological principles, demonstrating favorable physical interpretability and ecological rationality. The reconstructed LAI exhibits superior spatial completeness and temporal consistency compared with MODIS, VIIRS, and GLASS products, and performs robustly under variable climatic conditions. This study provides an effective self-supervised solution for MODIS LAI gap-filling over large regions, and the generated high-quality LAI dataset can serve as a reliable data foundation for vegetation dynamics monitoring, land surface modeling, and global change research. Full article

► Show Figures

Figure 1

21 pages, 2388 KB

Open AccessArticle

FedMIR: Multimodal Federated Learning with Missing Modality Imputation and Distribution-Aware Routing

by Hongyu Xiong and Ming Dai

Sensors 2026, 26(10), 2954; https://doi.org/10.3390/s26102954 - 8 May 2026

Viewed by 242

Abstract

Existing multimodal federated learning methods typically assume complete modality availability and struggle with heterogeneity between training and testing data distributions, making them unsuitable for handling missing modalities and distribution drift in distributed learning scenarios such as the Internet of Things (IoT). To address [...] Read more.

Existing multimodal federated learning methods typically assume complete modality availability and struggle with heterogeneity between training and testing data distributions, making them unsuitable for handling missing modalities and distribution drift in distributed learning scenarios such as the Internet of Things (IoT). To address these challenges, we present FedMIR, a novel framework for multimodal federated learning. Our key observation is that heterogeneous modalities can be mapped into a shared semantic space, where cross-modal dependencies can be effectively modeled. Based on this insight, FedMIR leverages contrastive learning to align image–text modalities in a shared latent space and employs conditional generation to reconstruct missing modality representations. The completed representations are then routed through a mixture-of-experts backbone conditioned on the estimated distribution state. FedMIR shares only model parameters and distribution statistics with the server. This design enables the model to operate under missing modality settings while adaptively allocating expert knowledge to cope with distribution drift. We validate FedMIR on federated image–text retrieval benchmarks under heterogeneity and missing data conditions, demonstrating its effectiveness compared to representative federated learning baselines. Full article

(This article belongs to the Section Internet of Things)

► Show Figures

Figure 1

28 pages, 3943 KB

Open AccessArticle

Weak Calibration Cross-Fusion Framework for Multi-Modal 3D Object Detection on Unmanned Surface Vehicles

by Yong Li, Dehang Lian, Jialong Du, Dongxu Gao, Xiangrong Xu and Xiang Gong

J. Mar. Sci. Eng. 2026, 14(9), 867; https://doi.org/10.3390/jmse14090867 (registering DOI) - 6 May 2026

Viewed by 291

Abstract

The field of intelligent transportation on inland waterways is experiencing rapid growth, driven by the global pursuit of enhanced waterway safety, operational efficiency, and environmental sustainability. In real-world autonomous operation scenarios of unmanned surface vehicles (USVs), image-based 2D object detection methods are insufficient [...] Read more.

The field of intelligent transportation on inland waterways is experiencing rapid growth, driven by the global pursuit of enhanced waterway safety, operational efficiency, and environmental sustainability. In real-world autonomous operation scenarios of unmanned surface vehicles (USVs), image-based 2D object detection methods are insufficient to meet the demands of 3D environmental modeling and accurate perception of dynamic objects. Existing 3D perception systems for USVs depend heavily on precise sensor calibration. However, projection offsets between point clouds and images—caused by water surface fluctuations and complex outdoor environments—hinder the practical deployment of these methods. To address these limitations, we propose a weak calibration multi-modal 3D object detection algorithm based on cross-view fusion, termed RCF-Free (Radar-Camera Fusion, Free from precise calibration). Inspired by autonomous driving solutions, we design a Triple-Path Cross-View Fusion module that achieves high-quality cross-view feature fusion without requiring accurate calibration parameters, while simultaneously detecting complete bird’s-eye view (BEV) bounding boxes. We further enhance the spatial layout comprehension of the visual branch through a Mobile Self-Attention Module (MAM) and effectively encode sparse point cloud features in BEV space using a dedicated BEV-Point feature encoder. Additionally, we reconstruct and introduce two water-related 3D object detection datasets, FloW-BEV and WaterScenes-BEV. Experimental results demonstrate that RCF-Free achieves

m A P_{B E V} 50

scores of 60.5% and 69.3% on the FloW-BEV and WaterScenes-BEV datasets, respectively, showing the effectiveness in water surface object detection. Moreover, on the DAIR-V2X-I dataset for autonomous driving scenarios, the model attains

m A P_{3 D} 50

scores of 73.3%, 61.2%, and 61.2% across three task difficulty levels, illustrating strong cross-domain generalization capability. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

31 pages, 4372 KB

Open AccessArticle

Text-Anchored Residual Cross-Modal Fusion for Multimodal Sentiment Analysis: A Unified and Protocol-Aware Evaluation on MVSA-Single

by Kosala Natarajan and Nirmalrani Vairaperumal

Appl. Sci. 2026, 16(9), 4514; https://doi.org/10.3390/app16094514 - 4 May 2026

Viewed by 444

Abstract

Multimodal sentiment analysis aims to infer sentiment polarity by jointly modeling textual and visual information. Despite recent advances in pretrained language and vision encoders, sentiment prediction from social media posts remains challenging because textual and visual modalities are often weakly aligned, semantically noisy, [...] Read more.

Multimodal sentiment analysis aims to infer sentiment polarity by jointly modeling textual and visual information. Despite recent advances in pretrained language and vision encoders, sentiment prediction from social media posts remains challenging because textual and visual modalities are often weakly aligned, semantically noisy, and unevenly informative. Recent studies have emphasized the importance of fine-grained cross-modal fusion, stronger pretrained visual representations, and strategies for reducing modality bias in MVSA-style benchmarks. In this work, we present a systematic implementation-driven study of multimodal sentiment classification on MVSA-Single. We first construct a clean three-class sentiment-consistent subset and then implement a wide set of baselines, including text-only DistilBERT, image-only ResNet18, simple multimodal fusion, gated fusion, residual fusion, multi-task contrastive fusion, DINOv2-based fusion, and attention bottleneck fusion. Building on these experiments, we propose a semantic cross-modal fusion architecture that combines a RoBERTa text encoder with a CLIP vision encoder through cross-attention, allowing textual representations to selectively attend to sentiment-relevant visual signals. On the clean 2592-sample subset, the proposed model achieved the best overall performance, reaching 82.63% validation accuracy, 79.62% test accuracy and 79.42 weighted F1, outperforming all other implemented baselines under the same experimental pipeline and dataset setting. To improve comparability with prior MVSA-Single studies, we additionally reconstructed a broader processed setting from the 4511-sample HDF5 version and aligned 4318 text–image pairs with original image files. On this harder protocol-matched setting, the same model achieved 72.69% test accuracy and 70.66 weighted F1, revealing a substantial performance gap caused by dataset construction and residual multimodal noise. These findings show that strong cross-modal semantic alignment contributes more to robust multimodal sentiment prediction than simply increasing architectural complexity and that CLIP-based visual semantics are more beneficial than DINOv2 in our text–image sentiment setting. Full article

► Show Figures

Figure 1

48 pages, 2556 KB

Open AccessReview

Security and Privacy in Generative Semantic Communication Systems: A Comprehensive Survey

by Mehwish Ali Naqvi and Insoo Sohn

Mathematics 2026, 14(9), 1522; https://doi.org/10.3390/math14091522 - 30 Apr 2026

Viewed by 388

Abstract

Semantic communication (SemCom) has emerged as a task-oriented communication paradigm that prioritizes meaning delivery over exact bit recovery. The integration of generative artificial intelligence (GenAI) into SemCom further enables knowledge-guided inference, multimodal reconstruction, and semantic compression through architectures such as large language models, [...] Read more.

Semantic communication (SemCom) has emerged as a task-oriented communication paradigm that prioritizes meaning delivery over exact bit recovery. The integration of generative artificial intelligence (GenAI) into SemCom further enables knowledge-guided inference, multimodal reconstruction, and semantic compression through architectures such as large language models, variational autoencoders, generative adversarial networks, and diffusion models. At the same time, this integration introduces new security and privacy risks, including semantic eavesdropping, model inversion, semantic jamming, covert backdoors, prompt manipulation, and knowledge-base leakage, which are not adequately captured by conventional communication security models. In this survey, we provide a security-centric review of GenAI-assisted semantic communication systems by organizing the literature according to threat models, attack surfaces, defence strategies, and semantic modalities across text, image, and multimodal settings. The survey was conducted using IEEE Xplore, ACM Digital Library, SpringerLink, arXiv, and Google Scholar. Approximately 180 papers were initially screened, and 53 representative studies published between 2021 and 2026 were selected for detailed review. Based on this analysis, we classify the major threats into adversarial perturbation, jamming, poisoning and backdoor attacks, privacy leakage and semantic eavesdropping, and generative-model-specific vulnerabilities involving diffusion, large language models, and multimodal foundation models. We further map the corresponding defences, including adversarial training, model ensembling, semantic-aware encryption, diffusion-guided denoising, privacy-preserving representation learning, and secure resource allocation. The survey also identifies persistent open challenges, including the lack of standardized semantic security metrics, unified benchmarks, cross-layer evaluation frameworks, and robust defences for GenAI-native and multimodal semantic communication systems. Overall, this work provides a structured reference for the design of secure, trustworthy, and attack-resilient generative semantic communication systems for future intelligent networks. Full article

(This article belongs to the Special Issue Advances in Blockchain and Intelligent Computing)

► Show Figures

Figure 1

22 pages, 62112 KB

Open AccessArticle

Semantic-Guided Multi-Level Collaborative Fusion Network for Visible and Infrared Images

by Lijun Yuan, Chuanjiang Xie, Ming Yang, Xiaoguang Tu, Qiqin Li and Xinyu Zhu

Sensors 2026, 26(9), 2577; https://doi.org/10.3390/s26092577 - 22 Apr 2026

Viewed by 233

Abstract

The paramount value of image fusion is manifested in effectively enhancing downstream tasks. However, compatibility with subsequent tasks is compromised due to the semantic deficiency of fusion representations generated by current approaches. To mitigate this limitation, a semantic-guided multi-level collaborative fusion network is [...] Read more.

The paramount value of image fusion is manifested in effectively enhancing downstream tasks. However, compatibility with subsequent tasks is compromised due to the semantic deficiency of fusion representations generated by current approaches. To mitigate this limitation, a semantic-guided multi-level collaborative fusion network is proposed, termed DSIFuse. By leveraging semantic priors and global context extracted from auxiliary segmentation branches, a multi-level interaction space is constructed to explicitly refine cross-modal features. Specifically, a cross-modal feature correction mechanism is designed to enhance semantic alignment by injecting complementary visible–infrared information at each layer, while a three-level interaction strategy gradually integrates unimodal features and semantic maps to generate semantically enriched representations. To mitigate semantic information loss during image reconstruction, a semantic compensation block is employed, incorporating interactive representations from prior layers and global semantic maps into the multi-scale decoder. Finally, the overall loss integrates semantic supervision, gradient, and intensity loss. Experiments conducted on public datasets indicate that clear fusion images are generated by DSIFuse, with improved structural consistency and reduced artifacts. Under a unified benchmark, the fused representations subsequently yield improved performance in downstream object detection tasks. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

35 pages, 5739 KB

Open AccessArticle

Multi-Scale Atrous Feature Fusion Based on a VGG19-UNet Encoder for Brain Tumor Segmentation

by Shoffan Saifullah and Rafał Dreżewski

Appl. Sci. 2026, 16(8), 3971; https://doi.org/10.3390/app16083971 - 19 Apr 2026

Viewed by 311

Abstract

Accurate brain tumor segmentation from magnetic resonance imaging (MRI) remains challenging due to heterogeneous tumor morphology, intensity variability, and multi-scale structural complexity. This study proposes a DeepLabV3+-based segmentation framework integrating a VGG19-UNet encoder, Atrous Spatial Pyramid Pooling (ASPP), and low-level feature refinement to [...] Read more.

Accurate brain tumor segmentation from magnetic resonance imaging (MRI) remains challenging due to heterogeneous tumor morphology, intensity variability, and multi-scale structural complexity. This study proposes a DeepLabV3+-based segmentation framework integrating a VGG19-UNet encoder, Atrous Spatial Pyramid Pooling (ASPP), and low-level feature refinement to simultaneously capture hierarchical semantics and boundary-sensitive spatial details. The architecture enhances receptive field coverage without additional downsampling while preserving fine-grained contour information during reconstruction. Extensive evaluation was conducted on the Figshare Brain Tumor Segmentation (FBTS) dataset and the BraTS 2021 and BraTS 2018 benchmarks, focusing on Whole Tumor segmentation across multiple MRI modalities and tumor grades. Under five-fold cross-validation, the proposed model achieved a mean Dice Similarity Coefficient of 0.9717 and Jaccard Index of 0.9456 on FBTS, with stable and competitive performance across FLAIR, T1, T2, and T1CE modalities in both HGG and LGG cases. Boundary-level analysis further confirmed controlled Hausdorff Distance and low Average Symmetric Surface Distance. Statistical validation and ablation analysis demonstrate consistent improvements over baseline U-Net configurations. The proposed framework provides a robust and computationally efficient solution for automated brain tumor segmentation across heterogeneous datasets. Full article

(This article belongs to the Special Issue Research on Artificial Intelligence in Healthcare)

► Show Figures

Figure 1

52 pages, 14386 KB

Open AccessReview

Trustworthy Intelligence: Split Learning–Embedded Large Language Models for Smart IoT Healthcare Systems

by Mahbuba Ferdowsi, Nour Moustafa, Marwa Keshk and Benjamin Turnbull

Electronics 2026, 15(7), 1519; https://doi.org/10.3390/electronics15071519 - 4 Apr 2026

Viewed by 583

Abstract

The Internet of Things (IoT) plays an increasingly central role in healthcare by enabling continuous patient monitoring, remote diagnosis, and data-driven clinical decision-making through interconnected medical devices and sensing infrastructures. Despite these advances, IoT healthcare systems remain constrained by persistent challenges related to [...] Read more.

The Internet of Things (IoT) plays an increasingly central role in healthcare by enabling continuous patient monitoring, remote diagnosis, and data-driven clinical decision-making through interconnected medical devices and sensing infrastructures. Despite these advances, IoT healthcare systems remain constrained by persistent challenges related to data privacy, computational efficiency, scalability, and regulatory compliance. Federated learning (FL) reduces reliance on centralised data aggregation but remains vulnerable to inference-based privacy risks, while edge-oriented approaches are limited by device heterogeneity and restricted computational and energy resources; the deployment of large language models (LLMs) further exacerbates concerns surrounding privacy exposure, communication overhead, and practical feasibility. This study introduces Trustworthy Intelligence (TI) as a guiding framework for privacy-preserving distributed intelligence in IoT healthcare, explicitly integrating predictive performance, privacy protection, and deployment-oriented system design. Within this framework, split learning (SL) is examined as a core architectural mechanism and extended to support split-aware LLM integration across heterogeneous devices, supported by a structured taxonomy spanning architectural configurations, system adaptation strategies, and evaluation considerations. The study establishes a systematic mapping between SL design choices and representative healthcare scenarios, including wearable monitoring, multi-modal data fusion, clinical text analytics, and cross-institutional collaboration, and analyses key technical challenges such as activation-level privacy leakage, early-round vulnerability, reconstruction risks, and communication–computation trade-offs. An energy- and resource-aware adaptive cut layer selection strategy is outlined to support efficient deployment across devices with varying capabilities. A proof-of-concept experimental evaluation compares the proposed SL–LLM framework with centralised learning (CL), federated learning (FL), and conventional SL in terms of training latency, communication overhead, model accuracy, and privacy exposure under realistic IoT constraints, providing system-level evidence for the applicability of the TI framework in distributed healthcare environments and outlining directions for clinically viable and regulation-aligned IoT healthcare intelligence. Full article

(This article belongs to the Special Issue Engineering Multimodal Medical Digital Twins: Sensor Fusion, Multimodal Learning, and Edge–Cloud AI for Real-Time Personalized Care)

► Show Figures

Figure 1

17 pages, 1622 KB

Open AccessArticle

Comparison of Limb Symmetry Index Values Across Different Knee Flexor Strength Testing Conditions in Healthy Male Recreational Athletes

by Natalia Urban and Aleksandra Królikowska

Appl. Sci. 2026, 16(7), 3440; https://doi.org/10.3390/app16073440 - 1 Apr 2026

Viewed by 752

Abstract

Background/Objectives: Restoring lower-limb strength and symmetry is crucial after ACL injury and reconstruction. The limb symmetry index (LSI) is often used to assess strength symmetry for return-to-sport decisions, but various assessment methods can influence outcomes. This study aimed to compare LSI across [...] Read more.

Background/Objectives: Restoring lower-limb strength and symmetry is crucial after ACL injury and reconstruction. The limb symmetry index (LSI) is often used to assess strength symmetry for return-to-sport decisions, but various assessment methods can influence outcomes. This study aimed to compare LSI across common knee flexor testing methods in healthy male athletes and to examine associations between absolute strength outcomes, thereby establishing baseline reference values for LSI in a healthy population. Methods: Twenty-two healthy recreationally active males participated in this prospective cross-sectional study. Knee flexor strength was assessed bilaterally using three force plate isometric tests, a static dynamometer-based test (isometric), and isokinetic dynamometer-based tests. Absolute strength values were normalized to body mass. LSI values were calculated for each testing condition. Differences in LSI across modalities were analyzed with repeated-measures ANOVA, and associations between normalized strength outcomes were assessed using Pearson correlation coefficients. Results: LSI values ranged from 96.69 to 101.83 across the testing conditions, with no significant differences observed between measures. Normalized absolute strength outcomes demonstrated very strong correlations within the same measurement category (r = 0.86–0.94 for force plate tests and r = 0.88–0.96 for isokinetic tests). In contrast, correlations between isometric and isokinetic strength outcomes were moderate (r = 0.41–0.67). Conclusions: LSI values were consistent across knee flexor strength testing modalities, suggesting that symmetry assessment was relatively consistent across different measurement methods in the studied group. In contrast, normalized absolute strength outcomes showed only moderate and variable associations across modalities, indicating that different testing approaches assess related but not interchangeable aspects of muscle strength. Full article

(This article belongs to the Special Issue Recent Advances in the Prevention and Rehabilitation of ACL Injuries—2nd Edition)

► Show Figures

Figure 1

29 pages, 6909 KB

Open AccessArticle

MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification

by Yihua Chen, Yuanpeng Han, Yujian Zhang, Yi Liu, Lin Song, Jialei Wang, Xinjue Wang and Qilin Zhang

Remote Sens. 2026, 18(7), 1027; https://doi.org/10.3390/rs18071027 - 29 Mar 2026

Viewed by 390

Abstract

Utilizing multi-source meteorological data for lightning identification is crucial for monitoring severe convective weather. However, several key challenges persist in this field: dimensional imbalance and modal competition among multi-source heterogeneous data, model training bias caused by the extreme sparsity of lightning samples, and [...] Read more.

Utilizing multi-source meteorological data for lightning identification is crucial for monitoring severe convective weather. However, several key challenges persist in this field: dimensional imbalance and modal competition among multi-source heterogeneous data, model training bias caused by the extreme sparsity of lightning samples, and an imbalance between false alarms and missed detections resulting from complex background noise. To address these challenges, this paper proposes a lightning identification network guided by physical priors and constrained by supervision. First, to tackle the issue of modal competition in fusing satellite (high-dimensional) and radar (low-dimensional) data, a physical prior-guided asymmetric radar information enhancement mechanism is introduced. This mechanism uses radar physical features as contextual guidance to selectively enhance the latent weak radar signatures. Second, at the architectural level, a multi-source multi-scale feature fusion module and a weighted sliding window–multilayer perceptron (MLP) enhanced decoding unit are constructed. The former achieves the coupling of multi-scale physical features at a 2 km grid scale through cross-level semantic alignment, building a highly consistent feature field that effectively improves the model’s ability to detect lightning signals. The latter leverages adaptive receptive fields and the nonlinear modeling capability of MLPs to effectively smooth spatially discrete noise, ensuring spatial continuity in the reconstructed results. Finally, to address the model bias caused by severe class imbalance between positive and negative samples—resulting from the extreme sparsity of lightning events—an asymmetrically weighted BCE-DICE loss function is designed. Its “asymmetric” characteristic is implemented by assigning different penalty weights to false-positive and false-negative predictions. This loss function balances pixel-level accuracy and inter-class equilibrium while imposing high-weight penalties on false-positive predictions, achieving synergistic optimization of feature enhancement and directional suppression. Experimental results show that the proposed method effectively increases the hit rate while substantially reducing the false alarm rate, enabling efficient utilization of multi-source data and high-precision identification of lightning strike areas. Full article

(This article belongs to the Special Issue Advancing Remote Sensing Through Large Multimodal Foundation Models: Toward Intelligent Earth Observation)

► Show Figures

Figure 1

18 pages, 763 KB

Open AccessReview

The Current Landscape of Artificial Intelligence in Positron Emission Tomography (PET) Imaging Across the Cancer Continuum

by Wut Yee The Zar, Mi Rim Kim, Aruni Ghose, Sola Adeleke, Manoj Gupta, Partha S. Choudhary, Anirudh Shankar, Srishti Mohapatra, Stergios Boussios and Akash Maniam

J. Clin. Med. 2026, 15(6), 2446; https://doi.org/10.3390/jcm15062446 - 23 Mar 2026

Viewed by 1066

Abstract

PET scans have long been used in oncology imaging to provide molecular and metabolic information about diseases. The use of artificial intelligence (AI) in PET scans in oncology theranostics has the potential to optimise PET modality and overcome the constraints that PET scans [...] Read more.

PET scans have long been used in oncology imaging to provide molecular and metabolic information about diseases. The use of artificial intelligence (AI) in PET scans in oncology theranostics has the potential to optimise PET modality and overcome the constraints that PET scans have, such as semi-quantitative metrics, reader subjectivity, and variability across scanners/institutions. Advances in AI and radiomics are overcoming those limitations by deep learning lesion detection, enhancing image reconstruction, and improving noise resolution, which allows ultra-low dose acquisitions, while physics-informed models integrate with PET systems to strengthen interpretability and quantitative accuracy. There are also predictive AI frameworks that link PET imaging biomarkers to therapy response and outcomes, create individualised care and are even able to simulate treatment response and help with treatment planning. However, challenges do exist. Most AI PET studies are retrospective, single-centre, and underpowered (small sample), with limited external validation and inconsistent standardisation (in acquisition, segmentation, and extraction), leading to poor reproducibility and higher performance estimates. Furthermore, ethical considerations, including data protection and transparency, need to be considered before implementation. Federated learning, physics-informed frameworks, and adherence to standardised protocols offer steps towards regulated AI systems. In summary, PET is evolving from an imaging modality to a platform with the integration of deep learning, radiomics and reconstruction capable of predicting treatment response and guiding treatment. With rigorous prospective validation, cross-institutional collaboration, and regulatory standardisation, AI in PET would create an advancement in nuclear medicine imaging in oncology. Full article

(This article belongs to the Special Issue AI-Enhanced Medical Imaging for Cancer Diagnosis)

► Show Figures

Figure 1

Search Results (119)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (119)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI