MDPI - Publisher of Open Access Journals

25 pages, 5937 KB

Open AccessArticle

CGSTA-Net: A Cross-Domain Generative Prior-Assisted Structure–Texture Adaptive Network for Remote Sensing Image Dehazing

by Xiaoyan Li, Yankun Zhao and Na Niu

Symmetry 2026, 18(6), 1027; https://doi.org/10.3390/sym18061027 (registering DOI) - 14 Jun 2026

Abstract

Dehazing of images is important for proper interpretation of optical images in remote sensing. However, current dehazing networks tend to have limited receptive field and texture information loss caused by conventional downsampling and complementary cross-domain information not being utilized in dehazing frameworks. In [...] Read more.

Dehazing of images is important for proper interpretation of optical images in remote sensing. However, current dehazing networks tend to have limited receptive field and texture information loss caused by conventional downsampling and complementary cross-domain information not being utilized in dehazing frameworks. In order to cope with these problems, we propose a Cross-domain Generative Prior-assisted Structure–Texture Adaptive Network for remote sensing image dehazing. It is a dual-stream encoder–decoder framework, which enhances the domain-specific information of RGB and generated prior, and then integrates them adaptively for haze-free reconstruction. In order to minimize information loss in downsampling, wavelet pooling is introduced to consider the frequency-aware structural and textural features. Additionally, a Structure–Texture Calibration Block is designed to simultaneously improve the local frequency textures and construct sparse long-range dependencies of structures, so as to achieve better restoration performance under spatially non-uniform haze. To appropriately fuse the various representations from RGB and generated prior images, a Prior-aware Gated Adaptive Fusion module is developed to balance the domain-specific features dynamically and keep the fine details at multi-level feature fusion. Finally, we utilize pixel-level contrastive learning to guide the latent space away from hazy distributions, thus enhancing the discriminability of the features. Extensive experiments on the three datasets, namely RSID, RICE-I and HRSD, demonstrate that CGSTA-Net can effectively restore images under varying haze conditions and significantly outperforms the latest dehazing methods in terms of visual quality and quantitative performance. Specifically, compared with the most effective competitive method, CGSTA-Net increased the PSNR by 22.9% on RSID, by 13.2% on RICE-I, and by 7.2% on HRSD. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

26 pages, 4861 KB

Open AccessArticle

Class-Aware Semantic Calibration for Cross-Scene Hyperspectral Image Classification

by Boshan Shi, Yanbo Liu, Youqiang Zhang and Guo Cao

Remote Sens. 2026, 18(12), 1976; https://doi.org/10.3390/rs18121976 (registering DOI) - 14 Jun 2026

Abstract

Cross-scene Hyperspectral Image (HSI) classification faces substantial domain shifts caused by sensor heterogeneity, acquisition variation, and scene diversity. While benchmark annotations are assigned to individual center pixels, local patches often contain implicit multi-label semantics due to spectral mixing and spatial overlap. This mismatch [...] Read more.

Cross-scene Hyperspectral Image (HSI) classification faces substantial domain shifts caused by sensor heterogeneity, acquisition variation, and scene diversity. While benchmark annotations are assigned to individual center pixels, local patches often contain implicit multi-label semantics due to spectral mixing and spatial overlap. This mismatch distorts prediction structure, exacerbates generalization errors, and limits the effectiveness of standard domain generalization (DG) techniques focused solely on feature or prediction invariance. We propose Class-Aware Semantic Calibration (CASC), a systematic semantic structure calibration framework that addresses three complementary distortions induced by mismatched patch supervision: (i) Balance corrects class frequency bias via reweighted supervision; (ii) Separability enhances boundary decision stability through margin-based logit calibration; and (iii) Independence reduces domain-specific spurious co-occurrence via prediction covariance decorrelation. To preserve calibrated semantics under pseudo-source shift, we further introduce a complementary DualAlign (DA) module, which jointly aligns feature statistics and prediction distributions, enforcing consistency at both representation and semantic levels. Extensive experiments on three cross-scene benchmarks (Houston, Pavia, and WHU-Hi) demonstrate that CASC-DA consistently improves performance over strong baselines, achieving an average gain of 3.0% in overall accuracy and 4.9% in Kappa coefficient compared with the best-performing baseline on each dataset. These results underscore the importance of semantic structure calibration for domain-generalized HSI classification. Full article

(This article belongs to the Section Remote Sensing Image Processing)

32 pages, 7334 KB

Open AccessArticle

Text Semantic Guided Spatial–Frequency Fusion Network for HSI–LiDAR Land-Cover Classification

by Aili Wang, Manman Yao, Haoran Lv and Haisong Chen

Remote Sens. 2026, 18(12), 1957; https://doi.org/10.3390/rs18121957 (registering DOI) - 12 Jun 2026

Abstract

Joint classification of hyperspectral images (HSI) and light detection and ranging (LiDAR) data is important for land-cover recognition, as it can exploit both spectral discrimination and structural elevation information. However, existing methods mainly focus on visual feature fusion and insufficiently utilize class-level semantic [...] Read more.

Joint classification of hyperspectral images (HSI) and light detection and ranging (LiDAR) data is important for land-cover recognition, as it can exploit both spectral discrimination and structural elevation information. However, existing methods mainly focus on visual feature fusion and insufficiently utilize class-level semantic priors, which limits their discriminative capability in complex boundaries, visually similar categories, and limited-sample scenarios. To address these issues, this paper proposes a text-guided multimodal semantic fusion network for HSI–LiDAR classification. Specifically, a Channel-Modulated Mobile Convolution Module (CMMC) is designed to extract modality-specific features, a Spatial–Frequency Feature Enhancement Module (SFFE) is introduced to enhance spatial-boundary and frequency-domain structural representations, and a Bidirectional Cross-Modal Fusion Module (BCMF) is developed to promote complementary interaction between spectral and structural information. Meanwhile, class-level textual descriptions are constructed from class names, color attributes, and geographical contexts, and a text encoder is employed to obtain semantic prototypes. Furthermore, a multi-branch vision–text semantic alignment mechanism projects HSI features, LiDAR features, and fused features into a shared semantic space for joint constraints, improving semantic consistency and class separability. Experiments on the Houston2013, Augsburg, and Trento datasets demonstrate the effectiveness of the proposed method. It achieves an overall accuracy of 98.76% on Houston2013, with improvements of 0.62%, 0.52%, and 0.67 in overall accuracy, average accuracy, and Kappa coefficient × 100 over the best competing results, respectively. The proposed method also obtains the best overall metrics on Augsburg and Trento, and ablation studies verify the effectiveness of the proposed components. Full article

(This article belongs to the Special Issue Deep Learning for Multi-Sensor Remote Sensing: Advancements in Image Classification and Semantic Segmentation)

22 pages, 1865 KB

Open AccessArticle

An Explainable Artificial Intelligence Framework for the Classification of Pumpkin Seed Varieties (Cucurbita pepo L.) Using Morphological Features

by Sajad Sabzi, Omid Daliran, Raziyeh Pourdarbani, Ginés García-Mateos and José Miguel Molina-Martínez

Appl. Sci. 2026, 16(12), 5958; https://doi.org/10.3390/app16125958 (registering DOI) - 12 Jun 2026

Abstract

Accurate automatic classification of seed varieties is important for seed sorting, quality assurance, and plant breeding, yet reliable discrimination remains difficult when cultivars exhibit highly similar visual characteristics. This study presents a reproducible and interpretable framework for the binary classification of two Turkish [...] Read more.

Accurate automatic classification of seed varieties is important for seed sorting, quality assurance, and plant breeding, yet reliable discrimination remains difficult when cultivars exhibit highly similar visual characteristics. This study presents a reproducible and interpretable framework for the binary classification of two Turkish pumpkin seed varieties using tabular morphological descriptors extracted from segmented seed images. Unlike many previous machine learning studies in this domain, which offer limited interpretability and leave model decisions largely as a black box, the proposed approach places Explainable Artificial Intelligence (XAI) at the center of the analysis. The framework combines biologically meaningful feature engineering, Optuna-based hyperparameter optimization, repeated stratified cross-validation, and a comparative evaluation of XGBoost, LightGBM, and CatBoost. Model explainability was investigated using SHapley Additive exPlanations (SHAP) to identify the morphological traits driving both global and instance-level predictions, while corrected repeated k-fold t-tests were used to assess the statistical significance of performance differences, which confirmed comparable accuracy among the three boosting models and a significant advantage over the baseline classifiers. All three boosting ensembles consistently outperformed the baseline classifiers (SVM, Logistic Regression, and Random Forest) on the hold-out test set. CatBoost achieved the best overall results, with an accuracy of 0.888, an F1-score of 0.879, and an MCC of 0.777. SHAP analysis consistently highlighted compactness, roundness, eccentricity, and engineered interaction descriptors as the most influential predictors. Overall, the proposed XAI-driven framework provides an accurate and transparent solution for pumpkin seed classification. Full article

(This article belongs to the Section Agricultural Science and Technology)

23 pages, 93772 KB

Open AccessArticle

TriCross-D2D: A Cross-Scene, Cross-View, and Cross-Weather Dataset for Drone-to-Drone Detection

by Wei Tang, Qilong Li, Yueping Peng, Hexiang Hao, Wenchao Kang, Xuekai Zhang, Liming Hou and Hongyan Lu

Drones 2026, 10(6), 459; https://doi.org/10.3390/drones10060459 (registering DOI) - 12 Jun 2026

Abstract

Drone-to-drone (D2D) detection is a critical yet underexplored task in low-altitude intelligent perception, where UAV targets are often small, weakly textured, motion-affected, and disturbed by complex backgrounds and environmental changes. Existing cross-domain detection datasets mainly focus on ground objects or single-factor shifts, making [...] Read more.

Drone-to-drone (D2D) detection is a critical yet underexplored task in low-altitude intelligent perception, where UAV targets are often small, weakly textured, motion-affected, and disturbed by complex backgrounds and environmental changes. Existing cross-domain detection datasets mainly focus on ground objects or single-factor shifts, making them insufficient for evaluating D2D detection under coupled real-world variations. To address this gap, we present TriCross-D2D, an RGB air-to-air UAV detection dataset and benchmark with three explicit domain shifts: scene, viewpoint, and weather. Built from real flight videos and controlled synthetic fog, TriCross-D2D contains 13 RGB video sequences, 23,403 raw frames, 7045 benchmark images, and 9771 annotated UAV instances. It provides a fixed split of 4045 Source_train images, 2000 Target_train images, and 1000 Target_val images, supporting both unsupervised domain adaptation (UDA) and semi-supervised domain adaptation (SSDA). The dataset is dominated by small objects, with extremely tiny, tiny, and small targets accounting for 73.8% of all instances. Benchmark results show that existing cross-domain detectors still perform limitedly on TriCross-D2D, especially under stricter localization and recall metrics. Single-factor analysis further reveals that the coupled scene–viewpoint–weather protocol is more challenging than isolated shifts, with viewpoint variation producing a particularly strong domain gap. As an exploratory enhanced baseline, SCOPE-DA-RTDETR improves DA-RTDETR from 28.63/13.12/22.39 to 29.94/13.71/23.40 in

{AP}_{50} / {AP}_{50 - 95} / AR

, showing consistent but modest gains. These findings demonstrate that TriCross-D2D provides a challenging and discriminative benchmark for cross-domain D2D small-object detection. Full article

(This article belongs to the Special Issue Detection, Identification and Tracking of UAVs and Drones: 2nd Edition)

► Show Figures

Figure 1

26 pages, 2010 KB

Open AccessArticle

A Dual-Stage Multimodal Alignment Approach for Robust Breast Cancer Diagnosis via Visual–Textual Computing

by Ramazan Ozgur Dogan

Appl. Sci. 2026, 16(12), 5934; https://doi.org/10.3390/app16125934 - 11 Jun 2026

Viewed by 99

Abstract

Manual classification of breast cancer is resource-intensive, slow, and subject to inter-observer variability, motivating automated deep learning solutions. Most current methods rely on unimodal imaging data and struggle with domain generalization (DG) across varied clinical environments. We propose a Dual-Stage Multimodal Alignment approach [...] Read more.

Manual classification of breast cancer is resource-intensive, slow, and subject to inter-observer variability, motivating automated deep learning solutions. Most current methods rely on unimodal imaging data and struggle with domain generalization (DG) across varied clinical environments. We propose a Dual-Stage Multimodal Alignment approach that integrates breast ultrasound (US) imagery with clinical text reports to improve diagnostic stability. The method proceeds in two stages: (1) Local Correlation Alignment (LCA), which aligns fine-grained visual features with textual embeddings to capture localized lesion attributes, and (2) Global Attention Alignment (GAA), which applies multi-head self-attention to the joint visual–textual sequence to encourage domain-invariant representations. We evaluate the approach on a harmonized, leakage-free repository of 6880 images aggregated from six public US datasets (BUS-CoT, BrEaST, BUS-BRA, BUS-UCLM, BLUI, BUSI) under three protocols: independent benchmarking on BUS-CoT, pooled cross-dataset evaluation, and zero-shot domain generalization on unseen unimodal target domains. On the BUS-CoT benchmark, the 198M-parameter model reaches 0.8177 accuracy and 0.8852 AUC, on par with the 7-billion-parameter Qwen2.5-VL-7B with chain-of-thought reasoning (0.8064 accuracy, 0.8354 AUC) while using roughly 1/35 the parameter count. In the pooled setting, it is competitive with single-domain state-of-the-art methods on individual subsets (e.g., 0.9576 AUC on BUSI, 0.8741 accuracy on BUS-BRA). Under zero-shot transfer without clinical text, per-domain AUC ranges from 0.7360 to 0.8060 across four unseen targets, providing a lower bound under cross-scanner shift. These results indicate that task-specific multimodal alignment can rival large vision-language models in breast US diagnosis at a fraction of the parameter count. Full article

14 pages, 1234 KB

Open AccessArticle

Enhancing Whole Slide Image Classification in Renal Cell Carcinoma via Swin Transformer-Based Multiple Instance Learning

by Bohan Zhang and Gao Zhen

Bioengineering 2026, 13(6), 680; https://doi.org/10.3390/bioengineering13060680 (registering DOI) - 11 Jun 2026

Viewed by 126

Abstract

Renal cell carcinoma (RCC) comprises histologic subtypes with distinct prognosis and treatment implications. This single-cohort study evaluated slide-level weakly supervised subtype classification for clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC) using 928 diagnostic H&E whole-slide images (WSIs) from 928 [...] Read more.

Renal cell carcinoma (RCC) comprises histologic subtypes with distinct prognosis and treatment implications. This single-cohort study evaluated slide-level weakly supervised subtype classification for clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC) using 928 diagnostic H&E whole-slide images (WSIs) from 928 patients in TCGA-RCC. We propose Swin-CLAM, a controlled modification of CLAM in which the conventional CNN patch encoder is replaced by an ImageNet-pretrained Swin-Tiny Transformer, while the CLAM-SB bag-level aggregation module is kept unchanged. WSIs were segmented, tiled into non-overlapping

256 \times 256

patches at an effective

20 \times

magnification, encoded offline, and classified using slide-level labels only. In five-fold patient-level cross-validation on TCGA-RCC, Swin-CLAM achieved a macro-averaged AUC of

0.976 \pm 0.008

, an accuracy of

94.8 \pm 1.0 %

, and a macro-F1 of

0.940 \pm 0.012

, with the largest gain observed for chRCC. Attention heatmaps and t-SNE plots were used as qualitative, exploratory analyses rather than formal evidence of interpretability. These results suggest that stronger patch-level representation can improve CLAM-based RCC subtype classification under a fixed MIL aggregator. However, the study does not establish clinical readiness, and external validation, calibration, domain-shift analysis, and expert region-level assessment are needed before practical deployment. Full article

(This article belongs to the Special Issue Advances in Computational Imaging and Artificial Intelligence for Biomedical and Clinical Applications)

► Show Figures

Figure 1

26 pages, 2289 KB

Open AccessArticle

VI-MSFFN: A Visible-Infrared Multi-Scale Feature Fusion Network for Cross-Modal Detection in Remote Sensing

by Yurong Yue, Weiwei Qin, Hao Chi, Baiwei An, Dingyi Wu, Wenxin Guo and Jingyi Xiong

Remote Sens. 2026, 18(12), 1938; https://doi.org/10.3390/rs18121938 - 11 Jun 2026

Viewed by 62

Abstract

To address the issues of insufficient single-modality robustness and limited multi-scale object detection accuracy in remote sensing image detection (RSID) in complex environments, this paper proposes a multimodal RSID network named VI-MSFFN. The model adopts a symmetric parallel dual-branch architecture to achieve independent [...] Read more.

To address the issues of insufficient single-modality robustness and limited multi-scale object detection accuracy in remote sensing image detection (RSID) in complex environments, this paper proposes a multimodal RSID network named VI-MSFFN. The model adopts a symmetric parallel dual-branch architecture to achieve independent extraction and collaborative modeling of visible and infrared modal features. A cross-modal multi-scale sparse cross-attention fusion module is proposed and applied to the P4 and P5 feature layers, and a high-low-level feature collaborative cross-modal fusion strategy was constructed to achieve efficient and robust cross-modal feature fusion while enhancing multi-scale object modeling capability and suppressing feature redundancy and noise. Additionally, a progressive feature interaction and fusion architecture was designed to combine spatial and frequency domain information to strengthen deep object representation. The experimental results on the VEDAI and Drone Vehicle datasets demonstrate that VI-MSFFN achieves state-of-the-art (SOTA) performance in detection accuracy, robustness, and generalization ability. The proposed method effectively solves the detection challenges of RSID and has significant application value in the field of multi-modal RSID. Full article

25 pages, 3608 KB

Open AccessArticle

GC²MFND: Multi-Granularity Conflict and Domain-Guided Calibration for Multimodal Fake News Detection

by Yanming Sun, Mingyue Zhang and Fujun Zhang

Entropy 2026, 28(6), 672; https://doi.org/10.3390/e28060672 (registering DOI) - 11 Jun 2026

Viewed by 179

Abstract

On current social media platforms, multimodal fake news has permeated various fields. Multi-domain fake news detection has garnered significant attention in the academic community. Existing multi-domain methods primarily employ feature fusion techniques based on text–image alignment, neglecting the extraction of conflicting information across [...] Read more.

On current social media platforms, multimodal fake news has permeated various fields. Multi-domain fake news detection has garnered significant attention in the academic community. Existing multi-domain methods primarily employ feature fusion techniques based on text–image alignment, neglecting the extraction of conflicting information across modalities and failing to address the domain-dependent nature of cross-modal feature conflicts. To address this, we propose a Multi-Granularity Conflict and Domain-Guided Calibration for Multimodal Fake News Detection model (GC²MFND). This model captures conflicting features through the domain-aware multi-granularity conflict extraction module and mitigates feature suppression using the domain-guided multimodal feature calibration module. Finally, it combines domain-adaptive aggregation with multi-view evidence integration to achieve robust decision-making under supervised contrastive learning constraints. Under known domain conditions, the experimental results demonstrate that GC²MFND outperforms existing multi-domain baseline methods, achieving accuracy rates of 95.3%, 95.7%, and 81.2% on the Weibo, Weibo21, and FineFake datasets, respectively, representing improvements of 1.1%, 1.2%, and 1.4% over the corresponding multi-domain baselines. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

20 pages, 16427 KB

Open AccessArticle

Lightweight Spatial-Frequency Collaborative Interaction Network for RGB-D Salient Object Detection

by Yitong Lu and Ziguan Cui

Sensors 2026, 26(12), 3708; https://doi.org/10.3390/s26123708 - 10 Jun 2026

Viewed by 234

Abstract

RGB-D salient object detection (SOD) aims to segment the most prominent objects from the background with a pair of given RGB and depth images. Existing RGB-D methods usually rely on heavy backbones to achieve high accuracy, while current lightweight methods struggle to maintain [...] Read more.

RGB-D salient object detection (SOD) aims to segment the most prominent objects from the background with a pair of given RGB and depth images. Existing RGB-D methods usually rely on heavy backbones to achieve high accuracy, while current lightweight methods struggle to maintain competitive performance. To break this intractable trade-off between effectiveness and model complexity, we propose a Lightweight Spatial-Frequency Collaborative Interaction Network (SFCINet), a unified and highly efficient framework. The core of SFCINet resides in the synergy between spatial-domain features and frequency-domain global priors. Specifically, we introduce the Spatial-Frequency Synergy (SFS) module, which shifts the perspective to a joint complex Fourier domain. By adaptively learning and optimizing the decoupled amplitude and phase components, it effectively isolates clutter to yield a purified global frequency-synergized prior, which modulates the spatial branches to eliminate cross-modal discrepancies for subsequent feature fusion while supplementing global information during decoding. To alleviate the interference caused by cross-modal representation discrepancies, we design the Cross-Guidance Interaction (CMGI) module, which employs a reciprocal anchoring mechanism. It guides the counterpart to mutually filter irrelevant noise and select task-relevant information, achieving fusion in an efficient manner. Finally, we present a Calibrated Hierarchical Decoder (CHD), which injects frequency-synergized global priors into the hierarchical decoding process. It re-establishes the connection between the frequency and spatial domains, ultimately achieving global-local consistency. Extensive experiments demonstrate that SFCINet delivers superior performance over state-of-the-art methods. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

22 pages, 6333 KB

Open AccessArticle

Frequency-Aware Bidirectional Interactive Mamba Network for Image Super-Resolution

by Youchen Chen, Ye Liu, Xing Wang, Qingze Zhang, Xingrui Zhang, Kerang Cao and Hoekyung Jung

Electronics 2026, 15(12), 2562; https://doi.org/10.3390/electronics15122562 - 10 Jun 2026

Viewed by 151

Abstract

Although state space models (SSMs) represented by Mamba have achieved long-range dependency modeling with linear complexity, they still struggle to realize differentiated processing of high-frequency and low-frequency components in image super-resolution (SR) tasks. To address the drawbacks of existing wavelet-Mamba methods, such as [...] Read more.

Although state space models (SSMs) represented by Mamba have achieved long-range dependency modeling with linear complexity, they still struggle to realize differentiated processing of high-frequency and low-frequency components in image super-resolution (SR) tasks. To address the drawbacks of existing wavelet-Mamba methods, such as texture distortion and unstable training caused by independent frequency band modeling, this paper proposes a frequency-aware bidirectional interactive mamba network (FABIMNet). The network uses discrete wavelet transform to decouple image frequency-domain components. The core lies in the proposed bidirectional high-frequency enhancement (Bi-HFE) module, which constructs an interaction mechanism of low-frequency guiding high-frequency generation and high-frequency feedback correcting low-frequency structure, and cooperates with the lightweight cross-band interaction (LCBI) module to achieve information synchronization and complementarity during deep feature extraction. Extensive experiments demonstrate that the proposed method achieves a competitive trade-off between computational efficiency and reconstruction performance across five benchmark datasets. Full article

► Show Figures

Figure 1

26 pages, 6396 KB

Open AccessArticle

A Method for Multimodal Information Extraction and Knowledge Graph Construction in Substation Secondary System

by Wenting Zha, Yue Liu, Dengrui Peng and Zhipeng Su

Entropy 2026, 28(6), 655; https://doi.org/10.3390/e28060655 - 9 Jun 2026

Viewed by 143

Abstract

Multi-source heterogeneous data in substation secondary systems are typically characterized by high entropy and disorder, which pose significant challenges for cross-modal information integration and efficient retrieval. Therefore, a method for multimodal information extraction and knowledge graph construction is proposed, enabling structured processing of [...] Read more.

Multi-source heterogeneous data in substation secondary systems are typically characterized by high entropy and disorder, which pose significant challenges for cross-modal information integration and efficient retrieval. Therefore, a method for multimodal information extraction and knowledge graph construction is proposed, enabling structured processing of heterogeneous data from multiple sources. For the image modality, positional and semantic information is extracted using YOLOv8n and Optical Character Recognition (OCR) techniques. To mitigate the effects of uncertain connection topology and noise interference, a Heuristic Circular Stepping Search Algorithm (HCSA) is designed to achieve deterministic path tracing of information flows. For the text modality, a RoFormer-BiLSTM-CRF model enhanced with Rotary Position Embedding (RoPE) is developed to alleviate information degradation in long-sequence texts, thereby enabling high-accuracy extraction of entities and relationships. Furthermore, by combining the domain ontology mapping rules and string similarity, the extracted device entities from the two modalities are aligned, thereby converting scattered data into a structured knowledge graph. Experiments conducted on the secondary-side data of a substation in China demonstrate that the proposed method effectively extracts multimodal information from substation secondary systems, providing valuable support for information management and decision-making assistance in complex industrial systems. Full article

(This article belongs to the Special Issue Methods in Artificial Intelligence and Information Processing, 4th Edition)

► Show Figures

Figure 1

21 pages, 321 KB

Open AccessArticle

Psychosocial Burden, Multi-System Somatic Symptom Severity, and Weight-Related Stigma in Late Adolescents and Young Adults: A Cross-Sectional Survey from Romania

by Raluca Maior, Hajnal Finta, Halit Tanju Besler, Elena Mardale, Simona Toncean and Vladimir Bacarea

Life 2026, 16(6), 969; https://doi.org/10.3390/life16060969 - 9 Jun 2026

Viewed by 178

Abstract

Evidence on the interplay between perceived stress, dietary behaviour, and weight-related psychosocial burden in Romanian young adults remains scarce. This cross-sectional study assessed associations between BMI, perceived stress, multi-system somatic symptom severity, and psychosocial burden in 117 participants aged 16 to 20 years [...] Read more.

Evidence on the interplay between perceived stress, dietary behaviour, and weight-related psychosocial burden in Romanian young adults remains scarce. This cross-sectional study assessed associations between BMI, perceived stress, multi-system somatic symptom severity, and psychosocial burden in 117 participants aged 16 to 20 years (89.7% female; mean age 19.23 ± 0.74 years; mean BMI 22.66 ± 3.85 kg/m²), recruited by convenience sampling in Târgu Mureș, Romania, during June 2025. Non-parametric methods were used throughout. Female participants scored significantly higher than males across digestive (p < 0.001), neurological (p = 0.001), cutaneous (p = 0.014), and total symptom domains (p < 0.001), with a median total symptom score of 21.0 versus 3.0 in males. Perceived stress correlated positively with neurological (rS = 0.445), cardiovascular (rS = 0.350), digestive (rS = 0.316), and total symptom scores (rS = 0.401; all p < 0.001). BMI was not associated with somatic symptoms but correlated with weight-related stigma (rS = 0.391, p < 0.001). Emotional distress was prevalent regardless of weight status: 60.7% reported food-related guilt and 59.8% reduced self-confidence, yet only 6.0% had consulted a mental health professional. Stress management, nutritional counselling, and body image support should target young adults across all BMI categories. Full article

(This article belongs to the Special Issue Nutrition, Exercise and Stress)

19 pages, 2225 KB

Open AccessArticle

Pancreas Segmentation Using a Two-Stage Pipeline of Faster R-CNN and TransUNet

by Yunjung Hong, Servas Adolph Tarimo and Jiyoung Woo

Appl. Sci. 2026, 16(12), 5764; https://doi.org/10.3390/app16125764 - 8 Jun 2026

Viewed by 93

Abstract

Pancreas segmentation in computed tomography (CT) images remains a challenging task due to the organ’s variable shape, size, and low contrast against surrounding tissues. In this study, we propose a two-stage pancreas segmentation framework that combines region localization using Faster R-CNN and pixel-level [...] Read more.

Pancreas segmentation in computed tomography (CT) images remains a challenging task due to the organ’s variable shape, size, and low contrast against surrounding tissues. In this study, we propose a two-stage pancreas segmentation framework that combines region localization using Faster R-CNN and pixel-level segmentation using a hybrid TransUNet architecture. To address issues related to class imbalance, we utilize the publicly available NIH pancreas CT dataset and apply downsampling techniques to construct a balanced training set. The segmentation stage incorporates a Transformer encoder into the U-Net framework and employs a novel DHD loss function, which combines the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD) to enhance both region-level and boundary-level accuracy. Through the experiments on the open dataset, our proposed model achieved a mean DSC of 88.98%, precision of 91.0%, and recall of 94.4%, the highest among compared approaches on the NIH dataset, outperforming the baselines such as U-Net and standard-loss TransUNet; however, differences in preprocessing protocols should be considered when making direct comparisons with prior work. To assess generalizability beyond the training domain, we further evaluated the NIH-trained model on the BTCV Multi-Organ Segmentation dataset, a completely different institution’s CT dataset, using inference-time adaptation strategies without any fine-tuning, achieving a mean DSC of 62.96% in a zero-shot cross-dataset setting. A fully automatic end-to-end pipeline where a Faster R-CNN detector fine-tuned on BTCV training cases predicts bounding boxes used for cropping was evaluated using 5-fold cross-validation on all 30 BTCV cases and achieved a mean DSC of 66.50% ± 8.55%, with no manual annotation used at any stage. Full article

(This article belongs to the Special Issue Applied and Innovative Computational Intelligence Systems: 4th Edition)

► Show Figures

Figure 1

35 pages, 1263 KB

Open AccessSystematic Review

Advances in Artificial Intelligence-Enabled Crop Pest and Disease Detection: A Systematic Review

by Zhen Ma, Cundeng Wang, Xinzhong Wang and Xuegeng Chen

Agriculture 2026, 16(12), 1262; https://doi.org/10.3390/agriculture16121262 - 7 Jun 2026

Viewed by 420

Abstract

The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral [...] Read more.

The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral sensing technology and analyzes the physical mechanisms of hyperspectral and multispectral imaging in early identification of crop diseases. The focus is on the architectural evolution of deep learning models, including lightweight convolutional neural networks (CNNs), vision transformers (ViTs) with long-range dependency modeling capabilities, and the efficient computing state space model Mamba. In addition, the research progress of spatial spectral joint learning, heterogeneous data fusion, and vision-language models (VLMs) in improving system robustness and interpretability are introduced. By synthesizing the integrated applications of UAV remote sensing, Internet of Things (IoT) edge computing and intelligent robots in staple and cash crops, this paper summarizes the implementation of the integrated system of perception, decision-making and execution. To address the issues of insufficient cross-domain generalization ability and uneven allocation of computing resources in existing models, this paper provides perspectives on the future development of agricultural artificial intelligence (AI) towards foundation model-driven, edge-intelligent collaboration, and green sustainable direction, which can provide theoretical reference for engineering applications in the field of intelligent plant protection. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

Search Results (864)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (864)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI