Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (864)

Search Parameters:
Keywords = cross-domain image

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 5937 KB  
Article
CGSTA-Net: A Cross-Domain Generative Prior-Assisted Structure–Texture Adaptive Network for Remote Sensing Image Dehazing
by Xiaoyan Li, Yankun Zhao and Na Niu
Symmetry 2026, 18(6), 1027; https://doi.org/10.3390/sym18061027 (registering DOI) - 14 Jun 2026
Abstract
Dehazing of images is important for proper interpretation of optical images in remote sensing. However, current dehazing networks tend to have limited receptive field and texture information loss caused by conventional downsampling and complementary cross-domain information not being utilized in dehazing frameworks. In [...] Read more.
Dehazing of images is important for proper interpretation of optical images in remote sensing. However, current dehazing networks tend to have limited receptive field and texture information loss caused by conventional downsampling and complementary cross-domain information not being utilized in dehazing frameworks. In order to cope with these problems, we propose a Cross-domain Generative Prior-assisted Structure–Texture Adaptive Network for remote sensing image dehazing. It is a dual-stream encoder–decoder framework, which enhances the domain-specific information of RGB and generated prior, and then integrates them adaptively for haze-free reconstruction. In order to minimize information loss in downsampling, wavelet pooling is introduced to consider the frequency-aware structural and textural features. Additionally, a Structure–Texture Calibration Block is designed to simultaneously improve the local frequency textures and construct sparse long-range dependencies of structures, so as to achieve better restoration performance under spatially non-uniform haze. To appropriately fuse the various representations from RGB and generated prior images, a Prior-aware Gated Adaptive Fusion module is developed to balance the domain-specific features dynamically and keep the fine details at multi-level feature fusion. Finally, we utilize pixel-level contrastive learning to guide the latent space away from hazy distributions, thus enhancing the discriminability of the features. Extensive experiments on the three datasets, namely RSID, RICE-I and HRSD, demonstrate that CGSTA-Net can effectively restore images under varying haze conditions and significantly outperforms the latest dehazing methods in terms of visual quality and quantitative performance. Specifically, compared with the most effective competitive method, CGSTA-Net increased the PSNR by 22.9% on RSID, by 13.2% on RICE-I, and by 7.2% on HRSD. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

26 pages, 4861 KB  
Article
Class-Aware Semantic Calibration for Cross-Scene Hyperspectral Image Classification
by Boshan Shi, Yanbo Liu, Youqiang Zhang and Guo Cao
Remote Sens. 2026, 18(12), 1976; https://doi.org/10.3390/rs18121976 (registering DOI) - 14 Jun 2026
Abstract
Cross-scene Hyperspectral Image (HSI) classification faces substantial domain shifts caused by sensor heterogeneity, acquisition variation, and scene diversity. While benchmark annotations are assigned to individual center pixels, local patches often contain implicit multi-label semantics due to spectral mixing and spatial overlap. This mismatch [...] Read more.
Cross-scene Hyperspectral Image (HSI) classification faces substantial domain shifts caused by sensor heterogeneity, acquisition variation, and scene diversity. While benchmark annotations are assigned to individual center pixels, local patches often contain implicit multi-label semantics due to spectral mixing and spatial overlap. This mismatch distorts prediction structure, exacerbates generalization errors, and limits the effectiveness of standard domain generalization (DG) techniques focused solely on feature or prediction invariance. We propose Class-Aware Semantic Calibration (CASC), a systematic semantic structure calibration framework that addresses three complementary distortions induced by mismatched patch supervision: (i) Balance corrects class frequency bias via reweighted supervision; (ii) Separability enhances boundary decision stability through margin-based logit calibration; and (iii) Independence reduces domain-specific spurious co-occurrence via prediction covariance decorrelation. To preserve calibrated semantics under pseudo-source shift, we further introduce a complementary DualAlign (DA) module, which jointly aligns feature statistics and prediction distributions, enforcing consistency at both representation and semantic levels. Extensive experiments on three cross-scene benchmarks (Houston, Pavia, and WHU-Hi) demonstrate that CASC-DA consistently improves performance over strong baselines, achieving an average gain of 3.0% in overall accuracy and 4.9% in Kappa coefficient compared with the best-performing baseline on each dataset. These results underscore the importance of semantic structure calibration for domain-generalized HSI classification. Full article
(This article belongs to the Section Remote Sensing Image Processing)
32 pages, 7334 KB  
Article
Text Semantic Guided Spatial–Frequency Fusion Network for HSI–LiDAR Land-Cover Classification
by Aili Wang, Manman Yao, Haoran Lv and Haisong Chen
Remote Sens. 2026, 18(12), 1957; https://doi.org/10.3390/rs18121957 (registering DOI) - 12 Jun 2026
Abstract
Joint classification of hyperspectral images (HSI) and light detection and ranging (LiDAR) data is important for land-cover recognition, as it can exploit both spectral discrimination and structural elevation information. However, existing methods mainly focus on visual feature fusion and insufficiently utilize class-level semantic [...] Read more.
Joint classification of hyperspectral images (HSI) and light detection and ranging (LiDAR) data is important for land-cover recognition, as it can exploit both spectral discrimination and structural elevation information. However, existing methods mainly focus on visual feature fusion and insufficiently utilize class-level semantic priors, which limits their discriminative capability in complex boundaries, visually similar categories, and limited-sample scenarios. To address these issues, this paper proposes a text-guided multimodal semantic fusion network for HSI–LiDAR classification. Specifically, a Channel-Modulated Mobile Convolution Module (CMMC) is designed to extract modality-specific features, a Spatial–Frequency Feature Enhancement Module (SFFE) is introduced to enhance spatial-boundary and frequency-domain structural representations, and a Bidirectional Cross-Modal Fusion Module (BCMF) is developed to promote complementary interaction between spectral and structural information. Meanwhile, class-level textual descriptions are constructed from class names, color attributes, and geographical contexts, and a text encoder is employed to obtain semantic prototypes. Furthermore, a multi-branch vision–text semantic alignment mechanism projects HSI features, LiDAR features, and fused features into a shared semantic space for joint constraints, improving semantic consistency and class separability. Experiments on the Houston2013, Augsburg, and Trento datasets demonstrate the effectiveness of the proposed method. It achieves an overall accuracy of 98.76% on Houston2013, with improvements of 0.62%, 0.52%, and 0.67 in overall accuracy, average accuracy, and Kappa coefficient × 100 over the best competing results, respectively. The proposed method also obtains the best overall metrics on Augsburg and Trento, and ablation studies verify the effectiveness of the proposed components. Full article
22 pages, 1865 KB  
Article
An Explainable Artificial Intelligence Framework for the Classification of Pumpkin Seed Varieties (Cucurbita pepo L.) Using Morphological Features
by Sajad Sabzi, Omid Daliran, Raziyeh Pourdarbani, Ginés García-Mateos and José Miguel Molina-Martínez
Appl. Sci. 2026, 16(12), 5958; https://doi.org/10.3390/app16125958 (registering DOI) - 12 Jun 2026
Abstract
Accurate automatic classification of seed varieties is important for seed sorting, quality assurance, and plant breeding, yet reliable discrimination remains difficult when cultivars exhibit highly similar visual characteristics. This study presents a reproducible and interpretable framework for the binary classification of two Turkish [...] Read more.
Accurate automatic classification of seed varieties is important for seed sorting, quality assurance, and plant breeding, yet reliable discrimination remains difficult when cultivars exhibit highly similar visual characteristics. This study presents a reproducible and interpretable framework for the binary classification of two Turkish pumpkin seed varieties using tabular morphological descriptors extracted from segmented seed images. Unlike many previous machine learning studies in this domain, which offer limited interpretability and leave model decisions largely as a black box, the proposed approach places Explainable Artificial Intelligence (XAI) at the center of the analysis. The framework combines biologically meaningful feature engineering, Optuna-based hyperparameter optimization, repeated stratified cross-validation, and a comparative evaluation of XGBoost, LightGBM, and CatBoost. Model explainability was investigated using SHapley Additive exPlanations (SHAP) to identify the morphological traits driving both global and instance-level predictions, while corrected repeated k-fold t-tests were used to assess the statistical significance of performance differences, which confirmed comparable accuracy among the three boosting models and a significant advantage over the baseline classifiers. All three boosting ensembles consistently outperformed the baseline classifiers (SVM, Logistic Regression, and Random Forest) on the hold-out test set. CatBoost achieved the best overall results, with an accuracy of 0.888, an F1-score of 0.879, and an MCC of 0.777. SHAP analysis consistently highlighted compactness, roundness, eccentricity, and engineered interaction descriptors as the most influential predictors. Overall, the proposed XAI-driven framework provides an accurate and transparent solution for pumpkin seed classification. Full article
(This article belongs to the Section Agricultural Science and Technology)
23 pages, 93772 KB  
Article
TriCross-D2D: A Cross-Scene, Cross-View, and Cross-Weather Dataset for Drone-to-Drone Detection
by Wei Tang, Qilong Li, Yueping Peng, Hexiang Hao, Wenchao Kang, Xuekai Zhang, Liming Hou and Hongyan Lu
Drones 2026, 10(6), 459; https://doi.org/10.3390/drones10060459 (registering DOI) - 12 Jun 2026
Abstract
Drone-to-drone (D2D) detection is a critical yet underexplored task in low-altitude intelligent perception, where UAV targets are often small, weakly textured, motion-affected, and disturbed by complex backgrounds and environmental changes. Existing cross-domain detection datasets mainly focus on ground objects or single-factor shifts, making [...] Read more.
Drone-to-drone (D2D) detection is a critical yet underexplored task in low-altitude intelligent perception, where UAV targets are often small, weakly textured, motion-affected, and disturbed by complex backgrounds and environmental changes. Existing cross-domain detection datasets mainly focus on ground objects or single-factor shifts, making them insufficient for evaluating D2D detection under coupled real-world variations. To address this gap, we present TriCross-D2D, an RGB air-to-air UAV detection dataset and benchmark with three explicit domain shifts: scene, viewpoint, and weather. Built from real flight videos and controlled synthetic fog, TriCross-D2D contains 13 RGB video sequences, 23,403 raw frames, 7045 benchmark images, and 9771 annotated UAV instances. It provides a fixed split of 4045 Source_train images, 2000 Target_train images, and 1000 Target_val images, supporting both unsupervised domain adaptation (UDA) and semi-supervised domain adaptation (SSDA). The dataset is dominated by small objects, with extremely tiny, tiny, and small targets accounting for 73.8% of all instances. Benchmark results show that existing cross-domain detectors still perform limitedly on TriCross-D2D, especially under stricter localization and recall metrics. Single-factor analysis further reveals that the coupled scene–viewpoint–weather protocol is more challenging than isolated shifts, with viewpoint variation producing a particularly strong domain gap. As an exploratory enhanced baseline, SCOPE-DA-RTDETR improves DA-RTDETR from 28.63/13.12/22.39 to 29.94/13.71/23.40 in AP50/AP5095/AR, showing consistent but modest gains. These findings demonstrate that TriCross-D2D provides a challenging and discriminative benchmark for cross-domain D2D small-object detection. Full article
Show Figures

Figure 1

26 pages, 2010 KB  
Article
A Dual-Stage Multimodal Alignment Approach for Robust Breast Cancer Diagnosis via Visual–Textual Computing
by Ramazan Ozgur Dogan
Appl. Sci. 2026, 16(12), 5934; https://doi.org/10.3390/app16125934 - 11 Jun 2026
Viewed by 99
Abstract
Manual classification of breast cancer is resource-intensive, slow, and subject to inter-observer variability, motivating automated deep learning solutions. Most current methods rely on unimodal imaging data and struggle with domain generalization (DG) across varied clinical environments. We propose a Dual-Stage Multimodal Alignment approach [...] Read more.
Manual classification of breast cancer is resource-intensive, slow, and subject to inter-observer variability, motivating automated deep learning solutions. Most current methods rely on unimodal imaging data and struggle with domain generalization (DG) across varied clinical environments. We propose a Dual-Stage Multimodal Alignment approach that integrates breast ultrasound (US) imagery with clinical text reports to improve diagnostic stability. The method proceeds in two stages: (1) Local Correlation Alignment (LCA), which aligns fine-grained visual features with textual embeddings to capture localized lesion attributes, and (2) Global Attention Alignment (GAA), which applies multi-head self-attention to the joint visual–textual sequence to encourage domain-invariant representations. We evaluate the approach on a harmonized, leakage-free repository of 6880 images aggregated from six public US datasets (BUS-CoT, BrEaST, BUS-BRA, BUS-UCLM, BLUI, BUSI) under three protocols: independent benchmarking on BUS-CoT, pooled cross-dataset evaluation, and zero-shot domain generalization on unseen unimodal target domains. On the BUS-CoT benchmark, the 198M-parameter model reaches 0.8177 accuracy and 0.8852 AUC, on par with the 7-billion-parameter Qwen2.5-VL-7B with chain-of-thought reasoning (0.8064 accuracy, 0.8354 AUC) while using roughly 1/35 the parameter count. In the pooled setting, it is competitive with single-domain state-of-the-art methods on individual subsets (e.g., 0.9576 AUC on BUSI, 0.8741 accuracy on BUS-BRA). Under zero-shot transfer without clinical text, per-domain AUC ranges from 0.7360 to 0.8060 across four unseen targets, providing a lower bound under cross-scanner shift. These results indicate that task-specific multimodal alignment can rival large vision-language models in breast US diagnosis at a fraction of the parameter count. Full article
14 pages, 1234 KB  
Article
Enhancing Whole Slide Image Classification in Renal Cell Carcinoma via Swin Transformer-Based Multiple Instance Learning
by Bohan Zhang and Gao Zhen
Bioengineering 2026, 13(6), 680; https://doi.org/10.3390/bioengineering13060680 (registering DOI) - 11 Jun 2026
Viewed by 126
Abstract
Renal cell carcinoma (RCC) comprises histologic subtypes with distinct prognosis and treatment implications. This single-cohort study evaluated slide-level weakly supervised subtype classification for clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC) using 928 diagnostic H&E whole-slide images (WSIs) from 928 [...] Read more.
Renal cell carcinoma (RCC) comprises histologic subtypes with distinct prognosis and treatment implications. This single-cohort study evaluated slide-level weakly supervised subtype classification for clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC) using 928 diagnostic H&E whole-slide images (WSIs) from 928 patients in TCGA-RCC. We propose Swin-CLAM, a controlled modification of CLAM in which the conventional CNN patch encoder is replaced by an ImageNet-pretrained Swin-Tiny Transformer, while the CLAM-SB bag-level aggregation module is kept unchanged. WSIs were segmented, tiled into non-overlapping 256×256 patches at an effective 20× magnification, encoded offline, and classified using slide-level labels only. In five-fold patient-level cross-validation on TCGA-RCC, Swin-CLAM achieved a macro-averaged AUC of 0.976±0.008, an accuracy of 94.8±1.0%, and a macro-F1 of 0.940±0.012, with the largest gain observed for chRCC. Attention heatmaps and t-SNE plots were used as qualitative, exploratory analyses rather than formal evidence of interpretability. These results suggest that stronger patch-level representation can improve CLAM-based RCC subtype classification under a fixed MIL aggregator. However, the study does not establish clinical readiness, and external validation, calibration, domain-shift analysis, and expert region-level assessment are needed before practical deployment. Full article
Show Figures

Figure 1

26 pages, 2289 KB  
Article
VI-MSFFN: A Visible-Infrared Multi-Scale Feature Fusion Network for Cross-Modal Detection in Remote Sensing
by Yurong Yue, Weiwei Qin, Hao Chi, Baiwei An, Dingyi Wu, Wenxin Guo and Jingyi Xiong
Remote Sens. 2026, 18(12), 1938; https://doi.org/10.3390/rs18121938 - 11 Jun 2026
Viewed by 62
Abstract
To address the issues of insufficient single-modality robustness and limited multi-scale object detection accuracy in remote sensing image detection (RSID) in complex environments, this paper proposes a multimodal RSID network named VI-MSFFN. The model adopts a symmetric parallel dual-branch architecture to achieve independent [...] Read more.
To address the issues of insufficient single-modality robustness and limited multi-scale object detection accuracy in remote sensing image detection (RSID) in complex environments, this paper proposes a multimodal RSID network named VI-MSFFN. The model adopts a symmetric parallel dual-branch architecture to achieve independent extraction and collaborative modeling of visible and infrared modal features. A cross-modal multi-scale sparse cross-attention fusion module is proposed and applied to the P4 and P5 feature layers, and a high-low-level feature collaborative cross-modal fusion strategy was constructed to achieve efficient and robust cross-modal feature fusion while enhancing multi-scale object modeling capability and suppressing feature redundancy and noise. Additionally, a progressive feature interaction and fusion architecture was designed to combine spatial and frequency domain information to strengthen deep object representation. The experimental results on the VEDAI and Drone Vehicle datasets demonstrate that VI-MSFFN achieves state-of-the-art (SOTA) performance in detection accuracy, robustness, and generalization ability. The proposed method effectively solves the detection challenges of RSID and has significant application value in the field of multi-modal RSID. Full article
25 pages, 3608 KB  
Article
GC2MFND: Multi-Granularity Conflict and Domain-Guided Calibration for Multimodal Fake News Detection
by Yanming Sun, Mingyue Zhang and Fujun Zhang
Entropy 2026, 28(6), 672; https://doi.org/10.3390/e28060672 (registering DOI) - 11 Jun 2026
Viewed by 179
Abstract
On current social media platforms, multimodal fake news has permeated various fields. Multi-domain fake news detection has garnered significant attention in the academic community. Existing multi-domain methods primarily employ feature fusion techniques based on text–image alignment, neglecting the extraction of conflicting information across [...] Read more.
On current social media platforms, multimodal fake news has permeated various fields. Multi-domain fake news detection has garnered significant attention in the academic community. Existing multi-domain methods primarily employ feature fusion techniques based on text–image alignment, neglecting the extraction of conflicting information across modalities and failing to address the domain-dependent nature of cross-modal feature conflicts. To address this, we propose a Multi-Granularity Conflict and Domain-Guided Calibration for Multimodal Fake News Detection model (GC2MFND). This model captures conflicting features through the domain-aware multi-granularity conflict extraction module and mitigates feature suppression using the domain-guided multimodal feature calibration module. Finally, it combines domain-adaptive aggregation with multi-view evidence integration to achieve robust decision-making under supervised contrastive learning constraints. Under known domain conditions, the experimental results demonstrate that GC2MFND outperforms existing multi-domain baseline methods, achieving accuracy rates of 95.3%, 95.7%, and 81.2% on the Weibo, Weibo21, and FineFake datasets, respectively, representing improvements of 1.1%, 1.2%, and 1.4% over the corresponding multi-domain baselines. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

20 pages, 16427 KB  
Article
Lightweight Spatial-Frequency Collaborative Interaction Network for RGB-D Salient Object Detection
by Yitong Lu and Ziguan Cui
Sensors 2026, 26(12), 3708; https://doi.org/10.3390/s26123708 - 10 Jun 2026
Viewed by 234
Abstract
RGB-D salient object detection (SOD) aims to segment the most prominent objects from the background with a pair of given RGB and depth images. Existing RGB-D methods usually rely on heavy backbones to achieve high accuracy, while current lightweight methods struggle to maintain [...] Read more.
RGB-D salient object detection (SOD) aims to segment the most prominent objects from the background with a pair of given RGB and depth images. Existing RGB-D methods usually rely on heavy backbones to achieve high accuracy, while current lightweight methods struggle to maintain competitive performance. To break this intractable trade-off between effectiveness and model complexity, we propose a Lightweight Spatial-Frequency Collaborative Interaction Network (SFCINet), a unified and highly efficient framework. The core of SFCINet resides in the synergy between spatial-domain features and frequency-domain global priors. Specifically, we introduce the Spatial-Frequency Synergy (SFS) module, which shifts the perspective to a joint complex Fourier domain. By adaptively learning and optimizing the decoupled amplitude and phase components, it effectively isolates clutter to yield a purified global frequency-synergized prior, which modulates the spatial branches to eliminate cross-modal discrepancies for subsequent feature fusion while supplementing global information during decoding. To alleviate the interference caused by cross-modal representation discrepancies, we design the Cross-Guidance Interaction (CMGI) module, which employs a reciprocal anchoring mechanism. It guides the counterpart to mutually filter irrelevant noise and select task-relevant information, achieving fusion in an efficient manner. Finally, we present a Calibrated Hierarchical Decoder (CHD), which injects frequency-synergized global priors into the hierarchical decoding process. It re-establishes the connection between the frequency and spatial domains, ultimately achieving global-local consistency. Extensive experiments demonstrate that SFCINet delivers superior performance over state-of-the-art methods. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 6333 KB  
Article
Frequency-Aware Bidirectional Interactive Mamba Network for Image Super-Resolution
by Youchen Chen, Ye Liu, Xing Wang, Qingze Zhang, Xingrui Zhang, Kerang Cao and Hoekyung Jung
Electronics 2026, 15(12), 2562; https://doi.org/10.3390/electronics15122562 - 10 Jun 2026
Viewed by 151
Abstract
Although state space models (SSMs) represented by Mamba have achieved long-range dependency modeling with linear complexity, they still struggle to realize differentiated processing of high-frequency and low-frequency components in image super-resolution (SR) tasks. To address the drawbacks of existing wavelet-Mamba methods, such as [...] Read more.
Although state space models (SSMs) represented by Mamba have achieved long-range dependency modeling with linear complexity, they still struggle to realize differentiated processing of high-frequency and low-frequency components in image super-resolution (SR) tasks. To address the drawbacks of existing wavelet-Mamba methods, such as texture distortion and unstable training caused by independent frequency band modeling, this paper proposes a frequency-aware bidirectional interactive mamba network (FABIMNet). The network uses discrete wavelet transform to decouple image frequency-domain components. The core lies in the proposed bidirectional high-frequency enhancement (Bi-HFE) module, which constructs an interaction mechanism of low-frequency guiding high-frequency generation and high-frequency feedback correcting low-frequency structure, and cooperates with the lightweight cross-band interaction (LCBI) module to achieve information synchronization and complementarity during deep feature extraction. Extensive experiments demonstrate that the proposed method achieves a competitive trade-off between computational efficiency and reconstruction performance across five benchmark datasets. Full article
Show Figures

Figure 1

26 pages, 6396 KB  
Article
A Method for Multimodal Information Extraction and Knowledge Graph Construction in Substation Secondary System
by Wenting Zha, Yue Liu, Dengrui Peng and Zhipeng Su
Entropy 2026, 28(6), 655; https://doi.org/10.3390/e28060655 - 9 Jun 2026
Viewed by 143
Abstract
Multi-source heterogeneous data in substation secondary systems are typically characterized by high entropy and disorder, which pose significant challenges for cross-modal information integration and efficient retrieval. Therefore, a method for multimodal information extraction and knowledge graph construction is proposed, enabling structured processing of [...] Read more.
Multi-source heterogeneous data in substation secondary systems are typically characterized by high entropy and disorder, which pose significant challenges for cross-modal information integration and efficient retrieval. Therefore, a method for multimodal information extraction and knowledge graph construction is proposed, enabling structured processing of heterogeneous data from multiple sources. For the image modality, positional and semantic information is extracted using YOLOv8n and Optical Character Recognition (OCR) techniques. To mitigate the effects of uncertain connection topology and noise interference, a Heuristic Circular Stepping Search Algorithm (HCSA) is designed to achieve deterministic path tracing of information flows. For the text modality, a RoFormer-BiLSTM-CRF model enhanced with Rotary Position Embedding (RoPE) is developed to alleviate information degradation in long-sequence texts, thereby enabling high-accuracy extraction of entities and relationships. Furthermore, by combining the domain ontology mapping rules and string similarity, the extracted device entities from the two modalities are aligned, thereby converting scattered data into a structured knowledge graph. Experiments conducted on the secondary-side data of a substation in China demonstrate that the proposed method effectively extracts multimodal information from substation secondary systems, providing valuable support for information management and decision-making assistance in complex industrial systems. Full article
Show Figures

Figure 1

21 pages, 321 KB  
Article
Psychosocial Burden, Multi-System Somatic Symptom Severity, and Weight-Related Stigma in Late Adolescents and Young Adults: A Cross-Sectional Survey from Romania
by Raluca Maior, Hajnal Finta, Halit Tanju Besler, Elena Mardale, Simona Toncean and Vladimir Bacarea
Life 2026, 16(6), 969; https://doi.org/10.3390/life16060969 - 9 Jun 2026
Viewed by 178
Abstract
Evidence on the interplay between perceived stress, dietary behaviour, and weight-related psychosocial burden in Romanian young adults remains scarce. This cross-sectional study assessed associations between BMI, perceived stress, multi-system somatic symptom severity, and psychosocial burden in 117 participants aged 16 to 20 years [...] Read more.
Evidence on the interplay between perceived stress, dietary behaviour, and weight-related psychosocial burden in Romanian young adults remains scarce. This cross-sectional study assessed associations between BMI, perceived stress, multi-system somatic symptom severity, and psychosocial burden in 117 participants aged 16 to 20 years (89.7% female; mean age 19.23 ± 0.74 years; mean BMI 22.66 ± 3.85 kg/m2), recruited by convenience sampling in Târgu Mureș, Romania, during June 2025. Non-parametric methods were used throughout. Female participants scored significantly higher than males across digestive (p < 0.001), neurological (p = 0.001), cutaneous (p = 0.014), and total symptom domains (p < 0.001), with a median total symptom score of 21.0 versus 3.0 in males. Perceived stress correlated positively with neurological (rS = 0.445), cardiovascular (rS = 0.350), digestive (rS = 0.316), and total symptom scores (rS = 0.401; all p < 0.001). BMI was not associated with somatic symptoms but correlated with weight-related stigma (rS = 0.391, p < 0.001). Emotional distress was prevalent regardless of weight status: 60.7% reported food-related guilt and 59.8% reduced self-confidence, yet only 6.0% had consulted a mental health professional. Stress management, nutritional counselling, and body image support should target young adults across all BMI categories. Full article
(This article belongs to the Special Issue Nutrition, Exercise and Stress)
19 pages, 2225 KB  
Article
Pancreas Segmentation Using a Two-Stage Pipeline of Faster R-CNN and TransUNet
by Yunjung Hong, Servas Adolph Tarimo and Jiyoung Woo
Appl. Sci. 2026, 16(12), 5764; https://doi.org/10.3390/app16125764 - 8 Jun 2026
Viewed by 93
Abstract
Pancreas segmentation in computed tomography (CT) images remains a challenging task due to the organ’s variable shape, size, and low contrast against surrounding tissues. In this study, we propose a two-stage pancreas segmentation framework that combines region localization using Faster R-CNN and pixel-level [...] Read more.
Pancreas segmentation in computed tomography (CT) images remains a challenging task due to the organ’s variable shape, size, and low contrast against surrounding tissues. In this study, we propose a two-stage pancreas segmentation framework that combines region localization using Faster R-CNN and pixel-level segmentation using a hybrid TransUNet architecture. To address issues related to class imbalance, we utilize the publicly available NIH pancreas CT dataset and apply downsampling techniques to construct a balanced training set. The segmentation stage incorporates a Transformer encoder into the U-Net framework and employs a novel DHD loss function, which combines the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD) to enhance both region-level and boundary-level accuracy. Through the experiments on the open dataset, our proposed model achieved a mean DSC of 88.98%, precision of 91.0%, and recall of 94.4%, the highest among compared approaches on the NIH dataset, outperforming the baselines such as U-Net and standard-loss TransUNet; however, differences in preprocessing protocols should be considered when making direct comparisons with prior work. To assess generalizability beyond the training domain, we further evaluated the NIH-trained model on the BTCV Multi-Organ Segmentation dataset, a completely different institution’s CT dataset, using inference-time adaptation strategies without any fine-tuning, achieving a mean DSC of 62.96% in a zero-shot cross-dataset setting. A fully automatic end-to-end pipeline where a Faster R-CNN detector fine-tuned on BTCV training cases predicts bounding boxes used for cropping was evaluated using 5-fold cross-validation on all 30 BTCV cases and achieved a mean DSC of 66.50% ± 8.55%, with no manual annotation used at any stage. Full article
Show Figures

Figure 1

35 pages, 1263 KB  
Systematic Review
Advances in Artificial Intelligence-Enabled Crop Pest and Disease Detection: A Systematic Review
by Zhen Ma, Cundeng Wang, Xinzhong Wang and Xuegeng Chen
Agriculture 2026, 16(12), 1262; https://doi.org/10.3390/agriculture16121262 - 7 Jun 2026
Viewed by 420
Abstract
The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral [...] Read more.
The detection technology of crop diseases and pests is transitioning from single sensor monitoring to intelligent perception and multimodal fusion. This paper follows the PRISMA 2020 standard and systematically reviews the relevant core literature. This paper systematically summarizes the development history of spectral sensing technology and analyzes the physical mechanisms of hyperspectral and multispectral imaging in early identification of crop diseases. The focus is on the architectural evolution of deep learning models, including lightweight convolutional neural networks (CNNs), vision transformers (ViTs) with long-range dependency modeling capabilities, and the efficient computing state space model Mamba. In addition, the research progress of spatial spectral joint learning, heterogeneous data fusion, and vision-language models (VLMs) in improving system robustness and interpretability are introduced. By synthesizing the integrated applications of UAV remote sensing, Internet of Things (IoT) edge computing and intelligent robots in staple and cash crops, this paper summarizes the implementation of the integrated system of perception, decision-making and execution. To address the issues of insufficient cross-domain generalization ability and uneven allocation of computing resources in existing models, this paper provides perspectives on the future development of agricultural artificial intelligence (AI) towards foundation model-driven, edge-intelligent collaboration, and green sustainable direction, which can provide theoretical reference for engineering applications in the field of intelligent plant protection. Full article
(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)
Show Figures

Figure 1

Back to TopTop