MDPI - Publisher of Open Access Journals

26 pages, 3229 KB

Open AccessReview

Artificial Intelligence Algorithms in Tunnel Construction Risk Management: A Review of Research Trends, Application Scenarios and Bottlenecks

by Junqian Zhang, Jianling Huang, Xiaodong Hu, Qing’e Wang, Huihua Chen and Zhenxu Guo

Buildings 2026, 16(12), 2446; https://doi.org/10.3390/buildings16122446 (registering DOI) - 20 Jun 2026

Abstract

As tunnel engineering continues to advance toward deeper, longer, and more complex projects, the risks encountered during the construction phase have evolved into a combination of various disaster types and the accumulation of multiple contributing factors. Traditional empirical and semi-empirical risk management methods [...] Read more.

As tunnel engineering continues to advance toward deeper, longer, and more complex projects, the risks encountered during the construction phase have evolved into a combination of various disaster types and the accumulation of multiple contributing factors. Traditional empirical and semi-empirical risk management methods are increasingly revealing shortcomings in terms of timeliness, accuracy, and the ability to process multi-source data. In recent years, driven by advancements in computing power and sensor technology, artificial intelligence algorithms (AI algorithms) such as machine learning and deep learning have been rapidly adopted in tunnel construction risk management. This paper retrieved relevant literature from the Web of Science database covering the period from 2010 to 2025. After rigorous screening, 96 highly relevant papers were selected for bibliometric analysis. This paper systematically reviews research progress from two perspectives: algorithmic models and engineering applications. The review indicates that, in terms of algorithmic models, traditional machine learning, convolutional neural network, recurrent neural network, generative adversarial network, Transformer, and graph neural network constitute a multi-level technical framework encompassing feature representation, risk perception, and intelligent decision-making. In terms of applications, AI algorithms have been widely integrated into typical scenarios such as geological hazard identification and prediction, surrounding rock stability and deformation prediction, rock burst assessment and early warning, lining defect detection and structural safety assessment, construction-induced ground settlement prediction, and tunnel gas and fire hazard prediction, significantly enhancing risk identification and early warning capabilities. However, several challenges remain, including the scarcity of high-quality datasets, the prevalence of noisy, incomplete, and heterogeneous monitoring data, insufficient coupling between model interpretability and engineering mechanisms, limited cross-project transferability, and the lack of integrated management systems for multi-hazard lifecycle control. Based on this, this paper proposes future research directions in areas such as data infrastructure development, integration of mechanism constraints, and multi-hazard collaborative modeling, aiming to provide guidance for the further development of intelligent risk management in tunnel construction. Full article

(This article belongs to the Section Construction Management, and Computers & Digitization)

► Show Figures

Figure 1

14 pages, 1969 KB

Open AccessArticle

Radiomics-Guided Multi-Sequence Learning for Pathological Complete Response Prediction from Breast MRI with Missing Auxiliary Sequences

by Xinyuan Xiang, Wenyu Yin and Jiayue Li

J. Imaging 2026, 12(6), 271; https://doi.org/10.3390/jimaging12060271 - 18 Jun 2026

Viewed by 86

Abstract

Pathological complete response (pCR) after neoadjuvant chemotherapy (NACT) provides an endpoint for treatment evaluation in breast cancer. Multi-sequence breast MRI can support pCR prediction, but routine examinations may lack usable T1-weighted or T2-weighted sequences. Many models merge radiomic and deep features by concatenation, [...] Read more.

Pathological complete response (pCR) after neoadjuvant chemotherapy (NACT) provides an endpoint for treatment evaluation in breast cancer. Multi-sequence breast MRI can support pCR prediction, but routine examinations may lack usable T1-weighted or T2-weighted sequences. Many models merge radiomic and deep features by concatenation, leaving the interaction between handcrafted descriptors and learned representations weakly specified. We developed a radiomics-guided framework for pCR prediction from multi-sequence breast MRI. The model uses a multi-branch 2.5D encoder for sequence-specific features, radiomics-guided channel recalibration, and masked token fusion to aggregate available sequence tokens. We evaluated the framework on 157 patients from the I-SPY1 Trial cohort with patient-level five-fold cross-validation, fixed sequence-combination analysis, and slice-window sensitivity analysis. The full model achieved 78.4% accuracy and 0.809 AUC, compared with 75.8% accuracy and 0.788 AUC for the strongest channel-concatenation baseline. In this cohort, radiomics-guided multi-sequence learning was feasible, with external validation required before clinical interpretation. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications, 2nd Edition)

29 pages, 6688 KB

Open AccessArticle

CGMSN: CFAR-Guided Mode-Selective Network for SAR Target Detection

by Lingjuan Yu, Xinya Xiong, Xiaochun Xie, Miaomiao Liang, Xiangchun Yu, Xuan Jiao and Wen Hong

Remote Sens. 2026, 18(12), 2040; https://doi.org/10.3390/rs18122040 - 18 Jun 2026

Viewed by 85

Abstract

Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according [...] Read more.

Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according to the CFAR target–background separation margin. Specifically, CFAR is used as an interpretable statistical tool to construct an anomaly response map. The separation margin is then calculated by comparing the average CFAR anomaly responses of annotated target regions and their surrounding contextual backgrounds. Based on this indicator, a You Only Look Once version 8 (YOLOv8)-based mode-selective detector is constructed with three key components. First, a lightweight representation-enhanced backbone that integrates ResNet18 and a dilated convolutional spatial pyramid (DCSP) module is adopted to improve contextual representation while maintaining moderate model complexity. Second, a mode-selective neck (MSN) is designed with three predefined fusion modes, where the appropriate fusion depth is selected according to the CFAR-guided target–background separation margin of each dataset. Third, a complete intersection over the union modulated head (CMH) is developed to enhance classification-regression alignment and suppress clutter-induced responses. Experiments on SAR-Aircraft-1.0, High-Resolution SAR Images Dataset (HRSID), and SAR Ship Detection Dataset (SSDD) indicate that datasets with smaller CFAR target–background separation margins benefit from deeper fusion, while datasets with larger separation margins can adopt shallower fusion. Moreover, the proposed CGMSN achieves superior performance over representative detectors, demonstrating its effectiveness on the evaluated SAR datasets with diverse scene characteristics. Full article

(This article belongs to the Special Issue Target Recognition and Detection Based on High Resolution Radar Images (Second Edition))

32 pages, 3409 KB

Open AccessArticle

xServeNet: An Explainable Deep Neural Network for Web Services Classification

by Yilong Yang, Muhammad Ali Khan, Zhaotian Li and Weiru Wang

Electronics 2026, 15(12), 2711; https://doi.org/10.3390/electronics15122711 - 18 Jun 2026

Viewed by 156

Abstract

Web service classification plays an important role in software reuse, service discovery, and automatic metadata organization. Although recent deep learning approaches have improved classification performance by using service names and natural-language descriptions, most existing methods still operate as black-box models and offer limited [...] Read more.

Web service classification plays an important role in software reuse, service discovery, and automatic metadata organization. Although recent deep learning approaches have improved classification performance by using service names and natural-language descriptions, most existing methods still operate as black-box models and offer limited insight into how different metadata sources influence classification decisions. This lack of transparency reduces their practical usefulness for developers who need to verify predicted categories, analyze incorrect classifications, and improve service metadata quality. A well-trained interpretable model can not only help developers choose more appropriate and reliable categories for each web service, but also help write a more reasonable service name and description. In this paper, we present xServeNet, an explainability-oriented extension of ServeNet for transparent web service classification. xServeNet preserves the BERT-based representation and CNN–BiLSTM feature extractor of ServeNet and introduces (i) an instance-wise dynamic source-fusion mechanism that adaptively combines service-name and service-description features according to their semantic contribution, and (ii) model-internal importance indicators at both the source and word levels that support inspection of classification decisions without introducing additional trainable parameters. We benchmark xServeNet against eleven machine learning baselines on two real-world ProgrammableWeb datasets of 10,943 and 14,086 services covering 50 categories. xServeNet reaches 71.08% Top-1/91.35% Top-5 accuracy on the original dataset and 74.10% Top-1/92.95% Top-5 accuracy on the updated dataset, consistently improving Top-1 accuracy over ServeNet while remaining competitive on Top-5, and achieving the lowest per-category Top-5 standard deviation among all twelve compared methods. In practice, the importance indicators support three concrete activities at the service registry: helping developers verify predicted categories at registration time, iterating on description wording when the predicted category looks wrong, and supporting registry curators in flagging likely mislabelled services for review. Full article

(This article belongs to the Special Issue New Trends in Machine Learning, System and Digital Twins)

► Show Figures

Figure 1

24 pages, 3312 KB

Open AccessArticle

Leveraging Multi-Source Data Fusion Approach for Fine-Grained Affective-Appraisal Analysis in TPD-Oriented Online Professional Learning

by Di Chen, Xinyue Xu, Ruiyang Gao and Yuhong Liu

Behav. Sci. 2026, 16(6), 1025; https://doi.org/10.3390/bs16061025 - 18 Jun 2026

Viewed by 133

Abstract

Teacher professional development (TPD) is increasingly mediated by online platforms, yet emotion analysis in this context remains underdeveloped because teachers’ professional discourse is often reflective, evaluative, and shaped by professional norms. To address this challenge, this study proposes a fine-grained, low-intrusion affective-appraisal analysis [...] Read more.

Teacher professional development (TPD) is increasingly mediated by online platforms, yet emotion analysis in this context remains underdeveloped because teachers’ professional discourse is often reflective, evaluative, and shaped by professional norms. To address this challenge, this study proposes a fine-grained, low-intrusion affective-appraisal analysis framework for TPD-oriented online professional learning that integrates textual evidence with platform interaction logs. The framework retains pleasure, arousal, and dominance from the pleasure–arousal–dominance (PAD) model and introduces utility as an appraisal-related dimension, capturing teachers’ perceived usefulness, value judgment, and professional learning gain. Methodologically, it combines textual representations based on Bidirectional Encoder Representations from Transformers (BERT), intra-week long short-term memory (LSTM) aggregation, interpretable behavioral-log features, and feature-level fusion. Data were collected from an authentic TPD-oriented online course involving 107 pre-service teachers, yielding 1276 teacher-week samples from 4300 texts and 264,028 interaction records. Results show that intra-week sequential modeling improves the macro-averaged F1 score (Macro-F1) over both the term frequency–inverse document frequency plus support vector machine (TF-IDF+SVM) baseline and BERT-based weekly text concatenation, with statistically significant gains over the non-sequential BERT-concat model across all four dimensions. Adding interaction logs improves accuracy across all dimensions and provides complementary process-based evidence, especially for arousal and utility. By linking a four-dimensional affective-appraisal framework with text-log fusion, this study offers a scalable and context-sensitive approach to affective-appraisal analytics in pre-service teacher professional learning. Full article

(This article belongs to the Section Educational Psychology)

► Show Figures

Figure 1

25 pages, 6003 KB

Open AccessArticle

Multi-Scale Feature Fusion for Intelligent Recognition of Tunnel Face Fractures

by Qiang Gong, Jiaying Fan, Ning Zhang, Hongliang Liu, Xinbo Jiang, Changyuan Chen, Wenfeng Tu and Yuxue Chen

Appl. Sci. 2026, 16(12), 6182; https://doi.org/10.3390/app16126182 - 18 Jun 2026

Viewed by 159

Abstract

Accurate recognition of fractures on tunnel faces is essential for evaluating surrounding-rock integrity and ensuring excavation safety, yet it remains difficult because fracture traces are slender, irregular, discontinuous, and easily obscured by complex rock textures and illumination variability. This study proposes MF-DeepLabv3+, an [...] Read more.

Accurate recognition of fractures on tunnel faces is essential for evaluating surrounding-rock integrity and ensuring excavation safety, yet it remains difficult because fracture traces are slender, irregular, discontinuous, and easily obscured by complex rock textures and illumination variability. This study proposes MF-DeepLabv3+, an enhanced DeepLabv3+-based semantic segmentation framework for tunnel-face fracture identification and geometric characterization. Unlike existing attention-based DeepLab variants that mainly enhance global feature representation, MF-DeepLabv3+ is specifically designed for thin and discontinuous tunnel-face fracture segmentation by integrating a Multi-Scale Cross Attention module for multi-receptive-field feature interaction, a Feature Smoothing Module for noise suppression and fracture-continuity enhancement, and a lightweight MobileNetV2 backbone for improved computational efficiency. A dataset of 2153 annotated images collected from the Qingdao Jiaozhou Bay Second Subsea Tunnel and the Yantai Urban Rapid Road Tunnel was established for training and evaluation. Considering the strong class imbalance between fracture and background pixels, Accuracy is reported only as an auxiliary metric, while mAP, mIoU, per-class IoU, and fracture-specific Precision, Recall, and F1-score are emphasized to provide a more reliable assessment of segmentation performance. Comparative and ablation experiments show that MF-DeepLabv3+ achieved 82.56% mAP and 62.99% mIoU, with an auxiliary Accuracy of 92.47%. Compared with the original DeepLabv3+ baseline, the proposed model achieved a substantial improvement in mAP and a modest improvement in mIoU, indicating enhanced fracture recognition capability and slightly improved region-level overlap and a moderate increase in computational cost in exchange for improved segmentation performance. Fracture grouping and post-processing were further performed using edge detection, Hough transform, connected-component analysis, and fitted-line geometry to estimate fracture length and width. The proposed method therefore enables more reliable tunnel-face fracture recognition and provides quantitative geometric information for engineering assessment and geological interpretation. Full article

► Show Figures

Figure 1

27 pages, 2820 KB

Open AccessReview

Phenotyping of Histology Imaging Data with Histomics

by Fnu Neha, Deepshikha Bhati and Deepak Kumar Shukla

AI 2026, 7(6), 228; https://doi.org/10.3390/ai7060228 - 18 Jun 2026

Viewed by 205

Abstract

Whole-slide imaging has transformed histopathology into a data-rich domain; however, many computational pathology models encode tissue morphology within latent representations, limiting interpretability, reproducibility, and generalization. This review positions histomics as an intermediate phenotype representation layer linking histological images with downstream clinical inference through [...] Read more.

Whole-slide imaging has transformed histopathology into a data-rich domain; however, many computational pathology models encode tissue morphology within latent representations, limiting interpretability, reproducibility, and generalization. This review positions histomics as an intermediate phenotype representation layer linking histological images with downstream clinical inference through structured descriptors of tissue morphology, spatial organization, and tissue architecture. Unlike prior reviews focused primarily on feature extraction or predictive performance, the study adopts a representation-centric perspective of histomics. A taxonomy of histomic features across biological scales is presented, and artificial intelligence frameworks, including machine learning, deep learning, weakly supervised learning, and multimodal approaches, are systematically examined. Key challenges, including segmentation dependence, feature instability, aggregation variability, and domain shift, are critically analyzed alongside emerging developments in foundation models, representation learning, and multimodal pathology. The review provides a unified framework for understanding histomic representations and identifies future directions for developing robust, interpretable, and generalizable computational pathology systems. Full article

► Show Figures

Figure 1

33 pages, 4450 KB

Open AccessArticle

Attention-Enhanced Hybrid CNN–ViT Framework for Genus-Level Classification of Selected Macrofungi from Basidiospore Micrographs

by Şuheda Aldemir Terman, Mustafa Emre Akçay, Ebubekir Seyyarer, Faruk Ayata and İsmail Acar

Appl. Sci. 2026, 16(12), 6167; https://doi.org/10.3390/app16126167 - 18 Jun 2026

Viewed by 171

Abstract

The development of rapid and reproducible image analysis approaches that support genus-level pre-classification of macrofungi is important for taxonomic pre-evaluation and controlled microscopic data analysis. In this study, an advanced deep learning-based approach, namely the Attention-Enhanced Hybrid CNN–ViT Framework, was rigorously evaluated for [...] Read more.

The development of rapid and reproducible image analysis approaches that support genus-level pre-classification of macrofungi is important for taxonomic pre-evaluation and controlled microscopic data analysis. In this study, an advanced deep learning-based approach, namely the Attention-Enhanced Hybrid CNN–ViT Framework, was rigorously evaluated for genus-level classification, using basidiospore micrographs of five carefully selected macrofungal genera. The proposed approach integrates the ability of convolutional neural networks to identify local texture and contour patterns with the global context-modelling capability of Vision Transformer structures. The objective is to enhance the extraction of distinctive representations from microscopic spore images through feature fusion and attention mechanisms. A series of experiments was conducted on a curated dataset consisting of light microscopy images of the genera Agaricus, Hebeloma, Inocybe, Amanita, and Russula. The models were compared using a range of evaluation metrics, including accuracy, F1-score, MCC, ROC-AUC, and PR-AUC. The results showed that the InceptionV3 + ViT-B16 + Fusion configuration was the most successful hybrid model, achieving an accuracy of 0.9213 ± 0.0182, an F1-score of 0.9212 ± 0.0179, a Matthews correlation coefficient (MCC) of 0.9040 ± 0.0222, a receiver operating characteristic (ROC)-area under the curve (AUC) of 0.9896 ± 0.0069, and a precision-recall (PR)-AUC of 0.9684 ± 0.0192, respectively. The present findings demonstrate that basidiospore images can carry distinctive visual information for genus-level automated classification under controlled conditions. However, it is important to note that these results should not be interpreted as claims of species-level identification or field generalisability. This is due to the use of a single microscope-camera system, a single preparation protocol, and the absence of an independent external test set. The present study demonstrates that deep learning-based microscopic image analysis can be evaluated as a preliminary classification tool in macrofungal taxonomy. It also shows that such tools can provide a foundation for future work supported by specimen-level validation, external test sets, and different imaging protocols. Full article

(This article belongs to the Section Applied Microbiology)

► Show Figures

Figure 1

23 pages, 2071 KB

Open AccessReview

XAI2Brain: A Perspective on Mechanistic Interpretability for Brain–AI Alignment

by Richard Jiang, Yongchen Zhou, Boyuan Wang, Plamen Angelov and Qiang Ni

Mach. Learn. Knowl. Extr. 2026, 8(6), 167; https://doi.org/10.3390/make8060167 - 18 Jun 2026

Viewed by 206

Abstract

The convergence of artificial intelligence (AI), explainable AI (XAI), and neuroscience is fostering new opportunities for understanding both machine and biological intelligence through interpretable and human-centered learning paradigms. In this Perspective, we introduce XAI2Brain as a conceptual framework for brain–AI alignment, positioning mechanistic [...] Read more.

The convergence of artificial intelligence (AI), explainable AI (XAI), and neuroscience is fostering new opportunities for understanding both machine and biological intelligence through interpretable and human-centered learning paradigms. In this Perspective, we introduce XAI2Brain as a conceptual framework for brain–AI alignment, positioning mechanistic interpretability as an intermediate layer connecting neural network representations, human understanding, and neuroscience-inspired AI design. Rather than viewing XAI solely as a post hoc transparency tool, we emphasize its emerging role in enabling mechanistic analysis of internal model representations, concept-level reasoning, and interactive human–AI alignment. We define XAI2Brain as a multi-level conceptual framework rather than a deployable system, explicitly aimed at structuring brain–AI alignment across representation-level, mechanism-level, and interaction-level perspectives. We survey the evolution of XAI methodologies—from feature attribution and concept-based explanations to mechanistic and human-centric interpretability approaches—and discuss how these methods may support bidirectional knowledge transfer between AI systems and cognitive neuroscience. Importantly, we adopt a cautious stance on brain–AI analogy, explicitly recognizing that artificial neural representations are not equivalent to biological neural representations, and instead focusing on functional and informational correspondences rather than structural equivalence. Unlike conventional human-in-the-loop or reinforcement learning from human feedback paradigms that primarily optimize behavioral outputs, XAI2Brain focuses on cognitively interpretable and mechanistically grounded alignment between AI systems and human reasoning processes. This alignment promotes interactive human-in-the-loop intelligence, empowering humans to comprehend, guide, and refine AI systems, while enabling AI systems to better interpret human instructions, intentions, and contextual reasoning. We further discuss the challenges of scaling explainability to large generative and multimodal models, including issues of interpretability robustness, cognitive compatibility, evaluation, and ethical accountability. We also highlight key limitations of current mechanistic interpretability methods, including explanation instability, representation superposition, and lack of causal guarantees, underscoring that these challenges remain open research problems. Rather than proposing a complete artificial brain architecture, this Perspective outlines a research roadmap toward more interpretable, adaptive, and neuroscience-inspired AI systems capable of supporting future brain–AI integration and collaborative intelligence. We additionally clarify that this work follows a narrative perspective review methodology with structured thematic synthesis of the literature. By framing explainability as a bridge between mechanistic AI understanding, cognitive science, and human-centered interaction, XAI2Brain highlights the importance of interpretable alignment for the next generation of brain-inspired AI systems. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

23 pages, 2980 KB

Open AccessArticle

Grouped Feature Representation and Gated Multilayer Perceptron for Event-Level Football Pass Outcome Prediction

by Yijuan Yuan, Shaosong Wang, Yonghong Deng and Zhibin Li

Entropy 2026, 28(6), 703; https://doi.org/10.3390/e28060703 - 17 Jun 2026

Viewed by 154

Abstract

Accurate prediction of football pass outcomes is important for tactical analysis, decision evaluation, and skill-oriented feedback in student football training and physical education. However, event-level pass outcome prediction remains challenging because pass success is jointly influenced by spatial context, defensive pressure, receiver-related cues, [...] Read more.

Accurate prediction of football pass outcomes is important for tactical analysis, decision evaluation, and skill-oriented feedback in student football training and physical education. However, event-level pass outcome prediction remains challenging because pass success is jointly influenced by spatial context, defensive pressure, receiver-related cues, and historical coordination between players. To address this issue, this study proposes an information-guided multilayer perceptron (IGMLP) based on grouped feature representation and gated feature fusion using structured event data. In the proposed framework, input variables are organized into interpretable semantic feature groups, including contextual features, pressure-aware features, historical coordination features, and receiver-related features. These groups are encoded through separate branches and adaptively fused by a group-level gating mechanism for nonlinear pass outcome modeling. Unlike conventional gated neural architectures that usually apply generic gates to hidden units, channels, or sequential states, the proposed gated design operates at the semantic feature-group level and adaptively weights football-specific information sources according to their relevance to each pass event. Using the StatsBomb open-event dataset, both prediction and recognition paths were constructed, and the proposed model was compared with standard multilayer perceptron (MLP), residual neural network (ResNet), boosting tree (BT), convolutional neural network (CNN), and long short-term memory network (LSTM). In the prediction path, IGMLP achieved an Accuracy of 0.9184, Precision of 0.9295, Recall of 0.9837, F1-score of 0.9558, and AUC of 0.9325. In the recognition path, IGMLP achieved an Accuracy of 0.9808, Precision of 0.9882, Recall of 0.9902, F1-score of 0.9893, and AUC of 0.9925. These results indicate that semantic feature grouping and gated feature fusion are effective for event-level football pass outcome prediction. Full article

(This article belongs to the Section Signal and Data Analysis)

► Show Figures

Figure 1

30 pages, 2505 KB

Open AccessArticle

A Knowledge Graph Multi-Hop Question Answering Method Based on Adaptive Graph Convolutional Neural Networks

by Cheng Gan, Yuhang Cai, Shenyi Qian, Songhe Jin, Bowen Fu, Tongxin Zhao and Daiyi Li

Symmetry 2026, 18(6), 1048; https://doi.org/10.3390/sym18061048 - 17 Jun 2026

Viewed by 177

Abstract

Multi-hop question answering (MQA) requires models to perform multi-step reasoning and integrate multiple knowledge sources. However, existing methods combining pre-trained language models (PLMs) and graph neural networks (GNNs) often suffer from low computational efficiency, insufficient deep semantic fusion, and imbalanced modeling of heterogeneous [...] Read more.

Multi-hop question answering (MQA) requires models to perform multi-step reasoning and integrate multiple knowledge sources. However, existing methods combining pre-trained language models (PLMs) and graph neural networks (GNNs) often suffer from low computational efficiency, insufficient deep semantic fusion, and imbalanced modeling of heterogeneous relations. To solve these problems, we propose a Dynamic Hierarchical Adaptive Graph Convolution Network (DHACNet). First, to deal with the issues of insufficient computational efficiency and feature interpretability, we introduce Dynamic Sparse Activation (DSA). A trainable gate unit is used to generate importance masks for the encoder outputs, keeping only the task-relevant neurons. This greatly decreases the computational burden and enhances the interpretability of the model’s decisions. Second, to alleviate insufficient deep semantic fusion, we design a Hierarchical Feature Fusion (HFF) mechanism. It adaptively weights and fuses hidden states from different layers, enhancing the extraction and representation of deep textual semantics. Furthermore, for graph structure modeling, we present Adaptive Graph Convolution (AGC), which assigns learnable weights to different edge types in the graph, thereby improving heterogeneous relation modeling. Finally, hierarchical graph pooling is introduced, which integrates attention mechanism and Top-

K

selection to achieve efficient and robust graph-level representation. The experimental results show that our proposed model maintains the symmetry between the text representation and graph representation through adaptive layered fusion and relational perceptual graph propagation. This symmetry-aware reasoning process encourages semantic consistency during multi-hop inference and makes knowledge integration more robust. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

30 pages, 719 KB

Open AccessArticle

A Multimodal Sensor-Based Self-Supervised Learning Framework for Low-Noise System State Prediction and Anomaly Detection

by Kexin Guo, Jingwen Wang, Jiayu Lin, Ningjing Chen, Hengyuan Chen, Zilang Zhou and Manzhou Li

Sensors 2026, 26(12), 3851; https://doi.org/10.3390/s26123851 - 17 Jun 2026

Viewed by 172

Abstract

To address the challenges of strong signal noise, pronounced cross-modal asynchrony, high subjectivity in manually defined state labels, and insufficient model stability under extreme abnormal conditions in multi-source sensor systems, a low-noise system state prediction and anomaly detection method based on multimodal sensor [...] Read more.

To address the challenges of strong signal noise, pronounced cross-modal asynchrony, high subjectivity in manually defined state labels, and insufficient model stability under extreme abnormal conditions in multi-source sensor systems, a low-noise system state prediction and anomaly detection method based on multimodal sensor signals and self-supervised representation learning is proposed. Environmental sensing data, device status data, network transmission data, operational behavior data, and event log data are uniformly modeled as system state perception signals. A temporal masking-based state structure modeling method, a state-oriented contrastive learning representation constraint mechanism, and a state representation and downstream prediction task alignment strategy are designed to learn stable, transferable, and interpretable system state features. Experimental results demonstrate that the proposed method achieves the best performance in multimodal sensor state prediction and anomaly detection tasks, with mean squared error (MSE), mean absolute error (MAE), and root mean square error (RMSE) values of

0.0167

,

0.0856

, and

0.1291

, respectively, outperforming baseline models such as GARCH, MLP, LSTM, TCN, and Transformer. Meanwhile,

I C

,

R a n k I C

, and

A U C

reach

0.494

,

0.460

, and

0.815

, respectively, indicating stronger state-ranking capability and improved discrimination between high-abnormality and low-abnormality states. At the classification recognition level, superior accuracy, precision, recall, and

F 1

-score are also achieved by the proposed method, suggesting that potential abnormal states can be identified more accurately. Ablation experiments verify the effectiveness of multimodal fusion, temporal masking modeling, self-supervised contrastive constraints, and task alignment strategies. Robustness experiments further show that lower prediction errors and higher

A U C

can still be maintained under high-fluctuation and extreme-shock states, demonstrating strong noise resistance, stability, and practical application potential in complex sensor system scenarios. Full article

(This article belongs to the Special Issue Deep Learning for Perception and Recognition Based on Sensor Data: Methods and Applications, 2nd Edition)

► Show Figures

Figure 1

33 pages, 3372 KB

Open AccessArticle

A Genomics-Guided Multimodal Contrastive Learning Framework for Clinically Significant Prostate Cancer Risk Stratification with Missing Clinical Data

by Abdullah, Muhammad Shahid, Muhammad Ateeb Ather, Zulaikha Fatima, Carlos Guzmán Sánchez Mejorada, Miguel Jesús Torres Ruiz, Rolando Quintero Téllez, Miguel Félix Mata-Rivera and Roberto Zagal-Flores

Cancers 2026, 18(12), 1952; https://doi.org/10.3390/cancers18121952 - 16 Jun 2026

Viewed by 225

Abstract

Background: Heterogeneous data integration remains a major challenge in intelligent information systems, particularly under missing-modality and cross-domain conditions. Existing multimodal fusion approaches often rely on complete datasets and weak alignment mechanisms, limiting their robustness and practical applicability. Objectives: This study aims to develop [...] Read more.

Background: Heterogeneous data integration remains a major challenge in intelligent information systems, particularly under missing-modality and cross-domain conditions. Existing multimodal fusion approaches often rely on complete datasets and weak alignment mechanisms, limiting their robustness and practical applicability. Objectives: This study aims to develop and evaluate a genomics-guided multimodal representation learning framework that enables robust heterogeneous data fusion, reliable cross-modal correspondence, and accurate prediction under incomplete-data conditions. Methods: We propose a multimodal learning architecture that models genomics as the primary biological anchor and learns conditional projections to imaging modalities, including multiparametric MRI and whole-slide histopathology (WSI). The framework formulates multimodal fusion as a genomics-guided contrastive learning problem, incorporates domain-specific optimization constraints, and learns a latent shared-state representation to support inference without requiring fully paired datasets. Evaluation was conducted using public datasets, including TCGA-PRAD and TCIA, across low-risk versus higher-risk/clinically significant prostate cancer (csPCa) discrimination, Gleason-based risk stratification, and clinically significant outcome prediction tasks under realistic multimodal and missing-modality scenarios. Results: In the adequately powered

G e n o m i c s + W S I

cohort (n = 486), the framework achieved an AUROC of 0.985 ± 0.005 for low-risk versus higher-risk/csPCa discrimination (p < 0.001). Exploratory analysis in a small, matched

G e n o m i c s + M R I

cohort (n = 28) yielded an AUROC of 0.980 ± 0.006 for the same endpoint; these findings are reported descriptively with bootstrap confidence intervals due to limited sample size. Because the negative reference group consisted of low-risk prostate cancer cases rather than cancer-free controls, results are interpreted as within-cancer risk discrimination rather than de novo cancer detection. The framework achieved weighted accuracy up to 92.1%, Cohen’s κ up to 0.86, and reduced critical decision errors by 58%. Calibration remained strong (ECE 0.021–0.024), and decision-curve analysis indicated improved utility with reduced unnecessary invasive workups in retrospective modeling. Robustness analysis demonstrated AUROC degradation below 0.04 under domain shifts. Single-modality inference using genomics alone maintained AUROC > 0.90. Interpretability analysis revealed feature attributions aligned with domain-relevant genomic markers. Conclusions: The proposed framework provides a scalable and generalizable solution for heterogeneous multimodal data fusion, supporting reliable prediction, robustness to missing modalities, and applicability to complex information systems beyond the studied domain. Full article

(This article belongs to the Section Molecular Cancer Biology)

► Show Figures

Figure 1

21 pages, 2831 KB

Open AccessArticle

Frequency-Guided Cross-Modal Interaction for Multimodal Yeast Classification Based on Light-Scattering and Microscopy Images

by Zexi Cheng, Xiaoxuan Liu, Shamanth Shankarnarayan, Manisha Gupta, Wojciech Rozmus, Ying Yin Tsui, Daniel A. Charlebois and Mrinal Mandal

J. Imaging 2026, 12(6), 263; https://doi.org/10.3390/jimaging12060263 - 16 Jun 2026

Viewed by 203

Abstract

Accurate identification of pathogenic yeasts is essential for clinical diagnosis and effective antifungal therapy. However, current approaches predominantly rely on microscopy-based models, which require large-scale annotated datasets and exhibit limited generalization across morphologically similar species. In contrast, light-scattering (LS) imaging captures the diffraction [...] Read more.

Accurate identification of pathogenic yeasts is essential for clinical diagnosis and effective antifungal therapy. However, current approaches predominantly rely on microscopy-based models, which require large-scale annotated datasets and exhibit limited generalization across morphologically similar species. In contrast, light-scattering (LS) imaging captures the diffraction patterns generated by internal cellular structures, providing volumetric biophysical cues that extend beyond surface morphology, yet its indirect representations pose major challenges for feature discrimination. Our objective is to develop fast and accurate methods to detect various species of yeasts. We propose FPA-YeastNet, which is a frequency-enhanced single-modality deep learning architecture that improves yeast classification in LS images by leveraging discriminative frequency-domain features. Building upon this enhanced modality, we further propose FGCA-YeastNet, a frequency-guided cross-attention network designed to integrate LS and microscopy information for complementary representation learning. The proposed multimodal model facilitates synergistic interactions between volumetric scattering structures and fine-grained cellular textures through adaptive fusion and bidirectional attention, leading to improved robustness and interpretability. Comprehensive classification experiments conducted on a multimodal yeast dataset demonstrate that FGCA-YeastNet effectively bridges the performance gap between LS and microscopy modalities, achieving significant improvements over both unimodal and multimodal baselines. The FPA-YeastNet yields an average accuracy improvement of 6.26% compared with LS-only models, and FGCA-YeastNet further provides mean gains of 19.97% and 7.67% over unimodal and multimodal baseline models, respectively. Experimental results demonstrate the diagnostic potential of light scattering and microscopic imaging and underscore the effectiveness of frequency-guided multimodal collaboration for reliable and interpretable yeast classification in clinical microbiology. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

30 pages, 21352 KB

Open AccessArticle

Early Visible Greenness Change in Forest Burned Areas Across Burn Severity and Mountainous Topography Using UAV RGB Imagery

by Qinyan Gu, Chao Xi, Weili Kou, Zhengshen Huang, Jiangxia Ye and Qiuhua Wang

Fire 2026, 9(6), 258; https://doi.org/10.3390/fire9060258 - 16 Jun 2026

Viewed by 279

Abstract

Understanding post-fire visible greenness change is important for assessing spatial heterogeneity in mountainous burned landscapes, but satellite observations often cannot capture local variation. This study developed a workflow using Unmanned Aerial Vehicle (UAV) Red–Green–Blue (RGB) imagery for RGB-interpreted burn severity classification and Green [...] Read more.

Understanding post-fire visible greenness change is important for assessing spatial heterogeneity in mountainous burned landscapes, but satellite observations often cannot capture local variation. This study developed a workflow using Unmanned Aerial Vehicle (UAV) Red–Green–Blue (RGB) imagery for RGB-interpreted burn severity classification and Green Leaf Index (GLI)-derived visible greenness change analysis three years after fire. The workflow integrated object-based Random Forest (RF) classification, bi-temporal GLI difference (

Δ

GLI) detection, and terrain-stratified analysis under RGB-only conditions. Object-based multi-feature representation, including a 41-dimensional (41D) feature set of color, texture, and gradient metrics, supported local burn severity mapping, although performance gain over the 23-dimensional (23D) set was modest and not statistically significant. The burned area was dominated by high and moderate severity classes. GLI-derived analysis showed limited visible greenness increase (mean

Δ

GLI = 0.0058), with slightly more than half of pixels being positive; high severity areas had higher

Δ

GLI, while low severity areas showed limited or negative values.

Δ

GLI also varied across terrain, being higher on steeper slopes, mid-to-upper elevations, and east-facing aspects. The workflow provides a practical local-scale approach for post-fire analysis using high-resolution UAV RGB imagery, with results interpreted as case-specific visible greenness patterns rather than comprehensive ecological recovery. Full article

► Show Figures

Figure 1

Search Results (877)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (877)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI