MDPI - Publisher of Open Access Journals

23 pages, 7986 KB

Open AccessArticle

Leveraging Spot–Gene Heterogeneous Graphs for Unified Spatially Resolved Transcriptomics Domain Detection on Single-Slice and Multi-Slice Data

by Lina Xia, Zhenyue Ding, Xun Zhang, Kun Qian and Hongwei Li

Genes 2026, 17(3), 310; https://doi.org/10.3390/genes17030310 - 7 Mar 2026

Viewed by 151

Abstract

Background: Spatially resolved transcriptomics (SRT) enables simultaneous measurement of gene expression and spatial location, but the existing domain detection methods are limited by over-reliance on spot-to-spot proximity, rigid pre-alignment requirements for multi-slice datasets, and inadequate mitigation of batch effects. This study aims [...] Read more.

Background: Spatially resolved transcriptomics (SRT) enables simultaneous measurement of gene expression and spatial location, but the existing domain detection methods are limited by over-reliance on spot-to-spot proximity, rigid pre-alignment requirements for multi-slice datasets, and inadequate mitigation of batch effects. This study aims to develop a unified method for accurate spatial domain identification across both single-slice and multi-slice SRT datasets. Methods: We propose a novel method named spatially resolved transcriptomics heterogeneous graph contrastive learning (stHGCL), which integrates a spot–gene heterogeneous graph, a dual-stage encoder (comprising LightGCN and GCN), and a neighborhood-driven contrastive learning module. The heterogeneous graph captures high-order structural information through spot–gene connections mediated by shared genes; the dual-stage encoder refines spot embeddings by fusing gene expression and spatial location; contrastive learning enhances intra-cluster compactness and mitigates batch effects. Results: stHGCL was validated on seven benchmark datasets from platforms including 10x Visium, BaristaSeq, STARmapSeq, Slide-seq, and Stereo-seq. It outperformed nine single-slice and eight multi-slice state-of-the-art methods. It achieved the highest mean Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI) scores and could accurately delineate complex spatial domains with distinct boundaries, and even achieved cross-slice spatial domain detection for unaligned multi-slice datasets. Ablation studies confirmed the effectiveness of its main modules. Conclusions: stHGCL effectively captures high-order structural and spatial information and mitigates batch effects. It provides a robust scalable solution for unified spatial domain detection in SRT, facilitating insights into the spatial domains across both single-slice and multi-slice experimental paradigms. Full article

(This article belongs to the Section Bioinformatics)

► Show Figures

Figure 1

23 pages, 3612 KB

Open AccessArticle

A Security Framework for Resilient Smart Grids Based on Self-Organizing Graph Neural Cellular Automata

by Rongxu Hou, Yiying Zhang, Siwei Li, Yeshen He and Pizhen Zhang

Algorithms 2026, 19(3), 195; https://doi.org/10.3390/a19030195 - 5 Mar 2026

Viewed by 173

Abstract

As smart grids evolve into complex cyber-physical systems, conventional static defenses struggle to address time-varying topologies and Advanced Persistent Threats (APTs). We propose the Security Framework for Resilient Smart Grids based on Self-Organizing Graph Neural Cellular Automata (SG-GNC). Specifically, a Neural Homeostatic Embedding [...] Read more.

As smart grids evolve into complex cyber-physical systems, conventional static defenses struggle to address time-varying topologies and Advanced Persistent Threats (APTs). We propose the Security Framework for Resilient Smart Grids based on Self-Organizing Graph Neural Cellular Automata (SG-GNC). Specifically, a Neural Homeostatic Embedding (NHE) mechanism utilizes variational graph autoencoders to construct a continuous health manifold for unsupervised anomaly detection, while a Neural Cellular Automata (NCA) engine employs shared-weight local rules to empower nodes with decentralized self-healing capabilities. Finally, a Generative Adversarial Immunity (GAI) strategy facilitates active defense co-evolution, enhancing robustness against zero-day attacks. Experimental results on the IEEE 118 and 300-bus systems demonstrate an average detection accuracy of 98.23%, significantly outperforming benchmarks. In scenarios involving dynamic topology and zero-day attacks, the framework maintains over 96% accuracy with an inference latency of only 9.45 ms. These findings validate the capability of SG-GNC to provide resilient, endogenous defense in complex heterogeneous environments. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, 2nd Edition)

► Show Figures

Figure 1

30 pages, 1138 KB

Open AccessArticle

An Axiomatic Relational–Informational Framework for Emergent Geometry and Effective Spacetime

by Călin Gheorghe Buzea, Florin Nedeff, Diana Mirilă, Valentin Nedeff, Oana Rusu, Maricel Agop and Decebal Vasincu

Axioms 2026, 15(2), 154; https://doi.org/10.3390/axioms15020154 - 20 Feb 2026

Viewed by 300

Abstract

This work is axiomatic and structural in nature and is not intended as a phenomenological physical theory, but as a framework clarifying minimal informational primitives from which geometric and dynamical descriptions may emerge. We present a background-independent framework in which physical geometry, interaction-like [...] Read more.

This work is axiomatic and structural in nature and is not intended as a phenomenological physical theory, but as a framework clarifying minimal informational primitives from which geometric and dynamical descriptions may emerge. We present a background-independent framework in which physical geometry, interaction-like forces, and spacetime arise as effective descriptions of constrained relational information rather than as fundamental entities. The only primitive structure is a network of degrees of freedom linked by admissible informational relations, each subject to quantifiable constraints on accessibility or flow. The motivation is to identify whether a single minimal relational primitive can account jointly for the emergence of geometry, forces, and spacetime, without presupposing a manifold, fields, or fundamental interactions. The framework is formalized using weighted relational graphs in which constraint weights encode limitations on information flow between degrees of freedom. Effective geometry is defined operationally through minimal constraint cost along relational paths, yielding an emergent metric without assuming spatial embedding. Relational evolution is modeled via a minimal configuration-space dynamics defined by local rewrite moves, and a statistical description is introduced through an informational action that governs coarse-grained response rather than serving as a fundamental dynamical law. Curvature-like observables are defined using transport-based comparisons of local accessibility structure. Within this setting, metric structure emerges from constrained relational accessibility, while curvature-like behavior arises from heterogeneity in constraint structure. Effective forces appear as entropic or informational action gradients with respect to coarse-grained control parameters that modulate relational constraints, and are interpreted as emergent responses rather than primitive interactions. A finite worked example explicitly demonstrates the emergence of nontrivial distance, curvature proxies, and an effective force via geodesic switching under constraint variation, without assuming fundamental spacetime, fields, or particles. The results support an interpretation in which geometry, forces, and spacetime are representational features of constrained information flow rather than fundamental elements of physical law. The framework clarifies conceptual distinctions and points of compatibility with existing approaches to emergent spacetime, and it outlines qualitative expectations for regimes in which smooth geometric descriptions are expected to break down. The work delineates the scope and limits of geometric description without proposing a complete phenomenological theory. Full article

► Show Figures

Figure 1

18 pages, 2697 KB

Open AccessArticle

TGCformer: A Transformer-Based Dual-Channel Fusion Framework for Power Load Anomaly Detection

by Li Xu, Shouwei Chen, Xiaoping Wu, Qu Wang, Yu Liu and Yasi Peng

Electronics 2026, 15(4), 874; https://doi.org/10.3390/electronics15040874 - 19 Feb 2026

Viewed by 217

Abstract

Existing methods for power load anomaly detection suffer from several limitations, including insufficient extraction of multi-scale temporal features, difficulty in capturing long-range dependencies, and inefficient fusion of heterogeneous Time-Graph information. To address these issues, this study proposes the TGCformer, an enhanced framework for [...] Read more.

Existing methods for power load anomaly detection suffer from several limitations, including insufficient extraction of multi-scale temporal features, difficulty in capturing long-range dependencies, and inefficient fusion of heterogeneous Time-Graph information. To address these issues, this study proposes the TGCformer, an enhanced framework for Time-Graph feature fusion. First, a dual-channel feature extraction module is constructed. The temporal path utilizes Time Series Feature Extraction based on Scalable Hypothesis Tests (TSFresh) to enhance the explicit pattern representation of the load sequences, while the graph-learning path employs a Sparse Unified Graph Attention Network v2 (Sparse Unified GATv2) to model global semantic correlations among time steps. Together, these two paths provide more interpretable and structured inputs for the subsequent fusion module. Subsequently, a multi-head cross-attention mechanism is designed, where temporal features serve as the Query and graph-level embeddings as the Key and Value to guide the feature fusion process. This design ensures the effective integration of complementary information while suppressing noise. Experimental results on the public Irish CER Smart Meter Dataset demonstrate the effectiveness of the proposed model. Specifically, TGCformer consistently outperforms four classic deep learning baselines (XceptionTime, InceptionTime, FormerTime, and LSTM-GNN), demonstrating competitive detection accuracy and robustness. Full article

► Show Figures

Figure 1

28 pages, 8127 KB

Open AccessArticle

CARAG: Context-Aware Retrieval-Augmented Generation for Railway Operation and Maintenance Question Answering over Spatial Knowledge Graph

by Wenkui Zheng, Mengzheng Yang, Yanfei Ren, Haoyu Wang, Chun Zeng and Yong Zhang

ISPRS Int. J. Geo-Inf. 2026, 15(2), 78; https://doi.org/10.3390/ijgi15020078 - 14 Feb 2026

Viewed by 385

Abstract

General-purpose large language models excel at open-domain question answering, but in railway operation and maintenance (O&M) scenarios they still suffer from hallucinated knowledge and poor domain adaptation. In practice, railway O&M knowledge mainly arises from two heterogeneous sources: spatio-temporal data such as train [...] Read more.

General-purpose large language models excel at open-domain question answering, but in railway operation and maintenance (O&M) scenarios they still suffer from hallucinated knowledge and poor domain adaptation. In practice, railway O&M knowledge mainly arises from two heterogeneous sources: spatio-temporal data such as train trajectories, which are organized along the spatial layout of railway lines, and domain documents such as operating rules, which exhibit varying degrees of structural regularity. Traditional retrieval-augmented generation (RAG) systems usually flatten these multi-source data into a single unstructured text space and perform global retrieval in one embedding space, which easily introduces noisy context and makes it difficult to precisely target knowledge for specific lines, sections, or equipment states. To overcome these limitations, we propose CARAG, a context-aware RAG framework tailored to railway O&M data. CARAG treats domain documents and spatial data as a unified knowledge substrate and builds a spatial knowledge graph with concept and instance levels. On top of this knowledge graph, a GraphReAct-based multi-turn interaction mechanism guides the LLM to reason and act over the concept knowledge graph, dynamically navigating to spatially and semantically relevant candidate regions, within which vector retrieval and instance-level graph retrieval are performed. Experiments show that CARAG significantly outperforms baseline RAG methods on RAGAS metrics, confirming the effectiveness of structure-guided multi-step reasoning for question answering over multi-source heterogeneous railway O&M data. Full article

(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)

► Show Figures

Figure 1

21 pages, 1511 KB

Open AccessArticle

SKNet-GAT: A Novel Multi-Source Data Fusion Approach for Distribution Network State Estimation

by Huijia Liu, Chengkai Yin and Sheng Ye

Energies 2026, 19(4), 1012; https://doi.org/10.3390/en19041012 - 14 Feb 2026

Viewed by 196

Abstract

This paper tackles the growing uncertainty in distribution networks caused by distributed generation, load fluctuations, and frequent topological changes. It proposes a multi-source data fusion framework using enhanced selective convolution (SKNet) and graph attention networks (GAT). First, heterogeneous measurement data, including Phasor Measurement [...] Read more.

This paper tackles the growing uncertainty in distribution networks caused by distributed generation, load fluctuations, and frequent topological changes. It proposes a multi-source data fusion framework using enhanced selective convolution (SKNet) and graph attention networks (GAT). First, heterogeneous measurement data, including Phasor Measurement Unit (PMU) and Supervisory Control and Data Acquisition (SCADA) data, are processed through a unified normalization and outlier elimination technique to ensure data quality. Second, SKNet is utilized to extract spatiotemporal multi-scale features, improving the detection of both rapid disturbances and long-term trends. Third, the extracted features are fed into GAT to model node electrical couplings, while power flow residual constraints are embedded in the loss function to enforce the physical validity of the estimated states. This physics-informed design overcomes a key limitation of pure data-driven models and enables an end-to-end framework that integrates data-driven learning with physical mechanism constraints. Finally, comprehensive validation is performed on the improved IEEE 33-node and IEEE 123-node test systems. The test scenarios include Gaussian measurement noise, data outliers, missing measurements, and topological changes. The results show that the proposed method outperforms baseline models such as Multi-Scale Graph Attention Network (MS-GAT), Bidirectional Long Short-Term Memory (BiLSTM), and traditional weighted least squares (WLS). It achieves Root Mean Square Error (RMSE) reductions of up to 18% and Mean Absolute Error (MAE) reductions of up to 15%. The average inference latency is only 10–18 ms. Even under unknown topological changes, the estimation error increases by only 15–25%. These results demonstrate the superior accuracy, robustness, and real-time performance of the proposed method for intelligent distribution network state estimation. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, 2nd Edition)

► Show Figures

Figure 1

34 pages, 10560 KB

Open AccessReview

Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards

by Yutong Guo and Chao Yang

Metals 2026, 16(2), 162; https://doi.org/10.3390/met16020162 - 29 Jan 2026

Viewed by 585

Abstract

High-entropy alloys (HEAs) present a fundamental design paradox: their exceptional properties arise from complex, high-dimensional composition–process–microstructure–property (CPMP) relationships, yet the knowledge needed to navigate this space is fragmented across a vast and unstructured literature. Large language models (LLMs) offer a transformative interface to [...] Read more.

High-entropy alloys (HEAs) present a fundamental design paradox: their exceptional properties arise from complex, high-dimensional composition–process–microstructure–property (CPMP) relationships, yet the knowledge needed to navigate this space is fragmented across a vast and unstructured literature. Large language models (LLMs) offer a transformative interface to this complexity. By extracting structured facts from text, they can convert dispersed and heterogeneous evidence (i.e., findings scattered across many studies and reported with inconsistent test protocols or characterization standards) into queryable knowledge graphs. Through code generation and tool composition, they can automate simulation pipelines, surrogate model construction, and inverse design workflows. This review analyzes how LLMs can augment key stages of HEA research—from intelligent literature mining and multimodal data integration (using LLMs to automatically extract and structure data from texts and to combine information across text, images, and other data sources) to model-driven design and closed-loop experimentation—illustrated by emerging case studies. We propose concrete evaluation protocols that measure direct scientific utility, including knowledge-graph completeness, workflow setup efficiency, and experimental validation hit rates. We also confront practical limitations: data sparsity and noise, model hallucination, domain bias (where models may exhibit superior predictive performance for specific, well-represented alloy systems over others due to imbalances in training data), and the imperative for reproducible infrastructure. We argue that domain-specialized LLMs, embedded within grounded, verifiable research systems, can not only accelerate HEA discovery but also standardize the representation, sharing, and reuse of community knowledge. Full article

(This article belongs to the Special Issue Advances in High-Entropy Alloys’ Microstructure, Properties and Preparation)

► Show Figures

Figure 1

18 pages, 2183 KB

Open AccessArticle

Uncovering miRNA–Disease Associations Through Graph Based Neural Network Representations

by Alessandro Orro

Biomedicines 2026, 14(2), 289; https://doi.org/10.3390/biomedicines14020289 - 28 Jan 2026

Viewed by 306

Abstract

Background: MicroRNAs (miRNAs) are an important class of non-coding RNAs that regulate gene expression by binding to target mRNAs and influencing cellular processes such as differentiation, proliferation, and apoptosis. Dysregulation in miRNA expression has been reported to be implicated in many human diseases, [...] Read more.

Background: MicroRNAs (miRNAs) are an important class of non-coding RNAs that regulate gene expression by binding to target mRNAs and influencing cellular processes such as differentiation, proliferation, and apoptosis. Dysregulation in miRNA expression has been reported to be implicated in many human diseases, including cancer, cardiovascular, and neurodegenerative disorders. Identifying disease-related miRNAs is therefore essential for understanding disease mechanisms and supporting biomarker discovery, but time and cost of experimental validation are the main limitations. Methods: We present a graph-based learning framework that models the complex relationships between miRNAs, diseases, and related biological entities within a heterogeneous network. The model employs a message-passing neural architecture to learn structured embeddings from multiple node and edge types, integrating biological priors from curated resources. This network representation enables the inference of novel miRNA–disease associations, even in sparsely annotated regions of the network. The approach was trained and validated on a dataset benchmark using ten replicated experiments to ensure robustness. Results: The method achieved an average AUC–ROC of ~98%, outperforming previously reported computational approaches on the same dataset. Moreover, predictions were consistent across validation folds and robustness analyses were conducted to evaluate stability and highlight the most important information. Conclusions: Integrating heterogeneous biological information and representing it through graph neural network representation learning offers a powerful and generalizable way to predict relevant associations, including miRNA–disease, and provide a robust computational framework to support biomedical discovery and translational research. Full article

(This article belongs to the Special Issue Bioinformatics Analysis of RNA for Human Health and Disease)

► Show Figures

Figure 1

34 pages, 1418 KB

Open AccessArticle

Hybrid Dual-Context Prompted Cross-Attention Framework with Language Model Guidance for Multi-Label Prediction of Human Off-Target Ligand–Protein Interactions

by Abdullah, Zulaikha Fatima, Muhammad Ateeb Ather, Liliana Chanona-Hernandez and José Luis Oropeza Rodríguez

Int. J. Mol. Sci. 2026, 27(2), 1126; https://doi.org/10.3390/ijms27021126 - 22 Jan 2026

Viewed by 367

Abstract

Accurately identifying drug off-targets is essential for reducing toxicity and improving the success rate of pharmaceutical discovery pipelines. However, current deep learning approaches often struggle to fuse chemical structure, protein biology, and multi-target context. Here, we introduce HDPC-LGT (Hybrid Dual-Prompt Cross-Attention Ligand–Protein Graph [...] Read more.

Accurately identifying drug off-targets is essential for reducing toxicity and improving the success rate of pharmaceutical discovery pipelines. However, current deep learning approaches often struggle to fuse chemical structure, protein biology, and multi-target context. Here, we introduce HDPC-LGT (Hybrid Dual-Prompt Cross-Attention Ligand–Protein Graph Transformer), a framework designed to predict ligand binding across sixteen human translation-related proteins clinically associated with antibiotic toxicity. HDPC-LGT combines graph-based chemical reasoning with protein language model embeddings and structural priors to capture biologically meaningful ligand–protein interactions. The model was trained on 216,482 experimentally validated ligand–protein pairs from the Chemical Database of Bioactive Molecules (ChEMBL) and the Protein–Ligand Binding Database (BindingDB) and evaluated using scaffold-level, protein-level, and combined holdout strategies. HDPC-LGT achieves a macro receiver operating characteristic–area under the curve (macro ROC–AUC) of 0.996 and a micro F1-score (micro F1) of 0.989, outperforming Deep Drug–Target Affinity Model (DeepDTA), Graph-based Drug–Target Affinity Model (GraphDTA), Molecule–Protein Interaction Transformer (MolTrans), Cross-Attention Transformer for Drug–Target Interaction (CAT–DTI), and Heterogeneous Graph Transformer for Drug–Target Affinity (HGT–DTA) by 3–7%. External validation using the Papyrus universal bioactivity resource (Papyrus), the Protein Data Bank binding subset (PDBbind), and the benchmark Yamanishi dataset confirms strong generalisation to unseen chemotypes and proteins. HDPC-LGT also provides biologically interpretable outputs: cross-attention maps, Integrated Gradients (IG), and Gradient-weighted Class Activation Mapping (Grad-CAM) highlight catalytic residues in aminoacyl-tRNA synthetases (aaRSs), ribosomal tunnel regions, and pharmacophoric interaction patterns, aligning with known biochemical mechanisms. By integrating multimodal biochemical information with deep learning, HDPC-LGT offers a practical tool for off-target toxicity prediction, structure-based lead optimisation, and polypharmacology research, with potential applications in antibiotic development, safety profiling, and rational compound redesign. Full article

(This article belongs to the Section Molecular Informatics)

► Show Figures

Figure 1

18 pages, 2210 KB

Open AccessArticle

SPINET-KSP: A Multi-Modal LLM-Graph Foundation Model for Contextual Prediction of Kinase-Substrate-Phosphatase Triads

by Michael Olaolu Arowolo, Marian Emmanuel Okon, Davis Austria, Muhammad Azam and Sulaiman Olaniyi Abdulsalam

Kinases Phosphatases 2026, 4(1), 3; https://doi.org/10.3390/kinasesphosphatases4010003 - 22 Jan 2026

Viewed by 304

Abstract

Reversible protein phosphorylation is an important regulatory mechanism in cellular signalling and disease, regulated by the opposing actions of kinases and phosphatases. Modern computer methods predict kinase–substrate or phosphatase–substrate interactions in isolation and lack specificity for biological conditions, neglecting triadic regulation. We present [...] Read more.

Reversible protein phosphorylation is an important regulatory mechanism in cellular signalling and disease, regulated by the opposing actions of kinases and phosphatases. Modern computer methods predict kinase–substrate or phosphatase–substrate interactions in isolation and lack specificity for biological conditions, neglecting triadic regulation. We present SPINET-KSP, a multi-modal LLM–Graph foundation model engineered for the prediction of kinase–substrate–phosphatase (KSP) triads with contextual awareness. SPINET-KSP integrates high-confidence interactomes (SIGNOR, BioGRID, STRING), structural contacts obtained from AlphaFold3, ESM-3 sequence embeddings, and a 512-dimensional cell-state manifold with 1612 quantitative phosphoproteomic conditions. A heterogeneous KSP graph is examined utilising a cross-attention Graphormer with Reversible Triad Attention to mimic kinase–phosphatase antagonism. SPINET-KSP, pre-trained on 3.41 million validated phospho-sites utilising masked phosphorylation modelling and contrastive cell-state learning, achieves an AUROC of 0.852 for kinase-family classification (sensitivity 0.821, specificity 0.834, MCC 0.655) and a Pearson correlation coefficient of 0.712 for phospho-occupancy prediction. In distinct 2025 mass spectrometry datasets, it identifies 72% of acknowledged cancer-resistance triads within the top 10 rankings and uncovers 247 supplementary triads validated using orthogonal proteomics. SPINET-KSP is the first foundational model for simulating context-dependent reversible phosphorylation, enabling the targeting of dysregulated kinase-phosphatase pathways in diseases. Full article

► Show Figures

Figure 1

22 pages, 795 KB

Open AccessArticle

HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models

by Xinchen Shi, Zhenyu Han and Bin Li

Electronics 2026, 15(2), 421; https://doi.org/10.3390/electronics15020421 - 18 Jan 2026

Viewed by 319

Abstract

Entity alignment (EA) facilitates knowledge fusion by matching semantically identical entities in distinct knowledge graphs (KGs). Existing embedding-based methods rely solely on intrinsic KG facts and often struggle with long-tail entities due to insufficient information. Recently, large language models (LLMs), empowered by rich [...] Read more.

Entity alignment (EA) facilitates knowledge fusion by matching semantically identical entities in distinct knowledge graphs (KGs). Existing embedding-based methods rely solely on intrinsic KG facts and often struggle with long-tail entities due to insufficient information. Recently, large language models (LLMs), empowered by rich background knowledge and strong reasoning abilities, have shown promise for EA. However, most current LLM-enhanced approaches follow the in-context learning paradigm, requiring multi-round interactions with carefully designed prompts to perform additional auxiliary operations, which leads to substantial computational overhead. Moreover, they fail to fully exploit the complementary strengths of embedding-based small models and LLMs. To address these limitations, we propose HIEA, a novel hierarchical inference framework for entity alignment. By instruction-tuning a generative LLM with a unified and concise prompt and a knowledge adapter, HIEA produces alignment results with a single LLM invocation. Meanwhile, embedding-based small models not only generate candidate entities but also support the LLM through data augmentation and certainty-aware source entity classification, fostering deeper collaboration between small models and LLMs. Extensive experiments on both standard and highly heterogeneous benchmarks demonstrate that HIEA consistently outperforms existing embedding-based and LLM-enhanced methods, achieving absolute Hits@1 improvements of up to 5.6%, while significantly reducing inference cost. Full article

(This article belongs to the Special Issue AI-Powered Natural Language Processing Applications)

► Show Figures

Figure 1

23 pages, 13094 KB

Open AccessArticle

PDR-STGCN: An Enhanced STGCN with Multi-Scale Periodic Fusion and a Dynamic Relational Graph for Traffic Forecasting

by Jie Hu, Bingbing Tang, Langsha Zhu, Yiting Li, Jianjun Hu and Guanci Yang

Systems 2026, 14(1), 102; https://doi.org/10.3390/systems14010102 - 18 Jan 2026

Viewed by 321

Abstract

Accurate traffic flow prediction is a core component of intelligent transportation systems, supporting proactive traffic management, resource optimization, and sustainable urban mobility. However, urban traffic networks exhibit heterogeneous multi-scale periodic patterns and time-varying spatial interactions among road segments, which are not sufficiently captured [...] Read more.

Accurate traffic flow prediction is a core component of intelligent transportation systems, supporting proactive traffic management, resource optimization, and sustainable urban mobility. However, urban traffic networks exhibit heterogeneous multi-scale periodic patterns and time-varying spatial interactions among road segments, which are not sufficiently captured by many existing spatio-temporal forecasting models. To address this limitation, this paper proposes PDR-STGCN (Periodicity-Aware Dynamic Relational Spatio-Temporal Graph Convolutional Network), an enhanced STGCN framework that jointly models multi-scale periodicity and dynamically evolving spatial dependencies for traffic flow prediction. Specifically, a periodicity-aware embedding module is designed to capture heterogeneous temporal cycles (e.g., daily and weekly patterns) and emphasize dominant social rhythms in traffic systems. In addition, a dynamic relational graph construction module adaptively learns time-varying spatial interactions among road nodes, enabling the model to reflect evolving traffic states. Spatio-temporal feature fusion and prediction are achieved through an attention-based Bidirectional Long Short-Term Memory (BiLSTM) network integrated with graph convolution operations. Extensive experiments are conducted on three datasets, including Metro Traffic Los Angeles (METR-LA), Performance Measurement System Bay Area (PEMS-BAY), and a real-world traffic dataset from Guizhou, China. Experimental results demonstrate that PDR-STGCN consistently outperforms state-of-the-art baseline models. For next-hour traffic forecasting, the proposed model achieves average reductions of 16.50% in RMSE, 9.00% in MAE, and 0.34% in MAPE compared with the second-best baseline. Beyond improved prediction accuracy, PDR-STGCN reveals latent spatio-temporal evolution patterns and dynamic interaction mechanisms, providing interpretable insights for traffic system analysis, simulation, and AI-driven decision-making in urban transportation networks. Full article

(This article belongs to the Special Issue AI-Driven Transportation Systems: Innovations, Challenges, and Future Mobility)

► Show Figures

Figure 1

33 pages, 465 KB

Open AccessArticle

A Multi-Stage NLP Framework for Knowledge Discovery from Crop Disease Research Literature

by Jantima Polpinij, Manasawee Kaenampornpan, Christopher S. G. Khoo, Wei-Ning Cheng and Bancha Luaphol

Mathematics 2026, 14(2), 299; https://doi.org/10.3390/math14020299 - 14 Jan 2026

Viewed by 446

Abstract

Extracting and organizing knowledge from the agricultural crop disease research literature are challenging tasks because of the heterogeneous terminologies, complicated symptom descriptions, and unstructured nature of scientific documents. In this study, we developed a multi-stage natural language processing (NLP) pipeline to automate knowledge [...] Read more.

Extracting and organizing knowledge from the agricultural crop disease research literature are challenging tasks because of the heterogeneous terminologies, complicated symptom descriptions, and unstructured nature of scientific documents. In this study, we developed a multi-stage natural language processing (NLP) pipeline to automate knowledge extraction, organization, and integration from the agricultural research literature into a domain-consistent crop disease knowledge graph. The model combines transformer-based sentence embeddings with variational deep clustering to extract topics, which are further refined via facet-aware relevance scoring for sentence selection to be included in the summary. Lexicon-guided named entity recognition helps in the precise identification and normalization of terms for crops, diseases, symptoms, etc. Relation extraction based on a combination of lexical, semantic, and contextual features leads to the meaningful generation of triplets for the knowledge graph. The experimental results show that the method yielded consistently good results at each stage of the knowledge extraction process. Among the combinations of embedding and deep clustering methods, SciBERT + VaDE achieved the best clustering results. The extraction of representative sentences for disease symptoms, control/treatment, and prevention obtained high F1-scores of around 0.8. The resulting knowledge graph has high node coverage and high relation completeness, as well as high precision and recall in triplet generation. The multi-stage NLP pipeline effectively converts unstructured agricultural research texts into a coherent and semantically rich knowledge graph, providing a basis for further research in crop disease analysis, knowledge retrieval, and data-driven decision support in agricultural informatics. Full article

(This article belongs to the Special Issue Research on Machine Learning, Data Mining, Natural Language Processes, and Optimization Methods)

► Show Figures

Figure 1

16 pages, 7621 KB

Open AccessArticle

Weighted Sampling Enclosing Subgraphs-Based Link Prediction in Attributed Graphs

by Ganglin Hu

Information 2026, 17(1), 66; https://doi.org/10.3390/info17010066 - 11 Jan 2026

Viewed by 227

Abstract

Link prediction is a fundamental problem for graphs, which can reveal the potential relationships between users. Graph embedding can easily encode graph structural relations, and heterogeneous attribute features in a continuous vector space, which is effective in link prediction. However, graph embedding methods [...] Read more.

Link prediction is a fundamental problem for graphs, which can reveal the potential relationships between users. Graph embedding can easily encode graph structural relations, and heterogeneous attribute features in a continuous vector space, which is effective in link prediction. However, graph embedding methods for large-scale graphs suffer high computation and space costs, and sampling enclosing subgraphs is a practical yet efficient way to obtain the most features at the least cost. Nevertheless, the existing sampling techniques may lose essential features when the random sampling number of nodes is not large, as node features are assumed to follow a uniform distribution. In this paper, we propose a novel large-scale graph sampling strategy for link prediction named Weighted Sampling Enclosing subgraphs-based Link prediction (WSEL) to resolve this issue, which maximumly preserves the structural and attribute features of enclosing subgraphs with less sampling. More specifically, we first extract the feature importance of each node in an enclosing subgraph and then take the node importance as node weight. Then, random walk node sequences are obtained by multiple weighted random walks from a target pair of nodes, generating a weighted sampling of enclosing subgraphs. By leveraging the weighted sampling enclosing subgraphs, WSEL can scale to larger graphs with much less overhead while maintaining some essential information of the original graph. Experiments on real-world datasets demonstrate that our model can scale to larger graphs while maintaining competitive link prediction performance under substantially reduced computational cost. Full article

(This article belongs to the Special Issue Graph Neural Networks and Transformers for Intelligent Data-Driven Systems)

► Show Figures

Graphical abstract

24 pages, 1916 KB

Open AccessFeature PaperArticle

ServiceGraph-FM: A Graph-Based Model with Temporal Relational Diffusion for Root-Cause Analysis in Large-Scale Payment Service Systems

by Zhuoqi Zeng and Mengjie Zhou

Mathematics 2026, 14(2), 236; https://doi.org/10.3390/math14020236 - 8 Jan 2026

Viewed by 410

Abstract

Root-cause analysis (RCA) in large-scale microservice-based payment systems is challenging due to complex failure propagation along service dependencies, limited availability of labeled incident data, and heterogeneous service topologies across deployments. We propose ServiceGraph-FM, a pretrained graph-based model for RCA, where “foundation” denotes a [...] Read more.

Root-cause analysis (RCA) in large-scale microservice-based payment systems is challenging due to complex failure propagation along service dependencies, limited availability of labeled incident data, and heterogeneous service topologies across deployments. We propose ServiceGraph-FM, a pretrained graph-based model for RCA, where “foundation” denotes a self-supervised graph encoder pretrained on large-scale production cluster traces and then adapted to downstream diagnosis. ServiceGraph-FM introduces three components: (1) masked graph autoencoding pretraining to learn transferable service-dependency embeddings for cross-topology generalization; (2) a temporal relational diffusion module that models anomaly propagation as graph diffusion on dynamic service graphs (i.e., Laplacian-governed information flow with learnable edge propagation strengths); and (3) a causal attention mechanism that leverages multi-hop path signals to better separate likely causes from correlated downstream effects. Experiments on the Alibaba Cluster Trace and synthetic PayPal-style topologies show that ServiceGraph-FM outperforms state-of-the-art baselines, improving Top-1 accuracy by 23.7% and Top-3 accuracy by 18.4% on average, and reducing mean time to detection by 31.2%. In zero-shot deployment on unseen architectures, the pretrained model retains 78.3% of its fully fine-tuned performance, indicating strong transferability for practical incident management. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI