MDPI - Publisher of Open Access Journals

57 pages, 5985 KB

Open AccessReview

Mathematical Framework for Explainable Vehicle Systems Integrating Graph-Theoretic Road Geometry and Constrained Optimization

by Asif Mehmood and Faisal Mehmood

Mathematics 2026, 14(10), 1710; https://doi.org/10.3390/math14101710 (registering DOI) - 15 May 2026

Abstract

Deep learning models are widely used in autonomous vehicle systems for perception, localization, and decision-making. However, their lack of transparency poses significant challenges in safety-critical environments. This systematic review presents a unified mathematical framework for explainable deep learning which integrates multimodal inputs, graph-theoretic [...] Read more.

Deep learning models are widely used in autonomous vehicle systems for perception, localization, and decision-making. However, their lack of transparency poses significant challenges in safety-critical environments. This systematic review presents a unified mathematical framework for explainable deep learning which integrates multimodal inputs, graph-theoretic road geometry, uncertainty modeling, and intrinsically interpretable representations. Road-structured priors that include lane topology and spatial constraints are incorporated into learning and optimization processes for ensuring model predictions and explanations to remain physically and semantically grounded. The review synthesizes methods across saliency-based, concept-based, causal, and intrinsic explainability, and extends them to vision-language models. This enables language-grounded, human-interpretable reasoning in autonomous vehicle systems. While vision-language models offer a new paradigm for semantic explainability, their limitations such as hallucinations, misgrounding, and reduced reliability under distribution shifts are also critically examined. Along with the role of road priors in improving alignment and robustness, another key contribution of this work is its quantitative evaluation metrics for road-aware explainability. These evaluation metrics link the explanations to spatial consistency, uncertainty alignment, and graph-structured reasoning. The overall framework connects latent representations, predictions, and explanations within a single formulation, enabling systematic comparison and analysis across models. Based on a PRISMA-guided review of 164 studies, this research identifies gaps in real-world reliability, temporal reasoning, and standardized evaluation, and outlines future directions including human-in-the-loop systems, regulatory readiness, and language-based auditing. Overall, this study advances a mathematically grounded and road-aware perspective on explainable vehicle AI which significantly bridges the gap between high-performance models and transparent, trustworthy autonomous systems. Full article

(This article belongs to the Special Issue Applications of Deep Learning and Convolutional Neural Network)

19 pages, 8217 KB

Open AccessArticle

A GIN-Based Pre-Identification Method for Dominant Flow Channels in Connection-Element Reservoirs: An Optimized Ant Colony Algorithm Search Scheme

by Zihao Zheng, Siying Chen, Fulin An, Shengquan Yu, Haotong Guo, Ze Du, Hua Xiang and Yunfeng Xu

Processes 2026, 14(10), 1605; https://doi.org/10.3390/pr14101605 - 15 May 2026

Abstract

Dominant flow channels formed during the late stages of waterflooding can severely reduce sweep efficiency and intensify ineffective interwell circulation. Conventional identification approaches, including tracer testing, well testing, and numerical simulation, often suffer from high operational cost, long execution time, or limited adaptability [...] Read more.

Dominant flow channels formed during the late stages of waterflooding can severely reduce sweep efficiency and intensify ineffective interwell circulation. Conventional identification approaches, including tracer testing, well testing, and numerical simulation, often suffer from high operational cost, long execution time, or limited adaptability to heterogeneous interwell connectivity. Although ant colony optimization (ACO) is suitable for path-search problems in reservoir networks, its performance depends strongly on hyperparameter settings, and sample-by-sample parameter tuning introduces substantial online computational overhead. This study proposes a structure-informed GIN–ACO framework for adaptive dominant flow channel identification in connection-element reservoir graphs. A physics-constrained benchmark model is first established using Darcy’s law and the connection element method to provide reference flow paths. A geometry-based surrogate model is then developed to approximate flow splitting coefficients efficiently while preserving the main physical trends. Based on graph topology and geometric descriptors, a graph isomorphism network is trained to predict task-specific ACO parameters, replacing iterative online search with direct parameter inference. Experiments on 1000 synthetic reservoir graphs show that the proposed method achieves a 100% success rate with an average online computation time of 143.5 ms, outperforming fixed-parameter ACO, PSO-ACO, and BO-ACO. On 20 semi-realistic SPE10 reservoir models, GIN–ACO achieves a success rate of 92 ± 1% with an average runtime of 160.3 ± 5 ms. Ablation studies further confirm that graph-structure learning, combined topology–geometry features, and GIN-based parameter prediction are essential for robust performance. The proposed framework provides a promising and computationally efficient route for structure-aware dominant channel identification in connection-element reservoir models. Full article

(This article belongs to the Section AI-Enabled Process Engineering)

► Show Figures

Figure 1

30 pages, 1991 KB

Open AccessArticle

Query-Driven Candidate Relation Screening for Scene Graph-Based Visual Relation Retrieval

by Wan Wang, Ke Wang and Huiqin Wang

Appl. Sci. 2026, 16(10), 4947; https://doi.org/10.3390/app16104947 (registering DOI) - 15 May 2026

Abstract

Scene graph generation (SGG) provides a structured representation for visual understanding. However, most existing methods are designed to optimize global triplet recall rather than retrieve relation instances specified by a user query. In query-driven visual relation retrieval, two major challenges arise: the target [...] Read more.

Scene graph generation (SGG) provides a structured representation for visual understanding. However, most existing methods are designed to optimize global triplet recall rather than retrieve relation instances specified by a user query. In query-driven visual relation retrieval, two major challenges arise: the target relation must compete with a highly redundant candidate space, and query semantics are not incorporated before relation classification. To address these challenges, we propose a Query-Driven Candidate Relation Screening (QCRS) module, which injects query semantics into the candidate screening process. Specifically, QCRS encodes the query and candidate visual relation features, and then filters query-relevant candidates through relevance scoring. By reducing interference from irrelevant candidates and avoiding redundant computation, QCRS improves the final exact triplet hit performance and enhances the interpretability of query-specific relations, thereby facilitating query-driven visual relation retrieval. Built upon the strong EGTR baseline, QCRS learns query relevance to prioritize relation instances matching the target query, enabling precise triplet retrieval. Extensive ablation studies and analyses on the VG150 benchmark validate the effectiveness of the proposed approach: when integrated with EGTR, QCRS improves PairR@50 from 61.52% to 80.06% and ETR@50 from 30.54% to 47.07%, achieving absolute gains of over 16 percentage points in both correct object pair retention and end-to-end target relation retrieval performance. Full article

26 pages, 3343 KB

Open AccessArticle

Graph Sampling Contrastive Self-Supervised Graph Neural Network for Network Traffic Anomaly Detection

by Min Yang and Caiming Liu

Electronics 2026, 15(10), 2119; https://doi.org/10.3390/electronics15102119 - 15 May 2026

Abstract

With the increasing scale and complexity of network traffic, anomaly detection faces significant challenges, particularly under the scarcity of labeled data in real-world environments. Although graph neural networks (GNNs) effectively model relational structures, most existing approaches rely on supervised learning, limiting their applicability [...] Read more.

With the increasing scale and complexity of network traffic, anomaly detection faces significant challenges, particularly under the scarcity of labeled data in real-world environments. Although graph neural networks (GNNs) effectively model relational structures, most existing approaches rely on supervised learning, limiting their applicability in weakly labeled or unlabeled scenarios. To address these limitations, this paper proposes a self-supervised graph neural network framework, termed EGSCA, for network traffic anomaly detection. The framework employs a GNN to jointly model node and edge information, enabling the learning of discriminative representations. On this basis, a graph contrastive learning strategy is designed, where diverse subgraphs are generated via breadth-first search (BFS) to effectively capture local structural patterns. Meanwhile, a hybrid contrastive loss based on Wasserstein distance and Gromov–Wasserstein distance is introduced to achieve collaborative optimization between feature-space alignment and structural consistency under unlabeled conditions. Experimental results on multiple benchmark datasets demonstrate that the proposed method achieves competitive performance. Notably, it achieves the best results on datasets NF-BoT-IoT and NF-BoT-IoT-v2, with average improvements of approximately 3.2% in F1-score and 1.7% in DR over the strongest baseline. Further analysis indicates that the model yields more pronounced performance gains in scenarios with high class separability. Full article

(This article belongs to the Special Issue AI in Cybersecurity, 3rd Edition)

► Show Figures

Figure 1

25 pages, 1542 KB

Open AccessArticle

Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis

by Sakthidhasan Periasamy, Rajesh Ramasamy, Rajasekar Chinnaiyan and Arun Sridhar

Sci. Pharm. 2026, 94(2), 39; https://doi.org/10.3390/scipharm94020039 - 14 May 2026

Abstract

Background/Objectives: Tuberculosis (TB), caused by Mycobacterium tuberculosis, remains a major global health challenge, exacerbated by the emergence of multidrug-resistant strains and limited efficacy of existing therapies. Given the involvement of multiple essential mycobacterial proteins, multitarget drug discovery represents a rational therapeutic strategy. [...] Read more.

Background/Objectives: Tuberculosis (TB), caused by Mycobacterium tuberculosis, remains a major global health challenge, exacerbated by the emergence of multidrug-resistant strains and limited efficacy of existing therapies. Given the involvement of multiple essential mycobacterial proteins, multitarget drug discovery represents a rational therapeutic strategy. Methods: In this study, an integrated in silico pipeline combining machine learning–based quantitative structure–activity relationship modeling, graph neural network–driven drug–target affinity prediction, molecular docking, molecular dynamics (MD) simulations, and pharmacokinetic–toxicity profiling was employed to identify potential antitubercular leads from natural products. Results: A curated library of over 0.69 million compounds from the COCONUT database was systematically screened against seven essential M. tuberculosis protein targets. Machine learning and heterogeneous graph neural network models effectively captured complex ligand–protein interaction patterns, enabling high-confidence multitarget prioritization. Structure-based docking and MM-GBSA analyses revealed favorable binding affinities, further supported by 100 ns Molecular Dynamics simulations demonstrating stable binding and conformational integrity. In silico ADMET and toxicity predictions identified pharmacokinetically balanced candidates, while density functional theory calculations corroborated favorable electronic properties. Conclusion: Notably, a myricetin-based flavonoid glycoside exhibited consistent multitarget binding and dynamic stability across all targets. Overall, this study underscores the potential of integrated artificial intelligence and structure-based approaches in accelerating natural product-based antitubercular drug discovery and supports further experimental validation of prioritized leads. Full article

22 pages, 447 KB

Open AccessArticle

Graph-Contrastive Pretraining for Payload-Free Encrypted-Traffic Intrusion Detection: Cross-Dataset OOD Transfer with Frozen Artifacts

by Miguel Arcos-Argudo, Rodolfo Bojorque and David Galarza-García

Algorithms 2026, 19(5), 389; https://doi.org/10.3390/a19050389 - 13 May 2026

Viewed by 4

Abstract

Encrypted transport increasingly limits the visibility required by intrusion detection systems (IDS), motivating payload-free learning from flow statistics and protocol metadata. We introduce GCP, a graph-contrastive pretraining framework that casts flows as nodes in a sparse graph and learns transferable node embeddings [...] Read more.

Encrypted transport increasingly limits the visibility required by intrusion detection systems (IDS), motivating payload-free learning from flow statistics and protocol metadata. We introduce GCP, a graph-contrastive pretraining framework that casts flows as nodes in a sparse graph and learns transferable node embeddings via an InfoNCE-style objective with graph-specific augmentations. The learned encoder is evaluated through frozen-embedding linear probing and cross-dataset out-of-domain (OOD) transfer, within a fully scripted pipeline that freezes run manifests and artifacts to make every reported number traceable and reproducible. Experiments cover enterprise IDS and encrypted DNS/DoH traffic using CICIDS2017, UNSW-NB15, and DoH-Combined at three label granularities (L1/L2/L3), for both binary detection (y) and finer-grained targets (

y_{multi}

), aggregated over five fixed split seeds with 95% confidence intervals. Results show that GCP yields a pronounced in-domain advantage on UNSW-NB15 for y (Macro-F1

\approx 0.993

) while substantially reducing false-alarm rate (FAR

\approx 0.013

) compared with strong tabular baselines. In feature-separable regimes (CICIDS2017 and DoH L1/L2), boosted-tree and supervised baselines remain difficult to surpass, but ablations confirm that graph structure alone is insufficient without contrastive pretraining. OOD transfer is strongly source–target dependent, with the most reliable transfer within closely related DoH domains, highlighting dataset shift as a first-class evaluation criterion for encrypted-traffic IDS. Full article

(This article belongs to the Special Issue Scalable Algorithms for Large-Scale Graph Neural Networks)

22 pages, 3368 KB

Open AccessArticle

QGKM: A Quantum Fidelity-Based Graph Clustering Framework for Robust Data Pattern Recognition in Education Social Networks

by Neal N. Xiong, Weiqing Long, Dacheng He, Xiangwei Meng, Zulong Diao, Sergey M. Avdoshin and Yevgeni Koucheryavy

Algorithms 2026, 19(5), 386; https://doi.org/10.3390/a19050386 - 13 May 2026

Viewed by 59

Abstract

In the era of data-driven education, educational social networks generate large volumes of high-dimensional and complex-structured data through learner interactions, collaborative activities, and resource-sharing behaviors, posing significant challenges to traditional unsupervised learning methods. Such data often exhibit non-convex distributions, heterogeneity, and noise sensitivity, [...] Read more.

In the era of data-driven education, educational social networks generate large volumes of high-dimensional and complex-structured data through learner interactions, collaborative activities, and resource-sharing behaviors, posing significant challenges to traditional unsupervised learning methods. Such data often exhibit non-convex distributions, heterogeneity, and noise sensitivity, making conventional clustering approaches insufficient for capturing their intrinsic structural relationships. To address this issue, this paper proposes Quantum Fidelity-Based Graph K-Means (QGKM), a clustering framework for robust pattern recognition in educational social networks. Specifically, QGKM employs quantum state encoding to map complex educational data into a quantum state space and utilizes quantum fidelity as a similarity metric to uncover latent correlations that Euclidean distance cannot effectively capture. In addition, the incorporation of k-nearest neighbor graphs preserves the local geometric structure of learner interaction networks, while a deterministic greedy hierarchical merging strategy eliminates the instability caused by random initialization. Experimental results on seven real-world datasets demonstrate that QGKM consistently outperforms classical K-Means in clustering accuracy. The proposed framework provides an effective solution for learning pattern discovery, learner profiling, and intelligent recommendation in digital education environments. Full article

(This article belongs to the Special Issue Artificial Intelligence in Education: Innovations and Implications)

► Show Figures

Figure 1

29 pages, 12420 KB

Open AccessArticle

A Dueling DQN-Based Hyper-Heuristic Framework for Learning Path Optimization

by Yong-Wei Zhang, Ming-Yang Zhu, Wen-Kai Xia, Xin-Yang Zhang and Jin-Di Liu

Big Data Cogn. Comput. 2026, 10(5), 153; https://doi.org/10.3390/bdcc10050153 - 13 May 2026

Viewed by 55

Abstract

Learning path optimization is crucial in intelligent educational systems, with the core challenge of efficient multi-objective sequential decision-making under complex prerequisite constraints. To address the poor generalization of existing methods relying on fixed operator scheduling or handcrafted heuristics, this paper proposes a hyper-heuristic [...] Read more.

Learning path optimization is crucial in intelligent educational systems, with the core challenge of efficient multi-objective sequential decision-making under complex prerequisite constraints. To address the poor generalization of existing methods relying on fixed operator scheduling or handcrafted heuristics, this paper proposes a hyper-heuristic framework based on Dueling Deep Q-Network (Dueling DQN-HH), formulating operator selection as a sequential decision-making process for dynamic adaptive scheduling of low-level operators. The framework adopts priority-based encoding to unify learning path representation (decoupling the hyper-heuristic layer from the problem domain) and designs a composite reward mechanism integrating reward shaping, exploration incentives, and computational cost awareness to balance solution quality and efficiency. Additionally, it employs a dueling network architecture with prioritized experience replay to enhance policy learning stability. Experimental results show the proposed method outperforms representative baseline algorithms in solution quality, convergence stability, and computational efficiency. The framework demonstrates superior performance across multiple objectives, particularly in minimizing the total learning time (F_time), as validated on two heterogeneous datasets: MOOCCube (Computer Science) and PsyDataset (Psychology). Further ablation studies and operator evolution analyses verify its adaptive scheduling capability under different objectives and knowledge graph structures, demonstrating strong objective independence and cross-dataset generalization. Full article

(This article belongs to the Section Data Mining and Machine Learning)

► Show Figures

Figure 1

19 pages, 1192 KB

Open AccessArticle

From Ontology to Application: A Semantic Architecture for Music Education in Low-Code Environments

by Ioannis Kakaras, Vasilios Zoumboulidis, Ioannis Paliokas and Stavros Valsamidis

Electronics 2026, 15(10), 2071; https://doi.org/10.3390/electronics15102071 - 13 May 2026

Viewed by 140

Abstract

This study investigates the design, development, and practical exploitation of an educational ontology for classical guitar instruction, within a semantically driven and application-oriented framework. The proposed approach aims to bridge the gap between formal knowledge representation and its functional use in real educational [...] Read more.

This study investigates the design, development, and practical exploitation of an educational ontology for classical guitar instruction, within a semantically driven and application-oriented framework. The proposed approach aims to bridge the gap between formal knowledge representation and its functional use in real educational contexts. The ontology is developed using OWL in the Protégé environment and systematically models core pedagogical elements, including learning objectives, technical skills, instructional practices, and assessment processes, in alignment with the official curriculum. The semantic model is stored and managed as an RDF graph within a GraphDB repository, where it supports consistency checking and semantic querying through SPARQL. For application development, the ontological model is subsequently translated into a structured tabular schema suitable for the AppSheet low-code environment. Thus, GraphDB functions as a semantic validation and knowledge management layer, whereas the educational application operates on an application-oriented representation derived from the ontology rather than on a live RDF backend. The proposed three-tier architecture (Ontology–GraphDB–Application) demonstrates how Semantic Web technologies can support the transformation of abstract knowledge models into functional educational systems. The results highlight the capacity of ontology-driven approaches to enhance the organization, reusability, and pedagogical coherence of instructional knowledge, while enabling scalable and accessible application development through low-code technologies. The study contributes to the field of educational technology by providing a practical framework for integrating semantic knowledge representation into music education and laying a semantic foundation for future extensions toward adaptive and intelligent learning environments. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

22 pages, 635 KB

Open AccessArticle

Preference-Guided Debiasing and Denoising Social Recommendation

by Jun Li, Shenghan Li, Huachang Zeng and Shengda Zhuo

Information 2026, 17(5), 473; https://doi.org/10.3390/info17050473 - 12 May 2026

Viewed by 89

Abstract

User behaviors and social interactions on online platforms are intricately intertwined, naturally forming complex graph structures. Leveraging this structure, Graph Neural Networks (GNNs) efficiently aggregate neighborhood information and have become a prevailing paradigm for social recommendation. However, existing methods often overemphasize social modeling [...] Read more.

User behaviors and social interactions on online platforms are intricately intertwined, naturally forming complex graph structures. Leveraging this structure, Graph Neural Networks (GNNs) efficiently aggregate neighborhood information and have become a prevailing paradigm for social recommendation. However, existing methods often overemphasize social modeling while overlooking the joint effects of preference-guided relation filtering and user/item biases, rendering them vulnerable to noise from redundant ties. To address these limitations, we propose PDDSR, a Preference-Guided Debiasing and Denoising Social Recommendation framework. Specifically, for debiasing, PDDSR explicitly models user rating bias and item popularity bias as learnable vectors, integrating them into embedding learning to mitigate bias drift at the embedding level. Simultaneously, for denoising, the model employs a social relation confidence mechanism guided by user preferences and adopts an adaptive graph denoising strategy to retain highly informative connections, effectively capturing social influence while filtering out noise. Extensive experiments on the Ciao and Epinions datasets demonstrate that PDDSR consistently outperforms state-of-the-art methods, and notably on the Ciao dataset, the MAE and RMSE are improved by 1.90% and 1.87%, respectively. These results validate the effectiveness and robustness of the joint debiasing and denoising mechanism in complex social recommendation scenarios. Full article

(This article belongs to the Topic Graph Neural Networks and Learning Systems)

27 pages, 4468 KB

Open AccessArticle

A Molecular–Protein Fusion Framework for Rapid Virtual Screening: Accelerating Lead Discovery for “Undruggable’’ Oncogenic Targets

by Chenxi Zhou, Yanni Zhu, Chenrui Yang, Yu Gao, Jianyang Lu and Dengming Ming

Pharmaceuticals 2026, 19(5), 753; https://doi.org/10.3390/ph19050753 (registering DOI) - 12 May 2026

Viewed by 216

Abstract

Background/Objectives: KRAS G12D is one of the most frequent oncogenic mutations in pancreatic ductal adenocarcinoma (PDAC) and remains challenging to target because of its limited druggable binding pockets. This study aimed to develop a machine learning-based framework for rapid virtual screening of [...] Read more.

Background/Objectives: KRAS G12D is one of the most frequent oncogenic mutations in pancreatic ductal adenocarcinoma (PDAC) and remains challenging to target because of its limited druggable binding pockets. This study aimed to develop a machine learning-based framework for rapid virtual screening of potential KRAS G12D inhibitors. Methods: A molecular–protein fusion prediction framework, MPFF-IS, was constructed by integrating the ESM2 protein language model with an MPNN-GNN molecular graph network to enable joint representation learning of protein and compound features. The model was trained using a KRAS G12D inhibitor dataset and applied to screen compounds from multiple chemical libraries. AutoDock Vina docking and 300 ns GROMACS molecular dynamics simulations were subsequently performed for structural validation. Results: MPFF-IS achieved favorable predictive performance on the test dataset and identified 2663 candidate compounds from more than 134,000 screened molecules. Several candidate ligands exhibited favorable binding affinity, stable proteinligand interactions, and enhanced structural stability compared with reference inhibitors, including MRTX1133 and BI-2852. Molecular dynamics analyses further supported the stability of the predicted complexes and the involvement of key binding residues within the KRAS G12D pocket. Conclusions: These findings demonstrate that MPFF-IS can efficiently identify potential KRAS G12D inhibitors and may provide a useful computational framework for precision drug discovery targeting difficult oncogenic proteins. Full article

(This article belongs to the Special Issue Artificial Intelligence: New Molecules, Therapeutic Targets and Discovery of New Drugs)

► Show Figures

Figure 1

29 pages, 5091 KB

Open AccessArticle

RNAFoldDiff-Based Sequence-Aware Graph Diffusion for Accurate RNA 3D Structure Prediction

by Abdullah Al-Refai, Mohammad F. Al-Hammouri, Bandi Vamsi and Ali Al Bataineh

Algorithms 2026, 19(5), 381; https://doi.org/10.3390/a19050381 - 11 May 2026

Viewed by 214

Abstract

The prediction accuracy of RNA’s tertiary structure remains a core challenge in the field of computational biology. Existing models frequently encounter significant challenges due to the complexities of diverse topologies and the intricate nature of long-range interactions. We introduce RNAFoldDiff, a generative framework [...] Read more.

The prediction accuracy of RNA’s tertiary structure remains a core challenge in the field of computational biology. Existing models frequently encounter significant challenges due to the complexities of diverse topologies and the intricate nature of long-range interactions. We introduce RNAFoldDiff, a generative framework that integrates a sequence-aware graph transformer with a geometric diffusion process for end-to-end RNA 3D structure prediction. RNA sequences and secondary structures are converted into graph representations that capture backbone connectivity and base pair topology. The transformer models local motifs and global dependencies, while the diffusion module iteratively denoises coordinates into physically consistent conformations. The model was pretrained on more than 15,000 structural motifs from the RNA 3D Hub and fine-tuned on complete RNAs from the RNA-Puzzles dataset. In benchmarking tests, RNAFold-Diff achieved an average root mean square deviation (RMSD) of 2.64 Å, a Global Distance Test (GDT) score of 68.7%, and a base pair accuracy of 89.5%, reducing RMSD by nearly 30% and improving GDT by 9 points compared to RoseTTAFoldNA. The framework also outperformed FARFAR2, SimRNA, and RNAformer. Ablation experiments confirmed the contributions of diffusion refinement, edge-aware graph encoding, and motif-level pretraining, while qualitative analyses showed biologically plausible folds including helices, junctions, and multiloops. By combining topology-aware graph learning with generative diffusion, RNAFoldDiff advances RNA tertiary structure modeling and provides a practical tool for RNA design, ribozyme analysis, and structure-guided drug discovery. Full article

► Show Figures

Figure 1

20 pages, 6641 KB

Open AccessArticle

Topology-Aware Road Extraction from Remote Sensing Images Using Deep Learning and Graph-Based Connectivity Refinement

by Zixuan Teng, Zezhong Zheng, Xiangyang Sun and Hao Xue

ISPRS Int. J. Geo-Inf. 2026, 15(5), 208; https://doi.org/10.3390/ijgi15050208 - 9 May 2026

Viewed by 298

Abstract

Road networks are fundamental components of transportation infrastructure and play a crucial role in various geospatial applications. Although deep learning-based semantic segmentation models have achieved promising results in extracting roads from high-resolution remote sensing imagery, the resulting networks often suffer from topological fragmentation [...] Read more.

Road networks are fundamental components of transportation infrastructure and play a crucial role in various geospatial applications. Although deep learning-based semantic segmentation models have achieved promising results in extracting roads from high-resolution remote sensing imagery, the resulting networks often suffer from topological fragmentation due to occlusions and shadows. To address this issue, we propose a topology-aware road extraction method that integrates deep learning-based segmentation with a graph-based connectivity refinement strategy. Specifically, a Pyramid Scene Parsing Network (PSPNet) is first employed to generate initial road probability maps. Subsequently, a connectivity-oriented post-processing pipeline is introduced, which incorporates a multi-source cost function strategy and a direction-aware Dijkstra search algorithm. By utilizing endpoint tangent vectors as inertial weights, the algorithm effectively reconstructs fragmented segments while ensuring geometric smoothness and topological consistency. Furthermore, a dynamic road width restoration strategy is applied to transform refined skeletons into physically consistent road entities. Experiments conducted on two publicly available datasets, CHN6-CUG and DeepGlobe, demonstrate the effectiveness of the proposed method. Quantitative results show that the refinement process significantly enhances road connectivity with a minimal trade-off in pixel-level accuracy. Specifically, the Conn metric increases by 0.1989 on the CHN6-CUG dataset and 0.3055 on the DeepGlobe dataset, while MIoU remains high with only marginal decreases of 1.07% and 0.45%, respectively. These findings indicate that the method effectively restores structural continuity, helping with reliable road network generation and subsequent integration into Geographic Information System (GIS)-based applications such as urban planning and autonomous navigation. Full article

(This article belongs to the Topic Digital and Intelligent Technologies and Application in Urban Construction, Operation, Maintenance, and Renewal)

► Show Figures

Figure 1

18 pages, 650 KB

Open AccessArticle

Dynamic Topic-Based Hierarchical Prompt Learning for Multi-Label Image Classification

by Zhiwen Chen, Yijia Zhang, Miao Liu and Yue Peng

Electronics 2026, 15(10), 2025; https://doi.org/10.3390/electronics15102025 - 9 May 2026

Viewed by 159

Abstract

Flat label supervision often constrains multi-label image classification, as it struggles to fully capture inherent label dependencies. It provides limited guidance to the hierarchical features that naturally emerge in Vision Transformers. To address this structural misalignment, we propose Dynamic Topic-based Hierarchical Prompt Learning [...] Read more.

Flat label supervision often constrains multi-label image classification, as it struggles to fully capture inherent label dependencies. It provides limited guidance to the hierarchical features that naturally emerge in Vision Transformers. To address this structural misalignment, we propose Dynamic Topic-based Hierarchical Prompt Learning (DyT-HPL). Instead of relying on predefined and fixed label graphs, DyT-HPL utilizes offline hierarchical clustering to construct multi-granularity semantic priors, from which hierarchical prompts are dynamically retrieved. A frozen visual query branch generates stable semantic queries, which are then used to retrieve discrete prompts from constructed coarse-mid-fine prompt pools. These hierarchical prompts are adaptively injected into different network depths, ensuring that semantic guidance with different abstraction levels is introduced at the most suitable architectural stages. To maintain stable routing and prevent prompt mode collapse, we jointly optimize the architecture with asymmetric classification, surrogate matching, and intra-pool diversity losses. This tripartite design promotes a diverse prompt space and isolates routing updates from final predictions. Comprehensive experiments on MS-COCO, NUS-WIDE, and Corel5k demonstrate that DyT-HPL achieves consistent and favorable performance across diverse settings, highlighting the value of hierarchical semantic guidance with different abstraction levels. Full article

(This article belongs to the Section Computer Science & Engineering)

34 pages, 1758 KB

Open AccessReview

Sensor-Driven Deep Learning for Smart Home Intelligence: Signal Analysis, Multimodal Perception, and System-Level Applications

by Chenchen Wu, Ziqian Yang and Tao Sun

Sensors 2026, 26(10), 2993; https://doi.org/10.3390/s26102993 - 9 May 2026

Viewed by 506

Abstract

Smart home environments are evolving toward context-aware intelligent systems with the rapid integration of the Internet of Things (IoT), edge computing, and artificial intelligence. In such settings, large volumes of heterogeneous sensor data must be continuously processed to support perception, behavior understanding, and [...] Read more.

Smart home environments are evolving toward context-aware intelligent systems with the rapid integration of the Internet of Things (IoT), edge computing, and artificial intelligence. In such settings, large volumes of heterogeneous sensor data must be continuously processed to support perception, behavior understanding, and autonomous decision-making. Deep learning has emerged as a key approach for transforming raw sensor signals into structured representations that enable these functions. This review examines recent advances in deep learning for smart home applications from a sensor-driven perspective. Existing studies are organized into five major domains: human activity recognition, health monitoring and assisted living, smart energy management, security monitoring and anomaly detection, and voice interaction and intelligent control. Representative methodological paradigms—including convolutional and recurrent neural networks, Transformers, graph-based learning, multimodal fusion, and deep reinforcement learning—are discussed with emphasis on their roles in signal representation, multimodal integration, and decision-oriented modeling. Despite notable progress, several challenges continue to limit real-world deployment. These include the scarcity of high-quality labeled data, privacy and security concerns associated with continuous sensing, limited generalization across environments and users, constraints of edge devices, and the limited interpretability of model output. Addressing these issues requires advances not only in model design but also in data-efficient learning, privacy-preserving architectures, and system-level integration. Future research is expected to focus on multimodal perception, distributed and edge intelligence, knowledge-enhanced modeling, and human-centered explainable systems. By synthesizing current developments and highlighting open challenges, this review aims to support the development of robust and deployable deep learning solutions for next-generation smart home systems. Full article

(This article belongs to the Special Issue Sensor Signal Analysis for Intelligent Health Management and Autonomous Systems)

Search Results (1,468)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,468)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI