Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (43)

Search Parameters:
Keywords = hierarchical graph fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 6538 KB  
Article
Multi-Scale Graph-Decoupling Spatial–Temporal Network for Traffic Flow Forecasting in Complex Urban Environments
by Hongtao Li, Wenzheng Liu and Huaixian Chen
Electronics 2026, 15(3), 495; https://doi.org/10.3390/electronics15030495 - 23 Jan 2026
Viewed by 121
Abstract
Accurate traffic flow forecasting is a fundamental component of Intelligent Transportation Systems and proactive urban mobility management. However, the inherent complexity of urban traffic flow, characterized by non-stationary dynamics and multi-scale temporal dependencies, poses significant modeling challenges. Existing spatio-temporal models often struggle to [...] Read more.
Accurate traffic flow forecasting is a fundamental component of Intelligent Transportation Systems and proactive urban mobility management. However, the inherent complexity of urban traffic flow, characterized by non-stationary dynamics and multi-scale temporal dependencies, poses significant modeling challenges. Existing spatio-temporal models often struggle to reconcile the discrepancy between static physical road constraints and highly dynamic, state-dependent spatial correlations, while their reliance on fixed temporal receptive fields limits the capacity to disentangle overlapping periodicities and stochastic fluctuations. To bridge these gaps, this study proposes a novel Multi-scale Graph-Decoupling Spatial–temporal Network (MS-GSTN). MS-GSTN leverages a Hierarchical Moving Average decomposition module to recursively partition raw traffic flow signals into constituent patterns across diverse temporal resolutions, ranging from systemic daily trends to high-frequency transients. Subsequently, a Tri-graph Spatio-temporal Fusion module synergistically models scale-specific dependencies by integrating an adaptive temporal graph, a static spatial graph, and a data-driven dynamic spatial graph within a unified architecture. Extensive experiments on four large-scale real-world benchmark datasets demonstrate that MS-GSTN consistently achieves superior forecasting accuracy compared to representative state-of-the-art models. Quantitatively, the proposed framework yields an overall reduction in Mean Absolute Error of up to 6.2% and maintains enhanced stability across multiple forecasting horizons. Visualization analysis further confirms that MS-GSTN effectively identifies scale-dependent spatial couplings, revealing that long-term traffic flow trends propagate through global network connectivity while short-term variations are governed by localized interactions. Full article
Show Figures

Figure 1

22 pages, 795 KB  
Article
HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models
by Xinchen Shi, Zhenyu Han and Bin Li
Electronics 2026, 15(2), 421; https://doi.org/10.3390/electronics15020421 - 18 Jan 2026
Viewed by 133
Abstract
Entity alignment (EA) facilitates knowledge fusion by matching semantically identical entities in distinct knowledge graphs (KGs). Existing embedding-based methods rely solely on intrinsic KG facts and often struggle with long-tail entities due to insufficient information. Recently, large language models (LLMs), empowered by rich [...] Read more.
Entity alignment (EA) facilitates knowledge fusion by matching semantically identical entities in distinct knowledge graphs (KGs). Existing embedding-based methods rely solely on intrinsic KG facts and often struggle with long-tail entities due to insufficient information. Recently, large language models (LLMs), empowered by rich background knowledge and strong reasoning abilities, have shown promise for EA. However, most current LLM-enhanced approaches follow the in-context learning paradigm, requiring multi-round interactions with carefully designed prompts to perform additional auxiliary operations, which leads to substantial computational overhead. Moreover, they fail to fully exploit the complementary strengths of embedding-based small models and LLMs. To address these limitations, we propose HIEA, a novel hierarchical inference framework for entity alignment. By instruction-tuning a generative LLM with a unified and concise prompt and a knowledge adapter, HIEA produces alignment results with a single LLM invocation. Meanwhile, embedding-based small models not only generate candidate entities but also support the LLM through data augmentation and certainty-aware source entity classification, fostering deeper collaboration between small models and LLMs. Extensive experiments on both standard and highly heterogeneous benchmarks demonstrate that HIEA consistently outperforms existing embedding-based and LLM-enhanced methods, achieving absolute Hits@1 improvements of up to 5.6%, while significantly reducing inference cost. Full article
(This article belongs to the Special Issue AI-Powered Natural Language Processing Applications)
Show Figures

Figure 1

15 pages, 1262 KB  
Article
Structured Scene Parsing with a Hierarchical CLIP Model for Images
by Yunhao Sun, Xiaoao Chen, Heng Chen, Yiduo Liang and Ruihua Qi
Appl. Sci. 2026, 16(2), 788; https://doi.org/10.3390/app16020788 - 12 Jan 2026
Viewed by 175
Abstract
Visual Relationship Prediction (VRP) is crucial for advancing structured scene understanding, yet existing methods struggle with ineffective multimodal fusion, static relationship representations, and a lack of logical consistency. To address these limitations, this paper proposes a Hierarchical CLIP model (H-CLIP) for structured scene [...] Read more.
Visual Relationship Prediction (VRP) is crucial for advancing structured scene understanding, yet existing methods struggle with ineffective multimodal fusion, static relationship representations, and a lack of logical consistency. To address these limitations, this paper proposes a Hierarchical CLIP model (H-CLIP) for structured scene parsing. Our approach leverages a pre-trained CLIP backbone to extract aligned visual, textual, and spatial features for entities and their union regions. A multi-head self-attention mechanism then performs deep, dynamic multimodal fusion. The core innovation is a consistency and reversibility verification mechanism, which imposes algebraic constraints as a regularization loss to enforce logical coherence in the learned relation space. Extensive experiments on the Visual Genome dataset demonstrate the superiority of the proposed method. H-CLIP significantly outperforms state-of-the-art baselines on the predicate classification task, achieving a Recall@50 score of 64.31% and a Mean Recall@50 of 36.02%, thereby validating its effectiveness in generating accurate and logically consistent scene graphs even under long-tailed distributions. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

21 pages, 6454 KB  
Article
Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression
by Guanghu Wang, Yan Zhou, Yan Yan, Zhihan Zhou, Zikang Yang, Litao Dai and Junpeng Huang
Sustainability 2026, 18(2), 739; https://doi.org/10.3390/su18020739 - 11 Jan 2026
Viewed by 236
Abstract
Accurate probabilistic forecasting of photovoltaic (PV) power generation is crucial for grid scheduling and renewable energy integration. However, existing approaches often produce prediction intervals with limited calibration accuracy, and the interdependence among meteorological variables is frequently overlooked. This study proposes a probabilistic forecasting [...] Read more.
Accurate probabilistic forecasting of photovoltaic (PV) power generation is crucial for grid scheduling and renewable energy integration. However, existing approaches often produce prediction intervals with limited calibration accuracy, and the interdependence among meteorological variables is frequently overlooked. This study proposes a probabilistic forecasting framework based on a Multi-scale Temporal–Spatial Attention Quantile Regression Network (MTSA-QRN) and an adaptive calibration mechanism to enhance uncertainty quantification and ensure statistically reliable prediction intervals. The framework employs a dual-pathway architecture: a temporal pathway combining Temporal Convolutional Networks (TCN) and multi-head self-attention to capture hierarchical temporal dependencies, and a spatial pathway based on Graph Attention Networks (GAT) to model nonlinear meteorological correlations. A learnable gated fusion mechanism adaptively integrates temporal–spatial representations, and weather-adaptive modules enhance robustness under diverse atmospheric conditions. Multi-quantile prediction intervals are calibrated using conformalized quantile regression to ensure reliable uncertainty coverage. Experiments on a real-world PV dataset (15 min resolution) demonstrate that the proposed method offers more accurate and sharper uncertainty estimates than competitive benchmarks, supporting risk-aware operational decision-making in power systems. Quantitative evaluation on a real-world 40 MW photovoltaic plant demonstrates that the proposed MTSA-QRN achieves a CRPS of 0.0400 before calibration, representing an improvement of over 55% compared with representative deep learning baselines such as Quantile-GRU, Quantile-LSTM, and Quantile-Transformer. After adaptive calibration, the proposed method attains a reliable empirical coverage close to the nominal level (PICP90 = 0.9053), indicating effective uncertainty calibration. Although the calibrated prediction intervals become wider, the model maintains a competitive CRPS value (0.0453), striking a favorable balance between reliability and probabilistic accuracy. These results demonstrate the effectiveness of the proposed framework for reliable probabilistic photovoltaic power forecasting. Full article
(This article belongs to the Topic Sustainable Energy Systems)
Show Figures

Figure 1

22 pages, 2074 KB  
Article
Traffic Flow Prediction Model Based on Attention Mechanism Spatio-Temporal Graph Convolutional Network on U.S. Highways
by Ruiying Zhang and Yin Han
Appl. Sci. 2026, 16(1), 559; https://doi.org/10.3390/app16010559 - 5 Jan 2026
Viewed by 259
Abstract
Traffic flow prediction is a fundamental component of intelligent transportation systems and plays a critical role in traffic management and autonomous driving. However, accurately modeling highway traffic remains challenging due to dynamic congestion propagation, lane-level heterogeneity, and non-recurrent traffic events. To address these [...] Read more.
Traffic flow prediction is a fundamental component of intelligent transportation systems and plays a critical role in traffic management and autonomous driving. However, accurately modeling highway traffic remains challenging due to dynamic congestion propagation, lane-level heterogeneity, and non-recurrent traffic events. To address these challenges, this paper proposes an improved attention-mechanism spatio-temporal graph convolutional network, termed AMSGCN, for highway traffic flow prediction. AMSGCN introduces an adaptive adjacency matrix learning mechanism to overcome the limitations of static graphs and capture time-varying spatial correlations and congestion propagation paths. A hierarchical multi-scale spatial attention mechanism is further designed to jointly model local congestion diffusion and long-range bottleneck effects, enabling an adaptive spatial receptive field under congested conditions. To enhance temporal modeling, a gating-based fusion strategy dynamically balances periodic patterns and recent observations, allowing effective prediction under both regular and abnormal traffic scenarios. In addition, direction-aware encoding is incorporated to suppress interference from opposite-direction lanes, which is essential for directional highway traffic systems. Extensive experiments on multiple benchmark datasets, including PeMS and PEMSF, demonstrate the effectiveness and robustness of AMSGCN. In particular, on the I-24 MOTION dataset, AMSGCN achieves an RMSE reduction of 11.0% compared to ASTGCN and 17.4% relative to the strongest STGCN baseline. Ablation studies further confirm that dynamic and multi-scale spatial attention provides the primary performance gains, while temporal gating and direction-aware modeling offer complementary improvements. These results indicate that AMSGCN is a robust and effective solution for highway traffic flow prediction. Full article
Show Figures

Figure 1

21 pages, 886 KB  
Article
A Dual-Attention CNN–GCN–BiLSTM Framework for Intelligent Intrusion Detection in Wireless Sensor Networks
by Laith H. Baniata, Ashraf ALDabbas, Jaffar M. Atwan, Hussein Alahmer, Basil Elmasri and Chayut Bunterngchit
Future Internet 2026, 18(1), 5; https://doi.org/10.3390/fi18010005 - 22 Dec 2025
Viewed by 409
Abstract
Wireless Sensor Networks (WSNs) are increasingly being used in mission-critical infrastructures. In such applications, they are evaluated on the risk of cyber intrusions that can target the already constrained resources. Traditionally, Intrusion Detection Systems (IDS) in WSNs have been based on machine learning [...] Read more.
Wireless Sensor Networks (WSNs) are increasingly being used in mission-critical infrastructures. In such applications, they are evaluated on the risk of cyber intrusions that can target the already constrained resources. Traditionally, Intrusion Detection Systems (IDS) in WSNs have been based on machine learning techniques; however, these models fail to capture the nonlinear, temporal, and topological dependencies across the network nodes. As a result, they often suffer degradation in detection accuracy and exhibit poor adaptability against evolving threats. To overcome these limitations, this study introduces a hybrid deep learning-based IDS that integrates multi-scale convolutional feature extraction, dual-stage attention fusion, and graph convolutional reasoning. Moreover, bidirectional long short-term memory components are embedded into the unified framework. Through this combination, the proposed architecture effectively captures the hierarchical spatial–temporal correlations in the traffic patterns, thereby enabling precise discrimination between normal and attack behaviors across several intrusion classes. The model has been evaluated on a publicly available benchmarking dataset, and it has been found to attain higher classification capability in multiclass scenarios. Furthermore, the model outperforms conventional IDS-focused approaches. In addition, the proposed design aims to retain suitable computational efficiency, making it appropriate for edge and distributed deployments. Consequently, this makes it an effective solution for next-generation WSN cybersecurity. Overall, the findings emphasize that combining topology-aware learning with multi-branch attention mechanisms offers a balanced trade-off between interpretability, accuracy, and deployment efficiency for resource-constrained WSN environments. Full article
Show Figures

Graphical abstract

23 pages, 2909 KB  
Article
A Symmetry-Aware Hierarchical Graph-Mamba Network for Spatio-Temporal Road Damage Detection
by Zichun Tian, Xiaokang Shao, Yuqi Bai, Qianyun Zhang, Zhuxuanzi Wang and Yingrui Ji
Symmetry 2025, 17(12), 2173; https://doi.org/10.3390/sym17122173 - 17 Dec 2025
Viewed by 422
Abstract
The prompt and precise detection of road damage is vital for effective infrastructure management, forming the foundation for intelligent transportation systems and cost-effective pavement maintenance. While current convolutional neural network (CNN)-based methodologies have made progress, they are fundamentally limited by treating damages as [...] Read more.
The prompt and precise detection of road damage is vital for effective infrastructure management, forming the foundation for intelligent transportation systems and cost-effective pavement maintenance. While current convolutional neural network (CNN)-based methodologies have made progress, they are fundamentally limited by treating damages as independent, isolated entities, thereby ignoring the intrinsic spatial symmetry and topological organization inherent in complex damage patterns like alligator cracking. This conceptual asymmetry in modeling leads to two major deficiencies: “context blindness,” which overlooks essential structural interrelations, and “temporal inconsistency” in video analysis, resulting in unstable, flickering predictions. To address this, we propose a Spatio-Temporal Graph Mamba You-Only-Look-Once (STG-Mamba-YOLO) network, a novel architecture that introduces a symmetry-informed, hierarchical reasoning process. Our approach explicitly models and integrates contextual dependencies across three levels to restore a holistic and consistent structural representation. First, at the pixel level, a Mamba state-space model within the YOLO backbone enhances the modeling of long-range spatial dependencies, capturing the elongated symmetry of linear cracks. Second, at the object level, an intra-frame damage Graph Network enables explicit reasoning over the topological symmetry among damage candidates, effectively reducing false positives by leveraging their relational structure. Third, at the sequence level, a Temporal Graph Mamba module tracks the evolution of this damage graph, enforcing temporal symmetry across frames to ensure stable, non-flickering results in video streams. Comprehensive evaluations on multiple public benchmarks demonstrate that our method outperforms existing state-of-the-art approaches. STG-Mamba-YOLO shows significant advantages in identifying intricate damage topologies while ensuring robust temporal stability, thereby validating the effectiveness of our symmetry-guided, multi-level contextual fusion paradigm for structural health monitoring. Full article
Show Figures

Figure 1

31 pages, 36598 KB  
Article
Spatio-Temporal and Semantic Dual-Channel Contrastive Alignment for POI Recommendation
by Chong Bu, Yujie Liu, Jing Lu, Manqi Huang, Maoyi Li and Jiarui Li
Big Data Cogn. Comput. 2025, 9(12), 322; https://doi.org/10.3390/bdcc9120322 - 15 Dec 2025
Viewed by 372
Abstract
Point-of-Interest (POI) recommendation predicts users’ future check-ins based on their historical trajectories and plays a key role in location-based services (LBS). Traditional approaches such as collaborative filtering and matrix factorization model user–POI interaction matrices fail to fully leverage spatio-temporal information and semantic attributes, [...] Read more.
Point-of-Interest (POI) recommendation predicts users’ future check-ins based on their historical trajectories and plays a key role in location-based services (LBS). Traditional approaches such as collaborative filtering and matrix factorization model user–POI interaction matrices fail to fully leverage spatio-temporal information and semantic attributes, leading to weak performance on sparse and long-tail POIs. Recently, Graph Neural Networks (GNNs) have been applied by constructing heterogeneous user–POI graphs to capture high-order relations. However, they still struggle to effectively integrate spatio-temporal and semantic information and enhance the discriminative power of learned representations. To overcome these issues, we propose Spatio-Temporal and Semantic Dual-Channel Contrastive Alignment for POI Recommendation (S2DCRec), a novel framework integrating spatio-temporal and semantic information. It employs hierarchical relational encoding to capture fine-grained behavioral patterns and high-level semantic dependencies. The model jointly captures user–POI interactions, temporal dynamics, and semantic correlations in a unified framework. Furthermore, our alignment strategy ensures micro-level collaborative and spatio-temporal consistency and macro-level semantic coherence, enabling fine-grained embedding fusion and interpretable contrastive learning. Experiments on real-world datasets, Foursquare NYC, and Yelp, show that S2DCRec outperforms all baselines, improving F1 scores by 4.04% and 3.01%, respectively. These results demonstrate the effectiveness of the dual-channel design in capturing both sequential and semantic dependencies for accurate POI recommendation. Full article
(This article belongs to the Topic Graph Neural Networks and Learning Systems)
Show Figures

Figure 1

26 pages, 477 KB  
Article
MTSA-CG: Mongolian Text Sentiment Analysis Based on ConvBERT and Graph Attention Network
by Qingdaoerji Ren, Qihui Wang, Ying Lu, Yatu Ji and Nier Wu
Electronics 2025, 14(23), 4581; https://doi.org/10.3390/electronics14234581 - 23 Nov 2025
Viewed by 516
Abstract
In Mongolian Text Sentiment Analysis (MTSA), the scarcity of annotated sentiment datasets and the insufficient consideration of syntactic dependency and topological structural information pose significant challenges to accurately capturing semantics and effectively extracting emotional features. To address these issues, this paper proposes a [...] Read more.
In Mongolian Text Sentiment Analysis (MTSA), the scarcity of annotated sentiment datasets and the insufficient consideration of syntactic dependency and topological structural information pose significant challenges to accurately capturing semantics and effectively extracting emotional features. To address these issues, this paper proposes a Mongolian Text Sentiment Analysis model based on ConvBERT and Graph Attention Network (MTSA-CG). Firstly, the ConvBERT pre-trained model is employed to extract textual features under limited data conditions, aiming to mitigate the shortcomings caused by data scarcity. Concurrently, textual data are transformed into graph-structured data, integrating co-occurrence, dependency, and similarity information into a Graph Attention Network (GAT) to capture syntactic and structural cues, enabling a deeper understanding of semantic and emotional connotations for more precise sentiment classification. The proposed multi-graph fusion strategy employs a hierarchical attention mechanism that dynamically weights different graph types based on their semantic relevance, distinguishing it from conventional graph aggregation methods. Experimental results demonstrate that, in comparison with various advanced baseline models, the proposed method significantly enhances the accuracy of MTSA. Full article
Show Figures

Figure 1

26 pages, 61479 KB  
Article
Graph-Based Multi-Resolution Cosegmentation for Coarse-to-Fine Object-Level SAR Image Change Detection
by Jingxing Zhu, Miao Yu, Feng Wang, Guangyao Zhou, Niangang Jiao, Yuming Xiang and Hongjian You
Remote Sens. 2025, 17(22), 3736; https://doi.org/10.3390/rs17223736 - 17 Nov 2025
Viewed by 545
Abstract
The ongoing launch of high-resolution satellites has led to a significant increase in the volume of synthetic aperture radar data, resulting in a high-resolution and high-revisit Earth observation that efficiently supports subsequent high-resolution SAR change detection. To address the issues of speckle noise [...] Read more.
The ongoing launch of high-resolution satellites has led to a significant increase in the volume of synthetic aperture radar data, resulting in a high-resolution and high-revisit Earth observation that efficiently supports subsequent high-resolution SAR change detection. To address the issues of speckle noise interference, insufficient integrity of change targets and blurred boundary location of high-resolution SAR change detection, we propose a coarse-to-fine framework based on the multi-scale segmentation and hybrid structure graph (HSG), which consists of three modules: multi-scale segmentation, difference measurement, and change refinement. First, we propose a graph-based multi-resolution co-segmentation (GMRCS) in the multi-scale segmentation module to generate hierarchically nested superpixel masks. And, a two-stage ranking (TSR) strategy is designed to help GMRCS better approximate the target edges and preserve the spatio-temporal structure of changed regions. Then, we introduce a graph model and measuring difference level based on the HSG. The multi-scale difference image (DI) is generated by constructing the HSG for bi-temporal SAR images and comparing the consistency of the HSGs to reduce the effect of speckle noise. Finally, the coarse-scale change information is gradually mapped to the fine-scale based on the multi-scale fusion refinement (FR) strategy, and we can get the binary change map (BCM). Experimental results on three high-resolution SAR change detection datasets demonstrates the superiority of our proposed algorithm in preserving the integrity and structural precision of change targets compared with several state-of-the-art methods. Full article
(This article belongs to the Special Issue SAR Image Change Detection: From Hand-Crafted to Deep Learning)
Show Figures

Figure 1

26 pages, 1967 KB  
Article
A Symmetric Multiscale Feature Fusion Architecture Based on CNN and GNN for Hyperspectral Image Classification
by Yaoqun Xu, Junyi Wang, Zelong You and Xin Li
Symmetry 2025, 17(11), 1930; https://doi.org/10.3390/sym17111930 - 11 Nov 2025
Viewed by 645
Abstract
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have been widely applied to hyperspectral image classification tasks, but both exhibit certain limitations. To address these issues, this paper proposes a multi-scale feature fusion architecture (MCGNet). Symmetry serves as the core design principle [...] Read more.
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have been widely applied to hyperspectral image classification tasks, but both exhibit certain limitations. To address these issues, this paper proposes a multi-scale feature fusion architecture (MCGNet). Symmetry serves as the core design principle of MCGNet, where its parallel CNN-GCN branches and multi-scale fusion mechanism strike a balance between local spectral-spatial features and global graph structural dependencies, effectively reducing redundancy and enhancing generalization capabilities. The architecture comprises four modules: the Spectral Noise Suppression (SNS) module enhances the signal-to-noise ratio of spectral features; the Local Spectral Extraction (LSE) module employs deep separable convolutions to extract local spectral-spatial features; Superpixel-level Graph Convolution (SGC), performing graph convolution on superpixel graphs to precisely capture dependencies between object regions; Pixel-level Graph Convolution (PGC), constructed via adaptive sparse pixel graphs based on spectral and spatial similarity to accurately capture irregular boundaries and fine-grained non-local relationships between pixels. These modules form a symmetric, hierarchical feature learning pipeline integrated within a unified framework. Experiments on three public datasets—Indian Pine, Pavia University, and Salinas—demonstrate that MCGNet outperforms baseline methods in overall accuracy, average precision, and Kappa coefficient. This symmetric design not only enhances classification performance but also endows the model with strong theoretical interpretability and cross-dataset robustness, highlighting the significance of symmetry principles in hyperspectral image analysis. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

25 pages, 689 KB  
Article
UMEAD: Unsupervised Multimodal Entity Alignment for Equipment Knowledge Graphs via Dual-Space Embedding
by Siyu Zhu, Qitao Tai, Jingbo Wang, Mingfei Tang, Liang Wang, Ning Li, Shoulu Hou and Xiulei Liu
Symmetry 2025, 17(11), 1869; https://doi.org/10.3390/sym17111869 - 5 Nov 2025
Viewed by 856
Abstract
The symmetry between different representation spaces plays a crucial role in effectively modeling complex multimodal data. To address the challenge of equipment knowledge graphs containing hierarchical relationships that cannot be fully represented in a single space, this study proposes UMEAD, an unsupervised multimodal [...] Read more.
The symmetry between different representation spaces plays a crucial role in effectively modeling complex multimodal data. To address the challenge of equipment knowledge graphs containing hierarchical relationships that cannot be fully represented in a single space, this study proposes UMEAD, an unsupervised multimodal entity alignment method based on dual-space embeddings. The method simultaneously learns graph embeddings in both Euclidean and hyperbolic spaces, forming a structural symmetry where the Euclidean space captures local regularities and the hyperbolic space models global hierarchies. Their complementarity achieves a balanced and symmetric representation of multimodal knowledge. An adaptive feature fusion strategy is further employed to dynamically weight semantic and visual modalities, enhancing the symmetry and complementarity between different modalities. To reduce reliance on scarce pre-aligned data, pseudo seed instances are generated from multimodal features, and an iterative constraint mechanism progressively enlarges the training set, enabling unsupervised alignment. Experiments on public datasets, including EMMEAD, FB15K-DB15K, and FB15K-YAGO15K, demonstrate that the combination of dual-space embeddings, adaptive fusion, and iterative constraints significantly improves alignment accuracy. In summary, the proposed method reduces dependence on pre-aligned data, strengthens multimodal and structural alignment, and its symmetric embedding and fusion design offers a promising approach for the construction and application of multimodal knowledge graphs in the equipment domain. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Computer Vision)
Show Figures

Figure 1

17 pages, 940 KB  
Article
ON-NSW: Accelerating High-Dimensional Vector Search on Edge Devices with GPU-Optimized NSW
by Taeyoon Park, Haena Lee, Yedam Na and Wook-Hee Kim
Sensors 2025, 25(20), 6461; https://doi.org/10.3390/s25206461 - 19 Oct 2025
Viewed by 1413
Abstract
The Industrial Internet of Things (IIoT) increasingly relies on vector embeddings for analytics and AI-driven applications such as anomaly detection, predictive maintenance, and sensor fusion. Efficient approximate nearest neighbor search (ANNS) is essential for these workloads. Graph-based methods are among the most representative [...] Read more.
The Industrial Internet of Things (IIoT) increasingly relies on vector embeddings for analytics and AI-driven applications such as anomaly detection, predictive maintenance, and sensor fusion. Efficient approximate nearest neighbor search (ANNS) is essential for these workloads. Graph-based methods are among the most representative methods for ANNS. However, most existing graph-based methods, such as Hierarchical Navigable Small World (HNSW), are designed for CPU execution on high-end servers and give little consideration to the unique characteristics of edge devices. In this work, we present ON-NSW, a GPU-optimized design of HNSW optimized for edge devices. ON-NSW employs a flat graph structure derived from HNSW to fully exploit GPU parallelism. In addition, it carefully places HNSW components in the unified memory architecture of NVIDIA Jetson Orin Nano. Also, ON-NSW introduces warp-level parallel neighbor exploration and lightweight synchronization to reduce search latency. Our experimental results on real-world high-dimensional datasets show that ON-NSW achieves up to 1.44× higher throughput than the original HNSW on the NVIDIA Jetson device while maintaining comparable recall. These results demonstrate that ON-NSW provides an effective design for enabling efficient and high-throughput vector search on embedded edge platforms. Full article
Show Figures

Figure 1

14 pages, 1149 KB  
Article
Modality Information Aggregation Graph Attention Network with Adversarial Training for Multi-Modal Knowledge Graph Completion
by Hankiz Yilahun, Elyar Aili, Seyyare Imam and Askar Hamdulla
Information 2025, 16(10), 907; https://doi.org/10.3390/info16100907 - 16 Oct 2025
Viewed by 756
Abstract
Multi-modal knowledge graph completion (MMKGC) aims to complete knowledge graphs by integrating structural information with multi-modal (e.g., visual, textual, and numerical) features and leveraging cross-modal reasoning within a unified semantic space to infer and supplement missing factual knowledge. Current MMKGC methods have advanced [...] Read more.
Multi-modal knowledge graph completion (MMKGC) aims to complete knowledge graphs by integrating structural information with multi-modal (e.g., visual, textual, and numerical) features and leveraging cross-modal reasoning within a unified semantic space to infer and supplement missing factual knowledge. Current MMKGC methods have advanced in terms of integrating multi-modal information but have overlooked the imbalance in modality importance for target entities. Treating all modalities equally dilutes critical semantics and amplifies irrelevant information, which in turn limits the semantic understanding and predictive performance of the model. To address these limitations, we proposed a modality information aggregation graph attention network with adversarial training for multi-modal knowledge graph completion (MIAGAT-AT). MIAGAT-AT focuses on hierarchically modeling complex cross-modal interactions. By combining the multi-head attention mechanism with modality-specific projection methods, it precisely captures global semantic dependencies and dynamically adjusts the weight of modality embeddings according to the importance of each modality, thereby optimizing cross-modal information fusion capabilities. Moreover, through the use of random noise and multi-layer residual blocks, the adversarial training generates high-quality multi-modal feature representations, thereby effectively enhancing information from imbalanced modalities. Experimental results demonstrate that our approach significantly outperforms 18 existing baselines and establishes a strong performance baseline across three distinct datasets. Full article
Show Figures

Figure 1

43 pages, 16029 KB  
Article
Research on Trajectory Planning for a Limited Number of Logistics Drones (≤3) Based on Double-Layer Fusion GWOP
by Jian Deng, Honghai Zhang, Yuetan Zhang and Yaru Sun
Drones 2025, 9(10), 671; https://doi.org/10.3390/drones9100671 - 24 Sep 2025
Cited by 1 | Viewed by 737
Abstract
Trajectory planning for logistics UAVs in complex environments faces a key challenge: balancing global search breadth with fine constraint accuracy. Traditional algorithms struggle to simultaneously manage large-scale exploration and complex constraints, and lack sufficient modeling capabilities for multi-UAV systems, limiting cluster logistics efficiency. [...] Read more.
Trajectory planning for logistics UAVs in complex environments faces a key challenge: balancing global search breadth with fine constraint accuracy. Traditional algorithms struggle to simultaneously manage large-scale exploration and complex constraints, and lack sufficient modeling capabilities for multi-UAV systems, limiting cluster logistics efficiency. To address these issues, we propose a GWOP algorithm based on dual-layer fusion of GWO and GRPO and incorporate a graph attention network (GAT). First, CEC2017 benchmark functions evaluate GWOP convergence accuracy and balanced exploration in multi-peak, high-dimensional environments. A hierarchical collaborative architecture, “GWO global coarse-grained search + GRPO local fine-tuning”, is used to overcome the limitations of single-algorithm frameworks. The GAT model constructs a dynamic “environment–UAV–task” association network, enabling environmental feature quantification and multi-constraint adaptation. A multi-factor objective function and constraints are integrated with multi-task cascading decoupling optimization to form a closed-loop collaborative optimization framework. Experimental results show that in single UAV scenarios, GWOP reduces flight cost (FV) by over 15.85% on average. In multi-UAV collaborative scenarios, average path length (APL), optimal path length (OPL), and FV are reduced by 4.08%, 14.08%, and 24.73%, respectively. In conclusion, the proposed method outperforms traditional approaches in path length, obstacle avoidance, and trajectory smoothness, offering a more efficient planning solution for smart logistics. Full article
Show Figures

Figure 1

Back to TopTop