MDPI - Publisher of Open Access Journals

23 pages, 6348 KiB

Open AccessArticle

A Framework for Predicting Winter Wheat Yield in Northern China with Triple Cross-Attention and Multi-Source Data Fusion

by Shuyan Pan and Liqun Liu

Plants 2025, 14(14), 2206; https://doi.org/10.3390/plants14142206 - 16 Jul 2025

Viewed by 207

Abstract

To solve the issue that existing yield prediction methods do not fully capture the interaction between multiple factors, we propose a winter wheat yield prediction framework with triple cross-attention for multi-source data fusion. This framework consists of three modules: a multi-source data processing [...] Read more.

To solve the issue that existing yield prediction methods do not fully capture the interaction between multiple factors, we propose a winter wheat yield prediction framework with triple cross-attention for multi-source data fusion. This framework consists of three modules: a multi-source data processing module, a multi-source feature fusion module, and a yield prediction module. The multi-source data processing module collects satellite, climate, and soil data based on the winter wheat planting range, and constructs a multi-source feature sequence set by combining statistical data. The multi-source feature fusion module first extracts deeper-level feature information based on the characteristics of different data, and then performs multi-source feature fusion through a triple cross-attention fusion mechanism. The encoder part in the production prediction module adds a graph attention mechanism, forming a dual branch with the original multi-head self-attention mechanism to ensure the capture of global dependencies while enhancing the preservation of local feature information. The decoder section generates the final predicted output. The results show that: (1) Using 2021 and 2022 as test sets, the mean absolute error of our method is 385.99 kg/hm², and the root mean squared error is 501.94 kg/hm², which is lower than other methods. (2) It can be concluded that the jointing-heading stage (March to April) is the most crucial period affecting winter wheat production. (3) It is evident that our model has the ability to predict the final winter wheat yield nearly a month in advance. Full article

(This article belongs to the Section Plant Modeling)

► Show Figures

Figure 1

19 pages, 6903 KiB

Open AccessArticle

GT-SRR: A Structured Method for Social Relation Recognition with GGNN-Based Transformer

by Dejiao Huang, Menglei Xia, Ruyi Chang, Xiaohan Kong and Shuai Guo

Sensors 2025, 25(10), 2992; https://doi.org/10.3390/s25102992 - 9 May 2025

Viewed by 419

Abstract

Social relationship recognition (SRR) holds significant value in fields such as behavior analysis and intelligent social systems. However, existing methods primarily focus on modeling individual visual traits, interaction patterns, and scene-level contextual cues, often failing to capture the complex dependencies among these features [...] Read more.

Social relationship recognition (SRR) holds significant value in fields such as behavior analysis and intelligent social systems. However, existing methods primarily focus on modeling individual visual traits, interaction patterns, and scene-level contextual cues, often failing to capture the complex dependencies among these features and the hierarchical structure of social groups, which are crucial for effective reasoning. In order to overcome these restrictions, this essay suggests a SRR model that integrates Gated Graph Neural Network (GGNN) and Transformer. The task for SRR in this model is image-based. Specifically, the purpose of a novel and robust hybrid feature extraction module is to capture individual characteristics, relative positional information, and group-level cues, which are used to construct relation nodes and group nodes. A modified GGNN is then employed to model the logical dependencies between features. Nevertheless, GGNN alone lacks the capacity to dynamically adjust feature importance, which may result in ambiguous relationship representations. The Transformer’s multi-head self-attention (MSA) mechanism is integrated to improve feature interaction modeling, allowing the model to capture global context and higher-order dependencies effectively. By fusing pairwise features, graph-structured features, and group-level information. Experimental results on public datasets such as PISC demonstrate that the proposed approach outperforms comparison models including Dual-Glance, GRM, GRRN, Graph-BERT, and SRT in terms of accuracy and mean average precision (mAP), validating its effectiveness in multi-feature representation learning and global reasoning. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

21 pages, 3508 KiB

Open AccessArticle

Pedestrian Trajectory Prediction Based on Dual Social Graph Attention Network

by Xinhai Li, Yong Liang, Zhenhao Yang and Jie Li

Appl. Sci. 2025, 15(8), 4285; https://doi.org/10.3390/app15084285 - 13 Apr 2025

Viewed by 721

Abstract

Pedestrian trajectory prediction poses significant challenges for autonomous systems due to the intricate nature of social interactions in densely populated environments. While the existing methods frequently encounter difficulties in effectively quantifying the nuanced social relationships, we propose a novel dual social graph attention [...] Read more.

Pedestrian trajectory prediction poses significant challenges for autonomous systems due to the intricate nature of social interactions in densely populated environments. While the existing methods frequently encounter difficulties in effectively quantifying the nuanced social relationships, we propose a novel dual social graph attention network (DSGAT) that systematically models multi-level interactions. This framework is specifically designed to enhance the extraction of pedestrian interaction features within the environment, thereby improving the trajectory prediction accuracy. The network architecture consists of two primary branches, namely an individual branch and a group branch, which are responsible for modeling personal and collective pedestrian behaviors, respectively. For individual feature modeling, we propose the Spatio-Temporal Weighted Graph Attention Network (STWGAT) branch, which incorporates a newly developed directed social attention function to explicitly capture both the direction and intensity of pedestrian interactions. This mechanism enables the model to more effectively represent the fine-grained social dynamics. Subsequently, leveraging the STWGAT’s processing of directed weighted graphs, the network’s ability to aggregate spatiotemporal information and refine individual interaction representations is further strengthened. To effectively account for the critical group dynamics, a dedicated group attention function is designed to identify and quantify the collective behaviors within pedestrian crowds. This facilitates a more comprehensive understanding of the complex social interactions, leading to an enhanced trajectory prediction accuracy. Extensive comparative experiments conducted on the widely used ETH and UCY benchmark datasets demonstrate that the proposed network consistently surpasses the baseline methods across the key evaluation metrics, including the Average Displacement Error (ADE) and Final Displacement Error (FDE). These results confirm the effectiveness and robustness of the DSGAT-based approach in handling complex pedestrian interaction scenarios. Full article

► Show Figures

Figure 1

30 pages, 11973 KiB

Open AccessArticle

A Novel and Powerful Dual-Stream Multi-Level Graph Convolution Network for Emotion Recognition

by Guoqiang Hou, Qiwen Yu, Guang Chen and Fan Chen

Sensors 2024, 24(22), 7377; https://doi.org/10.3390/s24227377 - 19 Nov 2024

Viewed by 1401

Abstract

Emotion recognition enables machines to more acutely perceive and understand users’ emotional states, thereby offering more personalized and natural interactive experiences. Given the regularity of the responses of brain activity to human cognitive processes, we propose a powerful and novel dual-stream multi-level graph [...] Read more.

Emotion recognition enables machines to more acutely perceive and understand users’ emotional states, thereby offering more personalized and natural interactive experiences. Given the regularity of the responses of brain activity to human cognitive processes, we propose a powerful and novel dual-stream multi-level graph convolution network (DMGCN) with the ability to capture the hierarchies of connectivity between cerebral cortex neurons and improve computational efficiency. This consists of a hierarchical dynamic geometric interaction neural network (HDGIL) and multi-level feature fusion classifier (M2FC). First, the HDGIL diversifies representations by learning emotion-related representations in multi-level graphs. Subsequently, M2FC integrates advantages from methods for early and late feature fusion and enables the addition of more details to final representations from EEG samples. We conducted extensive experiments to validate the superiority of our model over numerous state-of-the-art (SOTA) baselines in terms of classification accuracy, the efficiency of graph embedding and information propagation, achieving accuracies of 98.73%, 95.97%, 72.74% and 94.89% for our model as well as increases of up to 0.59%, 0.32%, 2.24% and 3.17% over baselines on the DEAP-Arousal, DEAP-Valence, DEAP and SEED datasets, respectively. Additionally, these experiments demonstrated the effectiveness of each module for emotion recognition tasks. Full article

(This article belongs to the Special Issue Advancements in Sensing Technologies and Control Mechanisms for Assistive Robotics: Enhancing Human–Robot Interaction and Assistance)

► Show Figures

Figure 1

15 pages, 856 KiB

Open AccessArticle

DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription

by Rui Cao, Zushuang Liang, Zheng Yan and Bing Liu

Electronics 2024, 13(19), 3939; https://doi.org/10.3390/electronics13193939 - 5 Oct 2024

Viewed by 1340

Abstract

Automatic music transcription (AMT) aims to convert raw audio signals into symbolic music. This is a highly challenging task in the fields of signal processing and artificial intelligence, and it holds significant application value in music information retrieval (MIR). Existing methods based on [...] Read more.

Automatic music transcription (AMT) aims to convert raw audio signals into symbolic music. This is a highly challenging task in the fields of signal processing and artificial intelligence, and it holds significant application value in music information retrieval (MIR). Existing methods based on convolutional neural networks (CNNs) often fall short in capturing the time-frequency characteristics of audio signals and tend to overlook the interdependencies between notes when processing polyphonic piano with multiple simultaneous notes. To address these issues, we propose a dual attention feature extraction and multi-scale graph attention network (DAFE-MSGAT). Specifically, we design a dual attention feature extraction module (DAFE) to enhance the frequency and time-domain features of the audio signal, and we utilize a long short-term memory network (LSTM) to capture the temporal features within the audio signal. We introduce a multi-scale graph attention network (MSGAT), which leverages the various implicit relationships between notes to enhance the interaction between different notes. Experimental results demonstrate that our model achieves high accuracy in detecting the onset and offset of notes on public datasets. In both frame-level and note-level metrics, DAFE-MSGAT achieves performance comparable to the state-of-the-art methods, showcasing exceptional transcription capabilities. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

30 pages, 73805 KiB

Open AccessArticle

DTT-CGINet: A Dual Temporal Transformer Network with Multi-Scale Contour-Guided Graph Interaction for Change Detection

by Ming Chen, Wanshou Jiang and Yuan Zhou

Remote Sens. 2024, 16(5), 844; https://doi.org/10.3390/rs16050844 - 28 Feb 2024

Cited by 1 | Viewed by 2011

Abstract

Deep learning has dramatically enhanced remote sensing change detection. However, existing neural network models often face challenges like false positives and missed detections due to factors like lighting changes, scale differences, and noise interruptions. Additionally, change detection results often fail to capture target [...] Read more.

Deep learning has dramatically enhanced remote sensing change detection. However, existing neural network models often face challenges like false positives and missed detections due to factors like lighting changes, scale differences, and noise interruptions. Additionally, change detection results often fail to capture target contours accurately. To address these issues, we propose a novel transformer-based hybrid network. In this study, we analyze the structural relationship in bi-temporal images and introduce a cross-attention-based transformer to model this relationship. First, we use a tokenizer to express the high-level features of the bi-temporal image into several semantic tokens. Then, we use a dual temporal transformer (DTT) encoder to capture dense spatiotemporal contextual relationships among the tokens. The features extracted at the coarse scale are refined into finer details through the DTT decoder. Concurrently, we input the backbone’s low-level features into a contour-guided graph interaction module (CGIM) that utilizes joint attention to capture semantic relationships between object regions and the contour. Then, we use the feature pyramid decoder to integrate the multi-scale outputs of the CGIM. The convolutional block attention modules (CBAMs) employ channel and spatial attention to reweight feature maps. Finally, the classifier discriminates change pixels and generates the final change map of the difference feature map. Several experiments have demonstrated that our model shows significant advantages over other methods in terms of efficiency, accuracy, and visual effects. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Graphical abstract

15 pages, 1788 KiB

Open AccessArticle

Multiple Information-Aware Recurrent Reasoning Network for Joint Dialogue Act Recognition and Sentiment Classification

by Shi Li and Xiaoting Chen

Information 2023, 14(11), 593; https://doi.org/10.3390/info14110593 - 1 Nov 2023

Cited by 1 | Viewed by 1710

Abstract

The task of joint dialogue act recognition (DAR) and sentiment classification (DSC) aims to predict both the act and sentiment labels of each utterance in a dialogue. Existing methods mainly focus on local or global semantic features of the dialogue from a single [...] Read more.

The task of joint dialogue act recognition (DAR) and sentiment classification (DSC) aims to predict both the act and sentiment labels of each utterance in a dialogue. Existing methods mainly focus on local or global semantic features of the dialogue from a single perspective, disregarding the impact of the other part. Therefore, we propose a multiple information-aware recurrent reasoning network (MIRER). Firstly, the sequence information is smoothly sent to multiple local information layers for fine-grained feature extraction through a BiLSTM-connected hybrid CNN group method. Secondly, to obtain global semantic features that are speaker-, context-, and temporal-sensitive, we design a speaker-aware temporal reasoning heterogeneous graph to characterize interactions between utterances spoken by different speakers, incorporating different types of nodes and meta-relations with node-edge-type-dependent parameters. We also design a dual-task temporal reasoning heterogeneous graph to realize the semantic-level and prediction-level self-interaction and interaction, and we constantly revise and improve the label in the process of dual-task recurrent reasoning. MIRER fully integrates context-level features, fine-grained features, and global semantic features, including speaker, context, and temporal sensitivity, to better simulate conversation scenarios. We validated the method on two public dialogue datasets, Mastodon and DailyDialog, and the experimental results show that MIRER outperforms various existing baseline models. Full article

(This article belongs to the Collection Natural Language Processing and Applications: Challenges and Perspectives)

► Show Figures

Figure 1

19 pages, 3783 KiB

Open AccessArticle

DSPose: Dual-Space-Driven Keypoint Topology Modeling for Human Pose Estimation

by Anran Zhao, Jingli Li, Hongtao Zeng, Hongren Cheng and Liangshan Dong

Sensors 2023, 23(17), 7626; https://doi.org/10.3390/s23177626 - 3 Sep 2023

Cited by 1 | Viewed by 2523

Abstract

Human pose estimation is the basis of many downstream tasks, such as motor intervention, behavior understanding, and human–computer interaction. The existing human pose estimation methods rely too much on the similarity of keypoints at the image feature level, which is vulnerable to three [...] Read more.

Human pose estimation is the basis of many downstream tasks, such as motor intervention, behavior understanding, and human–computer interaction. The existing human pose estimation methods rely too much on the similarity of keypoints at the image feature level, which is vulnerable to three problems: object occlusion, keypoints ghost, and neighbor pose interference. We propose a dual-space-driven topology model for the human pose estimation task. Firstly, the model extracts relatively accurate keypoints features through a Transformer-based feature extraction method. Then, the correlation of keypoints in the physical space is introduced to alleviate the error localization problem caused by excessive dependence on the feature-level representation of the model. Finally, through the graph convolutional neural network, the spatial correlation of keypoints and the feature correlation are effectively fused to obtain more accurate human pose estimation results. The experimental results on real datasets also further verify the effectiveness of our proposed model. Full article

(This article belongs to the Special Issue Human-Robot Interaction for Intelligent Education and Engineering Applications)

► Show Figures

Figure 1

16 pages, 3633 KiB

Open AccessArticle

HDGFI: Hierarchical Dual-Level Graph Feature Interaction Model for Personalized Recommendation

by Xinxin Ma and Zhendong Cui

Entropy 2022, 24(12), 1799; https://doi.org/10.3390/e24121799 - 9 Dec 2022

Viewed by 2126

Abstract

Under the background of information overload, the recommendation system has attracted wide attention as one of the most important means for this problem. Feature interaction considers not only the impact of each feature but also the combination of two or more features, which [...] Read more.

Under the background of information overload, the recommendation system has attracted wide attention as one of the most important means for this problem. Feature interaction considers not only the impact of each feature but also the combination of two or more features, which has become an important research field in recommendation systems. There are two essential problems in current feature interaction research. One is that not all feature interactions can generate positive gains, and some may lead to an increase in noise. The other is that the process of feature interactions is implicit and uninterpretable. In this paper, a Hierarchical Dual-level Graph Feature Interaction (HDGFI) model is proposed to solve these problems in the recommendation system. The model regards features as nodes and edges as interactions between features in the graph structure. Interaction noise is filtered by beneficial interaction selection based on a hierarchical edge selection module. At the same time, the importance of interaction between nodes is modeled in two perspectives in order to learn the representation of feature nodes at a finer granularity. Experimental results show that the proposed HDGFI model has higher accuracy than the existing models. Full article

(This article belongs to the Special Issue Information Network Mining and Applications)

► Show Figures

Figure 1

14 pages, 2312 KiB

Open AccessArticle

Dual-Channel Interactive Graph Convolutional Networks for Aspect-Level Sentiment Analysis

by Zhouxin Lan, Qing He and Liu Yang

Mathematics 2022, 10(18), 3317; https://doi.org/10.3390/math10183317 - 13 Sep 2022

Cited by 7 | Viewed by 1761

Abstract

Aspect-level sentiment analysis aims to identify the sentiment polarity of one or more aspect terms in a sentence. At present, many researchers have applied dependency trees and graph neural networks (GNNs) to aspect-level sentiment analysis and achieved promising results. However, when a sentence [...] Read more.

Aspect-level sentiment analysis aims to identify the sentiment polarity of one or more aspect terms in a sentence. At present, many researchers have applied dependency trees and graph neural networks (GNNs) to aspect-level sentiment analysis and achieved promising results. However, when a sentence contains multiple aspects, most methods model each aspect independently, ignoring the issue of sentiment connection between aspects. To address this problem, this paper proposes a dual-channel interactive graph convolutional network (DC-GCN) model for aspect-level sentiment analysis. The model considers both syntactic structure information and multi-aspect sentiment dependencies in sentences and employs graph convolutional networks (GCN) to learn its node information representation. Particularly, to better capture the representations of aspect and opinion words, we exploit the attention mechanism to interactively learn the syntactic information features and multi-aspect sentiment dependency features produced by the GCN. In addition, we construct the word embedding layer by the BERT pre-training model to better learn the contextual semantic information of sentences. The experimental results on the restaurant, laptop, and twitter datasets show that, compared with the state-of-the-art model, the accuracy is up to 1.86%, 2.50, 1.36%, and 0.38 and the Macro-F1 values are up to 1.93%, 0.61%, and 0.4%, respectively. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

19 pages, 3101 KiB

Open AccessArticle

MBHAN: Motif-Based Heterogeneous Graph Attention Network

by Qian Hu, Weiping Lin, Minli Tang and Jiatao Jiang

Appl. Sci. 2022, 12(12), 5931; https://doi.org/10.3390/app12125931 - 10 Jun 2022

Cited by 5 | Viewed by 5433

Abstract

Graph neural networks are graph-based deep learning technologies that have attracted significant attention from researchers because of their powerful performance. Heterogeneous graph-based graph neural networks focus on the heterogeneity of the nodes and links in a graph. This is more effective at preserving [...] Read more.

Graph neural networks are graph-based deep learning technologies that have attracted significant attention from researchers because of their powerful performance. Heterogeneous graph-based graph neural networks focus on the heterogeneity of the nodes and links in a graph. This is more effective at preserving semantic knowledge when representing data interactions in real-world graph structures. Unfortunately, most heterogeneous graph neural networks tend to transform heterogeneous graphs into homogeneous graphs when using meta-paths for representation learning. This paper therefore presents a novel motif-based hierarchical heterogeneous graph attention network algorithm, MBHAN, that addresses this problem by incorporating a hierarchical dual attention mechanism at the node-level and motif-level. Node-level attention aims to learn the importance between a node and its neighboring nodes within its corresponding motif. Motif-level attention is capable of learning the importance of different motifs in the heterogeneous graph. In view of the different vector space features of different types of nodes in heterogeneous graphs, MBHAN also aggregates the features of different types of nodes, so that they can jointly participate in downstream tasks after passing through segregated independent shallow neural networks. MBHAN’s superior network representation learning capability has been validated by extensive experiments on two real-world datasets. Full article

(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

► Show Figures

Figure 1

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI