Attention-Based Hypergraph Neural Network: A Personalized Recommendation

Xu, Peihua; Zhang, Maoyuan

doi:10.3390/app15116332

Open AccessArticle

Attention-Based Hypergraph Neural Network: A Personalized Recommendation

by

Peihua Xu

^1,2 and

Maoyuan Zhang

^3,*

¹

Faculty of Artificial Intelligence Education, School of Educational Information Technology, Central China Normal University, Wuhan 430079, China

²

Hubei Meteorological Service Center, Wuhan 430079, China

³

Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(11), 6332; https://doi.org/10.3390/app15116332

Submission received: 26 April 2025 / Revised: 31 May 2025 / Accepted: 2 June 2025 / Published: 4 June 2025

Download

Browse Figures

Versions Notes

Abstract

Personalized recommendation for online learning courses stands as a critical research topic in educational technology, where algorithmic performance directly impacts learning efficiency and user experience. To address the limitations of existing studies in multimodal heterogeneous data fusion and high-order relationship modeling, this research proposes a Heterogeneous Hypergraph and Attention-based Online Course Recommendation (HHAOCR) algorithm. By constructing a heterogeneous hypergraph structure encompassing three entity types (students, instructors, and courses), we innovatively designed hypergraph convolution operators to achieve bidirectional vertex-hyperedge information aggregation, integrated with a dynamic attention mechanism to quantify important differences among entities. The method establishes computational frameworks for hyperedge-vertex coefficient matrices and inter-hyperedge attention scores, effectively capturing high-order nonlinear correlations within multimodal heterogeneous data, while employing temporal attention units to track the evolution of user preferences. Experimental results on the MOOCCube dataset demonstrate that the proposed algorithm achieves significant improvements in NDCG@15 and F1-Score@15 metrics compared to TP-GNN (enhanced by 0.0699 and 0.0907) and IRS-GCNet (enhanced by 0.0808 and 0.0999). This work provides a scalable solution for multisource heterogeneous data fusion and precise recommendation for online education platforms.

Keywords:

online course recommendation; heterogeneous hypergraph; attention mechanism; personalized recommendation; deep learning; graph neural networks

1. Introduction

Most online course recommendations primarily use traditional machine learning methods and deep learning methods to model sequential data, neglecting the complex associations among course entities. These models lack explicit representations of the relationships between learning courses. A good personalized recommendation model needs to fully explore the high-level latent relationships between learner nodes, particularly relationships, such as course category connections, co-learning connections, and similarity connections. Secondly, most of these models usually capture users’ long-term preferences but ignore their recent learning preferences, thus producing negative recommendations using these models. This chapter discusses an attention-based hypergraph neural network framework for personalized online learning recommendations. This recommendation algorithm fully uncovers high-order latent relationships among learners, courses, and instructors, such as co-learning relationships among learners, course category relationships, and learners’ similar preference relationships.

Personalized learning is an inevitable trend in the information era [1]. Tailoring education to individual students’ needs, while paying attention to their personal differences, is an important research topic in online course recommendations. Personalized recommendations on MOOC platforms intelligently match courses through user behavior analysis. For instance, when User A completes an introduction to Python course with high scores, the system recommends “Practical Data Analysis” based on learning paths of similar learners. If the user maintains consistent engagement during fixed evening hours (8–10 PM), live-streamed courses are prioritized. When repeated viewing of “machine learning” content is detected, the Fundamentals of “Artificial Intelligence” course is automatically highlighted on the homepage. This dynamic personalization enhances user engagement and improves course completion rates. Xia et al. [2] addressed critical limitations in existing implicit feedback algorithms, including unreasonable positive–negative sample division, neglect of user operation frequency, and inaccurate user preference modeling, proposing an Implicit Feedback and Weighted User Preference-based recommendation algorithm (IFW-LFM). Shen et al. [3] developed a forgetting function-enhanced Mean Bayesian Personalized Ranking (MBPR) algorithm, integrating the Ebbinghaus forgetting curve with conventional BPR methodology to achieve personalized recommendations. Du et al. [4] leveraged both low- and high-order user features through a Deep Factorization Machine architecture, combining information extraction units with cross-network structures to construct a hybrid recommendation model for deep behavioral preference mining. For sequential recommendation systems challenged by noise interference and imprecise signal representation, Wen et al. [5] proposed the DeepEMA model, employing exponential moving averaging for noise reduction and trend extraction, coupled with a multi-module framework to capture diverse signals, demonstrating effectiveness across four benchmark datasets. Wang [6] incorporated Mel filters into long-/short-term user preference modeling, accounting for emotional attention while analyzing temporal sequences, subsequently developing a music recommendation algorithm validated against random forest and generalized predictive control baselines. Song [7] achieved precise knowledge demand pattern recognition by synthesizing multidimensional data including user preference features, historical borrowing behaviors, and digital access logs, enabling intelligent matching of personalized knowledge resources. This data-driven service paradigm significantly enhanced collection utilization efficiency while optimizing service experience through differentiated demand fulfillment. Tong et al. [8] employed virtual simulation technology to extract characteristic patterns from learning resources, establishing personalized features via data function modeling with concurrent anomaly detection to minimize bias, ultimately constructing data association nodes for intelligent resource recommendation in virtual training environments Deng et al. [9] implemented natural language processing techniques for multidimensional feature extraction from course descriptions, instructor qualifications, and institutional profiles. By analyzing historical course selection records and associated features to build user preference models, their XGBoost-based framework generated personalized course recommendations through feature vector matching, achieving accuracy improvements via systematic feature engineering and machine learning integration.

Deep learning’s architecture simulates the hierarchical structure of the human brain, effective for content information or feature extraction. Thus, deep learning models have unique advantages in feature extraction, enabling more accurate representation of external data. The field of deep learning in personalized recommendations is flourishing [10]. Leveraging neural networks’ powerful learning capabilities can better capture students’ complex learning patterns and knowledge states, effectively handling large-scale learning data [11,12]. However, traditional machine learning methods and deep learning methods model sequential data in isolation, ignoring the complex associations among entities. With the growing need to address complex relationships and structural issues, researchers have focused on graph data structures, which have proven more effective in capturing relationships between entities [13].

In this context, Graph Neural Networks (GNNs, See Appendix A) have been introduced into personalized recommendations. GNNs are a class of deep learning models specifically designed to handle graph-structured data. They can not only effectively capture relationships between entities but also learn global information from the entire graph. GNNs are advantageous tools for modeling multiple entities and their relationships in recommendation systems, serving as a form of graph embedding (Graph Embedding) technology that maps learners, courses, and knowledge concepts into a unified vector space. By computing vector similarities or predicting edge properties, GNNs transform recommendation problems into classical graph tasks, thereby providing new enhanced functionalities for intelligent services in recommendations [14,15,16].

Combining attention mechanisms allows the model to focus on the most relevant nodes or edges during message propagation. Attention mechanisms [17,18] enable the model to focus on the most relevant parts of the input data, thereby improving the performance and effectiveness of the recommendation model. By incorporating this mechanism that considers the different importance of nodes, the model’s ability to learn representations is enhanced. Unlike traditional GNNs, attention mechanisms can dynamically learn weights based on the similarity between nodes or edges. This dynamically adaptive attention mechanism can serve as a resource allocation scheme and is a primary approach to addressing information overload issues. In scenarios with limited computational resources, more important information can be processed with finite computational resources.

For online course recommendations, although a significant amount of research has been conducted, existing studies still suffer from several issues. Current research primarily employs machine learning and deep learning methods to model online courses while neglecting sequential relationships between courses. Few studies utilize multi-modal heterogeneous data to deeply investigate personalized course recommendations through hypergraphs and attention mechanisms. There is limited exploration of common learning behaviors and similar learning interactions among different learners, with insufficient research on personalized recommendations from the perspective of global relational graphs. Traditional course recommendation models assign equal weights to each learner and recommended item, and their node embedding methods overlook the varying importance of nodes by connecting and projecting all nodes into the same space. In reality, different nodes may have distinct significance, for instance, beginners and advanced learners likely prioritize different course content (e.g., programming novices versus algorithm engineers may exhibit significant differences in their demand weights for Python courses).

To address these challenges, this study makes the following contributions:

We propose a heterogeneous hypergraph and attention mechanism-based online course recommendation algorithm (HHAOCR). By constructing a heterogeneous hypergraph with entities including students, courses, and instructors, our model globally explores sequential relationships, co-learning relationships, and similarity relationships among courses. Compared to traditional recommendation models, this framework more comprehensively captures inter-entity connections, thereby enhancing recommendation accuracy.
To tackle the difficulty in discovering high-order connectivity and nonlinear relationships within multi-modal heterogeneous data in online learning scenarios, we designed five heterogeneous hypergraphs using hypergraph convolution operators. Each vertex aggregates information through its associated hyperedges, enabling the model to more accurately capture high-order relationships among students, instructors, and courses.
We propose a fusion method integrating heterogeneous hypergraphs with attention mechanisms. By combining attention mechanisms with hypergraph convolution operators, we develop a dynamically adaptive attention mechanism. This includes a coefficient matrix calculation method between hyperedges and vertices, as well as an attention score computation method between hyperedges, to identify the most relevant nodes or edges in the graph. This mechanism allows dynamic weight adjustment during feature extraction, enabling the model to focus precisely on critical information and relationships, thereby significantly improving the personalization and precision of recommendations.

The rest of this paper is organized as follows: Section 2 presents the fundamental preliminaries and preparatory work. Section 3 provides the formal definition and comprehensive description of our proposed algorithm. Section 4 details the experimental datasets and methodological procedures; then, analyzes and discusses the experimental results. Section 5 concludes the study with key findings. Finally, Section 6 critically examines the limitations of this work and outlines potential directions for future research.

2. Preliminaries

2.1. Graph Neural Networks

Traditional machine learning and deep learning methods model sequential data in isolation, ignoring the complex associations between student, teacher, and course entities [19]. As the demand for solving complex relational and structural problems grows, some researchers have explored the application of graph data structures in the field of online course recommendations. Graph data structures have proven to be more effective in capturing relationships between entities. Graph Neural Networks (GNNs) have been successfully applied in many fields [20], such as event prediction, scene recognition, and image classification [21,22]. Gori et al. and Scarselli et al. [21,23] proposed the use of recurrent neural networks to process graphs and further improved this approach in subsequent work. In this context, Graph Neural Networks have been introduced into online education. GNNs are a class of deep learning models specifically designed to handle graph-structured data. They can not only effectively capture the relationships between entities but also learn global information about the entire graph [23].

GNNs technology is a powerful tool for modeling multiple entities and their relationships in course recommendation for online education systems. It employs graph embedding representation learning to organically map learners, teachers, exercises, and knowledge concepts into a unified vector space. By computing vector similarity or predicting edge attributes, GNNs transform online education course recommendation problems into classic graph tasks, thereby providing new enhancements for various intelligent services in ITS [12,13,15,16,24].

Although the above studies show that GNNs-based models outperform other methods, GNN models also have inherent shortcomings. They are less capable of modeling complex item relationships and cannot fully mine cross-learner correlations in courses. In a simple graph structure, each edge can only connect two nodes. In contrast, in a hypergraph, each hyperedge can connect multiple course nodes. For example, two courses may appear unrelated in a short-term sequential context, but if the perspective is expanded to the global graph structure, they may exhibit strong correlations.

2.2. Hypergraphs

In traditional graph neural network structures, data connections are pairwise. However, real-world data structures often extend beyond pairwise connections and can be much more complex, especially in online learning scenarios involving multimodal data. To fully explore and capture the associations between multimodal data structures, more complex data structures are needed for modeling. Although methods based on graph neural networks for course recommendation have achieved certain results, these GNNs-based models are unable to capture complex high-order user relationship patterns beyond pairwise connections, and they still face limitations when processing high-order information.

Using hypergraphs [25,26] for modeling can effectively address the above issues. A hypergraph is a generalized graph structure that extends the concept of edges to connect more than two nodes. A hypergraph [27] is a non-pairwise data structure that can represent high-order complex relationships. Compared to traditional graph structures, hypergraph structures have greater advantages in modeling data relationships. As a result, recommendation systems based on hypergraphs have attracted the attention of scholars and have been applied to various recommendation scenarios, such as social recommendation and group recommendation [28,29]. Although hypergraph learning has significant advantages in modeling complex high-order data relationships, few studies have applied hypergraphs to online course recommendations, fully exploring high-order nonlinear relationships among students, courses, and teachers for course recommendations.

In traditional online course recommendations, most studies overlook the order of course recommendations and students’ preferences, such as their preferences for specific instructors. In fact, course sequences often include the learning paths and patterns that must be followed to master a subject. For example, a computer science learner who follows this course sequence: Computer Programming Languages and Hardware -> Data Structures and Algorithms -> Operating Systems -> Software Engineering -> Compilers -> Big Data and Artificial Intelligence, is more likely to complete all the courses in the computer science major than another learner who starts with compilers or big data and artificial intelligence. This can reduce the likelihood of learners feeling overwhelmed due to inappropriate recommendations, which may lead them to abandon their entire learning plan and increase dropout rates. This situation is particularly prominent in the study of STEM (Science, Technology, Engineering, and Mathematics) [29,30,31] courses. Additionally, naturally, the content and teaching style of online courses offered by instructors may be preferred by different students. Therefore, different students may choose completely different instructors for the same course. Consequently, if two users choose similar instructor courses during the early stages of their learning process, it is reasonable to assume that they are more likely to continue choosing other courses from the same instructors in the future. Based on these issues, this chapter proposes an online course recommendation method based on hypergraphs and attention mechanisms to improve the rationality of course recommendations.

3. Proposed Method

3.1. Model Construction Based on Hypergraphs and Attention Mechanisms

A Heterogeneous Hypergraph and Attention-based Online Course Recommendation (HHAOCR) algorithm was proposed. The model architecture is shown in Figure 1. The model first learns course embedding representations from the interaction sequences of online learning courses. These embeddings are then fed into an attention layer, which helps the model to focus on important nodes, and subsequently into an LSTM neural network. Simultaneously, the interaction sequences of online learning courses are combined with teacher and course sequences to construct a hypergraph. The features are learned through a hypergraph neural network, which helps the model to capture the higher-order relationships between learners, courses, and instructors, and is then fused with the course embedding vectors. Finally, the model achieves online course recommendations.

3.2. Definition of Hypergraph

A hypergraph can be defined as H = {V, E}, where the vertex set is V = {v1, v2, v3, …, vn} and the edge set is E = {e1, e2, e3, …, ek}. For any hypergraph, e ∈ E,

E = {v_{1}^{(e)}, v_{2}^{(e)}, \dots v_{k}^{(e)}}

represents subgraphs of the hypergraph H. Unlike graphs, hyperedges are nonempty subsets of vertices, and they can connect multiple vertices. The incidence matrix

H \in R^{n \times k}

is defined, where n represents the number of vertices and k represents the number of hyperedges. The hypergraph can be represented by an incidence matrix H of size

|v| \times |e|

, as shown in Figure 2. The definition is as follows:

h (v, e) = \{\begin{matrix} 1, i f v \in e \\ 0, i f v \notin e \end{matrix}

(1)

The vertex classification problem on a hypergraph can be described using the following regularization framework [30]:

\arg \underset{f}{m i n} {R_{e m p} (f) + Ω (f)}

(2)

Here,

Ω (f)

is a regularization term on the hypergraph,

R_{e m p} (f)

represents the supervised empirical loss, and

f (\cdot)

is a classification function. The

Ω (f)

regularization term is defined as follows:

Ω (f) = \frac{1}{2} \sum_{e \in ε} \sum_{{u, v} \in v} \frac{w (e) h (u, e) h (v, e)}{δ (e)} {(\frac{f (u)}{\sqrt{d (u)}} - \frac{f (v)}{\sqrt{d (v)}})}^{2}

(3)

3.3. Hypergraph Convolution Operator

First, the degree of a hypergraph vertex is defined as

N_{d}

, which represents the number of hyperedges passing through the vertex. The degree of a vertex is defined as follows:

N_{d} = \sum_{i = 1}^{j} W_{ii} H_{e i}

(4)

In Equation (4),

W_{i i}

represents the weight of the edge connected to vertex i, indicating the weight of the edge in the degree of the vertex.

H_{e i}

represents whether the i-th vertex is connected to the edge; if connected

H_{e i} = 1

, otherwise

H_{e i} = 0

.

The degree of a vertex’s edge represents how many vertices a hyperedge connects to, and it is defined as follows:

S_{d} = \sum_{j = 1}^{k} H_{e j}

(5)

The diagonal matrices

N \in R^{k \times k}

and

S \in R^{j \times j}

are, respectively, the degree matrices of vertices and hyperedges in the hypergraph.

In the hypergraph, convolutional network operators are used to estimate the probabilities of transitions between vertices. The embedding representation of each vertex can be transmitted throughout the entire hypergraph. This hypergraph convolutional network operator is defined as follows:

x^{l + 1} = σ (N^{- \frac{1}{2}} H W S^{- 1} H^{T} N^{- \frac{1}{2}} X^{l} P)

(6)

In Equation (6),

X^{l}

represents the vertex features input at layer l,

X^{l + 1}

represents the vertex features input at layer l + 1, P represents the trainable weight matrix between layers l and l + 1,

σ

represents an activation function, N represents the degree matrix of vertices in the hypergraph, and S represents the degree matrix of hyperedges in the hypergraph. H is an incidence matrix with vertices as rows and hyperedges as columns in the hypergraph. Conversely, if a message-passing path from vertices to hyperedges is specified, vertices are considered as columns and hyperedges as rows. The specific operational steps are as follows:

Hyperedge Feature Construction: Construct the feature representation of hyperedges by aggregating the features of vertices connected to each hyperedge.
Vertex Feature Classification and Aggregation: Classify and aggregate the features of all hyperedges related to a specific vertex to generate an enhanced vertex feature representation.
Model Training and Nonlinear Activation: Train the model using a nonlinear activation function to capture higher-order dependencies and complex structural features in the hypergraph.

In summary, this hypergraph convolutional network operator effectively captures higher-order correlations in the hypergraph, thereby significantly enhancing the model’s expressive power and generalization performance.

3.4. Heterogeneous Hypergraph and Attention Fusion

3.4.1. Construction of Heterogeneous Hypergraphs

To more accurately capture the higher-order relationships between students, instructors, and courses, five types of heterogeneous hypergraphs are constructed, respectively: ① A hypergraph with students as hyperedges and courses as vertices, with the incidence matrix defined as

H_{s \to c}

; ② A hypergraph with courses as hyperedges and students as vertices, with the incidence matrix defined as

H_{c \to s}

; ③ A hypergraph with instructors as hyperedges and students as vertices, with the incidence matrix defined as

H_{t \to s}

; ④ A hypergraph where courses with similar selections are hyperedges and students are vertices, with the incidence matrix defined as

H_{c^{'} \to s^{'}}

; and ⑤ A hypergraph with instructors as hyperedges and courses as vertices, with the incidence matrix defined as

H_{t^{'} \to c^{'}}

.

These five types of hypergraphs include student nodes, instructor nodes, course nodes, and hyperedges, which further describe higher-order relationships between different types of nodes through hyperedges. First Hypergraph: The subgraph of students and courses represents the collection of course sequences for students’ online learning. This can identify connections between students and uncover relationships among them. Second Hypergraph: The subgraph of courses and students represents the collection of students enrolled in a specific online course. This subgraph can identify students’ online course learning behaviors, thereby recognizing similarities in their course selections. Third Hypergraph: The subgraph of online instructors and students reveals which students chose courses taught by a particular instructor. It represents the collection of students enrolled in a specific instructor’s online courses and indirectly reflects the popularity of the instructor’s courses and students’ preferences for instructors. Fourth Hypergraph: The subgraph of similar course selections and students explores the similarity in courses taken by students directly connected (e.g., through edges) and those not directly connected (not connected by edges) to a particular student. Fifth Hypergraph: The subgraph of instructors and courses represents the collection of courses offered by online instructors.

Traditionally, definitions often use

2^{k \times j \times l} - k - j - l - 1

hyperedges. To reduce the complexity of the algorithm, we define

2 \times j + k + l + c + n

, where k, j and l represent the number of students, courses, and instructors, respectively. c and n represent the number of students who are directly connected and chose similar courses and the number of students who are not directly connected but chose similar courses, respectively.

Based on the definition of the hypergraph convolution operator given in Formulas (4)–(6) Section 3.3, the convolution formulas for these five hypergraphs are defined as the following Formulas (7)–(11).

First, for the hypergraph with students as hyperedges and courses as vertices, the hypergraph convolution formula is defined as:

x_{s \to c}^{l + 1} = σ (N_{s \to c}^{- \frac{1}{2}} H_{s \to c} W_{s \to c} S_{s \to c}^{- 1} H_{s \to c}^{T} N_{s \to c}^{- \frac{1}{2}} X_{s \to c}^{l} P_{s \to c})

(7)

Second, for the hypergraph with courses as hyperedges and students as vertices, the hypergraph convolution formula is defined as:

x_{c \to s}^{l + 1} = σ (N_{c \to s}^{- \frac{1}{2}} H_{c \to s} W_{c \to s} S_{c \to s}^{- 1} H_{c \to s}^{T} N_{c \to s}^{- \frac{1}{2}} X_{c \to s}^{l} P_{c \to s})

(8)

Third, for the hypergraph with teachers as hyperedges and students as vertices, the hypergraph convolution formula is defined as:

x_{t \to s}^{l + 1} = σ (N_{t \to s}^{- \frac{1}{2}} H_{t \to s} W_{t \to s} S_{t \to s}^{- 1} H_{t \to s}^{T} N_{t \to s}^{- \frac{1}{2}} X_{t \to s}^{l} P_{t \to s})

(9)

Fourth, for the hypergraph with courses that have similar elective patterns as hyperedges and students as vertices, the hypergraph convolution formula is defined as:

x_{c^{'} \to s^{'}}^{l + 1} = σ (N_{c^{'} \to s^{'}}^{- \frac{1}{2}} H_{c^{'} \to s^{'}} W_{c^{'} \to s^{'}} S_{c^{'} \to s^{'}}^{- 1} H_{c^{'} \to s^{'}}^{T} N_{c^{'} \to s^{'}}^{- \frac{1}{2}} X_{c^{'} \to s^{'}}^{l} P_{c^{'} \to s^{'}})

(10)

Fifth, for the hypergraph with teachers as hyperedges and courses as vertices, the hypergraph convolution formula is defined as:

x_{t \to c}^{l + 1} = σ (N_{t \to c}^{- \frac{1}{2}} H_{t \to c} W_{t \to c} S_{t \to c}^{- 1} H_{t \to c}^{T} N_{t \to c}^{- \frac{1}{2}} X_{t \to c}^{l} P_{t \to c})

(11)

In the matrix multiplications described in Equations (7)–(11) above, the embedding fusion representation learning from hypergraph vertices to hyperedges is achieved, capturing high-order features of online students’ courses and preferences. Multiple users and multiple courses are processed using hypergraph convolution calculations, and the results are concatenated for further operations.

In the convolution operations of the five heterogeneous hypergraphs described above, each vertex aggregates information through its associated hyperedges. However, the relationships between different hyperedges and vertices possess varying degrees of importance. To further enhance the model’s capability to capture the significance of different relationships and nodes, an attention mechanism is introduced into the model.

3.4.2. Calculation of Attention Weights

In heterogeneous hypergraphs, each vertex is connected to multiple hyperedges, and each hyperedge, in turn, connects multiple vertices. To capture the differences in importance between different hyperedges and vertices, attention mechanisms based on hyperedge aggregation and vertex aggregation are designed, respectively. The embedding matrices of the five different types of sub-hypergraphs and hyperedges mentioned above will be input into the attention network.

In the algorithm framework, the attention mechanism based on hyperedge aggregation aggregates the embeddings of all hyperedges connected to a particular vertex. The weight matrix P trained in Equation (6) is used to transform the features of vertices and hyperedges. The coefficient matrix between hyperedges and vertices can be calculated using the following formula:

σ (Z_{i j}) = \frac{e^{p^{T} (e_{j}, v_{i})}}{\sum_{v_{k} \in V_{i}} e^{P^{T} (e_{j}, v_{k})}}

(12)

In the above Equation (2), the terms “

e_{j}

” and “

v_{i}

” represent the connection feature vectors between the “hyperedge and vertex” nodes.

P^{T}

is the weight matrix for linear transformation,

V_{i}

is the set of all vertices connected to the hyperedge,

e_{j}

is the feature vector of hyperedge

j

, and the feature vector of the hyperedge is

e_{j} = \sum_{j \in V_{i}} σ (Z_{i j}) v_{j}

, which is the weighted sum of the feature vectors of the nodes connected to the hyperedge.

The attention mechanism based on vertex aggregation is similar to the above hyperedge-based attention aggregation mechanism, as they share the same coefficient matrix. The feature representations of hyperedges and vertices in the other four types of hypergraphs are obtained using the same method.

To reveal the similarity in the course lists that online learners focus on, i.e., the similarity between hyperedges with courses as vertices, if the courses that two online learners focus on have a high degree of similarity, then the list of courses they both focus on is relatively long, indicating a high-attention score between them. Otherwise, the attention score would be low. The attention scores between hyperedges are calculated using Equations (13) and (14), as follows:

sim (e_{i}, e_{j}) = \frac{e_{i} \cdot e_{j}}{‖e_{i}‖ ‖e_{j}‖}

(13)

Equation (13) calculates the cosine similarity between hyperedge

e_{i}

and hyperedge

e_{j}

. The numerator is the dot product of vector

e_{i}

and vector

e_{j}

, and

‖e_{i}‖

and

‖e_{i}‖

represent the L2 norms of vectors

e_{i}

and

e_{j}

, respectively. The formula for calculating the attention coefficients between hyperedges is shown in Equation (14), and the calculation method is as follows:

{Atten}_{e_{i}, e_{j}} = \frac{s i m (e_{i}, e_{j})}{\max (|e_{i}|, |e_{j}|)}

(14)

In Equation (14),

|e_{i}|

and

|e_{j}|

represent the number of vertices in hyperedge i and hyperedge j, respectively. Since two hyperedges may have different numbers of vertices, the calculation process uses the maximum number of vertices between the two to determine the attention coefficient. If there is no common list of learning courses between the two, the attention coefficient is 0, otherwise, the attention coefficient is high.

3.5. Model Optimization

Finally, by utilizing the embedding representations of users and courses, the preference of users for online courses is predicted, as represented below:

{\hat{r}}_{s, c} = e_{s}^{T} e_{c}

(15)

To optimize the model parameters, the pairwise BPR (Bayesian Personalized Ranking) loss function, which is commonly used in recommendation systems [31], is adopted. The selection of courses in the interactions between students and online courses reflects students’ learning preferences. Specifically, each observed course and an unobserved course form a pair of items.

BPR aims to optimize the ranking of these pairs, with the objective function being:

L o s s = \sum_{(s, c, j) \in Ω} = - \ln σ ({\hat{r}}_{s, j} - {\hat{r}}_{c, j}) + λ {‖Θ‖}_{2}^{2}

(16)

In Equation (16)

Ω = {(s, c, j) | c \in c^{+}, j \in c^{-}}

,

c^{+}

and

c^{-}

represent the positive and negative instances,

σ

represents the activation function,

Θ

represents the set of parameters, and

λ

controls the L2 regularization strength to prevent overfitting. To optimize the objective function, the Adam optimizer is used to train the model. During training, random sampling of u is performed, and all items in the sequence are treated as “positive” instances, while the next item in the sequence is treated as a “negative” instance, to form triplets. The objective function is used to iteratively train and update the model parameters.

4. Experiments

4.1. Datasets

MOOCCube is a large-scale dataset for MOOC (Massive Open Online Course) research, and all datasets are sourced from real-world online classrooms. It includes 199,199 online learners, 1736 online instructors, and more than 700 courses. In the experiments, the training, validation, and testing data consist of the historical online learning data of 18,890 learners and the teaching data of 1250 online instructors. We allocated 60% of the total data for training, 20% for the validation datasets, and the remaining 20% for the testing datasets.

4.2. Parameter Settings

In the experiments, the top-k recommended courses are selected as the final evaluation results, with k = {5, 10, 15}. The learning rate = {0.001, 0.005, 0.01, 0.05, 0.1}, the embedding dimensions for student users and courses D = {32, 64, 128, 256, 512}, and the batch size Batch-Size = {32, 64, 128, 256}. The model was trained on an NVIDIA GeForce RTX 3090 GPU (Nvidia, Santa Clara, CA, USA) and built using the TensorFlow 2.0 framework. Optuna 2.0.0 was used to search for the optimal hyper parameters, and the best results were achieved with an embedding dimension D of 128, a batch size of 256, and a learning rate of 0.01. The optimal regularization coefficient λ was found to be 0.01 through experimentation.

4.3. Evaluation Metrics

To objectively evaluate the performance of various algorithms, the performance 2.metrics used include NDCG (Normalized Discounted Cumulative Gain), Recall, and F1-SCORE. The higher the values of these metrics, the better the model’s performance. They are defined as shown in Equations (17)–(19).

The NDCG formula is as follows:

DCG = \sum_{i = 1}^{K} \frac{r e l_{i}}{\log_{2} (i + 1)}

(17)

IDCG = \sum_{i = 1}^{K} \frac{r e l_{i}^{i d e a l}}{\log_{2} (i + 1)}

(18)

NDCG = \frac{D C G}{I D C G}

(19)

In Equation (17), DCG (Discounted Cumulative Gain) represents the unnormalized cumulative gain, and IDCG (Ideal DCG) represents the maximum possible DCG value under the ideal ranking. The

{rel}_{i}

in the Equation represents the relevance score of the item at the i-th position in the recommendation list, and the

{rel}_{i}^{i d e a l}

represents the relevance score of the item at the i-th position in the ideal ranking.

The Recall formula is as follows:

R e c a l l = \frac{T P}{F N + T P}

(20)

F1-SCORE formula is as follows:

F 1 - SCORE = \frac{2 * Precision * Recall}{Precision + Recall}

(21)

In Equation (20), TP, TN, FN, and FP represent true positives, true negatives, false negatives, and false positives, respectively, in the recommendation results.

From Figure 3, Figure 4 and Figure 5, it can be observed that the impact of the learning rate on NDCG, F1-Score, and Recall shows a trend of first increasing and then decreasing, with the learning rate of around 0.01 yielding the best results. The NDCG, F1-Score, and Recall metrics gradually improve as the length of the recommendation list increases, indicating that with a longer recommendation list, more relevant courses can be recommended.

From Figure 6, Figure 7 and Figure 8, it can be observed that the NDCG, F1-Score, and Recall metrics exhibit the best performance for the recommendation sequence when the embedding dimension D is 128. This may be because, at this dimension, the model can effectively capture the characteristics of the data. As D increases further, all three metrics, NDCG, F1-Score, and Recall, begin to decline, indicating that the model starts to overfit when the embedding dimension D exceeds 128 dimensions.

From Figure 9, Figure 10 and Figure 11, it can be observed that when the Batch-size is 128, the model performs best in terms of the NDCG, F1-Score, and Recall evaluation metrics. When the Batch-size exceeds 128, the performance of these metrics begins to decline. This may be because, at this Batch-size, the model can effectively perform gradient updates and convergence.

4.4. Comparative Methods

We selected feature-based methods and deep learning-based methods to conduct comparative experiments with our proposed method. The methods involved in the comparative experiments are as follows:

POP [32]: Utilizes a neural network architecture with multilayer perceptrons to model latent features of user-item interaction functions. It is a neural network-based collaborative filtering recommendation method.

BPR [33]: A Bayesian optimization approach for recommendation tasks, providing a general learning algorithm to optimize models with respect to BPR-Opt. The algorithm is based on stochastic gradient descent and bootstrap sampling for model optimization.

GRU4Rec [34]: Implements session-based recommendation tasks using a recurrent neural network architecture based on GRUs.

BERT4Rec [31]: A classic pre-training method for achieving bidirectional transformer-based recommendation models.

Light-GCN [35]: This method learns student and course embeddings by conducting student-course interaction graphs and uses the weighted sum of embeddings learned from all layers as the final embeddings.

TP-GNN [36]: A graph neural network-based MOOC course recommendation method that captures high-level semantic relationships between courses through graph convolutional networks.

IRS-GCNet [37]: Proposes a university course intelligent recommendation system based on graph convolutional networks. The system utilizes graph convolutional networks to obtain representations of students’ English skills and employs an adjacency contrastive learning strategy to reduce errors caused by information loss during message passing between neighboring nodes in graph learning.

4.5. Experimental Results and Analysis

We evaluated the above seven algorithms using the NDCG, F1-Score, and Recall metrics, assessing the recommendation capabilities of the top 5, top 10, and top 15 recommendations for each algorithm. To ensure fair comparative experiments, each algorithm used the parameter settings from the original articles and employed the Bayesian hyper parameter optimization library Optuna to find the optimal hyper parameters. The experimental results are shown in Table 1, Table 2 and Table 3 and Figure 12. The last three columns of Table 1, Table 2 and Table 3 correspond to the differences in the three metrics between each algorithm and the proposed HHOCR.

From the comparative experimental results in Table 1, Table 2 and Table 3 and Figure 12, it can be observed that among the seven algorithms, including feature-based methods (POP, BPR) and deep learning-based methods (GRU4Rec, BERT4Rec, Light-GCN, TP-GNN, IRS-GCNet), the proposed heterogeneous hypergraph, and attention-based online course recommendation algorithm (HHOCR) was compared and analyzed. The following conclusions were drawn:

In the case of recommending the top 5 items, the HHOCR algorithm demonstrated the best performance, with NDCG@5 reaching 0.7965, F1-Score@5 at 0.6945, and Recall@5 at 0.7102. Compared to the second-ranked IRS-GCNet, HHOCR improved these three metrics by 0.0173, 0.0408, and 0.0510, respectively. Compared to other algorithms, HHOCR also significantly outperformed POP, BPR, GRU4Rec, BERT4Rec, and Light-GCN. This indicates that HHOCR has a clear advantage in accuracy and recall for the top 5 recommendations.

In the case of recommending the top 10 items, the HHOCR algorithm still performed exceptionally well, with NDCG@10 reaching 0.8593, F1-Score@10 at 0.7813, and Recall@10 at 0.7514. Compared to the second-ranked IRS-GCNet, HHOCR improved by 0.0613, 0.0824, and 0.0724, respectively. Compared to other algorithms, HHOCR’s NDCG@10 and F1-Score@10 were 0.5233 and 0.3939 higher than POP’s, and 0.3736 and 0.2906 higher than BPR’s, and 0.3179 and 0.3024 higher than GRU4Rec’s. This further validates HHOCR’s superior performance in recommending the top 10 items.

In the case of recommending the top 15 items, the HHOCR algorithm continued to perform excellently, with NDCG@15 reaching 0.8938, F1-Score@15 at 0.8092, and Recall@15 at 0.7698. Compared to the second-ranked TP-GNN, HHOCR improved by 0.0699, 0.0907, and 0.0509, respectively. Compared to other algorithms, HHOCR’s NDCG@15 and F1-Score@15 were 0.4001 and 0.2966 higher than POP’s, 0.3749 and 0.2828 higher than BPR’s, and 0.2595 and 0.2609 higher than GRU4Rec’s. This indicates that HHOCR still has a significant advantage in recommending the top 15 items.

Although POP, BPR, GRU4Rec, BERT4Rec, Light-GCN, TP-GNN, and IRS-GCNet performed well in certain metrics, they had limitations in capturing complex relationships and dynamic features. The HHOCR algorithm, by constructing a heterogeneous hypergraph, is able to capture many-to-many relationships among users, courses, and teachers, not only considering users’ historical behaviors but also the semantic relationships between courses and users’ preferences. This makes HHOCR significantly superior in recommendation accuracy and recall compared to other algorithms. Additionally, the use of attention mechanisms allows HHOCR to dynamically focus on important features and relationships, thereby improving the personalization and precision of recommendations. Especially in recommending the top 5 and top 10 items, the role of hypergraphs and attention mechanisms is more pronounced, leading to significant improvements in NDCG, F1-Score, and Recall for long-sequence recommendations.

5. Conclusions

In this study, we proposed a heterogeneous hypergraph and attention mechanism-based online course recommendation algorithm (HHOCR) and compared it with several existing recommendation algorithms (POP, BPR, GRU4Rec, BERT4Rec, Light-GCN, TP-GNN, IRS-GCNet) through comprehensive experiments. The experimental results demonstrate that HHOCR outperforms other algorithms in recommending the top 5, top 10, and top 15 items.

Specifically, in the case of recommending the top 5 items, HHOCR achieved an NDCG@5 of 0.7965, F1-Score@5 of 0.6945, and Recall@5 of 0.7102; in recommending the top 10 items, NDCG@10 was 0.8593, F1-Score@10 was 0.7813, and Recall@10 was 0.7514; in recommending the top 15 items, NDCG@15 was 0.8938, F1-Score@15 was 0.8092, and Recall@15 was 0.7698. Notably, HHOCR’s performance in the NDCG metric stood out.

The superior performance of HHOCR is primarily attributed to its unique heterogeneous hypergraph structure and attention mechanism. The heterogeneous hypergraph is capable of capturing complex higher-order relationships between users and courses, while the attention mechanism dynamically focuses on important features and relationships, thereby enhancing the personalization and precision of recommendations. Additionally, by utilizing the Bayesian hyper parameter optimization library Optuna, we were able to find an optimal set of hyper parameter configurations, further boosting the model’s performance.

In summary, the HHOCR algorithm demonstrates significant advantages in online course recommendation, achieving optimal recommendation results across multiple evaluation metrics. This provides new insights for future online education and personalized course recommendation systems.

6. Limits and Perspectives

Despite its significant performance improvements, the HHOCR algorithm still faces certain limitations. Although HHOCR achieves dynamic weight allocation through attention mechanisms, its adaptability and real-time responsiveness require further enhancement when confronted with rapid changes in user preferences and course content. With the continuous advancement of large language model (LLM) technologies, the HHOCR algorithm holds promise for further optimization and refinement. Future research should explore integrating LLMs into hypergraph construction and feature extraction processes. Leveraging the powerful expressive capabilities of LLMs could enable more efficient modeling of complex relationships among users, courses, and instructors.

Author Contributions

Conceptualization, M.Z.; Writing—original draft, P.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key Research Planning Project of National Language Committee (No. ZDI145-94).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Index	Abbreviation	Definition
1	GNNs	Graph Neural Networks
2	HHAOCR	heterogeneous hypergraph and attention mechanism-based online course recommendation algorithm
3	DCG	Discounted Cumulative Gain
4	NDCG	Normalized Discounted Cumulative Gain
5	BPR	Bayesian personalized ranking
6	MOOC	Massive Open Online Course

References

Shen, Y.F. Personalized Learning Path Recommendation Model Based on Multiple Intelligent Algorithms. China Educ. Technol. 2019, 11, 66–72. Available online: http://kns.cnki.net/kcms/detail/11.3792.G4.20191101.1204.018.html (accessed on 10 March 2020).
Xiang, X.; Jiang, L.; Feng, N. Recommendation Algorithm Based on Implicit Feedback and Weighted User Preferences. J. Comput. Technol. Dev. 2024, 34, 140–146. [Google Scholar] [CrossRef]
Shen, Y.M.; Jiang, B.Q.; Ao, S.; Liu, Z. Mean Bayesian Personalized Ranking Algorithm Based on Forgetting Function. Appl. Res. Comput. 2021, 38, 1350–1354+1370. [Google Scholar] [CrossRef]
Du, S.W.; Jin, T. Deep Hybrid Recommendation Algorithm Based on User Behavioral Features. J. Guangxi Norm. Univ. (Nat. Sci. Ed.) 2024, 42, 91–100. [Google Scholar] [CrossRef]
Wen, W.; Hu, Z.X.; Hao, Z.F. Deep Exponential Moving Average Learning Method for Sequential Recommendation. J. Front. Comput. Sci. Technol. 2025, 19, 774–786. [Google Scholar] [CrossRef]
Wang, K.X. Long- and Short-Term Music Preference Recommendation Algorithm Integrating Musical Emotional Attention. Mod. Sci. Instrum. 2024, 41, 287–292. [Google Scholar]
Song, L. Analysis of Personalized Recommendation Services in Smart Libraries Based on Data Mining Technology. J. Libr. Sci. 2023, 45, 69–73. [Google Scholar] [CrossRef]
Tong, X.J.; Chen, T. Personalized Learning Resource Recommendation Algorithm for Virtual Simulation Training Based on Data Mining. J. Shandong Univ. Technol. (Nat. Sci. Ed.) 2025, 39, 35–40+46. [Google Scholar] [CrossRef]
Deng, W.W.; Chen, J.J.; Chen, H.; Feng,, G. MOOC Recommendation Integrating Multidimensional User Preferences. J. Acad. Libr. Inf. Sci. 2025, 43, 80–88. [Google Scholar] [CrossRef]
Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep learning-based recommender system: A survey and new perspectives. arXiv 2018, arXiv:1707.07435. [Google Scholar] [CrossRef]
Zhang, Z.; Cui, P.; Zhu, W. Deep learning on graphs: A survey. IEEE Trans.Knowl. Data Eng. 2020, 34, 249–270. [Google Scholar] [CrossRef]
Xu, M. Understanding graph embedding methods and their applications. SIAM Rev. 2021, 63, 825–853. [Google Scholar] [CrossRef]
Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Wu, Z.; Huang, L.; Huang, Q.; Huang, C.; Tang, Y. SGKT: Session graph-based knowledge tracing for student performance prediction. Expert Syst. Appl. 2022, 206, 117681. [Google Scholar] [CrossRef]
Hu, Q.; Rangwala, H. Academic performance estimation with attention-based graph convolutional networks. arXiv 2019, arXiv:2001.00632. [Google Scholar]
Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
Li, X.; Li, M.; Yan, P.; Li, G.; Jiang, Y.; Luo, H.; Yin, S. Deep learning attention mechanism in medical image analysis: Basics and beyonds. Int. J. Netw. Dyn. Intell. 2023, 2, 93–116. [Google Scholar] [CrossRef]
Karypidis, E.; Mouslech, S.G.; Skoulariki, K.; Gazis, A. Comparison analysis of traditional machine learning and deep learning techniques for data and image classification. arXiv 2022, arXiv:2204.05983. [Google Scholar] [CrossRef]
Zhang, S.; Yin, H.; Chen, T.; Huang, Z.; Cui, L.; Zhang, X. Graph embedding for recommendation against attribute inference attacks. In Proceedings of the Web Conference 2021, New York, NY, USA, 12–23 April 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 3002–3014. [Google Scholar] [CrossRef]
Gori, M.; Monfardini, G.; Scarselli, F. A new model for learning in graph domains. In Proceedings of the IEEE International Joint Conference on Neural Network, Montréal, QC, Canada, 31 July–4 August 2005; Volume 2, pp. 729–734. [Google Scholar]
Xiao, Q. Recurrent neural network system using probability graph model optimization. Appl. Intell. Int. J. Artif. Intell. Neural Netw. Complex Probl. Solving Technol. 2017, 46, 889–897. [Google Scholar] [CrossRef]
Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef]
Suárez-Varela, J.; Ferriol-Galmés, M.; López, A.; Almasan, P.; Bernárdez, G.; Pujol-Perich, D.; Rusek, K.; Bonniot, L.; Neumann, C.; Schnitzler, F.; et al. The Graph Neural Networking Challenge: A Worldwide Competition for Education in AI/ML for Networks. ACM SIGCOMM Comput. Commun. Rev. 2021, 51, 10–16. [Google Scholar] [CrossRef]
Bretto, A. Hypergraph theory. In An Introduction. Mathematical Engineering; Springer: Cham, Switzerland, 2013. [Google Scholar]
Gao, Y.; Wang, M.; Tao, D.; Ji, R.; Dai, Q. 3-D Object Retrieval and Recognition with Hypergraph Analysis. IEEE Trans. Image Process. 2012, 21, 4290–4303. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Egorova, E.K.; Mokryakov, A.V. Development of hypergraph theory. J. Comput. Syst. Sci. Int. 2018, 57, 109–114. [Google Scholar] [CrossRef]
Yu, J.; Yin, H.; Li, J.; Wang, Q.; Hung, N.Q.; Zhang, X. Self-supervised multi-channel hypergraph convolutional network for social recommendation. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 413–424. [Google Scholar]
Zhang, J.; Gao, M.; Yu, J.; Guo, L.; Li, J.; Yin, H. Double-scale self-supervised hypergraph learning for group recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Online, 1–5 November 2021; pp. 2557–2567. [Google Scholar]
Zhou, D.; Huang, J.; Scholkopf, B. Learning with Hypergraphs: Clus-tering, Classification, and Embedding. In Advances in Neural Information Processing Systems 19 (NIPS 2006); The MIT Press: Cambridge, MA, USA, 2007; pp. 1601–1608. [Google Scholar]
Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: Bayesian personalized ranking from implicit feedback. arXiv 2012, arXiv:1205.2618. [Google Scholar]
Xia, X.; Qi, W. Driving STEM learning effectiveness: Dropout prediction and intervention in MOOCs based on one novel behavioral data analysis approach. Humanit. Soc. Sci. Commun. 2024, 11, 430. [Google Scholar] [CrossRef]
Lee, H.Y.; Cheng, Y.P.; Wang, W.S.; Lin, C.J.; Huang, Y.M. Exploring the learning process and effectiveness of STEM education via learning behavior analysis and the interactive-constructive-active-passive framework. J. Educ. Comput. Res. 2023, 61, 951–976. [Google Scholar] [CrossRef]
He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.-S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, WA, Australia, 3–7 April 2017. [Google Scholar]
Hidasi, B.; Karatzoglou, A.; Baltrunas, L.; Tikk, D. Session-based recommendations with recurrent neural networks. arXiv 2015, arXiv:1511.06939. [Google Scholar]
Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar]
He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. LightGCN: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Online, 25–30 July 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 639–648. [Google Scholar]

Figure 1. Online Course Recommendation Algorithm (HHAOCR) based on Heterogeneous Hypergraph and Attention Mechanism.

Figure 2. Diagram of the Course Hypergraph Incidence Matrix.

Figure 3. Learning rate experiment (k = 5).

Figure 4. Learning rate experiment (k = 10).

Figure 5. Learning rate experiment (k = 15).

Figure 6. Embedding dimension D experiment (k = 5).

Figure 7. Embedding dimension D experiment (k = 10).

Figure 8. Embedding dimension D experiment (k = 15).

Figure 9. Batch-size experiment (k = 5).

Figure 10. Batch-size experiment (k = 10).

Figure 11. Batch-size experiment (k = 15).

Figure 12. Comparison Chart of Three Metrics: NDCG, F1-Score, and Recall.

Table 1. Overall recommendation performance for k = 5.

Algorithm	NDCG@5	F1-Score@5	Recall@5	Diff. NDCG@5	Diff. F1-Score@5	Diff. Recall@5
POP	0.2041	0.1078	0.2106	−0.5924	−0.5867	−0.4996
BPR	0.3205	0.2843	0.3513	−0.4760	−0.4102	−0.3589
GRU4Rec	0.4603	0.3987	0.3809	−0.3362	−0.2958	−0.3293
BERT4Rec	0.5128	0.5586	0.4667	−0.2837	−0.1359	−0.2435
Light-GCN	0.6371	0.5973	0.5083	−0.1594	−0.0972	−0.2019
TP-GNN	0.6529	0.6098	0.5919	−0.1436	−0.0847	−0.1183
IRS-GCNet	0.7792	0.6537	0.6592	−0.0173	−0.0408	−0.0510
HHOCR	0.7965	0.6945	0.7102	-	-	-

Table 2. Overall recommendation performance for k = 10.

Algorithm	NDCG@10	F1-Score@10	Recall@10	Diff. NDCG@10	Diff. F1-Score@10	Diff. Recall@10
POP	0.336	0.3874	0.3601	−0.5233	−0.3939	−0.3913
BPR	0.4857	0.4907	0.5437	−0.3736	−0.2906	−0.2077
GRU4Rec	0.5413	0.4789	0.5904	−0.3179	−0.3024	−0.1610
BERT4Rec	0.6982	0.6186	0.6043	−0.1611	−0.1627	−0.1471
Light-GCN	0.7983	0.6992	0.6792	−0.0610	−0.0821	−0.0722
TP-GNN	0.7692	0.6781	0.6583	−0.0901	−0.1032	−0.0931
IRS-GCNet	0.798	0.6989	0.679	−0.0613	−0.0824	−0.0724
HHOCR	0.8593	0.7813	0.7514	-	-	-

Table 3. Overall Recommendation rerformance for k = 15.

Algorithm	NDCr@15	F1-Score@15	Recall@15	Diff. NDCG@15	Diff. F1-Score@15	Diff. Recall@15
POP	0.4937	0.5126	0.5217	−0.4001	−0.2966	−0.2481
BPR	0.5189	0.5264	0.5313	−0.3749	−0.2828	−0.2385
GRU4Rec	0.6343	0.5483	0.6095	−0.2595	−0.2609	−0.1603
BERT4Rec	0.7184	0.659	0.6286	−0.1754	−0.1502	−0.1412
Light-GCN	0.8068	0.7079	0.6991	−0.0870	−0.1013	−0.0707
TP-GNN	0.8239	0.7185	0.7189	−0.0699	−0.0907	−0.0509
IRS-GCNet	0.8130	0.7093	0.7094	−0.0808	−0.0999	−0.0604
HHOCR	0.8938	0.8092	0.7698	-	-	-

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, P.; Zhang, M. Attention-Based Hypergraph Neural Network: A Personalized Recommendation. Appl. Sci. 2025, 15, 6332. https://doi.org/10.3390/app15116332

AMA Style

Xu P, Zhang M. Attention-Based Hypergraph Neural Network: A Personalized Recommendation. Applied Sciences. 2025; 15(11):6332. https://doi.org/10.3390/app15116332

Chicago/Turabian Style

Xu, Peihua, and Maoyuan Zhang. 2025. "Attention-Based Hypergraph Neural Network: A Personalized Recommendation" Applied Sciences 15, no. 11: 6332. https://doi.org/10.3390/app15116332

APA Style

Xu, P., & Zhang, M. (2025). Attention-Based Hypergraph Neural Network: A Personalized Recommendation. Applied Sciences, 15(11), 6332. https://doi.org/10.3390/app15116332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Attention-Based Hypergraph Neural Network: A Personalized Recommendation

Abstract

1. Introduction

2. Preliminaries

2.1. Graph Neural Networks

2.2. Hypergraphs

3. Proposed Method

3.1. Model Construction Based on Hypergraphs and Attention Mechanisms

3.2. Definition of Hypergraph

3.3. Hypergraph Convolution Operator

3.4. Heterogeneous Hypergraph and Attention Fusion

3.4.1. Construction of Heterogeneous Hypergraphs

3.4.2. Calculation of Attention Weights

3.5. Model Optimization

4. Experiments

4.1. Datasets

4.2. Parameter Settings

4.3. Evaluation Metrics

4.4. Comparative Methods

4.5. Experimental Results and Analysis

5. Conclusions

6. Limits and Perspectives

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI