Next Article in Journal
Unsupervised Specific Emitter Identification via Group Label-Driven Contrastive Learning
Previous Article in Journal
High-Speed SMVs Subscriber Design for FPGA Architectures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Sequential Recommendation System Based on Deep Learning: A Survey

by
Peiyang Wei
1,2,3,4,5,*,
Hongping Shu
2,4,
Jianhong Gan
2,4,6,
Xun Deng
7,
Yi Liu
2,
Wenying Sun
8,
Tinghui Chen
1,
Can Hu
2,
Zhenzhen Hu
2,6,
Yonghong Deng
2,6,
Wen Qin
6,9 and
Zhibin Li
2,6,7
1
School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2
School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China
3
Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China
4
Automatic Software Generation & Intelligence Service Key Laboratory of Sichuan Province, Chengdu 610225, China
5
Key Laboratory of Remote Sensing Application and Innovation, Chongqing 401147, China
6
Dazhou Key Laboratory of Government Data Security, Sichuan University of Arts and Science, Dazhou 635000, China
7
Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
8
Lyle School of Engineering, Southern Methodist University, Dallas, TX 75205, USA
9
School of Computer Science, Sichuan Normal University, Chengdu 610101, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(11), 2134; https://doi.org/10.3390/electronics14112134
Submission received: 5 April 2025 / Revised: 18 May 2025 / Accepted: 21 May 2025 / Published: 24 May 2025
(This article belongs to the Section Computer Science & Engineering)

Abstract

With the rapid development of deep learning in artificial intelligence, sequential recommendation systems play an increasingly important role in e-commerce, social media, digital entertainment, and other fields. This work systematically reviews the research progress of deep learning in sequential recommendation systems from a methodological perspective. This paper focuses on analyzing three dominant technical paradigms: contrastive learning, graph neural networks, and attention mechanisms, elucidating their theoretical innovations and evolutionary trajectories in sequential recommendation systems. Through empirical investigation, we categorize the prevailing evaluation metrics, benchmark datasets, and characteristic distributions of typical application scenarios within this domain. This work further proposes promising avenues for sequential recommendation systems in the future.

1. Introduction

With the rapid development of the Internet, we have entered an era of information explosion. The vast amount of information on the Internet has brought us unprecedented challenges. Recommendation systems (RSs) have emerged from leveraging the rise of e-commerce [1,2,3,4,5], social networks, and other fields. RSs aim to provide users with personalized recommendations based on their interests and characteristics [5,6,7,8,9,10]. By analyzing and mining the relationships between collected user data and item information, the SR recommends items or content that users may be interested in and presents relevant recommendations to users [10,11,12,13,14,15].
Traditional RS technologies are generally classified into four major categories [16]:
(a) Collaborative Filtering (CF): The basic theory is to model user preferences based on their historical interaction data [17,18,19]. In collaborative filtering, the system collects and analyzes users’ historical behavior data, such as ratings, preferences, or purchase records. Then, based on the similarity between users, it matches users with similar interests and preferences, and it recommends the items or content that the target users may be interested in [1]. However, it has some issues, such as the sparsity problem, cold start problem, and limited generalization ability.
(b) Content-Based Filtering (CBF) [20,21]: This is based on the characteristics of items and user preferences for recommendations. Its core idea is to adopt the matching degree between the attributes of items and users’ personal preferences to make recommendations. When a user shows interest in an item or provides a positive review, the system attempts to find other items with similar features and recommends them to the user [2]. Compared with collaborative filtering, content-based methods are less susceptible to the cold start issue.
(c) Knowledge-Based Filtering (KBF): This mainly relies on understanding and analyzing the characteristics, attributes, and constraints of items and users to generate recommendation results that satisfy user requirements and constraints. Compared with other recommendation algorithms, KBF focuses on modeling and applying system knowledge and rules, which transforms domain knowledge into recommendation strategies to generate personalized recommendation results [22,23].
(d) Hybrid Filtering (HF): This refers to the combination of at least two different techniques to overcome the limitations of using a single method [5,24]. One involves integrating content-based and collaborative filtering algorithms to generate separate recommendation ranking lists, which are merged into the final recommendation results [25]. Some notable examples of hybrid recommendation systems include weighted and switching hybrid recommendation systems. A weighted hybrid recommendation system calculates recommendation item scores based on the results of all available recommendation algorithms in the system. For instance, the simplest form of a combination hybrid recommendation system could be a linear combination of recommendation scores. Moreover, a hybrid recommendation system switches between different recommendation techniques based on certain rules [16].
A traditional RS models user–item interaction behavior in a static manner, which can only capture users’ general preferences [15,16]. However, users’ preferences and items’ popularity change over time, rather than being static [26,27]. Additionally, users’ shopping behaviors usually occur continuously, rather than in isolation [28,29]. The interaction behavior between users and items essentially has temporal dependencies [30]. Traditional content-based and collaborative filtering RSs cannot effectively capture these sequential dependencies [3,31].
This dynamism is crucial for accurately providing personalized recommendations for users or items. A sequential recommendation system (SRS) takes time into consideration, enabling it to capture the temporal sequence of user behavior, as well as changes in preferences and interests at different time points [32]. The main research methods of SRSs include auto-regressive integrated moving averages (ARIMAs) [33] based on traditional sequential modeling, support vector machines (SVMs) [34], gradient boosting regression trees (GBRTs) [35], and the hidden Markov model (HMM) [36] based on machine learning methods, as well as the most recent deep learning (DL) techniques [37]. In general, the primary objective of SRSs is to predict a sequence of items that a user is likely to interact with in the future based on their previous interactions, thereby recommending items of potential interest to the users [4,38,39]. This method allows the system to better comprehend the users’ evolving process and provide more dynamic and personalized recommendations [40].
The ARIMA model for sequential recommendation is based on traditional statistical methods and mainly involves determining the parameters of the sequential parametric model, [41], solving for the model parameters [42], and using the solved model to complete future recommendation tasks [43].
The regression analysis method of machine learning is commonly applied in SRSs. By leveraging the stable prediction capability of the SVM for a nonlinear sequential, the variable x is mapped into a high-dimensional feature space. Within this high-dimensional feature space, a function f that can accurately represent the relationship between the output and the input data is identified. Subsequently, the sequential is fitted and recommended through regression analysis [44]. The GBRT utilizes its negative gradient for calculation and iteration to minimize the loss function, thereby obtaining the optimal recommendation model [45]. The HMM can provide a probabilistic framework for modeling multivariate sequential recommendations. The HMM is a doubly stochastic process, featuring an HMM with a certain number of states and a set of observable random functions. The former is unobservable, but it can be recommended through a set of processes that generate a series of observational results [46].
DL methods have achieved remarkable accomplishments in fields such as computer vision and natural language processing. This has led an increasing number of researchers to introduce them to SRSs and related applications. Deep neural networks can achieve better representations of high-dimensional data by constructing various network architectures, reducing the reliance on manual feature engineering and model design. By defining the loss function, deep neural networks can conduct end-to-end training more conveniently. Since deep neural networks can construct temporal feature representations through multiple nonlinear layers, learn the internal transformation rules of the sequential, and then provide more reliable and efficient recommendation results, they have attracted much attention in current SRSs [47].
At the forefront of the SRS domain, the integration of multimodal data fusion and large language models (LLMs) has emerged as a highly promising research direction. Currently, numerous challenges exist. Multimodal data exhibit heterogeneity, lacking a unified representation paradigm, and their spatiotemporal alignment and semantic fusion mechanisms remain immature [48]. The requirement of large-scale data for LLM training exacerbates the risk of data privacy leakage. Meanwhile, there are technical bottlenecks in updating training data in real time to capture dynamic user behaviors. In existing research, DL-based multimodal fusion methods leverage self-attention mechanisms and Transformer architectures to achieve cross-modal feature alignment and semantic fusion. In terms of LLM integration, researchers have attempted to employ LLMs as knowledge reasoning engines, fusing open-domain knowledge with structured data from RSs through techniques such as prompt engineering and parameter fine-tuning, thereby enhancing the interpretability and generalization capabilities of recommendations [49]. This research work proposes several suggestions, such as constructing a dynamic adaptive multimodal fusion framework and developing efficient feature dimensionality reduction and semantic enhancement algorithms; exploring lightweight LLM deployment schemes, optimizing inference efficiency by integrating model distillation and quantization compression technologies and introducing reinforcement learning to achieve dynamic policy optimization; and establishing an incremental learning mechanism under privacy protection to balance data security and model timeliness. Although there are currently few high-level research achievements in this field, innovative explorations are expected to bring paradigm changes to SRSs. Due to space limitations and the insufficient richness of the literature, we will still focus on the domain of general DL methods in the field of sequential recommendation [50].
Our research will classify and elaborate on the progress in terms of the methods and models of SRSs based on DL [51].
The principal contributions of this paper are presented as follows. (i) Our work presents a novel taxonomy for DL-based SRSs. The classification framework systematically delineates how each paradigm tackles the distinct challenges of sequential pattern modeling in recommendation tasks. (ii) This work systematically summarizes and analyzes the prevailing evaluation metrics, benchmark datasets, and application scenarios in SRSs, offering researchers a handbook-style reference.
The rest of this survey is organized as follows. Section 2 presents the SRSs based on DL. Section 3 explains the adopted metrics, datasets, and typical scenarios. Moreover, Section 4 describes several future development directions for SRSs, thereby offering some insights into this field. Finally, Section 5 concludes this review.

2. SRSs Based on DL

In this section, this paper summarizes the advancing research on DL in SRSs, thereby categorizing the current research into three areas: contrastive learning (CL), graph neural networks (GNNs), and attention mechanisms.

2.1. Sequence Recommendation Based on CL

This section mainly focuses on data sparsity and data noise, which summarize the most novel models in current CL.
In SR, data sparsity and data noise are critical issues. Existing SRSs frequently suffer from data sparsity, thereby making it difficult to learn high-quality user representations. In [15], a novel multi-task framework called CL is proposed for SR, which utilizes the traditional next-item prediction task and employs a CL framework to obtain self-supervised signals from raw user behavior sequences. Consequently, it can extract meaningful user patterns and effectively encode user representations.
Sparse information on user behavior makes it difficult to learn high-quality user preference representations. To address this issue, researchers propose a novel item attribute-aware CL framework for SR [52]. In [53], a time density-aware SR network with CL is presented, thereby exploring the impact of time density information on user preferences. Due to sparse user–item interactions [54], the vital idea incorporates CL into a time density-aware SR network with CL to mitigate the issue of insufficient supervised learning signals [55].
SRSs grounded in CL have witnessed remarkable advancements in mitigating the sparsity and noise inherent in knowledge graphs [56]. In [57,58], DiffKG is proposed as an innovative knowledge graph diffusion model. This model operates via a distinctive knowledge graph diffusion paradigm, which delicately balances the processes of destruction and reconstruction, thereby enhancing the performance of the recommendation system. The ablation study of DiffKG’s key components has illuminated the efficacy of CL in augmenting knowledge graphs [59]. It has also underscored the pivotal role played by the diffusion model in elevating system performance [60]. Moreover, the KGCL framework, through relation-aware and cross-view CL mechanisms, enriches item representation and suppresses noise [61]. This approach effectively addresses the long-tail distribution issue prevalent in knowledge graphs [62]. Collectively, these research outcomes offer novel perspectives and methodologies for SRSs to grapple with the challenges of knowledge graph sparsity and noise [63]. As a result, they hold the potential to further enhance the accuracy and robustness of such recommendation systems [64], thereby contributing to the ongoing development of this crucial area within the field of data-driven decision-making [65].
Various intents drive user interactions with items, and then these latent intents are unobservable [66]. In [8], a general learning paradigm called intent CL for SR is proposed by using latent user intent variables, which also improves the issue of data sparsity [67]. The critical idea is to learn the user’s intent distribution function from unlabeled user behavior sequences, which optimizes the SR model by contrastive self-supervised learning [68]. By learning latent intents, the recommendation can be improved [69]. Essentially, it infers the user’s intents from unlabeled data and applies them to SR [70], thereby enhancing the recommendation’s effectiveness [71].
Data sparsity makes it difficult for a single model to predict the next user–item interaction or aggregate user representations [72], which can result in inaccurate recommendations [73,74] when the CL model adopts data augmentation to alleviate the data sparsity problem [75], which can amplify the noise in the original sequence [76]. Additionally, noise frequently interferes with the user’s primary intent [77]. In [9], a novel framework in this field called ICORec is proposed, which stands for multi-intent CL recommendation. The primary idea is to select the user’s intent, applying CL to address the denoising issue. The multi-intent representation of the enhanced sequence is shown in Equation (1):
v i e w 1 s u a i ( k ) = L I s u a i ( k ) + G I s u a i ( k ) , v i e w 2 s u a j ( k ) = L I s u a j ( k ) + G I s u a j ( k ) ,
where view1 and view2 represent the addition of global intent representation and local intent representation from the same intent prototype; thus, k levels of intent representation are obtained for each enhanced sequence.
As shown in Figure 1, two enhanced sequences are generated as inputs for two independent sequence encoders, thereby ultimately generating k levels of intent-based positive sample pairs, which results in 2k intent representations.
In [12], the issues of data sparsity and noisy data in SRSs are addressed by implementing contrastive self-supervised learning and momentum contrast. The basic idea adopts a dynamic queue and a moving average encoder to expand negative samples [78]. By enhancing sequence-level and embedding-level methods, the representations of all historical encoder outputs are pushed into the dynamic queue [71]. Thereafter, the momentum-updating mechanism is combined with a new instance-weighting mechanism, which penalizes false negatives and ensures model effectiveness [79]. In [80], semantically enriched CL in collaborative filtering is discussed [81], thereby enhancing semantic associations between nodes and reducing noise from structural neighborhoods [82].
Except for the issues of data sparsity and noisy data, training models solely based on item prediction loss frequently fail to obtain appropriate sequence representations. In [75], a long short-term interest CL framework combined with a filter-enhanced SR is proposed to comprehensively address data sparsity, data noise, and learning appropriate sequence representations. This approach adopts a filtering algorithm to address user interaction sequences, thereby reducing noise information in the sequence data. Specifically, filters are employed in the backbone network of contrastive learning SR to denoise sequential behaviors. Additionally, a loss function is designed to learn sequence representations and mitigate the impact of data sparsity. Moreover, the loss function is shown in Equations (2) and (3):
L o s s b p r ( a , p , q ) = σ ( < a , q > < a , p > ) ,
L o s s t r i ( a , p , q ) = max { d ( a , p ) d ( a , q ) + m , 0 } ,
where σ ( · ) is the activation function, <,> denotes the inner product, d represents the Euclidean distance, and m is the boundary value. By replacing H1, H2, P1, and P2 in the above formula with the following equations, four corresponding contrastive losses can be constructed, as shown in Equation (4):
L o s s con   u , t = f H l , p l , p s + f p l , H l , H s                                   + f H s , p s , p l + f p s , H s , H l .
Two independent encoders are adopted to model the user’s long-term and short-term interests based on the filter-enhanced interaction sequences. Thereafter, a user-specific gating mechanism is constructed to capture long-term and short-term interests that align with the user’s personalized preferences. Furthermore, these interests are integrated into an attention network to learn interest representations in SR.
In [10], a dual contrastive network method is proposed to address the data sparsity issue from a new perspective of integrating the auxiliary user sequences of items. Specifically, two major CL methods are introduced. The first one is dual-representation CL, which is achieved by minimizing the distance between embeddings and the sequential representations of users or items. The second one is dual-interest CL, which aims to maintain the consistency between static and dynamic interests by auxiliary training under self-supervised next-item prediction.
Additionally, in some SRs, CL is adopted to randomly augment user sequences, which can alleviate the data sparsity issue. However, it does not guarantee that the augmented positive or negative views could maintain semantic similarity. To address this issue, researchers propose a GNN-guided CL method for SR [13]. The guiding process adopts a GNN to obtain user embeddings and an encoder to determine the importance score of each item and various data augmentation methods, which can construct contrastive views based on the importance scores. Notice that this approach improves recommendation performance and alleviates the data sparsity issue. Moreover, when positive and negative sequences are incorrectly labeled as false positives and false negatives, it may lead to a decline in recommendation performance. To solve this issue, researchers propose an explanation-guided augmentation and explanation-guided contrastive learning SR model framework [83]. The critical idea of explanation-guided augmentation adopts explanation methods to determine the importance of items in the user sequence, which accordingly generates positive and negative sequences. Explanation-guided contrastive learning SR combines self-supervised and supervised CL on the positive and negative sequences generated by the explanation-guided augmentation operations to improve sequence representation learning, thereby achieving highly accurate recommendation results.
Simultaneously, addressing popularity bias is another challenge in SRSs. Existing CL-based methods have limitations in addressing popularity bias and separating user consistency from true interests. To address this issue, researchers propose a new de-biased contrastive learning paradigm for a recommendation system (DCRec), which combines sequential pattern encoding with global collaborative relation modeling through adaptive consistency-aware augmentation [14].
As shown in Figure 2, the overall framework of DCRec, G c , and G t are built to encode the sequences from diversified views (left part). In addition, we generate reasonable interaction-level conformity weights ω from the rich structure of G t (right part). The weights are restrained in normal distribution and empower the cross-view CL to be adaptive and aware of conformity. Moreover, the overall framework of DCRec aims to encode sequences from multiple perspectives (left part). Additionally, it generates reasonable interaction-level consistency weights from rich structures (right part). These weights are constrained by a normal distribution, which is endowed with cross-view CL with adaptive and consistency-aware capabilities.
The authors of [84] experimentally demonstrate that CL implicitly alleviates popularity bias by learning more uniformly distributed user or item representations in CL-based recommendation models. Additionally, they reveal that the supposedly necessary graph augmentation plays a negligible role. Based on this finding, a simple proposed CL method discards graph augmentation and replaces adding uniform noise to the embedding space for creating contrastive views. In [37], the authors adopt the node degree of the global graph as node attributes of the local session graph. Specifically, these attributes are obtained by statistics and embedded with trainable parameters, which can eliminate the influence of popularity in session aggregation. Consequently, this attribute module can be easily used in various tasks to address the popularity bias issue.
Despite data sparsity and popularity bias, the application of CL in SR contains the following aspects.
Integrating self-supervised signals into sequence recommendation by CL has a slight effect on alleviating the issue of data sparsity. However, due to the insufficient modeling of complex collaborative information and common behavioral information, such as user–item relationships, user–user relationships, and item–item relationships, it is difficult to learn rich embedding representations of users or items. A multi-level CL framework is proposed for sequence recommendation (MCLSR) [17]. Unlike previous sequence recommendation methods based on CL, MCLSR learns the representation of users and items by performing cross-view CL from four specific views on two different levels, such as interest and feature layers.
Some existing GNNs mainly focus on summarizing information from the perspective of spatial graph structures, which ignores the temporal relationship between project neighbors during message transmission, thereby leading to information loss issues. In [85], a new framework called “session-based recommendation with spatial temporary comparative learning enhanced GNNs” is proposed, which supplements the time representation of the main supervised recommendation task based on GNNs by an auxiliary cross-view with CL.
Due to the ability of contrastive self-supervised learning to alleviate data sparsity issues, the complexity of time-based interactions limits the effectiveness of understanding user intentions. In [86], a new method called CL is proposed that has a frequency domain for SR, which adopts a multi-task learning framework. Moreover, this method combines CL and recommendation learning with joint training, thereby optimizing the user representation encoder.
In reinforcement learning-based recommendation models, if user–item interactions are sparse and dynamic rewards based on user preferences are unavailable, it results in sub-optimal strategies. In [87], a method that combines dynamic intrinsic reward signals with a contrastive discriminator-enhanced reinforcement learning framework is proposed. In this approach, the CL module is primarily adopted to learn representations of item sequences.
Existing methods frequently mix users’ long-term and short-term interests, which results in poor recommendation accuracy and interpretability. In [11], a CL framework with self-supervised learning is proposed that can decouple long-term and short-term interests in recommendations, thereby achieving a stronger decoupling.

2.2. Sequence Recommendation Based on GNNs

This subsection mainly focuses on item transition patterns, item dependencies, item representations, user interests and preferences, and user intentions, which summarize the most novel models in SR based on GNNs.
Item Transition Patterns: Traditional recommendation methods only adopt recurrent neural networks to address sequential data, which may not fully capture semantic-based preferences and complex transitions among items. To address this issue, researchers model separating session sequences into session graphs and adopt GNNs to capture these complex transitions [29].
As depicted in Figure 3, this approach mainly consists of a key-value memory network (KV-MN) and a GNN. The GNN component is adopted to capture complex transition relationships. By associating the existing external knowledge base (KB) entities with items in the recommendation system, the key-value memory network can integrate KB knowledge.
GNNs are adopted to capture complex transition patterns by generating node representations based on the graph topology, thereby modeling complex item connections. In [27], session graphs are constructed using items from historical sessions, utilizing GNNs to obtain corresponding embeddings for capturing complex transitions based on session graphs. Specifically, gated GGNNs [28], a variant of GNNs, are employed to learn node vectors. For a node vs,i in the session graph G s , its updating rule is shown in Equations (5)–(9):
a s , i ( t ) = A s , i v 1 ( t 1 ) , , v s n ( t 1 ) H + b ,
z s , i ( t ) = σ W z a s , i ( t ) + U z v i ( t 1 ) ,
r s , i ( t ) = σ W r a s , i ( t ) + U r v i ( t 1 ) ,
v i ( t ) ~ = tanh W o a s , i ( t ) + U o r s , i ( t ) v i ( t 1 ) ,
v i ( t ) = 1 z s , i ( t ) v i ( t 1 ) + z s , i ( t ) v i ( t ) ~
where t is the training step, A s , i : 1 × 2 n is the i-th row of the matrix As, which corresponds to the node vs,i, and H d × 2 d and b d are the weight and bias parameters, respectively. v 1 ( t 1 ) , , v s n ( t 1 ) is the list of node vectors in sessions s , z s , i d × d and r s , i d × d , which are the reset and updating gates, respectively. σ ( · ) is the sigmoid activation function, and ⊙ represents element-wise multiplication.
Most existing SR models only focus on the items that a user interacts with consecutively in a session to capture the transition patterns among items and ignore the user’s micro-behaviors and item attributes. The authors of [88] incorporate user micro-behavior and item knowledge into a session-based recommendation by using multi-task learning. Specifically, a given session is modeled in micro-behaviors and item knowledge into multi-task learning for session-based recommendation (MKM-SR), thereby using a series of item–operation pairs rather than just item sequences. MKM-SR also adopts a gated graph neural network (GGNN) to fully capture the transition patterns within sessions.
Existing personalized session-based recommenders are limited to the universal user’s session, which ignores useful item transition patterns from other users’ historical sessions and other sessions that may contain items related or unrelated to the current session. In [34], a novel heterogeneous global graph neural network (HG-GNN) is proposed that leverages item transitions from all sessions to infer user preferences for both current and historical sessions. In [38], a novel approach, called a global context-enhanced graph neural network (GCE-GNN), is presented, which utilizes item transitions from all sessions in a more detailed manner to infer user preferences in the current session.
A GNN typically propagates information only from adjacent items, which ignores information from items that are not directly connected. Additionally, GNN-based methods frequently face severe overfitting problems. In [33], a star GNN with highway networks (SGNN-HN) is proposed, which applies an SGNN to model the complex transition relationships among items in the current session. In this approach, the highway network adaptively selects embeddings from item representations, thereby avoiding the overfitting issue.
Although existing methods capture item transition patterns in the current session and neighboring sessions, they do not accurately filter out noise in sessions or expand the range of feasible data in a more reasonable way. In [35], a GNN with global noise filtering is proposed for session-based recommendations (GNN-GNF), which aims to filter noise data more comprehensively and reasonably by utilizing item transition patterns. The handling of noise is mainly reflected in data preprocessing, and the GNN-GNF adopts an item-based filtering module to capture the user’s primary intent. Moreover, the filtering module filters out sessions unrelated to the target session intent by edge matching.
Session graph-based methods have achieved excellent performance [33,38,42]. These treat each transition relationship among items equally and ignore the rich information on user interest drifts within these transitions. To fill this gap, researchers propose a novel model called a time-enhanced graph neural network (TE-GNN) [89], which aims to capture the complex user interest shift patterns in sessions. In a TE-GNN, a time-enhanced session graph (TES-Graph) is constructed, and then the transition relationships among items are adaptively addressed by the degree of user interest drift.
Item dependencies refer to the correlation or association among different items in a user’s sequence for SR. Utilizing item dependencies can enhance the accuracy and personalization of the recommendation system, thereby providing users with attractive and relevant recommendation results. In [37], a method called a recurrent neural graph neural network (RN-GNN) is proposed, which adopts GNNs to jointly model intra-session and inter-session item dependencies in session-based recommendations.
In each layer of a GNN model, the information carried by nodes propagates one step along the edges. Thus, each layer could only capture one-hop relationships. By stacking multiple layers, GNN models can capture up to L-hop relationships, where L is the number of layers. However, stacking more layers would result in overfitting and over-smoothing issues [41], which does not enhance performance [42]. Therefore, the optimal number of layers for GNN models is no more than three. Moreover, these models can only capture three-hop relationships at most. However, session lengths can easily exceed three in practical applications. Due to the limitations of the network structure, GNN-based models cannot capture long-range dependencies.
To address the issue of capturing long-range dependencies, researchers present a highly effective generative model, which can learn high-level representations from both short-term and long-term item dependencies [90]. In addition, the proposed network architecture consists of a stack of dilated convolution layers, which can effectively increase the receptive field without relying on lossy pooling operations.
Additionally, researchers have proposed a novel GNN model named LESSR (lossless edge-order preserving aggregation and shortcut graph attention for session-based recommendation) to address the issue of capturing long-range dependencies [40]. As shown in Figure 4, the workflow diagram of LESSR is as follows.
Given an input session, it is first converted into a lossless encoded graph, which is referred to as an edge-order preserving (EOP) multigraph, and then a shortcut graph is constructed. The EOP multigraph addresses the session encoding issue, while the shortcut graph can address the issue of capturing ineffective long-range dependencies. These graphs with item embeddings are passed by multiple edge-order preserving aggregation (EOPA) and shortcut graph attention (SGAT) layers, thereby generating latent features for all nodes. Moreover, the EOPA layer utilizes the EOP multigraph to capture local context information, while the SGAT layer adopts the shortcut graph to efficiently capture long-range dependencies. Thereafter, a readout function with an attention mechanism is applied to generate a graph-level embedding from all node embeddings. Finally, the graph-level embedding is combined with the user’s recent interest to make recommendations.
In [91], each session is represented as a graph rather than a linear sequence structure, which proposes a novel fully connected graph neural network (FGNN) to learn complex item dependencies. The FGNN adopts graphs to represent each session, which is more expressive than a sequence structure, and then it applies GNNs into the session graphs to learn item and session embeddings. In addition, the FGNN consists of two modules: (a) a weighted graph attention layer (WGAT), which encodes the information among nodes in the session graph into item embeddings, and (b) a readout function, which is designed to aggregate these embeddings and generate a graph-level representation after obtaining item embeddings, such as the session embedding, which is adopted to learn and determine appropriate item dependencies.
In [41], the authors mention that a GNN suffers from over-smoothing issues when its layer number exceeds three. Over-smoothing is an issue in the GNN that may cause all nodes to converge to the same value. For example, researchers have combined heterogeneous knowledge graphs with a GNN [92], but the effectiveness of this approach is limited by the over-smoothing issue of the GNN model. The authors of [39] do not consider the over-smoothing issue. Additionally, existing session-based recommendation systems mainly rely on mining sequential patterns within sessions, which is insufficient to capture more complex dependencies among items. To address these issues, researchers designed a hybrid sequential gated GNN (HGNN) [93]. Moreover, the proposed HGNN model is based on hybrid sequential propagation, thereby avoiding unimportant patterns, solving the over-smoothing issue, and capturing complex dependencies in the propagation. Specifically, the generated model aims to explicitly encode the interdependencies among items, thereby allowing the direct estimation of the distribution of the output sequence on the original item sequence rather than the desired item. Additionally, 1D dilated convolution layers are adopted and stacked together to increase the receptive fields by modeling long-range dependencies, which do not adopt inefficient large filters.
For an element h in the item sequence, the 1D dilated convolution operator *l is defined as shown in Equation (10):
x * l g ( h ) = i = 0 f 1 x h l i g ( i ) ,
where g is the filter function. The dilated convolution structure is suitable for modeling long-range item sequences, which improves efficiency without the requirement for larger filters or increasing the number of network layers.
Item representation represents the process of transforming each item into a vector or feature vector. These vectors are typically adopted to capture the critical characteristics and attributes of items for computation and matching in the recommendation system. In [42], a method was developed to construct a session-based representation graph neural network (SR-GNN) by using a gated neural network, which learns item representations on a session graph. With the success of SR-GNNs, GNNs are increasingly adopted in session-based recommendation systems.
Existing methods based on GNNs construct each session as a graph and capture rich transition relationships among items for generating item representations, while they frequently ignore the natural noise in sessions. In [94], two denoising modules are designed to provide accurate item representations from the perspectives of sequential and graph-structured data.
The graphs adopted in the GNN are constructed by static patterns, which may bring noise into the graph structure when user preferences change. In [95], a novel method is designed, called a dynamic global structure-enhanced multi-channel graph neural network (DGS-MGNN), to learn accurate item representations from multiple perspectives. Additionally, the DGS-MGNN dynamically generates local, global, and consistency graphs, which learn more informative item representations based on the corresponding graphs. Meanwhile, existing research only considers the number of interactions when item graphs are constructed, thereby necessitating revisions to capture multi-dimensional transition relationships among items. In [36], the authors emphasize the importance of multi-dimensional information and propose a GNN model based on integrating item category and interaction time information (CT-GNN). This method combines item category and interaction time information with a multi-layer graph convolutional network to form multi-dimensional and refined item representations.
In [96], a method called dual attention transfer based on multi-dimensional integration (DAT-MDI) is proposed. Notice that DAT adopts a latent mapping method based on slot attention mechanisms to extract user representation information from different sessions in multiple domains. MDI employs GNNs for graphs, such as session graphs and global graphs. It also adopts gated recurrent units (GRUs) for sequences to learn item representations in each session. Specifically, multi-level session representations are combined by a soft attention mechanism. Moreover, Figure 5 shows the local item representation.
The session graph contains paired item transitions within the existing session. Since the neighbors of items in the graph have different weights for each item, we adopt an attention mechanism to obtain weights from different item nodes. Furthermore, attention weights can be computed by using a method similar to soft attention. They are shown in Equations (11)–(13):
e i j = L e a k y Re L U a r i j h v i * h v j * + a r i j h v i * + h v j * ,
α i j * = exp e i j * v k N v i s exp e i j * ,
h l , v i * = v j N s , v i α i j * h v j * ,
where e i j represents the importance of node vj, L e a k y Re L U is the activation function, a * d is the weight, and a * i j denotes the attention coefficient, which is asymmetric. rij is the relationship between vi and vj. The output result for each node is obtained by calculating the linear combination of the intermediate coefficients corresponding to Equation (9).
For global item representations, a global graph layer architecture based on graph convolutional networks (GCNs) is designed [66]. The attention weights for each node are obtained by using the idea of an attention mechanism [97]. L e a k y Re L U is the activation function. Additionally, attention scores perceived in the session are used to perform a linear combination. Finally, the learned item features are combined with the features learned from their neighbors. Specifically, its formula is as shown in Equation (14):
h g , v * = s w i s h W 2 h v * h N v g * ,
In this section, s w i s h is adopted as the activation function. The final item representation, depending on the item itself and its nearest neighbors, is achieved by a single aggregation layer. Additionally, it means that the item’s representation is a mix of its initial representation and its neighbors’ representation, thereby allowing more effective information from all existing sessions to be incorporated into the item’s final representation. For item transition, item dependency, and item representation, research on items in SR also contains the following aspects.
In [98], a general embedding smoothing framework is proposed, which is suitable for an SR model. It primarily utilizes sequential item relationships and semantic item relationships based on item attributes, which is adopted to construct a hybrid item graph. Then, graph convolution operations are performed on the hybrid graph to generate smooth item embeddings, thereby comprehensively considering both sequential information and semantic relationships among item attributes. In [99], a novel SR method called TRec is presented, which learns trend information of items from the implicit user interaction history and then incorporates this trend information into subsequent item recommendation tasks. In [100], a new position-aware graph neural network (PA-GNN) is proposed. This model adopts a session graph in a position-aware manner, thereby fully utilizing the positional information of items. The authors of [94] aim to obtain more accurate item representations, adopting GRUs to enhance the GNN, thereby addressing the insufficient modeling of session sequential information by the GNN.
Despite most existing GNN-based methods in this field having significant achievements, none of them emphasize the importance of repeated recommendations, which are a crucial part of session-based recommendations. In [101], the authors highlight the importance of repeated recommendations and propose a new model called ReGNN, which combines GNNs with a repetition exploration mechanism to provide excellent recommendations.
ReGNN primarily adopts a GNN to update the initial item, which is introduced to obtain latent vectors for graph nodes. A GNN is suitable for item representation learning and can extract the features of session graphs by capturing transitions among graph nodes. As shown in Figure 6, the repetition exploration mechanism is divided into two modes: repeat mode and explore mode.
Session-based recommendation systems based on GNNs typically can only recommend items that exist in the user’s historical sessions. Therefore, these GNN models cannot recommend items that the user has never interacted with, such as new items, thereby resulting in an information closure phenomenon. Moreover, if there are no interactions between new items and users, it is impossible to obtain new items in the session graph by constructing a GNN-based session recommendation system. To address this issue, researchers draw on the concept of zero-shot learning (ZSL) [102], which adopts the attributes of new items to infer their representations in a GNN space. By outputting the probabilities of new items along with the corresponding item recommendation scores, high-scoring new items can be recommended to users.
User Interests and Preferences: In [103,104], the authors focus on modeling short-term user interests, thereby neglecting long-term dependencies in item sequences. However, long-term dependencies are crucial for achieving accurate recommendations [105]. With the number of users and items increasing significantly, SRSs face challenges in modeling short-term user interests and capturing long-term user interests. In [106], an enhanced memory graph neural network (MA-GNN) is proposed to capture both long-term and short-term user interests. Specifically, a GNN is applied to model the contextual information of items in the short term, and then a shared memory network is adopted to capture long-distance dependencies among items. To effectively integrate short-term and long-term interests, a gating mechanism is introduced in the GNN framework to adaptively combine two hidden representations. Moreover, the architecture of an MA-GNN is shown in Figure 7.
Rich historical user behaviors are frequently implicit and noisy, thereby failing to fully reflect the user’s actual preferences. Meanwhile, users’ dynamic preferences can rapidly change over time, which makes it difficult to capture and integrate different types of preferences from long-term user behavior into clusters. It helps clearly distinguish the user’s core interests, and each cluster represents a core interest. Specifically, cluster-aware, query-aware GCN propagation and graph pooling are performed on the constructed graph. Consequently, SURGE dynamically fuses and extracts the user’s currently activated core interests from the noisy user behavior sequences.
When the time span is large, user interests tend to drift, and it is difficult to model user interests accurately. To address this issue of dynamic user interests, previous work has fallen short in terms of efficiency and capturing timely user interests. For instance, researchers propose a Wasserstein reservoir to preserve sampled predictions of the worst sequences [107], and then it is used to retrain the entire model. This method can easily learn new user preferences by retraining on new data. However, its updating approach fails to capture new item transitions on time. To address this issue, researchers propose a time-augmented GNN for session-based recommendations (TASRec) to model dynamic user interests for long periods [108]. TASRec can construct a graph for each day to model relationships among items. Moreover, the same items on different days may have different neighbors, thereby corresponding to user interest shifts. A customized GNN is designed to embed the dynamic graph of items, which learns time-enhanced item representations. Typically, a sequential neural network structure is adopted to predict the next item in a given sequence.
Most SRSs adopt user interaction sequences based on GNN models as flat graphs, thereby ignoring the diversity of user preferences. To address this issue, researchers propose a novel SRS [109] that adopts hierarchical graph neural networks (HGNNs) to model user preferences. Firstly, it constructs the target user’s graph structure by using a time-span-aware sequential graph (TSG), which contains the time spans among interacted items. Thereafter, all original nodes in the TSG are softly clustered into factor nodes, with each factor node representing a factor of user preferences. Finally, the representations of all factor nodes are adopted to predict the SR results jointly.
Most existing methods can only model user interests with their sequences, thereby ignoring dynamic collaborative signals among different user sequences, which fails to fully explore user preferences. In [110], a new method, a dynamic graph neural network for SR (DGSR), is proposed, which connects different user sequences using a dynamic graph structure to explore interactions between users and items and temporal and sequential information.
User Intent: This describes the motivation behind a user’s behavior, which is closely related to the next item that the user is likely to prefer.
To improve prediction accuracy, existing methods mainly adopt two paradigms: sequential patterns and co-occurrence patterns. Although these methods have achieved excellent results, they still fail to accurately model users’ real intentions, which do not consider the complex transition patterns among interacted items. In [111], a novel enhanced graph neural network (E-GNN) framework is proposed for session-based recommendations. All anonymous users’ interaction sequences are modeled as a weighted global item graph (WGIG), and then the target user’s current interaction session is modeled as a local session graph (LSG). The overall framework of the E-GNN is shown in Figure 8.
In Figure 8, the model is constructed based on a WGIG and an LSG. Specifically, the user’s preference is modeled and recommendations can be made based on the E-GNN. In GNN-based SRSs, it is often assumed that the transitions of interacting items correspond one-to-one with the evolution of users’ interaction intentions. However, in reality, the interaction patterns are variable, including sequential and co-occurrence patterns. To recognize and integrate multiple patterns, an E-GNN is constructed based on the WGIG and the LSG. The input of the E-GNN is the weight values of all edges in the WGIG and LSG. According to the edge connections in the LSG, the algorithm extracts the corresponding weights from the WGIG to represent the strength of edge directivity. Finally, a new adjacency matrix is obtained to better present the actual interaction sequence of users.
Most existing methods fail to model latent user intentions, which are reflected by related items. In [112], a model with an intention-aware graph neural network (Int-GNN) is proposed, which aims to capture user intentions. This model comprehensively considers the frequency of item occurrences in user sessions, which also explores potential user intentions through long intervals among item reinteractions. Additionally, user preferences are considered during the prediction phase. Moreover, existing methods frequently ignore the noise signals presented in session sequences. In [113], a novel intention-aware denoising graph neural network (ID-GNN) is proposed for session-based recommendations, highlighting the importance of user intentions with implicit user preferences.

2.3. Application of Attention Mechanisms in SRSs

This section focuses on user interests, preferences, user intentions, items, and user representations, which summarize the existing applications of attentional mechanisms in SR.
User Interests and Preferences: In [27], researchers adopt a target-aware attention network to activate specific user interests, which are associated with target items. The interest representation vectors vary from one target item to another. Moreover, the flowchart of the target-attention graph neural network (TAGNN) is shown in Figure 9.
The GNN is first utilized to capture the complex item transitions of the session graph. Then, a goal-aware attention network is adopted to obtain item embeddings.
To achieve inductive and transferable capabilities [114], a relational attention GNN is trained on local subgraphs extracted from user–item pairs. Moreover, long-term and short-term temporal patterns of user preferences are encoded by the proposed sequential self-attention mechanism.
In [115], an attention mechanism is applied to learn the long-term interests of users, thereby incorporating other sessions to learn their short-term preferences. In [11], a network based on attention mechanisms adaptively aggregates the importance of long-term and short-term interests for prediction. In [116], the model adopts an attention mechanism to integrate long-term and short-term interests, which adapts users’ personalized preferences into an attention network, thereby learning user interest representations in SR.
In [117], the authors adopt cross-domain recommendation (CDR) to address the issue of data sparsity [118,119], proposing a cross-domain attentive SR (CD-ASR) model based on general and user preferences, thereby applying the self-attentive sequential model to obtain the user’s preferences in the target domain. Moreover, the model considers the information learned from the source domain, which represents the general preferences. However, the information learned from the target domain represents the user’s existing preferences. The equation is shown in Equation (15).
s A i U k , h = f e U k , h , e A i ,
where f ( · ) is a pairwise similarity measure. Moreover, the cosine similarity function is adopted in this work. Then, the attention weight of each term Ai can be obtained by the following normalization operation, as shown in Equation (16):
γ A i U k , h = exp s A i U k , h A i N A i U k , h exp s A i U k , h + B j N B j U k , h exp s B j U k , h + s U k , h U k , h ,
where s B j U k , h and s U k , h U k , h are the weights of the importance of each item Bj to Uk,h in domain B and the self-connection with Uk,h, respectively. By replacing the corresponding user or item embeddings, we can further obtain γ B i U k , h and γ U k , h U k , h in a similar way.
The second one is the sequence-aware attention mechanism. To distinguish the importance of linked users and items, we develop another attention mechanism to measure their sequential dependency on the target item, which can be defined as shown in Equations (17)–(19):
s A i 1 A i = f e A i 1 , e A i ,
s U k , h A i = f e U k , h , e A i ,
s A i A i = f e A i , e A i ,
where s A i 1 A i and s U k , h A i denote the importance of items Ai−1 to Ai and the importance of user Uk,h to item Ai, respectively. s A i A i is the importance of the self-connection to Ai. Then, we can obtain this using Equation (20):
γ A i 1 A i = exp s A i 1 A i A i 1 N A i 1 A i exp s A i 1 A i + U k , h N U k , h A i exp s U k , h A i + s A i A i ,
Moreover, the specific method is the same as the first attention mechanism module.
Considering the lack of time sensitivity of traditional self-attention mechanisms, researchers adopted a temporal self-attention network to model dynamic user preferences [36]. This network combines temporal information with self-attention mechanisms for next-item recommendation.
Conventional methods address users’ long-term sequential behavior in a left-to-right order, which may ignore some useful information. In addition, these methods ignore the fact that each user pays different amounts of attention to different items. In [120], the authors propose a novel hybrid model called an attentive hybrid recurrent neural network (AHRNN), which is designed to capture users’ generalized preferences and latest intentions. In addition, the first module of the model is a bidirectional long short-term memory network (Bi-LSTM), and the second module is a GRU. Both are equipped with a user-based attention mechanism.
User Intention: Previous recommendations ignore the user’s intention to choose a project, which is driven by the special factors of the project. In [121], a novel method called a disentangled graph neural network (Disen-GNN) is proposed to capture session objectives, which considers some attention to each factor in the project. By representing each item with independent factors, it designs attention mechanisms for learning the user’s intentions towards different factors of each item in the conversation. Thereafter, the attention weights of each item factor are adopted by aggregating item embeddings, thereby generating session embeddings.
To recommend some novel projects to users, researchers designed a dual-intention network to learn user intentions from attention mechanisms and historical data distributions [102], respectively, simulating the decision-making process of users by interacting with novel projects.
Changes in user preferences may introduce noise into the graph structure. In [95], the authors utilize a graph structure with an assisted attention mechanism to filter out noise information in each session, thereby generating accurate representations of user intentions.
Items and User Representation: In [89], an approach called temporal interest attention network (TIAN) is presented for items that are in a similar time interval, where they are then categorized into the same user interest group. Since items in the same group share the same user interest, information from each group would be incorporated into its corresponding item to obtain an excellent item representation.
In [38], the authors propose a novel global-level item representation learning layer (GCE-GNN), which adopts a session-aware attention mechanism to recursively integrate the neighbor embeddings of each node on the global graph. Moreover, the GCE-GNN aggregates two levels of learned item representations by using a soft attention mechanism.
In [101], the authors adopt an attention mechanism to build accurate session representations, which is applied to capture the user’s major purpose in the existing session, thereby merging it into a unified session representation.
In [122], the authors designed a novel recommendation method based on location-enhanced and time-aware graph convolutional networks (PTGCNs). This defines location-enhanced and time-aware graph convolutional operations, which adopt a self-attention aggregator to learn the dynamic representations of users and items on a two-part graph, thereby modeling the sequential patterns and temporal dynamics among user–item interactions. Additionally, the PTGCN achieves higher-order connectivity between users and projects by stacking multiple layers of graph convolution. In [96], dual-attention delivery based on multi-dimensional integration is proposed, thereby extracting information about users’ representations in different sessions between multiple domains and multi-level session representations, which are combined with a soft attention mechanism. In [52], the fusion method of item attribute embedding in the self-attention mechanism is modified to obtain an accurate user representation.
Except for the above-described functions, attention mechanisms continue to find applications in the following areas.
In [93], the authors adopt an attention mechanism to learn the weights of different orders of propagation. For HGNNs, the representation of each order is obtained by mixed-order propagation. Thereafter, the attention mechanism is utilized to determine the weights of each order and construct the item representation of the mixed order; thus, the final item representation is obtained after several updating steps. Moreover, the attention mechanism is adopted to assign weights to the representations of different orders. In [123], the authors design two temporal attention operations in a Poincare sphere space to explicitly capture order-dependent information. In [124], the authors propose a novel higher-order attention graph neural network (HA-GNN). These sessions are modeled as graph-structured data. Specifically, a self-attention mechanism is adopted to capture the dependencies among items, and then a soft-attention mechanism is adopted to learn the higher-order relationships in the graph. Finally, a simple fully connected layer is utilized to update the embedding of items.
In [125], the authors propose a torus-attention-based sequence recommendation (AGSR) model that constructs a torus over a sequence of user behaviors. AGSR can explore local features by applying cyclotomic attention to sub-cyclotomic graphs. Meanwhile, conventional approaches have endeavored to capture the dynamics of sequential patterns, but their lack of awareness of the importance of dynamically capturing social influences results in sub-optimal performance. In [126], a novel concept called social SR is presented, where the challenge lies in dynamically modeling social influence and capturing transition patterns among items in a time-sensitive manner. Additionally, a time-sensitive attention mechanism is designed to capture complex item transition patterns. Moreover, a socially aware attention mechanism is adopted to measure the importance of each friend for the existing user’s taste, which provides an excellent explanation that users enjoy a particular item at a particular timestamp.

3. Metrics, Datasets, and Application Scenarios

3.1. Metrics

In the realm of SRSs, the formulation of evaluation metrics necessitates a comprehensive consideration of two critical dimensions: ranking quality, encapsulated by metrics such as the hit ratio (HR), normalized discounted cumulative gain (NDCG), area under the curve (AUC), recall, precision, and mean average precision (MAP), and sequential dynamics, which are typically quantified by indicators including the generalized area under the curve (GAUC) and mean reciprocal rank (MRR). This section presents a detailed description of the most prevalent evaluation metrics employed in SRSs.
(1) HR@K: This represents the proportion of recommended items that appear in the top K positions of a list, as shown in Equation (21) below.
H R @ K = H i t s @ K | T N | .
(2) NDCG@K: This measures the ranking quality of recommended items in a list, thereby emphasizing the importance of placing relevant items higher in the ranking, as shown in Equation (22) below.
N D C G @ K = Z k 2 r i 1 log 2 ( i + 1 ) ,
where K represents the number of recommended items and |TN| represents the number of items in the test set. The numerator is the cumulative number of items in the test set, which is present in the previous K item for each user. ri is the relevance at position i. If the item at position i is in the test setting, ri = 1; otherwise, ri = 0. Zk is the regularization factor.
(3) AUC: This indicates the proportion of correctly classified samples to the total number of samples, which reflects the ability of the classification model to rank the samples, as shown in Equation (23) below.
A U C ( u ) = i T ( u ) j I \ T ( u ) I r ^ i > r ^ j | T ( u ) | | I \ T ( u ) | .
(4) GAUC: This is an enhanced version of the AUC, which is calculated by a weighted average of each user’s AUC. Moreover, the weight is the number of clicks that the user has made, which analyzes different user-personalized recommendations.
(5) MRR: This is the rank position of the target item in the recommendations list, thereby focusing on whether the recommended item appears at the top [127], as shown in Equation (24) below.
M R R @ N = 1 α u U f r r u , q g r u , q g ,
where qg is the ground-truth user behavior, and ru and qg are the ranking scores generated by SR models.
(6) Recall@N: This denotes the proportion of ground truth items that contain the top N recommended lists [128,129], as shown in Equation (25) below.
R e c a l l True   Positive   Rate = T P T P + F N ,
where TP is the recommended items, and TP + FN is the total useful recommended items.
(7) Precision: This represents the correct proportion of recommendations, as shown in Equation (26) below.
P r e c i s i o n = T P T P + F P ,
where FP is recommended, although it is not selected by the user.
(8) MAP: The average precision (AP) is computed using Equation (27).
A P = N = 1 | R ( u ) |   P r e c i s i o n   @ N * rel ( N ) | R ( u ) | ,
where r e l ( N ) = 1 if the N t h item is in R(u) in T(u). The MAP is the mean of AP for a set of users, as shown in Equation (28) below.
M A P = 1 n k = 1 k = n A P k
where A P k is the A P of class k, and n is the number of classes.
(9) MRR: This measures the average of the reciprocals of the ranking positions of the first correct answer among numerous retrieval results in the result list [130,131], as shown in Equation (29) below.
M R R = 1 N i = 1 N 1 r a n k i
where N is number of retrieval results, and r a n k i refers to the rank position of the first relevant item for the i-th retrieval result.

3.2. Datasets

In the realm of SRs, datasets are of great significance. Based on feedback mechanisms, they are mainly divided into two distinct types: explicit and implicit feedback datasets. Explicit feedback datasets contain clear rating information that users provide for diverse items. This includes ratings from different fields, such as movie ratings that are affected by the plot, acting, and cinematography in entertainment; hotel ratings determined by the service, facilities, and location in hospitality; and product ratings that consider functionality, durability, and cost-effectiveness in e-commerce. Conversely, implicit feedback datasets record various user–item interaction behaviors. For example, click-throughs can show initial interest, purchases indicate a strong preference, and listen-in actions (notably on music or audiobook platforms) reveal genre preferences. These two types of datasets provide rich resources for the in-depth study of SRs. In the following section, we will introduce commonly used datasets suitable for different application scenarios and user behavior patterns to promote algorithm development and the progress of this research field [130,131].
(1) Amazon Beauty, Sports, Toys, Clothing, Books [132]: These datasets are obtained from Amazon review datasets [133], which contain product reviews with extensive metadata [132].
(2) Yelp: This dataset is obtained from a business platform, which is adapted for personal, educational, and academic purposes.
(3) Taobao: This dataset is collected from the largest e-commerce platform in China [134], which contains user behaviors, such as clicks, cart actions, and purchases from 25 November to 3 December 2017.
(4) ML-1M: ML-1M is a large benchmark dataset, which is adopted for movie recommendations.
(5) Reddit: This dataset captures user interactions with subscribed topics on the Reddit platform.
(6) MovieLens-20M: This dataset contains rating behaviors, which are gathered from a movie review website.
(7) Diginetica: This is released by the CIKM Cup 2016, whose five months of anonymous users’ transaction data were collected from an e-commerce website.
(8) Nowplaying: This contains users’ music-listening behaviors for one year, which are based on social media tweets. It also describes the music-listening behavior of users. This dataset originates from [135].
(9) Yoochoose: This dataset contains six months of anonymous users’ click-stream data, which are collected from an e-commerce website. Moreover, it is provided by the data mining conference RecSys 2015.
(10) Gowalla: This is a widely adopted check-in dataset from a well-known location-based social networking website.
(11) Tmall: This dataset is from a competition in IJCAI, which contains anonymous users’ shopping logs from the Tmall online website.
(12) RetailRocket: This is originally from a Kaggle contest published by an e-commerce company and contains the browsing activity of anonymous users over six months.
(13) LastFM: This is a popular music dataset that has been adopted as a benchmark in numerous recommendation tasks.
(14) Kuaishou: This industrial dataset is collected from the Kuaishou APP, which is one of the largest short-video platforms in China. Moreover, users can browse short videos uploaded by other users.
(15) Ml-20m: This contains numerous triples [136].
(16) KKBOX: This dataset is provided by a famous music service, KKBOX, and contains extensive historical records of users who listened to music in a given period.
(17) JDATA: This dataset is extracted from JD.com, which is a famous Chinese e-commerce website. Moreover, it contains a stream of user actions on JD.com within two months. The operation types include clicking, ordering, commenting, adding to cart, and favorites.
(18) Xing: This collects job postings from a social network platform and contains interactions on job postings for 770,000 users.
(19) Google Local: This is a POI-based dataset crawled from Google Local that contains user reviews and POI data [47].
(20) Luxury, Digital, Software: These are Amazon custom review datasets [133], and are widely accepted stable benchmark datasets for recommendation systems.
(21) Amazon G&GF: This is the subset of the Amazon dataset released in 2018. Item attributes are obtained from item data, which contain brands and prices. The Amazon dataset describes the merchant with taxonomy, which adopts a hierarchical taxonomy relationship.
(22) Aotm: This is a music playlist dataset that contains users’ playlists and music identifiers [137].
(23) Steam: This dataset is collected from Steam, which is a video game distribution platform. It contains reviews, timestamps, and genres of massive games.
(24) Instagram: This dataset contains user check-in records at locations in three major urban areas: New York, Los Angeles, and London, which were crawled via the Instagram API in 2015 [138].
(25) HVIDEO: This is collected from a smart TV platform and involves the watch logs of family accounts on two domains, the education domain (E-domain) and the video domain (V-domain), from October 2016 to June 2017. The E-domain contains educational videos for students of all ages and instructional videos on daily life. Specifically, the V-domain is from a video-on-demand platform, containing movies, cartoons, and TV series. Family accounts are typically shared by multiple members, and their watching logs consist of mixed behaviors. Thus, this dataset is suitable for the SCSR task.
(26) MYbank: This is a larger and more challenging dataset collected from Ant Group, describing users’ interactions with financial products, such as debt, trust, and accounting.
The detailed descriptions of metrics and datasets are shown in Appendix A.

3.3. The Latest Application Scenarios of SRSs

SRSs are applied in a wide range of scenarios and have achieved abundant results.
On video platforms, the core objective of a SRS is to predict the content that users may be interested in next based on their historical viewing behavior, providing a personalized experience. Service providers such as TikTok and YouTube [97] utilize SR algorithms to analyze users’ behavior history and preferences [22,115,123,125,139], thereby recommending personalized movies, TV series, or other content to users [80].
Social media platforms such as Twitter and Facebook utilize SR technology to analyze user interaction behaviors, such as likes, comments, and shares. Then, they subsequently present users with recommended new content that they may find interesting [23,24,140,141,142]. Applying an SRS on social media can effectively enhance the user experience, increase user engagement, and contribute to longer user dwell times on the platform. However, it also suffers from some challenges. How can it balance novelty and accuracy? How can it address real-time recommendation issues with large-scale data?
The application of SRSs in e-commerce is extensive and mature. Commonly, it can predict a product or service that users may be interested in the future based on their historical purchase behavior, browsing records, rating data, and other interaction information. An e-commerce platform typically adopts a combination of various recommendation algorithms to form complex recommendation systems. These systems are required to be continuously iteratively optimized through user feedback to achieve higher accuracy and user satisfaction. However, whether the security and privacy of user data can be commendably protected is also a research focus [31,32,47,48,143].
SRSs in news recommendations can analyze users’ historical reading behavior to predict news content that users may be interested in, which helps news platforms provide personalized services, increases user engagement, enhances user experience, and promotes users to discover more high-quality content [25,46,53,83,86].
The application of SRSs in intelligent transportation can enhance travel efficiency and user experience. Typically, it can recommend the optimal travel routes based on users’ historical travel trajectories and time patterns. By combining users’ travel habits, it can recommend the most suitable transportation vehicle and service. Moreover, it also recommends attractions, restaurants, and accommodations based on the visiting sequence of users’ travel destinations. In intelligent transportation applications, sequence recommendation systems are typically required to address complex spatiotemporal data, which can respond to traffic condition changes in real time. Furthermore, the implementation of a sequence recommendation system in intelligent transportation can bring extensive conveniences to citizens and city managers [49,52,78,85,116].

4. Future Development Trends

4.1. Explainability

Explaining DL models is a challenging task. Providing interpretability to users enables them to understand the reasons behind the recommendations, which helps improve the transparency and persuasiveness of recommendation systems. Some researchers can also understand how the model works, which is beneficial for further debugging and improvement of the model. With the widespread application of GNNs, recent research has begun to investigate the interpretability of GNNs and recommendation systems [144,145,146].
Explainable recommendations aim to develop novel models that not only generate high-quality recommendation results but also provide intuitive explanations. These explanations can be post hoc or directly derived from an interpretable model [146]. So far, further exploration is required for recommendation systems based on interpretability.

4.2. Fairness

Seeking accuracy in a recommendation system may bring some side effects, such as unfair and overly specialized recommendation results. A fair recommendation system can give consumers or providers unbiased recommendation lists. Typical issues on recommendation fairness can be mainly divided into three groups. The first deals with popularity bias [56], the second tackles demographic bias [57], and the third deals with imposing statistical parity into recommendations [58].
Currently, some studies have been dedicated to exploring the fairness of recommendation systems [54]. There is a focus on calibrated recommendations for SR, which aims to provide fairer suggestions and enhance the diversity of recommendations. In [55], a novel fairness-aware SR task is proposed, which defines a new metric and interactive fairness to estimate how users with different protected attribute groups can fairly interact with recommended items. Evidently, bias is prevalent in recommendation systems; thus, further research is required on fairness.

4.3. Diversity

Some research has shown that diversity is an important factor, which may affect the performance of recommendation systems. Users may prefer more diverse recommendations [59]. Specifically, diversity requires a recommendation system to generate a list of items with more attributes.
Numerous studies on the diversification of traditional recommendation tasks indicate that these methods typically aim to improve recommendation diversity by reordering the items in the candidate list of recommendation items generated by general recommendation models [60]. However, these methods do not apply to SR. Over the past decades, there has also been extensive research on diversity in SRSs. In [62], an intent-aware diversity-promoting (IDP) loss is designed to supervise the learning of the IIM module, which also forces the model to take recommendation diversity into consideration during training. In [103], an IIM module is introduced to extract users’ multiple interests, and then an IDP decoder is used to produce diversified recommendations that gradually satisfy those interests. Therefore, we believe that further research should be conducted on the diversity in sequence recommendation systems.

4.4. Cross-Domain SR

With the development of deep neural networks and graph learning techniques, CDR has attracted increasing research attention. Most importantly, several existing studies have shown that DL is effective in capturing cross-domain generalizations and trends, which can also generate better recommendations on cross-domain platforms [72]. Additionally, CDR can transfer information from multiple source domains to address the sparsity of the target domain. Existing CDR approaches are divided into four types: single-target CDR, multi-domain recommendation, dual-target CDR, and multi-target CDR [73].
SR has the capability of modeling temporal information, which can extract the user’s current preferences. However, data sparsity is a vital challenge for SR models, resulting in poor functionality and inaccurate recommendations. Therefore, cross-domain sequence recommendation is a promising research direction.

4.5. The Dynamics in SRS

In an SRS, users, items, and other entities are time-varying, and their relationships also evolve. Therefore, the recommendation system requires continuous iterations and updates with new information. The latest proposed recommendation methods primarily focus on static user–item interactions [74,75]. However, user preferences are dynamic and time-varying, which differs from historical interactions with items. Additionally, by applying a GNN to capture transition relationships between items, the graphs used in the GNN are constructed through a static pattern. If user preferences change, it may introduce noise into the graph’s structure [95]. Notice that research on dynamic graphs in SR is still insufficient, which is worth further exploration.

5. Conclusions

This review first provides an overview of SRSs, briefly reviews traditional recommendation systems, and elucidates the significance of DL in SRSs. Thereafter, this paper summarizes the most widely utilized DL models in current SR, which can be divided into three types: CL, GNNs, and attention mechanisms. More importantly, this paper presents a detailed explanation of these categories and highlights a series of influential research models. Additionally, we summarize the adopted datasets and metrics in these models. Finally, this paper discusses the future directions for the development of an SRS. We hope that this survey gives readers a comprehensive understanding of the critical aspects of the latest developments in this field, which may provide some insights for future research.
Despite certain achievements obtained by this research in the summarization and overview of SRSs, some deficiencies still exist. The exploration of multimodal data fusion is still inadequate. For example, traditional methods typically concatenate multimodal data with sequential series, leading to the entanglement of dynamic user interests and static features. We can address this by constructing a ternary temporal heterogeneous graph of users, items, and modalities. Additionally, the application of large models in recommendation systems based on DL requires deeper exploration. For instance, long-tail semantic distillation is driven by large models. Specifically, there is a lack of in-depth discussion on how to utilize the language understanding and generation capabilities of LLM-based recommendation systems to enhance recommendation effectiveness, for example, dynamic intent parsing and user portrait generation, generative multi-round dialogue-based enhanced recommendation, and knowledge-enhanced cross-modal sequential alignment.
In future research and changes, we will concentrate on SRSs, exploring them across multiple key aspects. For multimodal data fusion, we will seek optimal ways to combine image, audio, and text data in sequential scenarios. This helps us better map users’ time-varying interests. Regarding large-model applications, we will analyze how large language models interact with sequential features to improve recommendation accuracy across different time sequences. Fairness and interpretability are crucial. We will remove biases from user traits or sequential-related factors and find ways to explain recommendations, thus enhancing user trust. We will also focus on cross-domain integration, borrowing methods from other fields to boost the system’s capabilities. Finally, we will develop advanced algorithms and models. These will capture and apply key features that change over time, optimizing the recommendation decision-making process. Our goal is to offer valuable ideas for the intelligent evolution of SRSs, fueling continuous innovation in this field.

Author Contributions

Conceptualization, P.W. and J.G.; methodology, H.S.; software, Y.L. and X.D.; validation, J.G., Y.L. and Z.L.; formal analysis, P.W. and W.S.; investigation, P.W., T.C. and Y.L.; resources, J.G. and C.H.; data curation, J.G. and Z.H.; writing—original draft preparation, P.W.; writing—review and editing, P.W. and W.Q.; visualization, P.W., Y.D. and Y.L.; supervision, J.G., Z.L. and H.S.; project administration, H.S.; funding acquisition, P.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Funded Postdoctoral Research Program (GZC20241900), the Natural Science Foundation Program of Xinjiang Uygur Autonomous Region (2024D01A141), the Tianchi Talents Program of Xinjiang Uygur Autonomous Region and Postdoctoral Fund of Xinjiang Uygur Autonomous Region (Li Zhibin), the Sichuan University students innovation and entrepreneurship training program (S202410621082), the Chengdu University of Information Technology key project of education reform (JYJG2024206), the open project of Dazhou Key Laboratory of Government Data Security under Grant (ZSAQ202401/ZSAQ202409/ZSAQ202414/ZSAQ202422/ZSAQ202423), and Key Laboratory of Remote Sensing Application and Innovation (LRSAI-2025004).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the correspondence author on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

RSRecommendation Systems
CLCollaborative Filtering
CBFContent-Based Filtering
KBFKnowledge-Based Filtering
SRSSequential Recommendation System
ARIMAAuto-Regressive Integrated Moving Average
SVMSupport Vector Machine
GBRTGradient Boosting Regression Tree
HMMHidden Markov Model
DLDeep Learning
CLContrastive Learning
GNNGraph Neural Network
DCRecDe-biased Contrastive learning paradigm for Recommendation system
MCLSRMulti-level CL framework is proposed for Sequence Recommendation
KV-MNKey-Value Memory Network
KBKnowledge Base
GGNNGated Graph Neural Network
HG-GNNHeterogeneous Global Graph Neural Network
GCE-GNNGlobal Context-Enhanced Graph Neural Network
SGNN-HNStar GNN with Highway Networks
TE-GNNTime-Enhanced Graph Neural Network
RN-GNNRecurrent Neural Graph Neural Network
LESSRLossless Edge-order preserving aggregation and Shortcut graph attention for Session-based Recommendation
EOPAEdge-Order Preserving Aggregation
SGATShortcut Graph Attention
FGNNFully connected Graph Neural Network
WGATWeighted Graph Attention
HGNNHybrid sequential gated GNN
SR-GNNSession-based Representation Graph Neural Network
DGS-MGNNDynamic Global Structure-enhanced Multi-channel Graph Neural Network
DAT-MDIDual Attention Transfer based on Multi-Dimensional Integration
GCNGraph Convolutional Networks
GRUGated Recurrent Units
ReGNNGNNs with a Repetition exploration mechanism
ZSLZero-Shot Learning
MA-GNNMemory Graph Neural Network
TASRecTime-Augmented GNN for Session-based Recommendations
TSGTime span-aware Sequential Graph
DGSRDynamic Graph neural network for SR
E-GNNEnhanced Graph Neural Network
WGIGWeighted Global Item Graph
LSGLocal Session Graph
Int-GNNIntention-aware graph neural network
ID-GNNIntention-aware Denoising Graph Neural Network
TAGNNTarget-Attention Graph Neural Network
CDRCross-Domain Recommendation
CD-ASRCross-Domain Attentive SR
AHRNNAttentive Hybrid Recurrent Neural Network
Bi-LSTMBidirectional Long Short-Term Memory
Disen-GNNDisentangled Graph Neural Network
TIANTemporal Interest Attention Network
HA-GNNHigher-order Attention Graph Neural Network

Appendix A. The Descriptions of Metrics and Datasets

ReferenceBaselineMetricDataset
[15]Pop, BPR-MF, NCF, GRU4Rec+, SASRecGC-SAN, S3-RecMIPHR, NDCGBeauty, Sports, Yelp, ML-1M
[52]BPR-MF, GRU4Rec, Caser, SASRec, FDSA, S3Rec, CL4SRec, ICLRecHR, NDCGBeauty, Sports, ML-1M
[53]BPR, FPMC, GRU4Rec, Time-LSTM, Caser, TiSASRec, CL4SRec, FMLPRec, DuoRecHR, NDCGMovieLens, Beauty, Video Games, CDs&Vinyl, Movies&TV
[65]BPR, NCF, GC-MC, LightGCN, SGL, CKE, RippleNet, KGCN, KGAT, KGIN, CKAN, MVINRecall, NDCGYelp2018, Amazon-book, MIND
[8]BPR-MF, Caser, GRU4Rec, SASRec, BERT4RecHR, NDCGBeauty, Sports, Toys, Yelp
[9]PopRec, GRU4Rec, Caser, BERT4Rec, SASRec, DSSRec, S3-RecMIP,SP, CL4SRec, CoSeRecHR, NDCGSports, Beauty, Yelp, Toys
[12]PopRec, BPR-MF, GRU4Rec, SASRec, Bert4Rec, S3-Rec, CL4SRec, CoSeRecHR, NDCGBeauty, Sports, Yelp, Toys, VideoGames, Health, Apps, Tmall
[78]PopRec, FPMC, GRU4Rec, Caser, SR-GNN, SASRec, BERT4Rec, SSE-PT, DGCF, PTGCNRecall, NDCGMovieLens, CDs, Beauty
[116]NCF, DIN, LightGCN, Caser, GRU4Rec, DIEN, CLSRAUC, GAUC, MRR, NDCGTaobao, Amazon, Yelp
[10]DIN, Caser, GRU4REC, DIEN, SASRec, SLi-RecAUC, MRR, NDCG, WAUCTaobao, Amazon Toys
[13]GRU4Rec, GC-SAN, SASRec, S3Rec(MLP), CL4Rec, DuoRec, GEC4SRecHR, NDCGBeauty, Sports, ML-1M
[83]BRP-MF, GRU4Rec, Caser, SASRe, BERT4Rec, S3Rec(MLP), CL4SRec, DuoRecHR, NDCGBeauty, Clothing, Sports ML-1M
[14]Caser, GRU4Rec, SASRec, BERT4Rec, SR-GNN, GCSAN, SURGE, S3-Rec, CL4SRec, DuoRec, ICLRecHR, NDCGReddit, Beauty, Sports Movielens-20M
[84]Mult-VAE, DNN+SSL, BUIR, MixGCLRecall, NDCGDouban-Book, Yelp2018 Amazon-Book
[37]POP, GRU4REC, NARM, RNN-KNN, STAN, CSRM, SR-GNN, NISER+, GCE-GNNRecall, MRRDiginetica, Nowplaying, Yoochoose
[26]POPRec, GRU4Rec, SASRec, ComiRec-SA, GCSAN, S3-RecMIP, CL4SRec, DuoRec, MCLSRRecall, NDCG, HitAmazon, Gowalla
[85]FPMC, GRU4REC, NARM, STAMP, SASRec, BERT4Rec SR-GNN, CSRM, FGNN, GC-SAN, GCE-GNN, TASRec S2-DHCNHR, MRRTmall, Diginetica, Gowalla, RetailRocket, Nowplaying, LastFM
[86]GRU4Rec, Caser, SASRec, S3RecMIP, CL4SRec, DuoRec, CFIT4SRecHR, NDCGBeauty, Clothing, Sports, ML-1M
[87]GRU4Rec, Caser, NItNet, SASRec, GRU-SQN, Caser-SQN, NItNet-SQN, SASRec-SQN, CP4Rec, CP4Rec-SQN, ICM, GIRIL, EMI, DAMHR, NDCGRC15, RetailRocket,
[11]NCF, DIN, LightGCN, Caser, GRU4REC, DIEN, SASRec, SURGE, SLi-RecAUC, GAUC, MRR, NDCGTaobao, Kuaishou
[29]BPR-MF, FPMC, GRU4REC, GRU4REC+, NARM, STAMP, SR-GNN, KSRRecall, NDCGMl-20m,Ml-1m,Book
[27]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, SR-GNNPrecision, MRRDiginetica, Yoochoose
[88]FPMC, GRU4REC+BPR, GRU4REC+CE, NARM, STAMP, SR-GNN, RIB, KM-SR, M(GRU)-SR, M(GGNN)-SR, M(GGNNx2)-SR, M-SRHit, MRRKKBOX, JDATA
[34]ItemKNN, GRU4Rec, NARM, SR-GNN, LESSR, GCE-GNN, H-RNN, A-PGNNHR, MRRLastFM, Xing, Reddit
[33]S-POP, FPMC, GRU4REC, NARM, CSRM, STAMP, SR-IEM, SR-GNN, NISER+Precision, MRRYoochoose, Diginetica
[35]POP, ItemKNN, FPMC, GRU4REC, NARM, STAMP, SR-GNN, DGTN, LESSR, TAGNNMRR, PrecisionDiginetica, Yoochoose
[89]POP, Item-KNN, FPMC, GRU4REC, NARM, STAMP, CSRM, SR-IEM, SR-GNN, TAGNN, GCE-GNNPrecision, MRR Diginetica, Tmall, Nowplaying, Retailrocket
[37]POP, Item-KNN, GRU4REC, NARM, RNN-KNN, STAN,SR-GNN, NISER+,GCE-GNNRecall, MRRDiginetica, Nowplaying, Yoochoose
[40]Item-KNN, FPMC, NextItNet, NARM, FGNN, SR-GNN, GC-SAN, LESSRHR, MRRDiginetica, Gowalla, LastFM
[91]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, SR-GNN, FGNN-SG-Gated, FGNN-SG-ATT, FGNN-SG, FGNN-BCS-0, FGNN-BCS-1, FGNN-BCS-2, FGNN-BCS-3Recall, MRRYoochoose, Diginetica
[92]ItemKNN, ItemKNN(geo), FPMC, NextItNet, NARM, STAMP,SR-GNN, SSRM, SNextItNet, SNARM, SSTAMP, SSR-GNN, SSSRMHR, MRRGowalla, Delicious, Foursquare
[93]Item-KNN, FPMC, PRME, GRU4REC, NextItNet, NARM, STAMP, SR-GNN, GC-SAN, FGNN, SR-HGNNPrecision, MRRYoochoose, Diginetica
[42]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, SR-GNNPrecision, MRRYoochoose, Diginetica
[94]POP, S-POP, Item-KNN, FPMC, GRU4Rec, SKNN, NARM, STAMP, SRGNN, TAGNNRecall, MRRRetailrocket, Yoochoosel, Diginetica, Xing, Reddit
[95]POP, Item-KNN, FPMC, GRU4REC, NARM, STAMP, CSRM, DSAN, SR-GNN, TAGNN, COTREC, GCE-GNNPrecision, MRRDiginetica, Yoochoose, Retailrocket
[36]POP, BPRMF, FPMC, GRU4REC, SASRec, TiSASRec, SR-GNN, CatGCN, CTGNNNDCG, RecallTaobao, Diginetica, Amazon
[96]POP, Item-KNN,FPMC, GRU4Rec, NARM, STAMP, SR-GNN, FGNN, GC-SAN, GCE-GNNPrecision, MRRDiginetica, Yoochoose, Gowalla, LastFM
[98]MP, BPR, Mult-DAE, Lig htGCN, FPMC, TransRec, GRU4Rec, NARM, Caser, SASRec, MCF, CKE, LightGCN+, MoHRHR, NDCG, MRRAmazon, Books, Yelp, Google Local
[99]BPR-MF, FPMC, GRU4Rec, AttRec, Caser, HGNRecall, NDCGML100k, Luxury, Digital, Software
[100]POP, FPMC, Item-KNN, GRU4REC, NARM, STAMP, SR-GNN, DHCN, GCE-GNNPrecision, MRRDiginetica, Tmall, Yoochoose
[94]POP, S-POP, ItemKNN, FPMC, GRU4Rec, SKNN, NARM, STAMP, SRGNN, TAGNNRecall, MRRRetailrocket, Yoochoose, Diginetica, Xing, Reddit
[101]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, RepeatNet, SR-GNNPrecision, MRRYoochoose, Dignietica
[102]SR-GNN,SR-GNN-ATT, GC-SAN, GCE-GNN, COTRECPrecision, MRRAmazonG&GF, Yelpsmall
[106]BPRMF, GRU4Rec, GRU4Rec+,GC-SAN, Caser, SARSRec, MARankRecall, NDCGCDs, Books, Children, Comics, ML20M
[107]POP, S-POP,BPR-MF, GRU4REC, NARM, FGNN, SSRMRecall, MRRLastFM, Gowalla
[108]GRU4REC, SR-GNN,CSRM,LESSR, TASRecRecall, NDCGAotm, Diginetica, Retailrocket
[109]FPMC, GRU4REC+,SASRec, SR-GNN, GC-SAN, FGNN, RetaGNN, HGNN-GAT1, HGNN-GAT2, HGNN-T, HGNN-EnHit, RRSteam, MovieLens
[110]BPR-MF, FPMC, GRU4Rec+, Caser, SASRec, SR-G NN, HGN, TiSASRec, GCE-GNN, SERec, HyperRecNDCG, HitAblation, Beauty, Games, CDs, ML-1M
[111]Item-POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, SR-GNNRecalL, MRRYoochoose, Diginetica
[112]POP, GRU4Rec, NARM, STAMP, SR-GNN,NISER,LESSR, GCE-GNN, DSAN, DHCN, COTRECPrecision, MRRDiginetica, Tmall, RetailRocket
[113]FPMC, STAMP, GC-SAN, GCE-GNN, DHCN, ID-GNNHR, MRR, NDCGTmall, Yoochoose
[27]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, SR-GNNPrecision, MRRDiginetica, Yoochoose
[114]HGN, SASRec, HAM,MA-GNN, HGAN, GCMC,I GMCNDCG, RecalL, PrecisionInstagram, MovieLens, Book-Crossing
[100]POP, FPMC, Item-KNN, GRU4REC, NARM, STAMP, SR-GNN, DHCN, GCE-GNNP, MRRDiginetica, Tmall, Yoochoose
[11]NCF, DIN, LightGCN, Caser, GRU4REC, DIEN, SASRec, SURGEAUC, GAUC, MRR, NDCGTaobao, Kuaishou
[116]NCF, DIN, LightGCN, Caser, GRU4REC, DIEN, CLSRAUC, GAUC, MRR, NDCGTaobao, Amazon-Movie and TV, yelp
[117]Pop, BPR-MF, FPMC, SASRec, FISSA-lg,CoNet, CD-SASRec, CD-ASRHR, NDCGBooks, Movies, Music
[147]Item-KNN, BPR-MF, NCF, LightGCN, VUI-KNN, NCF-MLP++, Conet, GRU4REC, HGRU4REC, NAIS, Time-LSTM, TGSRec, π-Net, PSJNet, DA-GCNMRR, RecallHAMAZON, HVIDEO
[36]POP, BPRMF, FPMC, GRU4REC, SASRec, TiSASRec, SR-GNN, CatGCN, CTGNNNDCG, RecallTaobao, Diginetica, Amazon
[120]POP, ItemKNN, BPRMF, GRU4RE, AFM, RBM, Caser, TransRec, SASRecRecallMovieLens1M, Tmall
[121]POP, Item-KNN, FPMC, GRU4REC, NARM, STAMP, SR-GNN, TAGNNPrecision, MRRDiginetica, Yoochoose 1, Nowplaying
[102]SR-GNN, SR-GNN-ATT, GC-SAN, GCE-GNN, COTRECPrecision, MRRAmazonG&GF, Yelpsmall
[95]POP, Item-KNN, FPMC, GRU4REC, NARM, STAMP, CSRM, DSAN, SR-GNN, TAGNN, COTREC, GCE-GNNPrecision, MRRDiginetica, Yoochoose, Retailrocket
[89]POP, Item-KNN, FPMC, GRU4REC, NARM, STAMP, CSRM, SR-IEM, SR-GNN, TAGNN, GCE-GNNPrecision, MRRDiginetica, Tmall, Nowplaying, Retailrocket
[38]POP, Item-KNN, FPMC, GRU4Rec, NARM, STAMP, CSRM, SR-GNN, FGNN, FGNNPrecision, MRRDiginetica, Tmall, Nowplaying
[101]POP, S-POP, Item-KNN, BPR-MF, FPMC, GRU4REC, NARM, STAMP, RepeatNet, SR-GNNPrecision, MRRYoochoose, Dignietica
[122]BPR, FPMC, GRU4Rec+, Caser, SASRec, BERT4Rec, DHCN, TiSASRec, DGCFRecall, NDGGMovieLens, Amazon CDs_and_Vinyl, Amazon Movies and_TV
[96]POP, Item-KNN, FPMC, GRU4Rec, NARM, STAMP, SR-GNN, FGNN, GC-SAN, GCE-GNNPrecision, MRRDiginetica, Yoochoose, Gowalla, LastFM
[52]BPR-MF, GRU4Rec, Caser, SASRec, FDSA, S3Rec, CL4SRec, ICLRecHR, NDCGBeauty, Sports, ML-1M
[93]Item-KNN, FPMC, PRME, GRU4REC, NextItNet, NARM, STAMP, SR-GNN, GC-SAN, FGNN, SR-HGNNPrecision, MRRYoochoose, Diginetica
[123]FPMC, FOSSIL, GRU4Rec, NARM,HGN, SASRec, LightSANs, HME,SRGNN, GCSAN, LESSRHR, NDCG, MAPBeauty, Pet, TH, MYbank
[124]POP, FPMC, Item-KNN, GRU4Rec, NARM, STAMP, SR-GNN, TAGNN, ICM-SRMRR, PrecisionYoochoose, Diginetica
[125]POP, BPRFMC, FPMC, Fossil, GRU4Rec, CaserPrec, Recall, MAPMovieLens, Gowalla

References

  1. Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web, Hong Kong, China, 1–5 May 2001; pp. 285–295. [Google Scholar]
  2. Pazzani, M.J.; Billsus, D. Content-based recommendation systems. In The Adaptive Web: Methods and Strategies of Web Personalization; Springer: Berlin/Heidelberg, Germany, 2007; pp. 325–341. [Google Scholar]
  3. Wang, S.; Hu, L.; Wang, Y.; Cao, L.; Sheng, Q.; Orgun, M. Sequential recommender systems: Challenges, progress and prospects. arXiv 2019, arXiv:2001.04830. [Google Scholar]
  4. Wang, S.; Cao, L.; Wang, Y.; Sheng, Q.; Orgun, M.; Lian, D. A survey on session-based recommender systems. ACM Comput. Surv. 2021, 54, 1–38. [Google Scholar] [CrossRef]
  5. Burke, R. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
  6. Adomavicius, G.; Tuzhilin, A. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005, 17, 734–749. [Google Scholar] [CrossRef]
  7. Bachman, P.; Hjelm, R.D.; Buchwalter, W. Learning representations by maximizing mutual information across views. In Proceedings of the Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; p. 32. [Google Scholar]
  8. Chen, Y.; Liu, Z.; Li, J.; McAuley, J.; Xiong, C. Intent contrastive learning for sequential recommendation. In Proceedings of the ACM Web Conference, Lyon, France, 25–29 April 2022; pp. 2172–2182. [Google Scholar]
  9. Li, X.; Sun, A.; Zhao, M.; Yu, J.; Zhu, K.; Jin, D.; Yu, M.; Yu, R. Multi-intention oriented contrastive learning for sequential recommendation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining, Singapore, 27 February–3 March 2023; pp. 411–419. [Google Scholar]
  10. Lin, G.; Gao, C.; Li, Y.; Zheng, Y.; Li, Z.; Jin, D.; Li, Y. Dual contrastive network for sequential recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 2686–2691. [Google Scholar]
  11. Zheng, Y.; Gao, C.; Chang, J.; Niu, Y.; Song, Y.; Jin, D.; Li, Y. Disentangling long and short-term interests for recommendation. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 2256–2267. [Google Scholar]
  12. Wei, Z.; Wu, N.; Li, F.; Wang, K.; Zhang, W. MoCo4SRec: A momentum contrastive learning framework for sequential recommendation. Expert Syst. Appl. 2023, 223, 119911. [Google Scholar] [CrossRef]
  13. Yang, X.Y.; Xu, F.; Yu, J.; Li, Z.; Wang, D. Graph neural network-guided contrastive learning for sequential recommendation. Sensors 2023, 23, 5572. [Google Scholar] [CrossRef]
  14. Yang, Y.; Huang, C.; Xia, L.; Huang, C.; Luo, D.; Lin, K. Debiased contrastive learning for sequential recommendation. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 1063–1073. [Google Scholar]
  15. Xie, X.; Sun, F.; Liu, Z.; Wu, S.; Gao, J.; Ding, B.; Cui, B. Contrastive learning for sequential recommendation. In Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia, 9–12 May 2022; pp. 1259–1273. [Google Scholar]
  16. Zuva, T.; Ojo, S.O.; Ngwira, S.; Zuva, K. A survey of recommender systems techniques, challenges and evaluation metrics. Int. J. Emerg. Technol. Adv. Eng. 2012, 2, 382–386. [Google Scholar]
  17. Hu, Y.; Koren, Y.; Volinsky, C. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 8th IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 263–272. [Google Scholar]
  18. Koren, Y.; Rendle, S.; Bell, R. Advances in collaborative filtering. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2021; pp. 91–142. [Google Scholar]
  19. Salakhutdinov, R.; Mnih, A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 880–887. [Google Scholar]
  20. Ling, G.; Lyu, M.R.; King, I. Ratings meet reviews, a combined approach to recommend. In Proceedings of the 8th ACM Conference on Recommender Systems, Silicon Valley, CA, USA, 6–10 October 2014; pp. 105–112. [Google Scholar]
  21. McAuley, J.; Leskovec, J. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 165–172. [Google Scholar]
  22. Ricci, F. Mobile recommender systems. Inf. Technol. Tour. 2010, 12, 205–231. [Google Scholar] [CrossRef]
  23. Gemmis, M.D.; Iaquinta, L.; Lops, P.; Musto, C.; Narducci, F.; Semeraro, G. Preference learning in recommender systems. Prefer. Learn. 2009, 41, 41–55. [Google Scholar]
  24. Ghazanfar, M.; Prugel-Bennett, A. An improved switching hybrid recommender system using naive Bayes classifier and collaborative filtering. In Proceedings of the 2010 IAENG International Conference on Data Mining and Applications, Hong Kong, China, 17–19 March 2010. [Google Scholar]
  25. Melville, P.; Sindhwani, V. Recommender systems. Encycl. Mach. Learn. 2010, 1, 829–838. [Google Scholar]
  26. Wang, Z.; Liu, H.; Wei, W.; Hu, Y.; Mao, X.; He, S.; Fang, R.; Chen, D. Multi-level contrastive learning framework for sequential recommendation. In Proceedings of the 31th ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 2098–2107. [Google Scholar]
  27. Yu, F.; Zhu, Y.; Liu, Q.; Wu, S.; Wang, L.; Tan, T. TAGNN: Target attentive graph neural networks for session-based recommendation. In Proceedings of the 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 1921–1924. [Google Scholar]
  28. Li, Y.; Tarlow, D.; Brockschmidt, M.; Zemel, R. Gated graph sequence neural networks. arXiv 2015, arXiv:1511.05493. [Google Scholar]
  29. Wang, B.; Cai, W. Knowledge-enhanced graph neural networks for sequential recommendation. Information 2020, 11, 388. [Google Scholar] [CrossRef]
  30. Ying, H.; Zhuang, F.; Zhang, F.; Liu, Y.; Xu, G.; Xie, X.; Xiong, H.; Wu, J. Sequential recommender system based on hierarchical attention network. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3926–3932. [Google Scholar]
  31. Hidasi, B.; Karatzoglou, A.; Baltrunas, L.; Tikk, D. Session-based recommendations with recurrent neural networks. arXiv 2015, arXiv:1511.06939. [Google Scholar]
  32. Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar]
  33. Pan, Z.; Cai, F.; Chen, W.; Chen, H.; Rijke, M. Star graph neural networks for session-based recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, 19–23 October 2020; pp. 1195–1204. [Google Scholar]
  34. Pang, Y.; Wu, L.; Shen, Q.; Zhang, Y.; Wei, Z.; Xu, F.; Chang, E.; Long, B.; Pei, J. Heterogeneous global graph neural networks for personalized session-based recommendation. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining, Virtual Event, 21–25 February 2022; pp. 775–783. [Google Scholar]
  35. Feng, L.; Cai, Y.; Wei, E.; Li, J. Graph neural networks with global noise filtering for session-based recommendation. Neurocomputing 2022, 472, 113–123. [Google Scholar] [CrossRef]
  36. Hao, Y.; Ma, J.; Zhao, P.; Liu, G.; Xian, X.; Zhao, L.; Sheng, V. Multi-dimensional Graph Neural Network for Sequential Recommendation. Pattern Recognit. 2023, 139, 109504. [Google Scholar] [CrossRef]
  37. Wang, J.; Xie, H.; Wang, F.L.; Lee, L.; Wei, M. Jointly modeling intra-and inter-session dependencies with graph neural networks for session-based recommendations. Inf. Process. Manag. 2023, 60, 103209. [Google Scholar] [CrossRef]
  38. Wang, Z.; Wei, W.; Cong, G.; Li, X.; Mao, X.; Qiu, M. Global context enhanced graph neural networks for session-based recommendation. In Proceedings of the 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 169–178. [Google Scholar]
  39. Qiu, R.; Li, J.; Huang, Z.; Yin, H. Rethinking the item order in session-based recommendation with graph neural networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 579–588. [Google Scholar]
  40. Chen, T.; Wong, R.C.W. Handling information loss of graph neural networks for session-based recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; pp. 1172–1180. [Google Scholar]
  41. Li, Q.; Han, Z.; Wu, X.M. Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
  42. Wu, S.; Tang, Y.; Zhu, Y.; Wang, L.; Xie, X.; Tan, T. Session-based recommendation with graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 346–353. [Google Scholar]
  43. Zhang, G.P. Times series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2001, 50, 159–175. [Google Scholar] [CrossRef]
  44. Noble, W. What is a support vector machine? Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef]
  45. Wang, L.; Zhang, Y.; Yao, Y.; Xiao, Z.; Shang, K.; Guo, X.; Yang, J.; Hue, S.; Wang, J. GBRT-Based Estimation of Terrestrial Latent Heat Flux in the Haihe River Basin from Satellite and Reanalysis Datasets. Remote Sens. 2021, 13, 1054. [Google Scholar] [CrossRef]
  46. Rendle, S.; Freudenthaler, C.; Schmidt-Thieme, L. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, Raleigh North, CA, USA, 26–30 April 2010; pp. 811–820. [Google Scholar]
  47. He, R.; Kang, W.C.; McAuley, J. Translation-based recommendation. In Proceedings of the 11th ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 161–169. [Google Scholar]
  48. He, R.; Fang, C.; Wang, Z.; McAuley, J. Vista: A visually, socially, and temporally-aware model for artistic recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 309–316. [Google Scholar]
  49. Dash, A.; Chakraborty, A.; Ghosh, S.; Mukherjee, A.; Gummadi, K. FaiRIR: Mitigating Exposure Bias From Related Item Recommendations in Two-Sided Platforms. IEEE Trans. Comput. Soc. Syst. 2023, 10, 1301–1313. [Google Scholar] [CrossRef]
  50. Li, Q.; Peng, H.; Li, J.; Xia, C.; Yang, R.; Sun, L.; Yu, P.S.; He, L. A survey on text classification: From traditional to deep learning. ACM Trans. Intell. Syst. Technol. 2022, 13, 1–41. [Google Scholar] [CrossRef]
  51. Covington, P.; Adams, J.; Sargin, E. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 191–198. [Google Scholar]
  52. Yan, B.; Wang, H.; Ouyang, Z.; Chen, C.; Xia, A. Item attribute-aware contrastive learning for sequential recommendation. IEEE Access 2023, 11, 70795–70804. [Google Scholar] [CrossRef]
  53. Wang, J.; Shi, Y.; Yu, H.; Zhang, K.; Wang, X.; Yan, Z.; Li, H. Temporal density-aware sequential recommendation networks with contrastive learning. Expert Syst. Appl. 2023, 211, 118563. [Google Scholar] [CrossRef]
  54. Chen, J.; Wu, W.; Shi, L.; Ji, Y.; Hu, W.; Chen, X.; Zheng, W.; He, L. DACSR: Decoupled-aggregated end-to-end calibrated sequential recommendation. Appl. Sci. 2022, 12, 11765. [Google Scholar] [CrossRef]
  55. Li, C.T.; Hsu, C.; Zhang, Y. Fairsr: Fairness-aware sequential recommendation through multi-task learning with preference graph embeddings. ACM Trans. Intell. Syst. Technol. 2022, 13, 1–21. [Google Scholar] [CrossRef]
  56. Abdollahpouri, H.; Mansoury, M.; Burke, R.; Mobasher, B. The impact of popularity bias on fairness and calibration in recommendation. arXiv 2019, arXiv:1910.05755. [Google Scholar]
  57. Rastegarpanah, B.; Gummadi, K.P.; Crovella, M. Fighting fire with fire: Using antidote data to improve polarization and fairness of recommender systems. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 231–239. [Google Scholar]
  58. Jiang, Y.; Yang, Y.; Xia, L.; Huang, C. DiffKG: Knowledge Graph Diffusion Model for Recommendation. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining (WSDM ‘24). Association for Computing Machinery, New York, NY, USA, 4–8 March 2024; pp. 313–321. [Google Scholar]
  59. Zhu, Z.; Hu, X.; Caverlee, J. Fairness-aware tensor-based recommendation. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 1153–1162. [Google Scholar]
  60. Zhang, M.; Hurley, N. Avoiding monotony: Improving the diversity of recommendation lists. In Proceedings of the 2008 ACM Conference on Recommender Systems, Lausanne, Switzerland, 23–25 October 2008; pp. 123–130. [Google Scholar]
  61. Wu, Q.; Liu, Y.; Miao, C.; Zhao, Y.; Guan, L.; Tang, H. Recent advances in diversified recommendation. arXiv 2019, arXiv:1905.06589. [Google Scholar]
  62. Chen, W.; Ren, P.; Cai, F.; Sun, F.; De Rijke, M. Multi-interest diversification for end-to-end sequential recommendation. ACM Trans. Inf. Syst. 2021, 40, 1–30. [Google Scholar] [CrossRef]
  63. Chen, W.; Ren, P.; Cai, F.; De Rijke, M. Improving end-to-end sequential recommendations with intent-aware diversification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, 19–23 October 2020; pp. 175–184. [Google Scholar]
  64. Deng, Y.; Hou, X.; Li, B.; Wang, J.; Zhang, Y. A novel method for improving optical component smoothing quality in robotic smoothing systems by compensating path errors. Opt. Express 2023, 31, 30359–30378. [Google Scholar] [CrossRef]
  65. Yang, Y.; Huang, C.; Xia, L.; Li, C. Knowledge graph contrastive learning for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 1434–1443. [Google Scholar]
  66. Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
  67. Monti, F.; Bronstein, M.; Bresson, X. Geometric matrix completion with recurrent multi-graph neural networks. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
  68. Li, Z.; Li, S.; Francis, A.; Luo, X. A novel calibration system for robot arm via an open dataset and a learning perspective. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 5169–5173. [Google Scholar] [CrossRef]
  69. Wu, Y.; DuBois, C.; Zheng, A.X.; Ester, M. Collaborative denoising auto-encoders for top-n recommender systems. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA, 22–25 February 2016; pp. 153–162. [Google Scholar]
  70. Hinton, G.E. A practical guide to training restricted Boltzmann machines. In Neural Networks: Tricks of the Trade, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 599–619. [Google Scholar]
  71. Donkers, T.; Loepp, B.; Ziegler, J. Sequential user-based recurrent neural network recommendations. In Proceedings of the 11th ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 152–160. [Google Scholar]
  72. Li, Z.; Li, S.; Bamasag, O.O.; Alhothali, A.; Luo, X. Diversified regularization enhanced training for effective manipulator calibration. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 8778–8790. [Google Scholar] [CrossRef] [PubMed]
  73. Zhu, F.; Wang, Y.; Chen, C.; Zhou, J.; Li, L.; Liu, G. Cross-domain recommendation: Challenges, progress, and prospects. arXiv 2021, arXiv:2103.01696. [Google Scholar]
  74. Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T. Neural graph collaborative filtering. In Proceedings of the 42th International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 165–174. [Google Scholar]
  75. He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
  76. Man, T.; Shen, H.; Jin, X.; Cheng, X. Cross-domain recommendation: An embedding and mapping approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; Volume 17, pp. 2464–2470. [Google Scholar]
  77. Sahu, A.K.; Dwivedi, P. User profile as a bridge in cross-domain recommender systems for sparsity reduction. Appl. Intell. 2019, 49, 2461–2481. [Google Scholar] [CrossRef]
  78. Duan, H.; Zhu, Y.; Liang, X.; Zhu, Z.; Liu, P. Multi-feature fused collaborative attention network for sequential recommendation with semantic-enriched contrastive learning. Inf. Process. Manag. 2023, 60, 103416. [Google Scholar] [CrossRef]
  79. Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. In Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
  80. Zhang, Y.; Bai, Y.; Chang, J.; Zhang, X.; Lu, S.; Lu, J.; Feng, F.; Niu, Y.; Song, Y. Leveraging watch-time feedback for short-video recommendations: A causal labeling framework. In Proceedings of the 32th ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 4952–4959. [Google Scholar]
  81. Deng, Y.; Hou, X.; Li, B.; Wang, J.; Zhang, Y. A highly powerful calibration method for robotic smoothing system calibration via using adaptive residual extended Kalman filter. Robot. Comput.-Integr. Manuf. 2024, 86, 102660. [Google Scholar] [CrossRef]
  82. Wu, Y.; Xie, R.; Zhu, Y.; Zhuang, F.; Zhang, X.; Lin, L. Personalized prompt for sequential recommendation. IEEE Trans. Knowl. Data Eng. 2024, 36, 3376–3389. [Google Scholar] [CrossRef]
  83. Wang, L.; Lim, E.P.; Liu, Z.; Zhao, T. Explanation guided contrastive learning for sequential recommendation. In Proceedings of the 31th ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 2017–2027. [Google Scholar]
  84. Yu, J.; Yin, H.; Xia, X.; Chen, T.; Cui, L.; Nguyen, Q. Are graph augmentations necessary? Simple graph contrastive learning for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 1294–1303. [Google Scholar]
  85. Wan, Z.; Liu, X.; Wang, B.; Qiu, J.; Li, B.; Guo, T.; Chen, G.; Wang, Y. Spatio-temporal contrastive learning enhanced Gnns for session-based recommendation. ACM Trans. Inf. Syst. 2023, 42, 1–26. [Google Scholar] [CrossRef]
  86. Zhang, Y.; Yin, G.; Dong, Y. Contrastive learning with frequency domain for sequential recommendation. Appl. Soft Comput. 2023, 144, 110481. [Google Scholar] [CrossRef]
  87. Liu, Z.; Ma, Y.; Hildebrandt, M.; Ouyang, Y.; Xiong, Z. CDARL: A contrastive discriminator-augmented reinforcement learning framework for sequential recommendations. Knowl. Inf. Syst. 2022, 64, 2239–2265. [Google Scholar] [CrossRef]
  88. Meng, W.; Yang, D.; Xiao, Y. Incorporating user micro-behaviors and item knowledge into multi-task learning for session-based recommendation. In Proceedings of the 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 1091–1100. [Google Scholar]
  89. Tang, G.; Zhu, X.; Guo, J.; Dietze, S. Time enhanced graph neural networks for session-based recommendation. Knowl.-Based Syst. 2022, 251, 109204. [Google Scholar] [CrossRef]
  90. Yuan, F.; Karatzoglou, A.; Arapakis, I.; Jose, J.; He, X. A simple convolutional generative network for next item recommendation. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 582–590.32. [Google Scholar]
  91. Qiu, R.; Huang, Z.; Li, J.; Yin, H. Exploiting cross-session information for session-based recommendation with graph neural networks. ACM Trans. Inf. Syst. 2020, 38, 1–23. [Google Scholar] [CrossRef]
  92. Chen, T.; Wong, R.C.W. An efficient and effective framework for session-based social recommendation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Virtual Event, 8–12 March 2021; pp. 400–408. [Google Scholar]
  93. Chen, Y.H.; Huang, L.; Wang, C.D.; Lai, J. Hybrid-order gated graph neural network for session-based recommendation. IEEE Trans. Ind. Inform. 2021, 18, 1458–1467. [Google Scholar] [CrossRef]
  94. Zhang, C.; Zheng, W.; Liu, Q.; Nie, J.; Zhang, H. SEDGN: Sequence enhanced denoising graph neural network for session-based recommendation. Expert Syst. Appl. 2022, 203, 117391. [Google Scholar] [CrossRef]
  95. Zhu, X.; Tang, G.; Wang, P.; Li, C.; Guo, J.; Dietze, S. Dynamic global structure enhanced multi-channel graph neural network for session-based recommendation. Inf. Sci. 2023, 624, 324–343. [Google Scholar] [CrossRef]
  96. Chen, C.; Guo, J.; Song, B. Dual attention transfer in session-based recommendation with multi-dimensional integration. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 11–15 July 2021; pp. 869–878. [Google Scholar]
  97. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
  98. Zhu, T.; Sun, L.; Chen, G. Graph-based embedding smoothing for sequential recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35, 496–508. [Google Scholar] [CrossRef]
  99. Tao, Y.; Wang, C.; Yao, L.; Li, W.; Yu, Y. Item trend learning for sequential recommendation system using gated graph neural network. Neural Comput. Appl. 2023, 35, 13077–13092. [Google Scholar] [CrossRef]
  100. Sang, S.; Yuan, W.; Li, W.; Yang, Z.; Zhang, Z. Position-aware graph neural network for session-based recommendation. Knowl.-Based Syst. 2023, 262, 110201. [Google Scholar] [CrossRef]
  101. Xian, X.; Fang, L.; Sun, S. ReGNN: A repeat aware graph neural network for session-based recommendations. IEEE Access 2020, 8, 98518–98525. [Google Scholar] [CrossRef]
  102. Jin, D.; Wang, L.; Zheng, Y.; Song, G.; Jiang, F.; Li, X.; Lin, W.; Pan, S. Dual intent enhanced graph neural network for session-based new item recommendation. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 684–693. [Google Scholar]
  103. Yu, L.; Zhang, C.; Liang, S.; Zhang, X. Multi-order attentive ranking model for sequential recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 5709–5716. [Google Scholar]
  104. Tang, J.; Wang, K. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; pp. 565–573. [Google Scholar]
  105. Belletti, F.; Chen, M.; Chi, E.H. Quantifying long range dependence in language and user behavior to improve RNNs. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1317–1327. [Google Scholar]
  106. Ma, C.; Ma, L.; Zhang, Y.; Sun, J.; Liu, X.; Coates, M. Memory augmented graph neural networks for sequential recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 5045–5052. [Google Scholar]
  107. Qiu, R.; Yin, H.; Huang, Z.; Chen, T. Gag: Global attributed graph neural network for streaming session-based recommendation. In Proceedings of the 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 669–678. [Google Scholar]
  108. Zhou, H.; Tan, Q.; Huang, X.; Zhou, K.; Wang, X. Temporal augmented graph neural networks for session-based recommendations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 11–15 July 2021; pp. 1798–1802. [Google Scholar]
  109. Xue, L.; Yang, D.; Xiao, Y. Factorial user modeling with hierarchical graph neural network for enhanced sequential recommendation. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
  110. Zhang, M.; Wu, S.; Yu, X.; Liu, Q.; Wang, L. Dynamic graph neural networks for sequential recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35, 4741–4753. [Google Scholar] [CrossRef]
  111. Sheng, Z.; Zhang, T.; Zhang, Y.; Gao, S. Enhanced graph neural network for session-based recommendation. Expert Syst. Appl. 2023, 213, 118887. [Google Scholar] [CrossRef]
  112. Xu, G.; Yang, J.; Guo, J.; Huang, Z.; Zhang, B. Int-GNN: A user intention aware graph neural network for session-based recommendation. In Proceedings of the ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
  113. Hua, S.; Gan, M. Intention-aware denoising graph neural network for session-based recommendation. Appl. Intell. 2023, 53, 23097–23112. [Google Scholar] [CrossRef]
  114. Hsu, C.; Li, C.T. Retagnn: Relational temporal attentive graph neural networks for holistic sequential recommendation. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2968–2979. [Google Scholar]
  115. Ricci, F.; Rokach, L.; Shapira, B. Recommender systems: Introduction and challenges. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2015; pp. 1–34. [Google Scholar]
  116. Li, Y.; Yang, C.; Ni, T.; Zhang, Y.; Liu, A. Long and short-term interest contrastive learning under filter-enhanced sequential recommendation. IEEE Access 2023, 11, 95928–95938. [Google Scholar] [CrossRef]
  117. Alharbi, N.; Caragea, D. Cross-domain attentive sequential recommendations based on general and current user preferences (CD-ASR). In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Melbourne, Australia, 14–17 December 2021; pp. 48–55. [Google Scholar]
  118. Lian, J.; Zhang, F.; Xie, X.; Sun, G. CCCFNet: A content-boosted collaborative filtering neural network for cross domain recommender systems. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; pp. 817–818. [Google Scholar]
  119. Zhong, S.T.; Huang, L.; Wang, C.D.; Lai, J.; Yu, P. An autoencoder framework with attention mechanism for cross-domain recommendation. IEEE Trans. Cybern. 2022, 52, 5229–5241. [Google Scholar] [CrossRef]
  120. Zhang, L.; Wang, P.; Li, J.; Xiao, Z.; Shi, H. Attentive hybrid recurrent neural networks for sequential recommendation. Neural Comput. Appl. 2021, 33, 11091–11105. [Google Scholar] [CrossRef]
  121. Li, A.; Cheng, Z.; Liu, F.; Gao, Z.; Guan, W.; Peng, Y. Disentangled graph neural networks for session-based recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35, 7870–7882. [Google Scholar] [CrossRef]
  122. Huang, L.; Ma, Y.; Liu, Y.; Du, B.; Wang, S.; Li, D. Position-enhanced and time-aware graph convolutional network for sequential recommendations. ACM Trans. Inf. Syst. 2023, 41, 1–32. [Google Scholar] [CrossRef]
  123. Guo, N.; Liu, X.; Li, S.; Ma, Q.; Gao, K.; Han, B.; Zheng, L.; Guo, S.; Guo, X. Poincaré Heterogeneous Graph Neural Networks for Sequential Recommendation. ACM Trans. Inf. Syst. 2023, 41, 1–26. [Google Scholar] [CrossRef]
  124. Sang, S.; Liu, N.; Li, W.; Zhang, Z.; Qin, Q.; Yuan, W. High-order attentive graph neural network for session-based recommendation. Appl. Intell. 2022, 52, 16975–16989. [Google Scholar] [CrossRef]
  125. Hao, J.; Dun, Y.; Zhao, G.; Wu, Y.; Qian, X. Annular-graph attention model for personalized sequential recommendation. IEEE Trans. Multimed. 2021, 24, 3381–3391. [Google Scholar] [CrossRef]
  126. Wu, B.; He, X.; Wu, L.; Zhang, X.; Ye, Y. Graph-augmented co-attention model for socio-sequential recommendation. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 4039–4051. [Google Scholar] [CrossRef]
  127. Vargas, S.; Castells, P. Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the 5th ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011; pp. 109–116. [Google Scholar]
  128. Shani, G.; Gunawardana, A. Evaluating recommendation systems. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2010; pp. 257–297. [Google Scholar]
  129. Avazpour, I.; Pitakrat, T.; Grunske, L.; Grundy, J. Dimensions and metrics for evaluating recommendation systems. In Recommendation Systems in Software Engineering; Springer: Berlin/Heidelberg, Germany, 2013; pp. 245–273. [Google Scholar]
  130. Boka, F.T.; Niu, Z.; Neupane, R.B. A survey of sequential recommendation systems: Techniques, evaluation, and future directions. Inf. Syst. 2024, 125, 102427. [Google Scholar] [CrossRef]
  131. Nasir, M.; Ezeife, C.I. A survey and taxonomy of sequential recommender systems for e-commerce product recommendation. SN Comput. Sci. 2023, 4, 708. [Google Scholar] [CrossRef]
  132. He, R.; McAuley, J. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web, Montréal, QC, Canada, 11–15 April 2016; pp. 507–517. [Google Scholar]
  133. McAuley, J.; Targett, C.; Shi, Q.; Hengel, A. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, 9–13 August 2015; pp. 43–52. [Google Scholar]
  134. Zhu, H.; Chang, D.; Xu, Z.; Zhang, P.; Li, X.; He, J.; Li, H.; Xu, J.; Gai, K. Joint optimization of tree-based index and deep model for recommender systems. In Proceedings of the 33th Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32, pp. 1–10. [Google Scholar]
  135. Zangerle, E.; Pichl, M.; Gassler, W.; Specht, G. #Nowplaying music dataset: Extracting listening behavior from twitter. In Proceedings of the 1th International Workshop on Internet-Scale Multimedia Management, Orlando, FL, USA, 7 November 2014; pp. 21–26. [Google Scholar]
  136. Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 1–19. [Google Scholar] [CrossRef]
  137. McFee, B.; Lanckriet, G.R.G. Hypergraph models of playlist dialects. In Proceedings of the International Society for Music Information Retrieval, Porto, Portugal, 8–12 October 2012; Volume 12, pp. 343–348. [Google Scholar]
  138. Zhang, Y.; Humbert, M.; Rahman, T.; Li, C.; Pang, J.; Backes, M. Tagvisor: A privacy advisor for sharing hashtags. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 287–296. [Google Scholar]
  139. Jannach, D.; Zanker, M.; Felfernig, A.; Friedrtich, G. Recommender Systems: An Introduction; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
  140. Wang, S.; Hu, L.; Wang, Y.; He, X.; Sheng, Q.; Orgun, M.; Cao, L.; Wang, N.; Ricci, F.; Yu, P. Graph learning approaches to recommender systems: A review. arXiv 2020, arXiv:2004.11718. [Google Scholar]
  141. Berg, R.; Kipf, T.N.; Welling, M. Graph convolutional matrix completion. arXiv 2017, arXiv:1706.02263. [Google Scholar]
  142. Ying, R.; He, R.; Chen, K.; Eksombatchai, P.; Hamilton, W.; Leskovec, J. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Dining, London, UK, 19–23 August 2018; pp. 974–983. [Google Scholar]
  143. Yin, Z.; Han, K.; Wang, P.; Hu, H. Multi global information assisted streaming session-based recommendation system. IEEE Trans. Knowl. Data Eng. 2023, 35, 8245–8256. [Google Scholar] [CrossRef]
  144. Yuan, H.; Yu, H.; Gui, S.; Ji, S. Explainability in graph neural networks: A taxonomic survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5782–5799. [Google Scholar] [CrossRef]
  145. Yang, Z.; Dong, S.; Hu, J. GFE: General knowledge enhanced framework for explainable sequential recommendation. Knowl.-Based Syst. 2021, 230, 107375. [Google Scholar] [CrossRef]
  146. Zhang, Y.; Chen, X. Explainable Recommendation: A Survey and New Perspectives. Foundations and Trends® in Information Retrieval; Now Publishers Inc.: Hanover, MA, USA, 2020; Volume 14, pp. 1–101. [Google Scholar]
  147. Guo, L.; Zhang, J.; Tang, L.; Chen, T.; Zhu, L.; Yin, H. Time interval-enhanced graph neural network for shared-account cross-domain sequential recommendation. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 4002–4016. [Google Scholar] [CrossRef]
Figure 1. The schematic diagram of CL.
Figure 1. The schematic diagram of CL.
Electronics 14 02134 g001
Figure 2. The overall framework of the de-biased CL paradigm for the recommendation system.
Figure 2. The overall framework of the de-biased CL paradigm for the recommendation system.
Electronics 14 02134 g002
Figure 3. The overall framework of sequence recommendation based on a GNN.
Figure 3. The overall framework of sequence recommendation based on a GNN.
Electronics 14 02134 g003
Figure 4. The workflow diagram of LESSR.
Figure 4. The workflow diagram of LESSR.
Electronics 14 02134 g004
Figure 5. The workflow diagram of the session graph.
Figure 5. The workflow diagram of the session graph.
Electronics 14 02134 g005
Figure 6. The overview of the ReGNN model.
Figure 6. The overview of the ReGNN model.
Electronics 14 02134 g006
Figure 7. The architecture of the MA-GNN model.
Figure 7. The architecture of the MA-GNN model.
Electronics 14 02134 g007
Figure 8. The overall framework of the E-GNN model.
Figure 8. The overall framework of the E-GNN model.
Electronics 14 02134 g008
Figure 9. The flowchart of the TAGNN model.
Figure 9. The flowchart of the TAGNN model.
Electronics 14 02134 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wei, P.; Shu, H.; Gan, J.; Deng, X.; Liu, Y.; Sun, W.; Chen, T.; Hu, C.; Hu, Z.; Deng, Y.; et al. Sequential Recommendation System Based on Deep Learning: A Survey. Electronics 2025, 14, 2134. https://doi.org/10.3390/electronics14112134

AMA Style

Wei P, Shu H, Gan J, Deng X, Liu Y, Sun W, Chen T, Hu C, Hu Z, Deng Y, et al. Sequential Recommendation System Based on Deep Learning: A Survey. Electronics. 2025; 14(11):2134. https://doi.org/10.3390/electronics14112134

Chicago/Turabian Style

Wei, Peiyang, Hongping Shu, Jianhong Gan, Xun Deng, Yi Liu, Wenying Sun, Tinghui Chen, Can Hu, Zhenzhen Hu, Yonghong Deng, and et al. 2025. "Sequential Recommendation System Based on Deep Learning: A Survey" Electronics 14, no. 11: 2134. https://doi.org/10.3390/electronics14112134

APA Style

Wei, P., Shu, H., Gan, J., Deng, X., Liu, Y., Sun, W., Chen, T., Hu, C., Hu, Z., Deng, Y., Qin, W., & Li, Z. (2025). Sequential Recommendation System Based on Deep Learning: A Survey. Electronics, 14(11), 2134. https://doi.org/10.3390/electronics14112134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop