NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention

Sun, Qinghao; Li, Ziyang; Yu, Jiong; Li, Xue; Wang, Xin

doi:10.3390/electronics14010033

Open AccessArticle

NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention

by

Qinghao Sun

¹

,

Ziyang Li

^1,*,

Jiong Yu

²,

Xue Li

² and

Xin Wang

¹

School of Software, Xinjiang University, Urumqi 830046, China

²

School of Computer Science and Technology, Xinjiang University, Urumqi 830046, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(1), 33; https://doi.org/10.3390/electronics14010033

Submission received: 3 October 2024 / Revised: 17 December 2024 / Accepted: 24 December 2024 / Published: 25 December 2024

Download

Browse Figures

Versions Notes

Abstract

Review-based recommendation can utilize user and item features extracted from review text to alleviate the problems of data sparsity and poor interpretability. However, most existing methods focus on static modeling of user personality and item attributes while ignoring the dynamic changes of user and item features. Therefore, this paper proposes a neural recommendation method with dynamic personalized attention (NRDPA). First, this method captures the changes in user behavior at the word level and review level and models the personalized features of users and items by dynamically highlighting key words and important reviews. Second, the method considers information interaction in the process of user and item modeling and adjusts the feature representations of the interacting parties according to the user’s preferences for different items. Finally, experiments on five public datasets from Amazon demonstrate that the proposed NRDPA model has superior performance, with improvements of up to 10% in MSE and 6.3% in MAE compared to state-of-the-art models.

Keywords:

review-based recommendation; attention mechanism; personalized features; user personality and item attributes

1. Introduction

Nowadays, recommender systems are widely used in various e-commerce and social media to help people cope with the complexity of information. In the face of the increasingly massive amount of products, pictures, videos and other resources on the internet, recommender systems can take into account user preferences, fully analyze the available resources, and assist in making effective decisions [1,2,3].

Past works on recommender systems have focused on analyzing numerical ratings using collaborative filtering methods to infer user preferences for items [4,5]. The techniques used range from traditional matrix factorization to current deep neural network variants [6,7]. However, user ratings are affected by natural sparsity, while only reflecting their overall satisfaction with an item and not providing specific reasons for the assigned ratings. To overcome this limitation, researchers have focused on textual reviews that contain user preferences and product attributes [8,9,10]. Early research in this direction combined matrix factorization with probabilistic topic modeling [11]. Later, neural recommendation was applied to extract review features in order to further improve recommendation performance. Neural recommendation refers to recommendation models that use deep learning techniques and can be trained end-to-end, which generally have excellent feature extraction capabilities. Recently, researchers have begun to focus on user personalization, and have made significant progress using attention mechanisms [12,13].

Although the above studies have made effective progress in modeling the features of users and items using reviews, there are still some issues in the effectiveness and interpretability of recommendations that have not been thoroughly investigated. Existing works fail to dynamically capture personalized features of users and items in reviews, which results in suboptimal recommendation accuracy and diminished user trust. As shown in Figure 1, two examples are used to illustrate scenarios where user features require dynamic modeling. In the upper part of Figure 1a, the user’s purchase behavior shows that he tends to buy low-priced and high-quality consumer electronics. Accordingly, the recommendation model mines products with this attribute in the reviews to recommend to the user. In the lower part, the user’s subsequent purchase behavior indicates a shift in preference, prioritizing quality over price. In this case, the model must dynamically adjust to focus on quality features for recommendation. In Figure 1b, while the user considers both appearance and quality for a mobile phone and camera, his preferences for different items are still different. For the mobile phone, quality plays a more critical role in the feature extraction process, whereas for the camera appearance is the primary focus. Therefore, the model should make corresponding feature adjustments based on users’ different preferences.

In this paper, we propose a Neural Recommendation model with Dynamic Personalized Attention (NRDPA). The model comprehensively solves the dynamic personalization problem in review-based recommendations through multiple attention mechanisms. On the one hand, the model implements dynamic personalized word-level and review-level attention mechanisms to dynamically select words and reviews based on past experience. On the other hand, the model implements personalized interactive attention mechanisms to dynamically adjust the modeling of users and items through user preferences for different items.

The contributions of this paper are summarized as follows:

We propose a review-based recommendation framework called NRDPA to solve the dynamic personalization problem. The proposed framework integrates multiple attention mechanisms to dynamically mine the personalized features of users and items in reviews.
We propose a dynamic personalized word-level and review-level attention mechanism that can perceive the changes in the personalized features of users and items, capturing the current user personality and item attributes through the user’s historical rating behavior to dynamically construct users and items.
We propose a personalized interactive attention mechanism that can adjust the features of both sides of the current interaction and dynamically obtain more accurate personalized features of users and items by perceiving the different preferences of users for items.
Our experiments on five public datasets from Amazon demonstrate that the rating prediction accuracy of our model outperforms the baseline model. Ablation experiments and interpretability analysis further validate the superiority of the proposed NRDPA.

The subsequent sections of this paper are organized as follows: Section 2 describes other work related to our topic; Section 3 systematically describes the NRDPA model; Section 4 performs a detailed experimental analysis of NRDPA; and Section 5 fully summarizes the paper.

2. Related Work

In this section, we provide an overview of recent research on review-based recommendation. Based on the utilization of review texts, review-based recommendation can be categorized into three main approaches: feature extraction-based recommendation, aspect mining-based recommendation, and graph construction-based recommendation.

2.1. Feature Extraction-Based Recommendation

Feature extraction-based recommendation directly leverages deep learning techniques to extract features from review text by modeling both the user and the item [14,15,16,17,18,19,20,21,22]. Early work in this area included the ConvMF model proposed by Kim et al. [23], which combines convolutional neural networks with probabilistic matrix factorization and can effectively utilize item reviews. Later, Zheng et al. [24] proposed the DeepCoNN model, which utilizes two parallel convolutional neural networks to jointly extract features from user reviews and item reviews, achieving strong predictive performance. Building on DeepCoNN, Catherine et al. [25] introduced the TransNets model, which enables feature transformation, while Wu et al. [26] proposed the PARL model to supplement extracted features. As the integration of attention mechanisms into recommendation systems has consistently shown effectiveness, these mechanisms have been widely adopted in review-based recommendations. Seo et al. [27] proposed the D-Attn model, which incorporates a word-level attention mechanism to differentiate the importance of words during feature extraction. Similarly, Chen et al. [28] developed the NARRE model, which applies a review-level attention mechanism to assign varying weights to different reviews. Subsequently, various attention mechanism-based models such as TARMF [29], MPCN [30], and CARL [31] have been proposed, further advancing the performance and interpretability of review-based recommendation systems.

Recently, Liu et al. [32] introduced the DAML model, which builds upon CAML by utilizing a two-layer word-level attention mechanism to capture precise word semantics. Extending the NARRE model, Liu et al. [33] further proposed the NRPA model, which significantly enhances the personalization of recommendations by combining word-level and review-level attention mechanisms. Building on the NRPA model, Luo et al. [34] developed the NRCMA model, which emphasizes the interaction between review and rating information to refine feature extraction. Meanwhile, Li et al. [35] proposed the ARPCNN model, which establishes trust relationships to enhance feature complementarity.

Although these feature extraction-based recommendation methods incorporate various attention mechanisms to address user personalization and demonstrate strong performance in terms of recommendation accuracy and interpretability, their key limitation lies in the reliance on static feature extraction. When compared to the NRDPA model proposed in this study, these methods fail to fully capture dynamic user preferences, which are essential for accurately modeling user personality and improving recommendation effectiveness.

2.2. Aspect Mining-Based Recommendation

Aspect mining-based recommendation focuses on identifying aspects in reviews to enhance the granularity and interpretability of recommendations. Compared to traditional explicit aspects extracted using rules or statistical methods, recent studies have employed deep learning techniques to uncover more implicit aspects from reviews. For instance, Chin et al. [36] proposed the attention-based ANR model, which learns the importance of aspect-level features for users and items. Cheng et al. [37] introduced the A3NCF model, which is able to capture users’ attention to different aspects of various items. Building on this idea, Li et al. [38] proposed the CARP model based on capsule networks, which combines user opinions and item aspects for sentiment analysis. Meanwhile, Guan et al. [39] introduced the AARM model, which models the interactions between similar and distinct aspects. Recently, Da’u et al. [40] developed an aspect-based weighted opinion recommendation model which reduces model parameters through tensor decomposition. Ding et al. [41] further advanced this field by proposing an enhanced latent semantic model that captures the temporal evolution of topics via regularization factors.

Although the above-mentioned aspect-based review recommendation methods have made significant progress in fine-grained recommendations, they still face challenges when compared to the NRDPA model; specifically, these methods struggle to comprehensively address all aspects and lack the ability to dynamically extract aspect preferences, which limits their effectiveness in user modeling.

2.3. Graph Construction-Based Recommendation

Graph-based recommendation methods build graph structures from review information to model complex interactions. For instance, Wu et al. [42] proposed the RMG method, which combines review information with user–item graphs to learn rich user and item representations. Gao et al. [43] introduced the SSG model, which incorporates review semantics into the information propagation on the user–item graph to simulate multifaceted high-order relationships. Cai et al. [44] later proposed the RI-GCN model, which captures word context features from a graph constructed from reviews. Shuai et al. [45] developed the Review-aware Graph Comparison Framework (RGCL) for recommendation, which learns user–item interactions based on review-aware user–item graphs. More recently, Liu et al. [46] introduced the RGNN model, which builds a review graph for each user–item pair and learns word semantic embeddings through a type-aware graph attention mechanism. Additionally, Liu et al. [47] proposed the ERUR model, which learns social relationships between users based on their review representations, allowing for enhanced user representations under the supervision of a generative adversarial network.

Although the graph-based review recommendation methods mentioned above can capture complex interactions, the computational complexity of these models remains a significant concern. In contrast, NRDPA is sufficiently lightweight, utilizing a calculation method that combines multiple attention mechanisms.

3. Method

In this section, we describe the framework of NRDPA in detail, as illustrated in Figure 2. The model consists of two main components: a user network for modeling users, and an item network for modeling items. Specifically, it is divided into three key modules. First, the review encoder learns word features to represent reviews, where we implement a dynamic personalized word-level attention mechanism. Second, the user and item encoder learns review features to represent users and items, incorporating a dynamic personalized review-level attention mechanism. Finally, the rating prediction module applies a personalized interactive attention mechanism to adjust user and item features and predict potential user–item associations. Due to the similarity in user and item modeling, in the following sections we describe the process for users.

3.1. Review Encoder

In this module, we efficiently capture word features by considering user interests and item properties and use them to construct reviews. First, we obtain the word vector matrix of the reviews via the word embedding method. Then, semantic features between words are extracted by a convolutional neural network. Finally, we obtain the feature representation of the review by fusing the word features. Considering the varying importance of different words in representing a review, a personalized word-level attention mechanism must be applied. Moreover, because user personality is dynamically changing, the selection of important words must also be dynamic. Thus, we utilize the proposed dynamic personalized word-level attention mechanism to highlight the significance of different words and aggregate them into reviews.

Word Embedding: Using word2vec technology, we embed any user’s reviews into a low-dimensional vector matrix. Given user u’s review

s_{u, i} = \{w_{1}, w_{2}, \dots, w_{T}\}

of item i, we convert any word

w_{t}

to a low-dimensional vector. Then,

s_{u, i}

is mapped to a word vector matrix

D_{u, i} = [w_{1}, w_{2}, \cdot \cdot \cdot, w_{T}]

. Note that the length of the review text should be standardized.

Convolution: To acquire comprehensive review features, we extract semantic features by conducting a convolution operation both on the word itself and on its context. For the word vector matrix

D_{u, i}

, we derive the feature vector matrix

F \in R^{K \times T}

through feature extraction, denoted as

F_{j} = ReLu (W_{j} * D_{u, i} + b_{j}), 1 \leq j \leq K,

(1)

where

ReLu

represents the corresponding activation function,

W_{j}

denotes the weight matrix of the j convolution kernel, and ∗ symbolizes the convolution operator.

Dynamic Personalized Word-Level Attention Mechanism: This step employs a dynamic personalized word-level attention mechanism to focus on the dynamic nature of user preferences by assigning weights to different words. As illustrated in Figure 3, the user ID serves as a personalized identifier reflecting the user’s inherent characteristics, making it a critical input to the attention mechanism. Furthermore, historical interactions, including item IDs and their corresponding ratings, capture the user’s evolving interests over time. By integrating the user ID with these historical interactions and feeding their low-dimensional embedding vectors into a Multi-Layer Perceptron (MLP) model, we derive dynamic personalized vectors for different words in user u’s reviews:

v_{u} = u_{i d} \oplus i_{i d s} r,

(2)

p_{u}^{w} = ReLu (W_{w} v_{u} + b_{w}),

(3)

where ⊕ represents the concatenation operation,

u_{i d}

stands for the ID embedding of user u,

i_{i d s}

represents the ID embedding of user’s history rating items and r the corresponding ratings,

W_{w}

denotes the dynamic word-level attention parameter matrix, and

b_{w}

is the bias. Similarly, item i possesses a unique attention vector denoted as

p_{i}^{w}

.

Importantly, not all words in a review hold equal importance in conveying its meaning. In order to highlight the semantic importance of key words, personalized attention values representing the importance of each word can be derived:

f_{t} = p_{u}^{w} A d_{t},

(4)

α_{t} = \frac{exp (f_{t})}{\sum_{j = 1}^{T} exp (f_{j})}, α_{t} \in (0, 1),

(5)

where

A

represents the word-level attention harmony matrix and

d_{t}

stands for the t-th column of the feature vector matrix

F_{u, i}

, which denotes the representations of the t-th word.

After scaling the semantic features

d_{j}

of each word in the review by the size of its attention value

α_{j}

, we derive the feature representation of the i-th review of user u as follows:

s_{u, i} = \sum_{j = 1}^{T} α_{j} d_{j} .

(6)

3.2. User and Item Encoder

Upon completing the review encoder, we acquire the personalized feature of all the reviews for a given user. The next step is to further abstract the review features to obtain the user representations. Similar to the review encoder, recognizing the varying requirements for review information among different users and the dynamic changes in user personalization, we employ a dynamic personalized review-level attention mechanism to aggregate review feature to represent users. In this way, we focus on changes in user personality and highlight reviews that provide more semantic information.

Dynamic Personalized Review-Level Attention Mechanism: Similar to the dynamic personalized word-level attention mechanism, we feed the dynamic embedding vector

v_{u}

of the user’s ID and history rating record into another MLP layer to obtain the dynamic personalized vectors of user u for different reviews, as follows:

p_{u}^{r} = ReLu (W_{r} v_{u} + b_{r})

(7)

where

W_{r}

represents the parameter matrix and

b_{r}

stands for the bias term.

In the obtained feature representation

s_{u} = \{s_{u, 1}, s_{u, 2}, \dots, s_{u, H}\}

of multiple reviews, we prioritize highlighting those reviews containing user interests while fading out the less meaningful ones. Specifically, the dynamic personalized attention value

β_{h}

of the h-th review of user u can be computed as follows:

g_{h} = p_{u}^{r} B s_{u, h},

(8)

β_{h} = \frac{exp (g_{h})}{\sum_{k = 1}^{H} exp (g_{k})}, β_{h} \in (0, 1),

(9)

where

B

represents the review-level attention harmony matrix and

s_{u, h}

denotes the representations of the h-th review of user u.

Upon scaling the user’s multiple review features

s_{u}

by the size of their attention value

β

, we obtain a feature representation

q_{u}

of the user u, denoted as

q_{u}^{r} = \sum_{k = 1}^{H} β_{k} s_{u, k} .

(10)

The feature representation

q_{i}^{r}

of item i can be obtained in a similar way.

3.3. Rating Prediction

Through the above methods, we finally obtain the feature representations that can fully reflect the user’s personality and item attributes. Next, we fuse the user features and item features to predict their corresponding ratings. However, it is important to note that the same user may have different preferences for different items. Therefore, a personalized interactive attention mechanism is required to dynamically adjust the obtained user and item feature representations.

Personalized Interactive Attention Mechanism: In this step, we consider the differences in the preferences of the same user for different items and adjust the user personality and item attributes accordingly. As shown in Figure 4, we combine the user ID, item ID, user features, and item features to obtain the attention weight that contains interaction information through corresponding feature transformations. The calculation is as follows:

q_{f} = ReLu (W_{f} (q_{u}^{r} \oplus q_{i}^{r}) + b_{f})

(11)

p_{u}^{i} = ReLu (W_{i d} (u_{i d} \oplus i_{i d}) + b_{i d})

(12)

γ_{u} = σ (p_{u}^{i} C ⊙ q_{f}), γ_{u} \in (0, 1)

(13)

where

W_{f}

and

W_{i d}

denote the weight matrices,

b_{f}

and

b_{i d}

represent the bias terms,

σ

denotes the activation function,

C

represents the linear transformation matrix for the interactive attention, and the symbol ⊙ denotes the element-wise product.

By adjusting the user features

q_{u}^{r}

through the attention value

γ_{u}

, we obtain the representation of user u for a specific item i, as follows:

{\hat{q}}_{u}^{r} = γ_{u} ⊙ q_{u}^{r} .

(14)

The representation

{\hat{q}}_{i}^{r}

of item i for a specific user u can be derived in a similar way.

Prediction Rating: Here, we fuse the adjusted user features

{\hat{q}}_{u}^{r}

and item features

{\hat{q}}_{i}^{r}

and input them into the factorization machine to simulate their interaction for rating prediction:

z = {\hat{q}}_{u}^{r} \oplus {\hat{q}}_{i}^{r}

(15)

{\hat{R}}_{u, i} = w_{0} + \sum_{i = 1}^{| z |} w_{i} z_{i} + \sum_{i = 1}^{| z |} \sum_{j = i + 1}^{| z |} 〈x_{i}, x_{j}〉 z_{i} z_{j}

(16)

where

〈x_{i}, x_{j}〉

represents the second-order interactions between the two features, while

w_{0}

and

w_{i}

denote different bias parameters in the factorization machine.

4. Experiments

This section first presents the datasets and evaluation metrics used in the experiment, then describes the baseline models used for comparison along with the corresponding experimental environment and parameter settings, and finally provides a comprehensive analysis of the model’s performance through multiple experiments.

4.1. Datasets

To effectively verify the performance and generalization of our proposed model, we selected five subsets from the Amazon review dataset updated in 2018, which includes reviews from 1996 to 2018. These subsets are from several different domains: Digital Music, Musical Instruments, Video Games, Office Products, and Grocery and Gourmet Food. We refer to these as Music, Instrument, Game, Office, and Food, respectively. Experiments conducted on these public datasets of varying sizes across different fields can objectively reflect the real performance of different models.

Although the dataset has been briefly screened by the data publisher, there are still cases where the number and length of reviews vary significantly. To address the resulting long-tail effect, we preprocessed the data by limiting both the number and length of reviews. As shown in Table 1, the processed data mitigate the long-tail effect and exhibit a notable difference in sparsity. Experiments on these datasets can provide a more comprehensive and realistic evaluation of the model’s generalization.

During model training, we randomly split the dataset into a training set, validation set, and test set with a ratio of 80%, 10%, and 10%, respectively. It is important to exclude reviews generated by interactions between the validation and test sets during training. Additionally, during validation and testing, we excluded user’s review of the current product, as these do not occur in real recommendation scenarios.

4.2. Evaluation Metrics

To evaluate the performance of all experimental methods, this paper uses the Mean Squared Error (MSE) and Mean Absolute Error (MAE) as the metrics to measure the prediction accuracy. These metrics are defined as follows:

M S E = \frac{1}{K} \sum_{(u, i) \in R} {({\hat{R}}_{u, i} - R_{u, i})}^{2}

(17)

M A E = \frac{1}{K} \sum_{(u, i) \in R} |{\hat{R}}_{u, i} - R_{u, i}|

(18)

where

{\hat{R}}_{u, i}

and

R_{u, i}

denote the predicted and actual ratings, respectively, while K represents the number of ratings.

4.3. Baselines

We selected the following methods as baselines to evaluate our proposed NRDPA method:

DeepCoNN [24]: Uses simple user and item networks to respectively extract user and item features from reviews. This is a classic deep learning-based model for review recommendation.
DAML [32]: Focuses on word meaning relevance in reviews both locally and globally during feature extraction, and combines rating and review features. This model explores word-level attention mechanisms in depth.
NRPA [33]: Implements dual personalized attention mechanisms at the word and review levels, emphasizing static personalization of user interests and item attributes. This model combines word-level and review-level attention.
NRCMA [34]: Implements cross-modal mutual attention, focusing on the information interaction between users and items during feature modeling at both the word and review levels. This model introduces information interaction into the modeling process.
ARPCNN [35]: Focuses on the sparsity of static personalized review data and establishes an auxiliary network by defining trust relationships, representing a relatively new approach in this field.
ERUR [47]: A novel user representation model for learning social graph recommendations based on enhanced reviews, which effectively integrates user reviews and social relationships. This research branch has gained significant attention recently.

In addition, Table 2 provides detailed statistics on the characteristics and differences between the baseline models and NRDPA.

4.4. Experimental Configuration and Parameter Settings

The details of our experimental configuration are shown in Table 3. The models were implemented in Python using PyTorch, with Adam as the optimizer. To ensure a fair comparison of model performance, we fixed the word embedding dimension at 100 during the experiment. For the model hyperparameters, we used a grid search with the following settings: ID embedding dimension searched over 16, 32, 48, 64, attention size searched over 8, 16, 32, 48, dropout searched over 0.1, 0.2, 0.4, 0.6, and learning rate searched over 0.001, 0.01, 0.05, 0.1. For the baseline models, we followed the same design choices as in the original papers.

4.5. Performance Comparison

Table 4 and Figure 5 present the performance results of the NRDPA model and baseline models. Table 4 shows the experimental results for each model, with the best performance shown in bold. Figure 5 shows the MSE and MAE ratios, which measure the performance of each model relative to the best model. Based on these results, we draw the following conclusions.

First, the attention-based models (e.g., DAML, NRPA, NRCMA, ARPCNN, and NRDPA) outperform DeepCoNN, demonstrating that attention mechanisms effectively improve recommendation performance in review-based tasks.

Second, DAML, which uses only a word-level attention mechanism, performs worse than NRPA, NRCMA, ARPCNN, and NRDPA across multiple datasets, demonstrating the limitations of focusing solely on personalized features at the word level.

In addition, the strong performance of ERUR demonstrates that the recommendation performance achieved by constructing a graph to capture high-order interaction information between users and items is sufficient to surpass various static attention mechanisms.

Finally, the excellent performance of NRDPA across multiple datasets confirms our idea of focusing on dynamic personalized features. Although static personalization is no longer able to outperform recent graph-based review recommendation models, capturing dynamic features to better simulate user-item interactions can still improve recommendation accuracy.

To enhance the rigor and credibility of this experiment, we employed the KSPA [48] to compare the accuracy between two prediction groups. KSPA determines statistical significance by analyzing the distribution of prediction errors between two groups and evaluates whether the model with smaller prediction errors has a significant advantage. Figure 6 and Figure 7 present the histograms and cumulative distribution functions (CDF) of the prediction errors for the NRDPA and ERUR models on the Foods dataset. Analysis of these figures confirms that the NRDPA model significantly outperforms the ERUR model on the Foods dataset.

4.6. Ablation Study

To investigate the reasons for the superior performance of NRDPA in greater detail, we developed several variants by modifying the components of NRDPA, as briefly described below:

NRDPA-I: Extracts features of reviews using a basic network that is not personalized.
NRDPA-II: Adds a dynamic personalized word-level attention mechanism to NRDPA-I.
NRDPA-III: Adds a dynamic personalized review-level attention mechanism to NRDPA-II.

Based on the presented results of these NRDPA variants on multiple datasets in Figure 8, we make the following analysis.

First, the prediction errors on multiple datasets show a decreasing trend from the variant model NRDPA-I to the complete model NRDPA. This basically validates the effectiveness of our proposed components.

Second, NRDPA-II presents a large performance improvement on multiple datasets compared with NRDPA-I. This suggests that paying attention to dynamically changing user interests and item properties through the dynamic personalized word-level attention mechanism effectively improves recommendation accuracy.

In addition, NRDPA-III achieves further performance improvements by implementing a dynamic personalized review-level attention mechanism. However, there is only a slight improvement, indicating that the focus on user and item personalization through semantic information has reached a stage of basic stagnation.

Finally, in comparing NRDPA and NRDPA-III, we find that the improvement in model performance is greater than the improvement brought about by NRDPA-III. This shows that dynamically adjusting personalized features according to users and items is an approach that review-based recommendation systems should consider.

4.7. Parameter Sensitivity Analysis

For the effects of ID embedding dimension, attention size, and dropout ratio on model NRDPA shown in Figure 9 and Figure 10, we make the following analysis.

First, the ID embedding dimension affects the generation of attention vectors, which changes the model’s focus on dynamic personalization at both the word and review levels. Observing Figure 9a and Figure 10a, as the ID embedding dimension increases, the attention vector transitions from being unable to fully learn the features of the review to learning too many unimportant features. Therefore, setting the ID embedding dimension to 32 is more in consistent with the learning ability of attention vectors.

Second, the attention size affects the generation of attention weights, which in turn affects how well the model simulates the real world. Observing Figure 9b and Figure 10b, as attention size increases, the distribution of attention weights ranges from not conforming to the modeling rules of users and items to overfitting; therefore, setting the attention size to 16 is more in line with overall model performance.

Finally, the dropout ratio directly affects the fitting degree of the model after training. Observing Figure 9c and Figure 10c, the model is more affected by dropout on larger or smaller datasets. This could be due to the fact that smaller datasets are prone to overfitting and larger datasets are prone to underfitting. Therefore, taking all cases into account, we set the dropout to 0.2.

4.8. Interpretability Analysis

To explore the interpretability of the NRDPA model, we visualized the dynamic personalized word-level attention mechanism that captures important words. As shown in Figure 11, we use word weights obtained through the attention mechanism, which represent user personalization, to highlight keywords to varying degrees. In the visualization, the depth of a word’s background color corresponds to its weight. If the weight is too low, the background is omitted.

In Figure 11, using the dynamic personalized word-level attention mechanism, we obtain important words that reflect the personalized preferences of users A and B. For the same review, the attention mechanism highlights different important words at different times. For user A, the highlighted words indicate that his attitude toward the reviewed items has changed from positive to negative. For user B, although the highlighted words have changed, his attitude toward the reviewed items remains positive. Clearly, user preferences are not static. The change in highlighted words fully demonstrates the ability of the dynamic personalized word-level attention mechanism to capture changes in user preferences.

5. Conclusions

In this paper, we propose a neural recommendation method that focuses on the dynamic features of users and items. The proposed method captures dynamic changes in user interests and item attributes through a dynamic personalized attention mechanism and dynamically emphasizes personalized features in various situations using a personalized interactive attention mechanism. Specifically, we employ a dynamic personalized word-level attention mechanism to highlight keywords reflecting personalized interests, a dynamic personalized review-level attention mechanism to emphasize important reviews containing explicit information, and a personalized interactive attention mechanism to align item features with user goals. Experiments on five datasets validate the superiority of our NRDPA in predicting user ratings. Furthermore, ablation experiments and interpretability analyses further demonstrate the critical role of the model components in capturing the personalization of real users.

It is evident that the proposed NRDPA model represents users and products based on comment features. Therefore, in e-commerce or video recommendation scenarios where user comments are widely available, NRDPA can leverage the rich personalized features in the comments to make the final recommendation results more aligned with user preferences. At the same time, the utility of NRDPA’s various attention mechanisms in other research areas is worth exploring. For example, with the appropriate modifications, the dynamic personalized word-level attention mechanism could also be applied to non-text data such as image or video. It must be noted that the performance of NRDPA will be significantly affected in recommendation settings with sparse review text. Additionally, the NRDPA model places a strong emphasis on further processing of the extracted features; thus, variations in feature extraction capabilities can greatly influence the effectiveness of the attention mechanism, including the impact of feature extraction techniques tailored to different languages.

In the future, we will focus on the sparsity of comment texts and work to improve recommendation performance in scenarios where comment texts are missing or unreliable. We intend to leverage existing comment texts to supplement features based on the similarities between comments from different users as well as to gradually implement dynamic feature extraction and fusion of multimodal data in order to integrate network data resources, design a more comprehensive model, and enhance the accuracy and interpretability of the resulting recommendations.

Author Contributions

Conceptualization, Q.S.; methodology, Q.S.; software, Q.S.; validation, Q.S.; formal analysis, Q.S.; writing—original draft preparation, Q.S.; writing—review and editing, Z.L.; supervision, X.L. and X.W.; funding acquisition, Z.L. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region of China under Grant 2022D01C56, the National Natural Science Foundation of China Projects under Grant 62466057 and 62262064, and the Key R&D project of Xinjiang Uygur Autonomous Region under Grant 2022B01006.

Data Availability Statement

The data that support the findings of this study are openly available at https://nijianmo.github.io/amazon/ (accessed on 10 September 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Burke, R. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep learning based recommender system: A survey and new perspectives. ACM Comput. Surv. 2019, 52, 1–38. [Google Scholar] [CrossRef]
Roy, D.; Dutta, M. A systematic review and research perspective on recommender systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
Su, X.; Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 2009, 1–19. [Google Scholar] [CrossRef]
Cacheda, F.; Carneiro, V.; Fernández, D.; Formoso, V. Comparison of collaborative filtering algorithms: Limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web 2011, 5, 1–33. [Google Scholar] [CrossRef]
Rendle, S.; Krichene, W.; Zhang, L.; Anderson, J. Neural collaborative filtering vs. matrix factorization revisited. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual Event, Brazil, 22–26 September 2020; pp. 240–248. [Google Scholar]
Chen, H.; Shi, S.; Li, Y.; Zhang, Y. Neural collaborative reasoning. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 1516–1527. [Google Scholar]
Chen, L.; Chen, G.; Wang, F. Recommender systems based on user reviews: The state of the art. User Model. User-Adapt. Interact. 2015, 25, 99–154. [Google Scholar] [CrossRef]
Wu, L.; He, X.; Wang, X.; Zhang, K.; Wang, M. A survey on neural recommendation: From collaborative filtering to content and context enriched recommendation. arXiv 2021, arXiv:2104.13030. [Google Scholar]
Sachdeva, N.; McAuley, J. How useful are reviews for recommendation? A critical review and potential improvements. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, China, 25–30 July 2020; pp. 1845–1848. [Google Scholar]
McAuley, J.; Leskovec, J. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 165–172. [Google Scholar]
Zhou, J.P.; Cheng, Z.; Pérez, F.; Volkovs, M. TAFA: Two-headed attention fused autoencoder for context-aware recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual Event, Brazil, 22–26 September 2020; pp. 338–347. [Google Scholar]
Chuang, Y.N.; Chen, C.M.; Wang, C.J.; Tsai, M.F.; Fang, Y.; Lim, E.P. TPR: Text-aware preference ranking for recommender systems. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; pp. 215–224. [Google Scholar]
Li, X.; Yu, J.; Wang, Y.; Chen, J.Y.; Chang, P.X.; Li, Z. DAHP: Deep attention-guided hashing with pairwise labels. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 933–946. [Google Scholar] [CrossRef]
Li, X.; Yu, J.; Jiang, S.; Lu, H.; Li, Z. Msvit: Training multiscale vision transformers for image retrieval. IEEE Trans. Multimed. 2024, 26, 2809–2823. [Google Scholar] [CrossRef]
Li, X.; Yu, J.; Lu, H.; Jiang, S.; Li, Z.; Yao, P. MAFH: Multilabel aware framework for bit-scalable cross-modal hashing. Knowl. Based Syst. 2023, 279, 110922. [Google Scholar] [CrossRef]
Lu, W.; Altenbek, G. A recommendation algorithm based on fine-grained feature analysis. Expert Syst. Appl. 2021, 163, 113759. [Google Scholar] [CrossRef]
Wang, X.; Ounis, I.; Macdonald, C. Leveraging review properties for effective recommendation. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2209–2219. [Google Scholar]
Xiong, K.; Ye, W.; Chen, X.; Zhang, Y.; Zhao, W.X.; Hu, B.; Zhang, Z.; Zhou, J. Counterfactual review-based recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual Event, Queensland, Australia, 1–5 November 2021; pp. 2231–2240. [Google Scholar]
Tran, N.T.; Lauw, H.W. Aligning dual disentangled user representations from ratings and textual content. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 1798–1806. [Google Scholar]
Liu, W.; Zheng, X.; Hu, M.; Chen, C. Collaborative filtering with attribution alignment for review-based non-overlapped cross domain recommendation. In Proceedings of the ACM Web Conference 2022, Virtual Event, Lyon, France, 25–29 April 2022; pp. 1181–1190. [Google Scholar]
Pan, S.; Li, D.; Gu, H.; Lu, T.; Luo, X.; Gu, N. Accurate and explainable recommendation via review rationalization. In Proceedings of the ACM Web Conference 2022, Virtual Event, Lyon, France, 25–29 April 2022; pp. 3092–3101. [Google Scholar]
Kim, D.; Park, C.; Oh, J.; Lee, S.; Yu, H. Convolutional matrix factorization for document context-aware recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 233–240. [Google Scholar]
Zheng, L.; Noroozi, V.; Yu, P.S. Joint deep modeling of users and items using reviews for recommendation. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, Cambridge, UK, 6–10 February 2017; pp. 425–434. [Google Scholar]
Catherine, R.; Cohen, W. Transnets: Learning to transform for recommendation. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 288–296. [Google Scholar]
Libing, W.; Cong, Q.; Chenliang, L.; Donghong, J. PARL: Let strangers speak out what you like. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 22–26. [Google Scholar]
Seo, S.; Huang, J.; Yang, H.; Liu, Y. Interpretable convolutional neural networks with dual local and global attention for review rating prediction. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–30 August 2017; pp. 297–305. [Google Scholar]
Chen, C.; Zhang, M.; Liu, Y.; Ma, S. Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 1583–1592. [Google Scholar]
Lu, Y.; Dong, R.; Smyth, B. Coevolutionary recommendation model: Mutual learning between ratings and reviews. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 773–782. [Google Scholar]
Tay, Y.; Luu, A.T.; Hui, S.C. Multi-pointer co-attention networks for recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2309–2318. [Google Scholar]
Wu, L.; Quan, C.; Li, C.; Wang, Q.; Zheng, B.; Luo, X. A context-aware user-item representation learning for item recommendation. ACM Trans. Inf. Syst. 2019, 37, 1–29. [Google Scholar] [CrossRef]
Liu, D.; Li, J.; Du, B.; Chang, J.; Gao, R. Daml: Dual attention mutual learning between ratings and reviews for item recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 344–352. [Google Scholar]
Liu, H.; Wu, F.; Wang, W.; Wang, X.; Jiao, P.; Wu, C.; Xie, X. NRPA: Neural recommendation with personalized attention. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 1233–1236. [Google Scholar]
Luo, S.; Lu, X.; Wu, J.; Yuan, J. Review-Aware neural recommendation with cross-modality mutual attention. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Queensland, Australia, 1–5 November 2021; pp. 3293–3297. [Google Scholar]
Li, Z.; Chen, H.; Ni, Z.; Deng, X.; Liu, B.; Liu, W. ARPCNN: Auxiliary review-based personalized attentional CNN for trustworthy recommendation. IEEE Trans. Ind. Inform. 2023, 19, 1018–1029. [Google Scholar] [CrossRef]
Chin, J.Y.; Zhao, K.; Joty, S.; Cong, G. ANR: Aspect-based neural recommender. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 147–156. [Google Scholar]
Cheng, Z.; Ding, Y.; He, X.; Zhu, L.; Song, X.; Kankanhalli, M.S. A3NCF: An Adaptive Aspect Attention Model for Rating Prediction. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3748–3754. [Google Scholar]
Li, C.; Quan, C.; Peng, L.; Qi, Y.; Deng, Y.; Wu, L. A Capsule Network for Recommendation and Explaining What You Like and Dislike. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’19, New York, NY, USA, 21–25 July 2019; pp. 275–284. [Google Scholar]
Guan, X.; Cheng, Z.; He, X.; Zhang, Y.; Zhu, Z.; Peng, Q.; Chua, T.S. Attentive aspect modeling for review-aware recommendation. ACM Trans. Inf. Syst. 2019, 37, 1–27. [Google Scholar] [CrossRef]
Da’u, A.; Salim, N.; Rabiu, I.; Osman, A. Weighted aspect-based opinion mining using deep learning for recommender system. Expert Syst. Appl. 2020, 140, 112871. [Google Scholar]
Ding, H.; Liu, Q.; Hu, G. TDTMF: A recommendation model based on user temporal interest drift and latent review topic evolution with regularization factor. Inf. Process. Manag. 2022, 59, 103037. [Google Scholar] [CrossRef]
Wu, C.; Wu, F.; Qi, T.; Ge, S.; Huang, Y.; Xie, X. Reviews meet graphs: Enhancing user and item representations for recommendation with hierarchical attentive graph neural network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 4884–4893. [Google Scholar]
Gao, J.; Lin, Y.; Wang, Y.; Wang, X.; Yang, Z.; He, Y.; Chu, X. Set-sequence-graph: A multi-view approach towards exploiting reviews for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; pp. 395–404. [Google Scholar]
Cai, Y.; Wang, Y.; Wang, W.; Chen, W. RI-GCN: Review-aware interactive graph convolutional network for review-based item recommendation. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 475–484. [Google Scholar]
Shuai, J.; Zhang, K.; Wu, L.; Sun, P.; Hong, R.; Wang, M.; Li, Y. A review-aware graph contrastive learning framework for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 1283–1293. [Google Scholar]
Liu, Y.; Yang, S.; Zhang, Y.; Miao, C.; Nie, Z.; Zhang, J. Learning hierarchical review graph representations for recommendation. IEEE Trans. Knowl. Data Eng. 2021, 35, 658–671. [Google Scholar] [CrossRef]
Liu, H.; Chen, Y.; Li, P.; Zhao, P.; Wu, X. Enhancing review-based user representation on learned social graph for recommendation. Knowl.-Based Syst. 2023, 266, 110438. [Google Scholar] [CrossRef]
Hassani, H.; Silva, E.S. A Kolmogorov-Smirnov based test for comparing the predictive accuracy of two sets of forecasts. Econometrics 2015, 3, 590–609. [Google Scholar] [CrossRef]

Figure 1. Scenarios that require dynamic modeling of user features.

Figure 2. The framework of the proposed NRDPA model.

Figure 3. Calculation process of the dynamic personalized word-level and review-level attention mechanism.

Figure 4. Calculation process of the dynamic personalized interactive attention mechanism.

Figure 5. MSE and MAE ratios for the performance of all models on the five datasets.

Figure 6. Histogram drawn by KSPA test method.

Figure 7. Cumulative distribution function drawn by KSPA test method.

Figure 8. Performance comparison of NRDPA variants containing different components on the five datasets.

Figure 9. MSE performance for different dimensions of ID embedding, attention size and dropout on five datasets.

Figure 10. MAE performance for different dimensions of ID embedding, attention size, and dropout on five datasets.

Figure 11. For users A and B, we visualize the weights of different words assigned by the dynamic personalized word-level attention mechanism for them under different interactions.

Table 1. Statistics of the five publicly available datasets.

Dataset	#Users	#Items	#Ratings	Density
Music	1909	742	12,099	0.854%
Instrument	3113	1002	23,711	0.760%
Game	5080	2071	45,422	0.432%
Office	8361	3056	75,046	0.294%
Food	16,813	6386	178,206	0.166%

Table 2. Characteristics of all models.

Methods	Deep Learning	Review Text	Word-Level Attention	Review-Level Attention	Interactive Attention	Rating Matrix	Auxiliary Feature
DeepCoNN	✓	✓	×	×	×	×	×
DAML	✓	✓	✓	×	×	✓	×
NRPA	✓	✓	✓	✓	×	×	×
NRCMA	✓	✓	✓	✓	×	×	×
ARPCNN	✓	✓	✓	✓	×	×	✓
ERUR	✓	✓	×	×	×	×	✓
NRDPA	✓	✓	✓	✓	✓	×	×

Table 3. Experimental environment configuration.

Resource	Configuration
OS	Ubuntu 20.04.6 LTS
Nvidia Driver	550.90.07
CPU	Intel i7-13700KF
GPU	GeForce RTX 3090
RAM	94 GB
CUDA	12.4
CuDNN	11.3
Python	3.7.16
Pytorch	1.12.0
Optimizer	Adam

Table 4. Performance of all models on the five datasets.

Methods	Music		Instrument		Game		Office		Food
Methods	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE
DeepCoNN	0.664	0.641	0.943	0.761	1.363	0.914	0.963	0.722	1.190	0.916
DAML	0.592	0.566	0.901	0.712	1.355	0.902	0.943	0.729	1.133	0.897
NRPA	0.610	0.578	0.868	0.725	1.344	0.904	0.854	0.707	1.128	0.860
NRCMA	0.606	0.569	0.860	0.711	1.342	0.901	0.862	0.711	1.125	0.821
ARPCNN	0.595	0.567	0.857	0.697	1.335	0.898	0.852	0.698	1.105	0.791
ERUR	0.545	0.557	0.852	0.657	1.330	0.891	0.851	0.693	1.086	0.751
NRDPA	0.490	0.522	0.844	0.662	1.322	0.895	0.843	0.694	1.047	0.726

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, Q.; Li, Z.; Yu, J.; Li, X.; Wang, X. NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention. Electronics 2025, 14, 33. https://doi.org/10.3390/electronics14010033

AMA Style

Sun Q, Li Z, Yu J, Li X, Wang X. NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention. Electronics. 2025; 14(1):33. https://doi.org/10.3390/electronics14010033

Chicago/Turabian Style

Sun, Qinghao, Ziyang Li, Jiong Yu, Xue Li, and Xin Wang. 2025. "NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention" Electronics 14, no. 1: 33. https://doi.org/10.3390/electronics14010033

APA Style

Sun, Q., Li, Z., Yu, J., Li, X., & Wang, X. (2025). NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention. Electronics, 14(1), 33. https://doi.org/10.3390/electronics14010033

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

NRDPA: Review-Aware Neural Recommendation with Dynamic Personalized Attention

Abstract

1. Introduction

2. Related Work

2.1. Feature Extraction-Based Recommendation

2.2. Aspect Mining-Based Recommendation

2.3. Graph Construction-Based Recommendation

3. Method

3.1. Review Encoder

3.2. User and Item Encoder

3.3. Rating Prediction

4. Experiments

4.1. Datasets

4.2. Evaluation Metrics

4.3. Baselines

4.4. Experimental Configuration and Parameter Settings

4.5. Performance Comparison

4.6. Ablation Study

4.7. Parameter Sensitivity Analysis

4.8. Interpretability Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI