Next Article in Journal
Effect of Femtosecond Laser Polarization on the Damage Threshold of Ta2O5/SiO2 Film
Next Article in Special Issue
An Intelligent Radiomic Approach for Lung Cancer Screening
Previous Article in Journal
Railway Overhead Wiring Structures in Australia: Review and Structural Assessment
Previous Article in Special Issue
Emotion Estimation Method Based on Emoticon Image Features and Distributed Representations of Sentences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Graph-Based Movie Recommender System Using Cinematic Experience †

1
Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
2
Department of Data Engineering, Buzzvil, Seoul 05623, Korea
3
Intelligent Convergence Research Laboratory, Electronics and Telecommunications Research Institute, Daejeon 34129, Korea
*
Authors to whom correspondence should be addressed.
A preliminary version of this paper was presented at IEEE International Conference on Big Data and Smart Computing (BigComp 2022), held in Daegu, South Korea from 17 to 20 January.
Appl. Sci. 2022, 12(3), 1493; https://doi.org/10.3390/app12031493
Submission received: 20 December 2021 / Revised: 21 January 2022 / Accepted: 28 January 2022 / Published: 29 January 2022

Abstract

:
With the advent of many movie content platforms, users face a flood of content and consequent difficulties in selecting appropriate movie titles. Although much research has been conducted in developing effective recommender systems to provide personalized recommendations based on customers’ past preferences and behaviors, not much attention has been paid to leveraging users’ sentiments and emotions together. In this study, we built a new graph-based movie recommender system that utilized sentiment and emotion information along with user ratings, and evaluated its performance in comparison to well known conventional models and state-of-the-art graph-based models. The sentiment and emotion information were extracted using fine-tuned BERT. We used a Kaggle dataset created by crawling movies’ meta-data and review data from the Rotten Tomatoes website and Amazon product data. The study results show that the proposed IGMC-based models coupled with emotion and sentiment are superior over the compared models. The findings highlight the significance of using sentiment and emotion information in relation to movie recommendation.

1. Introduction

In recent years, users have been exposed to a large volume of multimedia contents due to the emergence of Over-the-Top (OTT) platforms such as Netflix and Disney Plus, highlighting the importance of developing content recommender systems that identify users’ tastes from their daily usage logs and recommend serendipitous contents that users may potentially like, so that users are can be stayed and tuned longer on the OTT platform services.
The most well-known recommendation strategy of commercial OTT services is to use users’ rating data, which is then fed for collaborative filtering [1]. However, the taste of movies can be represented in various types of experiences other than the rating scores, and those experiences are often defined as Cinematic Experience (CX) [2]. Generally, it is a concept that refers to the interplay between inner mental reality and outer material reality when watching a movie. Specifically, it refers to the environment that immerses a movie or the psychological context and aftertaste when watching a movie, meaning that CX would require more numerical and categorical forms such as sentiment and emotion of the users, to be used for contents recommendation.
One of the representative data that can reflect the user’s CX are sentiment and emotion, and they are collected and used individually for personalized recommendations by OTT platforms. For instance, Youtube collects not only the sentiment feedback via five-point scales from negative to positive but also the emotion feedback by asking users about their feelings after watching the recommended item. Similarly, some of previous studies have shown that sentiment or emotion information improves recommendation performance [1,3,4]. However, to the best of our understanding, no studies have used both of these two CX elements simultaneously for a recommender system.
In fact, sentiment and emotion are clearly distinguished in two measurement types: ordinal type and categorical type. First, sentiment typically divides the user’s positive and negative feelings into five levels, such as negative, somewhat negative, neutral, somewhat positive, and positive, and it is characterized by having an order. Second, emotion commonly divides the user’s psychological feelings into six levels, such as anger, fear, joy, love, sadness, and surprise; and it is characterized by having categories without orders. The dimension difference between the sentiment and emotion has challenged combining these data together for further recommendation, addressing the need of exploiting new approaches to embed multiple CX experiences into one dimension to enhance the user retention on the OTT platforms.
Recently, the recommendation method, which constructs the interaction between user and movie in a graph and reflects metadata and relations through the graph-natural-network (GNN), has become popular [5,6,7,8]. In this paper, we propose a new graph-based movie recommender system with BERT, which utilizes all of the rating, sentiment, and emotion information by considering the differences of the information in different dimensions. Inductive Graph-based Matrix Completion (IGMC) [6] is a link prediction method that transforms the rating matrix of user–item into a bipartite graph and predicts missing rating using subgraph embeddings. It is worth noting that the proposed model obtains user’s psychological context using fine-tuned BERT for sentiment and emotion, and predicts final ratings using IGMC by combining BERT results and rating information into a subgraph. Using Rotten Tomato movie data (https://www.rottentomatoes.com/, data accessed 13 September 2021) and Amazon review data (http://jmcauley.ucsd.edu/data/amazon/, data accessed 20 September 2021) [9,10], our experiment was conducted with baselines widely used in existing recommender systems. Throughout the experiment, we observed that CX information and IGMC-based models outperformed the baselines.
In summary, our contributions in this paper can be summarized as follows:
  • We build a stacking ensemble recommender system that exploits both the ordinal and categorical CX features by exploiting BERT and IGMC approaches simultaneously.
  • We further investigate individual effects of sentiment and emotion features by adding the features into the proposed model sequentially.
  • By adopting the stacking ensemble approach that predicts the sentiment and emotions from multiple fine-tuned BERTs, and further exploits the results for the final prediction, we achieve higher effectiveness compared to the state-of-the-art recommendation baselines.
The remainder of this paper is structured as follows: Section 2 reviews related studies and Section 3 describes the procedure of the proposed model for content recommendation. Section 4 demonstrates the effectiveness of the proposed model through experiments and presents the results. Lastly, Section 6 discusses limitations and future work, and concludes the study.

2. Related Work

Recommender systems select the most appropriate items to a specific user based on user profiles, item profiles, and user–item interactions. Recommender systems have different types of filtering such as content-based, collaborative, demographic-based, and hybrid [11]. Recently, many studies have adopted neural network architectures to mine complex, and nonlinear patterns of user–item interactions [12].
There largely exists two types of approaches to contents recommendation: rating-only and multi-relational based recommendation.
First, recommendation models in [13,14] predicted a future rating of a user by using the user reviews as input data. In such methods, sophisticated recommendation became difficult when there was a substantial number of neutral ratings. As the number of user reviews was limited to track the individual user interests, various sources such as user tags [15] and user behavior logs [16] were further exploited for contents recommendations.
Second, in [17], sentimental analysis was conducted in a statistical-based manner based on an sentimental dictionary, text mining, natural language processing (NLP) [18], but only two levels (positive, negative) or three levels (positive, neutral, negative) were utilized. Indeed, it is essential to further subdivide the user’s psychological information for better recommendation. However there yet exists the difficulty to extract the user’s psychology in detail by dividing sentiment into a smaller level.
Although the sentiment analysis would be successfully conducted at a smaller level, it is yet difficult to use data altogether because the dimensions of sentiment and emotion are different. For example, in Hyun et al’s study [19], positive and negative signals were scaled to −0.5 to 0.5 to reflect the results of sentiment analysis in the rating, combined with rating, and then used in the recommendation system. Similarly, if emotion information is to be reflected in the recommendation, scaling becomes problematic as it is not ordinal but categorical, suggesting that a new method should be considered because a native linear combination faces the lack of the effectiveness in recommendation due to the dimension difference of multiple relations. Previous studies have shown that sentiment or emotion information improves recommendation performance [1,3,4,20,21]. To the best of our knowledge, however, no studies have used both of these two in combination for recommender systems.
Existing recommender systems range from traditional statistical-based methods to deep learning methods. Traditional methods can be further divided into memory-based or model-based approaches. In memory-based approach, collaborative filtering (CF) [22,23] is a widely used algorithm and can be divided into user-based or item-based CFs according to criteria for similarity comparison. User-based CF is a method of identifying a user’s neighbors whose preferences are similar using their historical information. On the other hand, item-based CF is a method of identifying neighbors of items.
In the model-based approach, matrix factorization (MF) [24,25] predicts the missing rating of the sparse matrix by dividing the rating matrix into the latent user matrix and the latent item matrix. As one of the representative linear algebra techniques for matrix decomposition, singular value decomposition (SVD) [26] predicts the missing rating by dividing the rating matrix into m*m orthogonal matrix, m*n rectangular diagonal matrix (singular values), and n*n orthogonal matrix. In this method, m is the number of users, and n is the number of movies. Furthermore, SVD++ is employed as an optimized SVD algorithm to enhance the accuracy of prediction by generating implicit feedback.
Recently, recommendation methods based on graph neural networks (GNNs) have gained attention. GNN-based methods learn user–item relations from graph data and predict new relations. The graph convolutional network is a powerful tool for non-Euclidean data such as social networks, knowledge graphs, user–item interaction graphs. Graph convolutional network generates user/item embeddings from graph [12]. User/item embeddings can be used to predict ratings of user–item pairs.
Graph convolution matrix completion (GCMC) [5] is a link prediction method that transforms the rating matrix of user–item into a bipartite graph and predicts missing rating using a graph auto-encoder. On the basis of this idea, recent studies performed recommendations using a bipartite graph and link prediction method [27,28]. In [29], it was confirmed that link prediction performance was improved when text information was reflected and node embedding was performed with GCN. In GCMC, node embedding and relationship information between nodes were used through the GCN method, and by comparing RMSE, it was confirmed that user and item feature information could also be used as an important factor in recommendation.
Recently, a model that improved over the GCMC model was introduced, named IGMC [6]. It is a current state-of-the-art GNN model that most accurately predicts the rating scores using a bipartite graph by conducting subgraph-level embedding without side information. Figure 1 shows the link process of IGMC. Sub-bipartite graphs were used as inputs in the model and the model outputs subgraph embeddings through the message passing process. Finally Multi-Layer-Perceptron (MLP) predicts ratings by reconstruting rating matrix from subgraph embeddings. Although the model was effective without side information, the performance of IGMC can be further improved by exploiting CX as side information, which has yet to be explored.

3. Method

In this section, we propose our model that considers both the rating and the CX information of sentiment and emotion together. Extracting the relations between the users and items can be achieved by passing the user reviews through the IGMC model. In the subsequent sections, we first describe the overview of the proposed model, which combines the predictions through multiple BERTs for the final prediction with IGMC. We then proceed to describe the BERTs and IGMC model that we developed for this study.
A. Overview of the Proposed Model
Figure 2 shows the overview of the proposed model as follows: First, movie and review data from Rotten Tomatoes and Amazon Product were preprocessed for excluding cold-start problems and divided into train/test data set. Second, in Rotten Tomatoes, the users’ rating score was re-arranged by normalization as the Rotten Tomatoes rating scores were in various formats. The score interval of refined data was 0.5 from 1 to 5. Amazon data had already been normalized. Third, the user reviews were inputted into fine-tuned independent BERTs generating the prediction values as the output for the sentiment and emotion. The detailed operation process is presented in Algorithm 1. It describes the process of adding the relationship between user u and item v as a bipartite graph. Fourth, three graphs of the rating, sentiment, and emotion edges are entered into the graph model. Figure 3 shows the specific embedding process of the graph model. The final embedding was obtained by summing the three subgraph embedding vectors containing three types of information, producing richer information than the rating only. The corresponding operation process is presented in Algorithm 2. In the IGMC, graphs were embedded using RGCN. When three graphs were inputs to the RGCN, three vectors were outputs and were combined to generate an aggregated vector. Finally, we exploited Multi-Layer-Perceptron (MLP) to predict missing ratings for recommendation. The framework is based on the assumption, as reported in [19], the performance of the model could be further improved by exploiting CX data as side information.
Algorithm 1 Generating graphs through BERT
Applsci 12 01493 i001
Algorithm 2 Aggregating embedding
Applsci 12 01493 i002
B. Predictions for sentiment and emotion by BERT
We used the BERT model to extract sentiment and emotion information from the review text. BERT is one of the state-of-the-art transformers that has been proven to be outperforming for various tasks such as sentiment analysis, preference analysis, and conversation purpose classification through fine-tuning process [18]. To fine-tune the BERT models, we used training datasets for sentiment analysis (https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data, accessed on 12 December 2021) and emotion analysis (https://www.kaggle.com/praveengovi/emotions-dataset-for-nlp, accessed on 12 December 2021), available in Kaggle. The accuracy of each BERT model was 70% for sentiment analysis and 90.4% for emotion analysis for the testing set. The data created through the fine-tuned BERT was added as relations on the user–item graph. In the Experiment section, the correlation between the output result of the BERT model and the rating will be described.
C. IGMC model
As shown in Figure 1, the existing IGMC model [6] transforms a rating matrix of the user–item into a bipartite graph and uses the MLP to predict the unobserved interaction between the user and the item using subgraph embedding. IGMC learns the patterns of subgraphs, not the entire graph, and the model is inductive, so it can predict the rating even if some unlearned data are entered. The components of the model are largely composed of four steps.
  • Enclosing subgraph extraction
    In the entire Bipartite graph, it expands to neighbors around specific users and items to create a subgraph. We add as many new nodes as h-hope connected to the user and item sets to the subgraph node set. Thereafter, edges associated with nodes that do not belong to this set are deleted from the entire graph. Finally, the edge is deleted between the target users and items for the rating prediction.
  • Node labeling:
    In order for the model to be inductive, it is necessary to shake off the global context for generalization. So, a new label is attached to read the subgraph’s pattern only.
  • Passing graph neural networks:
    When the features of the subgraph are extracted using GNN, each node has a unique embedding of the assigned label. In the proposed model, a relational-GNN(R-GNN) that extends GNN to grasp other features according to the type of edge, is exploited. The GNN passes through several layers of GNN layers, and the output of all layers is concatenated, defined as the final representation of the node.
  • Rating prediction:
    Among the feature vectors of the node identified through R-GNN, the feature vectors of users and items for which we want to predict the score are concatenated and defined as the final expression of the subgraph. Finally, the rating is returned by passing it through the MLP layer.

4. Experiment

A. Datasets
We used two movie datasets: Rotten Tomatoes and Amazon Review. Because each dataset was rated in a different type of rating system, two different datasets were used to see if the proposed model could reflect all types of rating systems, from general to particular types of rating systems. For example, the Amazon dataset is used by general users, and the ratings range from 1 to 5. However, Rotten Tomatoes Dataset is a dataset crawled by a specific researcher and provided to Kaggle, and has an unstructured rating such as categorical type (A, B, C, D), continuous type (0.75, 0.573), and fractional type (1/5, 35/70). In the subsequent sections, the preprocessing process of each dataset is presented.
Rotten Tomatoes movie dataset We used a Kaggle dataset (https://www.kaggle.com/stefanoleone992/rotten-tomatoes-movies-and-critic-reviews-dataset, last accessed: 15 October 2021), which has been constructed by crawling movies’ meta-data and review data from the Rotten Tomatoes website. The dataset was divided into movie data and rating data, where the number of the user ratings and reviews were 758,709 in total, accounting for 67% of the whole dataset. There were three considerations when building the dataset. First, users with more than 20 ratings were selected to minimize the effect of the long-tail as shown in Figure 4. Second, when building the training and testing sets, it was divided based on the timestamps to find out if the model successfully predicts the future ratings based on the historical data. Third, user_ids and movie_ids existing in the training set remained only in the testing set for constant observation.
After all, 1112 user_ids and 8521 movie_ids were used. For node embedding of GCMC, ‘title, year, and genres’ were used in the movie data, and ‘top_critic and publisher_name’ were used in the user data. Because movie_id and user_id did not exist in the original dataset, movie_ids were attached in index order in the movie_data, and user_ids were attached in groups of ‘critical_name, top_critical, and publisher_name’ in the user_data.
Amazon Review dataset We used Amazon product data, a popular benchmark dataset [30,31,32,33] containing product reviews and metadata of Amazon from 1996 to 2018. It also includes various categories such as Books, Electronics, Movies and TV, and Beauty. Among the categories, we used Movies and TV data, given the purpose of our experiment. First, we used the smaller version of the dataset (about 1,697,533 reviews) and chose users with more than 50 reviews to exclude the cold-start problem. Figure 5 shows the form of a long-tail graph as in Rotten Tomatoes. Second, train and test data were divided at a ratio of 8:2 according to the timestamp. Finally, 3264 user_ids and 14,124 movie_ids were used. Because the Amazon product data does not provide meta information of movies, the GCMC model does not use node features in the Amazon product data.
Feature generating using fine-tuned BERT To learn BERT, data from ‘Sentimental Analysis on Movie Reviews’ and ‘Emotions Dataset for NLP’ in Kaggle were used. After fine-tuning two independent BERTs using the training data sets, the review data were entered into the BERTs and the predicted results were used as the labels for user_id and movie_id. It was observed that the sentiment data were highly correlated with its rating. On the other hand, in the emotion data, no clear correlation was seen. Figure 6 shows the correlation between the sentiment and rating. The higher the positive sentiment, the higher the rating is, indicating that there is a positive correlation between the two variables. Accordingly, when features having orders are combined, it may be expected that the recommendation performance is increased.
Meanwhile, Figure 7 shows the correlation between the emotion and rating, and shows that there is no clear correlation. By observing the dataset, we found that, when users felt the emotion of ‘joy, love, surprise’, they left high ratings, and when they felt the emotion of ‘sadness’, they tended to leave low ratings. On the other hand, when they felt the emotion of ‘fear’, they often left relatively high ratings.
B. Compared methods
In the experiment, we chose multiple traditional and graph-based models for comparison. For the CF model, we investigated both item and user-based CF models. As matrix factorization has been widely adopted for commercial usage in recent days, we also included SVD-based models (SVD and SVD++). It should be noted that the proposed model is a graph-based approach for the recommendation, indicating that the current state-of-the-art graph-based model should be also included for the experiment. For this purpose, GCMC and IGMC models were selected, and those hyperparameters were determined by grid search to ensure their best performances.
Traditional models KNNWithMeans, SVD, and SVDpp of the Surprise library were used to implement the traditional models. The KNNWithMeans model was a variant of CF and used as user and item-based CFs through the user_based option. The hyperparameters for the grid search were chosen to be best performing of the models. In the KNNWithMeans model, k were 10, 15, and 25, and name were Pearson_baseline and Cosine. In the SVD and SVDpp models, learning rate were 0.005, 0.006, and 0.007.
GCMC model In order to see the performance of individual models by the changes of the hyperparameters, the seed was fixed, and a grid search was conducted to perform hyperparameter tuning for each model. The final hyperparameters used for the experiment were as follows. Dimension was fixed at 75, 150, and 225. Drop outs were fixed at 0.3, 0.5, and 0.7. Learning rate was fixed at 0.005, 0.01, 0.02, and 0.03.
C. Loss definition in IGMC
IGMC uses a unique type of loss, which consists of Mean Square Error (MSE) and Adjacent Rating Regularization (ARR) Loss. MSE loss is used for general regresion based analysis that adjusts the weight of the models by learning the MSE difference between the prediction and actual values. However, graph neural network models such as GCMC and IGMC calculate the last output for each rating by Softmax and then transform it into a regression analysis form, so if only the MSE method is used, it can be learned without reflecting the similarity between each rating. For example, if the type of rating has scores from 1 to 5, score 1 and 2 are more similar than score 1 and 5. This distance between scores would not be reflected in the model.
Equation (1) shows the MSE loss function that performs training process by reducing the difference between the predicted and actual ratings. Meanwhile, Equation (2) shows ARR loss function that trains adjacent ratings. The closer the rating scores are, the more similar the weights such as W r i and W r i + 1 are. F indicates regulating matrices in a Frobenius norm. Equation (3) is the loss used when using only the rating in the IGMC method. However, because our model has three types of data, we need to define a new type of loss to reflect the characteristics of the IGMC model. As the sentiment is an ordinal data, Equation (4) is finally defined for our model applying the ARR loss functions on two data types such as ratings and sentiments. Here, because the emotion is a categorical type, it is not considered. Compared to the use of existing IGMC loss, the performance was slightly increased, but there was no significant effect.
L M S E = 1 n i = 1 n ( y i y i ^ )
L A R R = i = 1 , 2 , . . . , R 1 W r i + 1 W r i F 2
L I G M C = L M S E + L A R R r
L I G M C + C X = L M S E + L A R R r + L A R R s
D. Results
Through the experiment, we investigated whether or not the recommendation performance could be improved by using additional relational data such as CX data in the GNN method. The results of comparing performance with Rotten Tomatoes and Amazon data are as follows. The results are presented for Rotten Tomatoes and Amazon data in Table 1 and Table 2, respectively.
Table 1 shows the best RMSE of Rotten Tomatoes data using traditional models and GNN-based models. The conventional and GCMC models used only ratings, while IGMC models used ratings and additional data. In the traditional models, user-based CF produced the worst results, and its performance was lower than item-based CF, supporting the previous observations that item-based CF often outperforms user-based CF [19]. The SVD model performed better than the CF, and the SVD++ model outperformed the other traditional models.
As the graph reflects the complex information between users and items, GNN-based models performed better than traditional models such as CF and SVD. Although the IGMC model did not use side information, it outperformed the GCMC model supporting that it is the current state-of-the-art GNN model. In the end, the result of the IGMC model with Cinematic Experience (CX) outperformed the basic IGMC. According to [6], there is a possibility that using side information in IGMC could further improve performance, and our study shows that the model’s performance can be improved with CX information, indeed.
Table 2 shows the best RMSE of Amazon movie data using traditional models and GNN-based models. Unlike Rotten Tomatoes’ results, user-based CF performed better than item-based CF, but the difference of RMSE is slight. As observed in the Rotten Tomatoes dataset, the SVD model performed better than the CF, and the SVD++ model outperformed the other traditional models. We observed that the SVD++ model was performing better than GCMC. Although GNN-based models generally perform better than traditional models, we confirmed that traditional models can be effective depending on the characteristics of the data. Yet, the IGMC model with Cinematic Experience (CX) performed better than the basic IGMC.
We attribute the superior outcome to the fact that the graph features interact with various relations. In GNN, the features between the nodes and edges are transmitted through the message passing method. By doing so, more diverse information would be delivered within the model according to the combination of the edge types and node features. Meanwhile, the proposed model extracts the features, which are not revealed in the rating, from the review data by representing those as graphs. In the end, a multi-modality recommendation can achieve improved performance by transforming graphs with different dimensions such as sentiment, emotion into embedding in the same dimension.
E. Ablation study of IGMC model
We conducted an ablation study as shown in Table 3 to discover features that affected the recommendation performance. The aforementioned experiment results showed that the rating-based recommendation approach is limited, and the CX information can be used as boosting features to improve performance.
Since sentiment clearly showed a positive correlation with rating in the experiment, it showed a high-performance improvement when combined with rating. On the other hand, emotion did not correlate with rating. Yet, there was a correlation with emotion types such as sadness and joy, which contributed to improved performance.
The results of the ablation study also showed better performance when combining sentiments with ratings rather than emotions. However, when sentiment and emotion were used altogether, the performance was the best, implying that they would complement one another. In the graph between the sentiment and rating, ‘negative review and high rating’ or ‘positive review and low rating’ could somewhat hinder the performance of the recommendation, but such information should be reflected to see similar results to the existing industrial recommender system. Therefore, when experimenting in such an environment, it sounds plausible that exploiting both sentiment and emotion could produce the best results.
In summary, our model aggregated a multi-relational graph by performing GNN’s operations for each edge type. Latent vectors were used to predict missing ratings, and when combined with CX information, they showed improved results.
F. The impact of parameter
GCMC model For the final model run, drop out was set to 0.5, the dimension was set to 256, and learning rate was set to 0.006. However, when the hyperparameters were changed, there was no significant effect because the range of changes in RMSE was small.
Surprise In both KNNwithMeans of user-based and item-based, the performance was the best when k was 25 and name was cosine, meaning that the higher the k, the more neighbors can be considered. In both SVD and SVDpp, the performance was the best when learning_rate was 0.006.
G. Visualization of subgraph
Inspired by [6], we visualized subgraphs of testing data with the highest and lowest predicted ratings for Rotten Tomatoes, as shown in Figure 8 and Figure 9, respectively. As IGMC produces distinct subgraph patterns for each rating, it is helpful to look at those patterns for further investigation closely. In each subgraph, nodes on the left are users, and nodes on the right are 100 neighbor items that the model referred to produce the predicted rating for the target item. In Figure 8, the model produced a low score (i.e., 0.73) for the target item, and the items the model referred to tended to have relatively lower ratings, colored in blue. On the other hand, Figure 9 visualized a case where the higher the predicted score was (i.e., 4.68), the higher the actual ratings of the neighboring nodes were, colored in red.

5. Discussion

The implications of this study showed that recommendation performance can be improved when information not reflected in the rating is extracted from the review and combined with the rating. In addition, it can be seen as a helpful study in the actual environment in that the recommender system industry also requires CX information for personalized recommendation.
The limitation was that only information extracted directly from the review was used as features, so other meaningful features included in the review were not considered. In addition, since the problem of time complexity has increased due to the use of several graphs, it was confirmed that the performance and efficiency aspects should be properly considered in the recommender system.
The future work is to embed the review itself to generate vectors with richer information. Recently, graph-based research has drawn attention, and several research cases reflecting CX in the graph are expected to emerge in the future.

6. Conclusions

In this paper, we propose a new, improved graph-based movie recommender system using Cinematic Experience with BERT by building a novel stacking ensemble architecture. We demonstrate that natural language information such as review data can be used in a graph-based recommender system by stacking BERT. Furthermore, through the experiment, we found that Cinematic Experience (CX) manifested via sentiment and emotion can contribute to improving the recommendation performance. As the graph structure reflects the multi-relational information between users and items, CX enhances the performance through a GNN method. Lastly, multi-relational graphs we used in the study can be replaced by other meta-information between users and items, opening up the possibility of expanding the proposed model in various domains such as shopping and social network that primarily rely on user reviews as primary information on user experience (UX).
The limitation of the proposed model is that only information extracted directly from the review was used as features. Thus, other potentially meaningful features included in the review were not considered. In addition, the use of several graphs increased the problem of time complexity. The trade-off between the performance and efficiency aspects should be addressed in a future study. The future work might consider embedding the review itself to generate vectors with richer information. Furthermore, although there exist various features that affect the user’s CX, such as the viewer’s individual memory, we only focused on two representative features to see if these primary aspects of CX can contribute to improving the performance of the recommender systems. In the future work, relatively personalized CX features need to be extracted and analyzed for improving the proposed recommendation model further.
Nonetheless, this study has showed that recommendation performance could be improved by exploiting additional information from the combination of the reviews and ratings. It also suggests that, by exploiting various CX information, the recommender services can be further improved in providing more personalized experiences to their users.

Author Contributions

Conceptualization, C.L. and K.H.; methodology, L.C, D.H., and K.H.; software, C.L. and D.H.; validation, K.H. and M.Y.; investigation, M.Y.; resources, K.H. and M.Y.; data curation, D.H.; writing-original draft preparation, C.L. and K.H.; writing-review and editing, K.H. and M.Y.; visualization, C.L. and D.H.; supervision, K.H. and M.Y.; project administration, M.Y.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ullah, R.; Zeb, A.; Kim, W. The impact of emotions on the helpfulness of movie reviews. J. Appl. Res. Technol. 2015, 13, 359–363. [Google Scholar] [CrossRef] [Green Version]
  2. Kuhn, A. Cinematic experience, film space, and the child’s world. Can. J. Film. Stud. 2010, 19, 82–98. [Google Scholar] [CrossRef] [Green Version]
  3. Lim, H.; Kim, H.J. Tensor-based tag emotion aware recommendation with probabilistic ranking. KSII Trans. Internet Inf. Syst. (TIIS) 2019, 13, 5826–5841. [Google Scholar]
  4. Kumar, S.; De, K.; Roy, P.P. Movie recommendation system using sentiment analysis from microblogging data. IEEE Trans. Comput. Soc. Syst. 2020, 7, 915–923. [Google Scholar] [CrossRef]
  5. van den Berg, R.; Kipf, T.N.; Welling, M. Graph convolutional matrix completion. arXiv 2017, arXiv:1706.02263. [Google Scholar]
  6. Zhang, M.; Chen, Y. Inductive matrix completion based on graph neural networks. arXiv 2019, arXiv:1904.12058. [Google Scholar]
  7. Shen, W.; Zhang, C.; Tian, Y.; Zeng, L.; He, X.; Dou, W.; Xu, X. Inductive Matrix Completion Using Graph Autoencoder. arXiv 2021, arXiv:2108.11124. [Google Scholar]
  8. Ma, X.; Dong, L.; Wang, Y.; Li, Y.; Sun, M. AIRC: Attentive Implicit Relation Recommendation Incorporating Content Information for Bipartite Graphs. Mathematics 2020, 8, 2132. [Google Scholar] [CrossRef]
  9. He, R.; McAuley, J. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 507–517. [Google Scholar]
  10. McAuley, J.; Targett, C.; Shi, Q.; Van Den Hengel, A. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, 9–13 August 2015; pp. 43–52. [Google Scholar]
  11. Hussien, M.M.; Helmy, K.M.; Hasan, I.M. Recommender Systems Challenges and Solutions Survey. In Proceedings of the 2019 International Conference on Innovative Trends in Computer Engineering (ITCE), Aswan, Egypt, 2–4 February 2019; pp. 149–155. [Google Scholar] [CrossRef]
  12. Shuai, Z.; Lina, Y.; Aixin, S.; Yi, T. Deep Learning Based Recommender System: A Survey and New Perspectives. ACM Comput. Surv. 2019, 52, 1–38. [Google Scholar]
  13. Cuizon, J.C.; Giovanni, A.C. Sentiment Analysis for Review Rating Prediction in a Travel Journal. In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, Association for Computing Machinery, NLPIR, Seoul, Korea, 18–20 December 2020; pp. 70–74. [Google Scholar]
  14. Bahar, G.; Necva, B.; Ayça, T.; Burcu, C. Neural Sentiment Analysis of User Reviews to Predict User Ratings. In Proceedings of the 2019 4th International Conference on Computer Science and Engineering (UBMK), Samsun, Turkey, 11–15 September 2019; pp. 629–634. [Google Scholar]
  15. Han, K.; Yi, M.Y.; Kim, J. Search Personalization in Folksonomy by Exploiting Multiple and Temporal Aspects of User Profiles. IEEE Access 2019, 7, 95610–95619. [Google Scholar] [CrossRef]
  16. Qiwei, C.; Huan, Z.; Wei, L.; Pipei, H.; Wenwu, O. Behavior Sequence Transformer for E-Commerce Recommendation in Alibaba. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data (DLP-KDD), Anchorage, AK, USA, 4 August 2019; pp. 1–4. [Google Scholar]
  17. Ziani, A.; Azizi, N.; Schwab, D.; Aldwairi, M.; Chekkai, N.; Zenakhra, D.; Cheriguene, S. Recommender system through sentiment analysis. In Proceedings of the 2nd International Conference on Automatic Control, Telecommunications and Signals, Annaba, Algeria, 11–12 December 2017; pp. 1–7. [Google Scholar]
  18. Yeon Park, H.; Jae Kim, K. Recommender system using BERT sentiment analysis. J. Intell. Inf. Syst. 2021, 27, 1–15. [Google Scholar]
  19. Hyun, J.; Ryu, S.; Lee, S.Y.T. How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores. J. Intell. Inf. Syst. 2019, 25, 219–239. [Google Scholar]
  20. Vincenzo, M.; Antonio, P.; Sperlí Giancarlo, J. An Emotional Recommender System for Music. IEEE Intell. Syst. 2021, 36, 57–68. [Google Scholar] [CrossRef]
  21. Asani, E.; Vahdat-Nejad, H.; Sadri, J. Restaurant recommender system based on sentiment analysis. Mach. Learn. Appl. 2021, 6, 100114. [Google Scholar] [CrossRef]
  22. Sewar, K.; Amjed, A.A.-M. A Book Recommender System Using Collaborative Filtering Method. In Proceedings of the International Conference on Data Science, E-Learning and Information Systems (DATA), Ma’an, Jordan, 5–7 April 2021; pp. 131–135. [Google Scholar]
  23. Chen, X.; Li, L.; Pan, W.; Ming, Z. A Survey on Heterogeneous One-Class Collaborative Filtering. ACM Trans. Inf. Syst. 2020, 38, 1–54. [Google Scholar] [CrossRef]
  24. Koren, Y.; Bell, R.; Volinsky, C. Matrix factorization techniques for recommender systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
  25. Raúl, L.-C.; Ángel, G.-P.; Fernando, O. Deep Matrix Factorization Approach for Collaborative Filtering Recommender Systems. Appl. Sci. 2020, 10, 4926. [Google Scholar]
  26. Wall, M.E.; Rechtsteiner, A.; Rocha, L.M. Singular value decomposition and principal component analysis. In A Practical Approach to Microarray Data Analysis; Springer: Berlin/Heidelberg, Germany, 2003; pp. 91–109. [Google Scholar]
  27. Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T. Neural Graph Collaborative Filtering. In Proceedings of the 42nd International ACM SIGIR Conference on SIGIR’19: Research and Development in Information Retrieval, Paris, France, 21–28 January 2019; pp. 165–174. [Google Scholar]
  28. Zhang, L.; Li, J.; Zhang, Q.; Meng, F.; Teng, W. Domain knowledge-based link prediction in customer-product bipartite graph for product recommendation. Int. J. Inf. Technol. Decis. Mak. 2019, 18, 311–338. [Google Scholar] [CrossRef] [Green Version]
  29. Shi, M.; Zhuang, Y.; Tang, Y.; Lin, M.; Zhu, X.; Liu, J. Web Service Network Embedding based on Link Prediction and Convolutional Learning. IEEE Trans. Serv. Comput. 2021. [Google Scholar] [CrossRef]
  30. Babak, M.B.; Nasseh, T. Customer Reviews Analysis With Deep Neural Networks for E-Commerce Recommender Systems. IEEE Access 2019, 7, 119121–119130. [Google Scholar] [CrossRef]
  31. Fu, W.; Peng, Z.; Wang, S.; Xu, Y.; Li, J. Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems. Proc. Aaai Conf. Artif. Intell. 2019, 33, 94–101. [Google Scholar] [CrossRef]
  32. Chen, C.; Zhang, M.; Ma, W.; Liu, Y.; Ma, S. Jointly Non-Sampling Learning for Knowledge Graph Enhanced Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 January 2020; pp. 189–198. [Google Scholar]
  33. Wu, C.; Wu, F.; Qi, T.; Huang, Y.; Xie, X. Fastformer: Additive Attention Can Be All You Need. arXiv 2021, arXiv:2108.09084. [Google Scholar]
Figure 1. Link prediction process of IGMC.
Figure 1. Link prediction process of IGMC.
Applsci 12 01493 g001
Figure 2. Overview of the proposed framework.
Figure 2. Overview of the proposed framework.
Applsci 12 01493 g002
Figure 3. Aggregated subgraph embedding reflecting Cinematic Experience.
Figure 3. Aggregated subgraph embedding reflecting Cinematic Experience.
Applsci 12 01493 g003
Figure 4. Long-tail graph in user dataset of Rotten Tomatoes.
Figure 4. Long-tail graph in user dataset of Rotten Tomatoes.
Applsci 12 01493 g004
Figure 5. Long-tail graph in user dataset of Amazon.
Figure 5. Long-tail graph in user dataset of Amazon.
Applsci 12 01493 g005
Figure 6. Relation between sentiment and rating.
Figure 6. Relation between sentiment and rating.
Applsci 12 01493 g006
Figure 7. Relation between emotion and rating.
Figure 7. Relation between emotion and rating.
Applsci 12 01493 g007
Figure 8. A visualized subgraph when the predicted score is low.
Figure 8. A visualized subgraph when the predicted score is low.
Applsci 12 01493 g008
Figure 9. A visualized subgraph when the predicted score is high.
Figure 9. A visualized subgraph when the predicted score is high.
Applsci 12 01493 g009
Table 1. RMSE Table of Rotten Tomatoes data.
Table 1. RMSE Table of Rotten Tomatoes data.
Model TypeBest RMSE
CF (user-based)0.9233
CF (item-based)0.8818
SVD0.8425
SVD++0.8328
GCMC0.8135
IGMC0.8051
IGMC + CX (ours)0.8004
Table 2. RMSE Table of Amazon movie data.
Table 2. RMSE Table of Amazon movie data.
Model TypeBest RMSE
CF (item-based)1.0840
CF (user-based)1.0528
SVD0.9872
GCMC0.9852
SVD++0.9798
IGMC0.9657
IGMC + CX (ours)0.9621
Table 3. RMSE Table for Cinematic Experience’s influence.
Table 3. RMSE Table for Cinematic Experience’s influence.
Model TypeBest RMSE
Rating0.8051
Rating, Emotion0.8036
Rating, Sentiment0.8029
Rating, Sentiment, Emotion (ours)0.8004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lee, C.; Han, D.; Han, K.; Yi, M. Improving Graph-Based Movie Recommender System Using Cinematic Experience. Appl. Sci. 2022, 12, 1493. https://doi.org/10.3390/app12031493

AMA Style

Lee C, Han D, Han K, Yi M. Improving Graph-Based Movie Recommender System Using Cinematic Experience. Applied Sciences. 2022; 12(3):1493. https://doi.org/10.3390/app12031493

Chicago/Turabian Style

Lee, CheonSol, DongHee Han, Keejun Han, and Mun Yi. 2022. "Improving Graph-Based Movie Recommender System Using Cinematic Experience" Applied Sciences 12, no. 3: 1493. https://doi.org/10.3390/app12031493

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop