You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Article
  • Open Access

5 January 2023

Improving Data Sparsity in Recommender Systems Using Matrix Regeneration with Item Features

,
,
,
and
1
Department of Computer Science, Gyeongsang National University, Jinju-si 52828, Republic of Korea
2
Manager S/W Development Wellxecon Corp., Seoul 06168, Republic of Korea
3
Department of Computer Science, Yonsei University, Seoul 03722, Republic of Korea
4
Department of Computer Science and Engineering, Kangwon National University, Chuncheon 24341, Republic of Korea
This article belongs to the Section E1: Mathematics and Computer Science

Abstract

With the development of the Web, users spend more time accessing information that they seek. As a result, recommendation systems have emerged to provide users with preferred contents by filtering abundant information, along with providing means of exposing search results to users more effectively. These recommendation systems operate based on the user reactions to items or on the various user or item features. It is known that recommendation results based on sparse datasets are less reliable because recommender systems operate according to user responses. Thus, we propose a method to improve the dataset sparsity and increase the accuracy of the prediction results by using item features with user responses. A method based on the content-based filtering concept is proposed to extract category rates from the user–item matrix according to the user preferences and to organize these into vectors. Thereafter, we present a method to filter the user–item matrix using the extracted vectors and to regenerate the input matrix for collaborative filtering (CF). We compare the prediction results of our approach and conventional CF using the mean absolute error and root mean square error. Moreover, we calculate the sparsity of the regenerated matrix and the existing input matrix, and demonstrate that the regenerated matrix is more dense than the existing one. By computing the Jaccard similarity between the item sets in the regenerated and existing matrices, we verify the matrix distinctions. The results of the proposed methods confirm that if the regenerated matrix is used as the CF input, a denser matrix with higher predictive accuracy can be constructed than when using conventional methods. The validity of the proposed method was verified by analyzing the effect of the input matrix composed of high average ratings on the CF prediction performance. The low sparsity and high prediction accuracy of the proposed method are verified by comparisons with the results by conventional methods. Improvements of approximately 16% based on K-nearest neighbor and 15% based on singular value decomposition, and a three times improvement in the sparsity based on regenerated and original matrices are obtained. We propose a matrix reconstruction method that can improve the performance of recommendations.

1. Introduction

With the development of the Web and the extensive use of various smart devices, people can provide substantial information to the Web in real-time, while simultaneously consuming information. Users who access the Web with the main focus on information consumption face large amounts of available information. The information to which users are exposed contains not only the information they seek, but also spam or a lot of information that they do not want. As users are spending increasing time accessing information they seek on the Web, recommender systems have emerged to provide users with preferred content by filtering large amounts of information, along with providing means of exposing the search results to users more effectively.
Recommender systems generally operate based on collaborative filtering (CF) and content-based filtering (CBF) [1,2,3,4]. CF operates according to memory-based and model-based methods [1,4,5]. Both methods use the user–item matrix, which is a matrix that indicates the preference information evaluated by the user for the item [1,4,5]. The memory-based method first calculates similar users or items in the user–item matrix and applies similar user or item information to predict the user preferences for items to determine recommendation lists [1]. Matrix factorization (MF) is a typical example of the model-based approach [5]. The MF method consists of factorizing the user–item matrix and learning the user propensity based on the decomposed matrix to derive the predictive preference. CBF is a method that involves classifying and recommending users or items by analyzing the user demographic information or item features [4,6,7]. Numerous types of methods exist for CBF, as the available information differs depending on the domain.
The recommendation system is proposed with a variety of approaches, using deep learning as well as MF. First, there is neural collaborative filtering (NCF) model that has developed MF into a deep neural network [8]. There are other methods such as using autoencoder and item2vec [9,10,11]. These approaches leverage dimension reduction or embedding to produce recommendation results. Since deep learning requires the embedding process of user preferences or item information, it is suggested to embed the input matrix as well as to embed item features or similarity of the user and item [12,13]. In addition there are studies using the structure of autoencoder to improve top-k recommendation performance, sequential recommendation, or learning the recommendation structure [14,15,16]. However, these deep neural network based recommendation models do not always provide more precise recommendation results than MF for all situations since the models have a non-linear structure [17].
Recommendation systems offer the advantage of providing appropriate information to users rapidly. However, several disadvantages exist. First, because CF-based recommender systems operate based on information regarding the user using a specific item, such as a user–item matrix, the cold-start problem may exist, which decreases the recommendation reliability when little or no information regarding the user or item is available [18,19]. Moreover, magic-barrier issues may arise when predicting the user preferences based on numerical information, which make it difficult to reflect 100% of the user preferences [20,21]. Accuracy problems are a field of continuous research in recommender systems [1,5,22,23,24]. Problems such as cold-start do not exist in certain CBF approaches as they are not based on the user action history for items such as preferences. However, because CBF operates based on metadata, the recommendation accuracy cannot be guaranteed. That is, the extraction of significant features from various metadata requires a more complex process than CF and it is difficult to ensure the reliability of the predicted user preferences based on the results [4,25,26,27]. We use the advantages of CBF to extract the user preference data and propose a method that can improve existing CF. Thus, we derive a means of improving the existing recommendation accuracy by applying the CBF perspective in CF.
We propose a method for improving the sparseness and increasing the accuracy of the user–item matrix by using item features selected by the users. For this purpose, we apply a hybrid method incorporating CBF and CF. The existing CF method extracts the user information from the user–item matrix (similar users or items for the memory-based approach and latent factors for the model-based approach) and the prediction results are derived. Prior to applying the existing user–item matrix to the CF as input, we regenerate the matrix based on item categories to implement CF.
To achieve this, we first extract the category ratio of the selected items for each user. Thereafter, we apply the extracted information, namely the category ratio, as a filter for the user–item matrix and regenerate the matrix. Therefore, the existing user–item matrix is regenerated based on the category ratio. This regenerated matrix (RM) is assumed to be more user appropriate than the conventional matrix and is applied to the CF. The results are compared with those obtained through the existing matrix. Moreover, we test the sparsity of the regenerated and original matrices, and demonstrate the improvement in the sparsity and accuracy in our approaches. We use the MovieLens dataset (https://grouplens.org/datasets/movielens/, accessed on 26 November 2020) and consider the genre information of the movie as category information in the experiments.
We verify the significance of the RM through various experiments, thereby demonstrating that our approaches are superior compared to the recommendation results derived by conventional CF. Our main research questions can be summarized as follows:
  • Can we reconstruct the original input for collaborative filtering to a more dense matrix using user preferences and item features?
  • Can the reconstructed matrix alleviate the sparsity problem of the original input?
  • Are the results derived through the reconstructed matrix based on various collaborative filtering approaches as accurate as the results of the original matrix?
The remainder of this paper is organized as follows. Related works are introduced in Section 2. The proposed algorithm is presented in Section 3. The experiments and results are detailed in Section 4, and Section 5 provides concluding remarks.

3. Our Approach

Prior to applying the original matrix (OM) to collaborative filtering (CF), we filter the OM based on the item category information and regenerate the matrix into a more suitable form for the user. In the conventional method, the OM is applied to CF as input and the prediction results are derived. Figure 1 presents the differences between the conventional CF and proposed method. Compared to the conventional CF, we analyze the user selection propensity and extract a matrix that reflects the user preferences from the OM.
Figure 1. Differences between conventional CF and our approach.
We first extract the category ratio of the selected items by user and regenerate the OM based on the category percentage. The regenerated matrix (RM) is assumed to be more user-appropriate than the OM and is applied to the CF. We conduct experiments using the MovieLens database, and consider the genres that exist in movie information as the category information. Accordingly, we take the movie database as an example to explain the proposed method.

3.1. Database

We employ the MovieLens dataset (https://grouplens.org/datasets/movielens/, accessed on 26 November 2020), as indicated in Table 1, which comprises 9125 movies and 671 users. The movie database provides genre information as an item feature. All movies in the database have at least one genre and each movie has a genre combination. For example, the genres for “Toy Story” are classified as Animation, Children’s, and Comedy. Table 2 presents the 18 genres of the database.
Table 1. MovieLens database.
Table 2. 18 genres.

3.2. Matrix Regeneration Based on User Preference Filter

Recommendation systems generally use the user–item matrix as an input. The user–item matrix consists of item ratings evaluated by the user. Let U = u 1 , u 2 , . . . , u n be the set of users and I = i 1 , i 2 , . . . , i m be the set of items, where there are n users and m items. Let R be the set of ratings, where r i , j is a rating provided by user u i for item i j . The user–item matrix is composed of U, I, and R. We consider the user–item matrix as the OM in this study.
We extract the user preference vector (PV) from the OM. Thereafter, we filter the OM using the PV and perform matrix regeneration.

3.2.1. Extracting User PV

To extract the user PVs, we first extract genres from the items evaluated by users in the OM and calculate the percentage of genres that have been evaluated by users. The item is composed of various features, including the year, actor, genre, and country. We select only the genre among these and calculate the user selection ratio. In a movie, a genre is information selected by a group of experts that can serve as a standard for the characteristics of an item, similar to a category in general e-commerce [28]. A total of 18 genres are included in the MovieLens dataset. Thus, the calculated vectors have a total of 18 dimensions, each with a user-selected ratio for each genre. Figure 2 presents the process of extracting the PVs for each user from the OM.
Figure 2. Process of extracting PVs for each user from OM.
In Figure 2, we extract the items that have user preferences, namely ratings, from the OM. For example, in Figure 2, user u 1 evaluated a total of q items from items i 1 to i q . We count the frequency of the genre appearing in these evaluation items, following which we can obtain vectors for the genre selection frequency by the users. The frequency of genre vectors that exist for each user is below 18 because a total of 18 genres are included in the database. Thus, the maximum number of dimensions is 18; however, certain users may not select a particular genre at all, so the total may be 18 or less. The value of the frequency vector for each user is subsequently calculated through percentile normalization to calculate the user PV.
In Figure 2, assume that users u 1 and u 2 have the same frequency number for genre G 1 . However, the value of G 1 may differ in each user PV. This is because when the total numbers of frequencies for G 1 selected by u 1 and u 2 are 100 and 20, respectively, u 1 has a preference of approximately 10% for G 1 and u 2 has a preference of 50% for G 1 .
Normalization is applied to the frequency vectors because, depending on the total number of selected genres, the preference ratio may vary for each user. We derive the PVs taking into account the proportion of the selection-based preferences of these users. Therefore, the preferences can be considered as a percentage of the genre selected by the user.

3.2.2. Matrix Regeneration Using PV

We calculate G l o b a l P V based on the PV extracted from each user. G l o b a l P V is derived by calculating the average PV for each user. Equation (1) presents the process of calculating the average rate for a genre in the PV.
A v g G n = v i U v i | U | ,
where G n is the n t h genre in the dataset, U is a set of users, and v i is the preference rate of user i for genre G n . Therefore, the result of Equation (1) is the average of the preference rates of all users for genre G n .
We apply Equation (1) to all genres (from G 1 to G 18 ) and derive G l o b a l P V . Figure 3 presents the process for deriving G l o b a l P V .
Figure 3. Process for deriving G l o b a l P V .
In Figure 3, a genre preference ratio exists for each user. We derive the average of each column. That is, the average of each genre preference ratio from G 1 to G 18 is calculated in the form depicted in Figure 2 to derive G l o b a l P V .
We use G l o b a l P V to extract a more user-appropriate matrix from the OM. Thus, we consider G l o b a l P V as a filter and apply it to the OM to regenerate the matrix in which the user preferences are considered. Figure 4 depicts the process of constructing the RM by applying the PV filter to the OM.
Figure 4. Process of constructing RM by applying PV filter to OM.
In Figure 4, for the matrix regeneration, we first classify the items by genre in the OM. Thereafter, we reconstruct the items in the OM based on G l o b a l P V . For example, suppose that there are 100 items in G 1 classified from the OM. Assume that the ratio of G 1 in G l o b a l P V is 10%. Based on this ratio, we extract 10, which is 10% of the total of 100 items in G 1 of the OM. We apply this to G 18 , and, subsequently, the items to which the genre ratio of G l o b a l P V is applied are extracted from the OM.
When the set of items in the OM is I and the set of items extracted based on G l o b a l P V from the OM is I , | I | | I | . Suppose that a user u a has evaluated all items in the OM. When the set of items extracted from the OM based on P V for u a is I a , | I | = | I a | . In all cases except this, | I | > | I | .
Thereafter, all users who evaluated the set of items I are extracted. If the set of users existing in the OM is U and the set of users extracted based on I from the OM is U , | U | | U | . In the extracted user set, similar to the item set, | U | = | U | for users who have evaluated all items. For all cases except this, | U | > | U | .
We extract all users who have evaluated the extracted item set I . Suppose that the set of users extracted based on I is U . Then, we can construct a new matrix R M using I and U .

4. Experiments

We applied the regenerated matrix (RM) and original matrix (OM) to collaborative filtering (CF), and analyzed the results. Figure 5 presents the entire experimental process. In our experiments, we do not provide the environments for experiments since our approaches including collaborative filtering have no real-time issue. Because of this reason, we show the experimental process and the results of our test. We first describe the CF approaches utilized for our experiments. Then we provide and analyze the experimental results using RM and OM as input for each CF approach.
Figure 5. Experimental process.
We first introduce the CF methods used in the experiment. Thereafter, we present the experimental design and the method used to compare the results. Finally, the results and analyses are provided. We used the mean absolute error (MAE) and root mean square error (RMSE) to verify the accuracy in the experiments. We calculated the sparsity of the input matrices and analyzed the results. Moreover, the differentiation of the results was verified through the Jaccard similarity of the set items in each matrix.

4.1. CF Approaches Used in Experiments

We leverage the conventional CF approaches that utilize various applications in real-services for the experiments. The conventional CF can be divided into memory-based and model-based methods. The memory-based approach is considered as being neighborhood based, and provides a method of identifying similar users and using them to derive recommendation results [1]. Model-based methods are considered as latent factor models and are represented by matrix factor models [5]. We applied the RM and OM to both types of methods to compare the results. The following CF approaches were used in the experiments:

4.1.1. K-Nearest Neighbor (KNN) Approaches

KNN approaches [62] measure the similarity between users or items in the OM and select a similar user or item. The selected similar users or items are referred to as neighbors. Cosine similarity [1] or the Pearson correlation coefficient [63] is used to calculate the similarity. The similarity calculations can be carried out on a user or an item basis. In this paper, the process is explained on a user basis.
After selecting a similar user, namely a neighbor, the prediction results are calculated based on the existing ratings of the neighbor. We used the following four prediction methods for the experiments in this study.
  • KNN (Basic): this method obtains the prediction results through the weighted average of the neighbor ratings, for which we use Equation (2).
    r ^ u , i = v N i k ( u ) s i m ( u , v ) · r v i v N i k ( u ) s i m ( u , v ) ,
    where u is a user and i is an item to predict for u. Furthermore, N is a set of similar users to user u, so v is one of the similar users as an element of the set N, s i m ( u , v ) indicates the similarity between users u and v, and r ( v , i ) is a rating for item i by user v.
  • KNN (Means): this method obtains the prediction results by considering the average of the neighbor ratings, for which we use Equation (3).
    r ^ u , i = μ ( u ) + v N i k ( u ) s i m ( u , v ) · ( r v i μ ( v ) ) v N i k ( u ) s i m ( u , v ) ,
    where μ ( u ) is the average rating of user u. The other variables are the same as in Equation (2).
  • KNN (Zscore): this method obtains the prediction results by considering the z-score normalization of the neighbor ratings, for which we use Equation (4).
    r ^ u i = μ ( u ) + σ u v N i k ( u ) s i m ( u , v ) · ( r v i μ ( v ) ) / σ v v N i k ( u ) s i m ( u , v ) ,
    where σ u and σ v are the standard deviations for the average ratings of users u and v, respectively. The other variables are the same as in Equation (3).
  • KNN (Baseline): this method is similar to KNN (Means); however, it uses the baseline instead of the average and adds the baseline to a user. For this purpose, we use Equation (5).
    r ^ u , i = b u , i + v N i k ( u ) s i m ( u , v ) · ( r v i b v , i ) v N i k ( u ) s i m ( u , v ) ,
    where b u , i and b v , i are the baselines of users u and v, respectively, and b u , i is defined by Equation (6).
    r ^ u , i = b u , i = μ + b u + b i ,
    where μ is the global average rating of the OM, and b u and b i are the bias (or baseline) for the user and item, respectively.

4.1.2. MF Approaches

MF [64] determines the values of the decomposed matrix through learning in the process of factoring and recombining the user–item matrix; that is, the input matrix. In this case, it is known as a latent factor model, as the intentions of the users or items can be understood in the process of identifying the decomposed matrix. The prediction results are derived through the process of adding these latent factors. We used two MF approaches, namely SVD and non-negative MF (NMF), in this paper. SVD and NMF both use stochastic gradient descent for learning.
  • SVD: this is one methodology of MF and can be expressed in the form of R = U Σ V T . In this case, R is the input matrix, U is a matrix of size m × m , Σ is a matrix of size m × n with a non-diagonal component of 0, and V is the matrix n × n . This constitutes probabilistic MF and the prediction results are derived through Equation (7).
    r ^ u , i = μ + b u + b i + q i T p u ,
    where μ is the global average rating, b u and b i are the bias of the user and item, respectively, and q i and p u are the latent vectors for the item and user, respectively.
  • NMF: This method is similar to SVD, and we use Equation (8) to derive the prediction results.
    r ^ u , i = q i T p u

4.2. Experimental Design

We conducted the experiments using the database introduced in Table 1. Three methods were used as the basis for item composition in the process of reconstructing the matrix, as follows:
  • Selection based (RM-1): a method of extracting items in the order of user selection in the matrix reconstruction. Figure 6 presents the selection-based matrix reconstruction.
    Figure 6. Process of selection-based matrix reconstruction.
    For example, in Figure 6, G l o b a l P V has 10% of G 1 . Suppose that G 1 has a total of 100 items in the OM. Then, we sort the 100 items in descending order based on the user selection frequency. Thereafter, we extract the top 10 items from the sorted items and select the items of G 1 in the RM.
  • Average based (RM-2): a method of extracting items in the order of the average user ratings in the matrix reconstruction. Figure 6 depicts the average-based matrix reconstruction.
    Figure 7 is similar to Figure 6 except for the item selection process. For example, the difference is that G 1 items in the OM are sorted in descending order based on the average rating. We applied this process to all genres.
    Figure 7. Process of average-based matrix reconstruction.
  • Random based (RM-3): a method of extracting items randomly according to the ratio of G l o b a l P V in the matrix reconstruction.
We selected RM-1 and RM-2 as the proposed methods for the RM composition. For comparison testing, RM-3 was added, with random extraction at a rate of G l o b a l P V used by RM-1 and RM-2. The comparative experiments with RM-3 demonstrated the significance of RM-1 and RM-2. We conducted the experiments with RM-1, RM-2, and RM-3 as the CF input.
Moreover, for further comparative experiments, the OM and the following method were used as the CF input.
  • Method for comparative experiments: we extracted the user-level RM ( r m u ) and concatenated the matrices, based on which the CF results were derived. The concatenated matrix was the result of adding the items ( I u ) of r m u derived from each user. The set of items in R M c could be considered as a result of combining the genre preferences of each user in the OM. The difference from R M extracted through the G l o b a l P V filter was the result of considering the average genre preferences of the OM users in the case of R M and R M c was the result of adding the genre preferences of each user. Figure 8 depicts the process of generating R M c . We used two methods for each user r m u , namely the selection-based and average-based methods, as in the case of deriving R M .
    Figure 8. Process of generating R M c .
Table 3 summarizes the matrices used as the CF input in the experiments.
Table 3. Matrices used in experiments.
We used the MAE and RMSE to compare the accuracy of the methods. Equations (9) and (10) express the MAE and RMSE, respectively.
M A E = 1 | T | n T | r n r ^ n | ,
R M S E = 1 | T | n T ( r n r ^ n ) 2 ,
where T is the test set of items and n is one of the test items. Furthermore, r n and r ^ n denote the real rating and predicted rating for item n, respectively.
For a more precise and varied analysis, we divided the test set using 10-fold cross-validation [63,65,66] for each matrix and conducted experiments to derive the MAE and RMSE results. In general, if we use k-fold cross-validation for the experiments, more reliable experimental results can be provided based on a small set of data. In this paper, we conducted an experiment using 10-fold cross-validation to provide more reliable experimental results on a limited size dataset. In total, 10 experimental results can be provided for the same input, which is derived from different test sets. That is, we can derive 10 different test results from the same input through 10-fold cross-validation. We utilize 10-fold cross-validation to derive more experimental results from limited input data.

4.3. Experimental Results

4.3.1. Analysis of Accuracy

Table 4 shows the CF methods used in our experiments and its ID. These IDs are utilized in Table 5 and Table 6. Table 5 and Table 6 display the results of the 10-fold cross-validation with each CF approach. In the tables, F o l d n indicates each dataset for the 10-fold cross-validation, whereas M e a n and S t . d e v . denote the average and standard deviation of the MAE for each dataset, respectively.
Table 4. CF method and ID.
Table 5. MAE results for 10-fold cross-validation with each CF approach.
Table 6. RMSE results for 10-fold cross-validation with each CF approach.
It can be observed from Table 5 that the MAE of RM-2 in almost all folds, namely in each dataset, was better than those of the other matrices. Moreover, RM-1 exhibited superior results in our approaches. In the M e a n column, RM-1 and RM-2 exhibited superior results to the OM, which means that our approaches could derive better prediction results. In comparison, the results of RM-3 were more inaccurate than those of the OM. Thus, the extraction methods of RM-1 and RM-2 were significant. Moreover, RMc-1 and RMc-2 exhibited better performance than the OM, but were worse than RM-1 and RM-2, respectively.
In Table 5, from the perspective of each CF approach, it can be observed that RM-1 and RM-2 still derived more accurate results than the OM. Among the four KNN approaches, RM-2 exhibited the best performance, and the same results were achieved by SVD and NMF, which are MF approaches. Thus, RM-1 and RM-2 yielded higher accuracy in all methodologies than the results derived from the OM as input.
Similar results were demonstrated in the RMSE cases. It can be observed from Table 6 that RM-1 and RM-2 yielded higher accuracy when applying CF compared to the OM. There was no significant difference between RMc-1 and RMc-2, but the overall results of RM-1 and RM-2 were more accurate, respectively. Thus, the prediction results of applying the average ratio to the user genre preferences were more accurate than the prediction results obtained by concatenating each user ratio. Furthermore, it can be observed that the method using the PV filter yielded more accurate results than the OM.
Figure 9 and Figure 10 present the means in Table 5 and Table 6, respectively.
Figure 9. Average MAE of 10-fold cross-validation for each input matrix.
Figure 10. Average RMSE of 10-fold cross-validation for each input matrix..
The results of RM-2 and RMc-2, which selected items based on high average ratings, exhibited very high accuracy. Thus, the following hypothesis can be stated: “If the average of ratings constituting the matrix; that is, r i , j in the matrix, is high, the accuracy of the predictions is high”. To confirm this hypothesis, we computed the average of r i , j making up RM-1, RM-2, RM-3, and OM. Table 7 displays the averages and standard deviations of the ratings in each matrix.
Table 7. Averages and standard deviations of ratings in each matrix.
According to Table 7, the highest average was provided by RM-2. This is a reasonable result because the items were extracted from the OM in the order of high average ratings when regenerating RM-2. In comparison, the other three methodologies, namely RM-1, RM-3, and OM, did not extract the items in the order of average ratings, so it can be confirmed that these produced relatively lower averages than RM-2.
Based on the low MAE and RMSE, it can be observed that RM-1, which exhibited the best results apart from RM-2, had higher average ratings than the other two methods. In the case of OM, which had the next highest MAE and RMSE, the lowest average value was observed, and RM-3, which exhibited the worst accuracy as per Table 5 and Table 6, resulted in a higher average than the OM. Thus, a high average rating does not guarantee prediction accuracy. Furthermore, Figure 11 and Table 8 present the changes in the average, MAE, and RMSE of the different methodologies based on the OM in percentages. We calculated the change rates using Equation (11).
d n = | v a l ( R M n ) v a l ( O M ) | v a l ( O M ) 100 ,
where v a l ( R M n ) and val(OM) indicate a value, such as the average, MAE, or RMSE for RM-n and the OM, respectively.
Figure 11. Change percentages for average, MAE, and RMSE of each method (OM-based).
Table 8. Change percentages for average, MAE, and RMSE of each method (OM-based).
In Figure 11, the x-axis and y-axis indicate the comparison criterion and change percentage, respectively. RM-1&OM represents the change of RM-1 based on the OM. For example, when the average rating of the OM was 3.29 and the average rating of RM-1 was 3.51, the average of RM-1&OM was | 3.29 3.51 | / 3.29 * 100, which was approximately 6.7% (it could be rounded to 7%). If the average of RM-2 was 3.4, the average of RM-1&OM was | 3.29 3.4 | / 3.29 * 100, which was approximately 3.3%. It means that the results derived from RM-2 is more close than RM-1 for the result derived from OM. The results were derived by applying this process to each methodology for the MAE and RMSE. RM-2&OM and RM-3&OM represent the changes of RM-2 and RM-3, respectively, based on the OM.
Comparing the change rates of the MAE and RMSE in Figure 11 and Table 8, it can be observed that, in all methodologies, there was a difference in the change rate for the average and that for the accuracy. The average change rate for RM-1&OM was approximately 7%. The change rate between the MAE and RMSE varied between approximately 1% and 5%. Moreover, the average change rate for RM-2&OM was approximately 35%, and the change rate of the MAE and RMSE ranged from approximately 9% to 16%. It can be observed from the comparison results that the change in the prediction results was insignificant compared to the change in the average values.
In the case of RM-3&OM, the change rate for the average was approximately 2%, whereas the change rate for the MAE and RMSE varied between approximately 9% and 16%. Accordingly, it can be confirmed that the change rate of the prediction accuracy was greater than that of the average, which means that the difference in the average rating has less of an effect on the accuracy.

4.3.2. Analysis of Sparsity

Our approach not only can improve the recommendation accuracy, but can also alleviate the data sparsity. Table 9 and Figure 12 present the data sparsity of the OM and RM. In Table 9, u s e r s i z e and i t e m s i z e indicate the number of users and items in each matrix, respectively. Moreover, # o f r a t i n g s denotes the number of ratings in each matrix and s p a r s i t y indicates the amount of data sparsity in each matrix as the result of Equation (12).
S P = # of ratings user size item size
Table 9. Data sparsity of OM and RM.
Figure 12. Data sparsity of OM and RM according to the various combinations of method.
It can be observed that the results for the sparsity of RM-1 and RM-2 were higher than those for the OM. This means that RM-1 and RM-2 had denser matrices than the OM; that is, when using two R M s, we can obtain more ratings and can apply CF based on the matrix with more ratings than the OM. Thus, through RM extraction, we can construct a denser matrix, which can alleviate the sparsity of the OM.

4.3.3. Analysis of Jaccard Similarity

We determined the similarities of the items composing each matrix to verify the differentiation of the results. For the verification, we used the Jaccard similarity, which can be derived according to Equation (13) [67].
J ( X , Y ) = | X Y | | X Y | ,
where X and Y indicate sets. The result of the Jaccard similarity yields 1 when the two sets are the same and 0 when there is no common element. Therefore, the result of Equation (13) represents the ratio of the elements shared by two sets as a real number between 0 and 1.
For example, suppose that the sets of items in RM-1 and RM-2 are I 1 and I 2 , respectively. If the elements of both sets are the same, the result of the Jaccard similarity is 1; if all elements are different, the result is 0.
Table 10 presents the results of the Jaccard similarity between the item sets of each matrix. It can be observed that the Jaccard similarity between RM-1 and RM-2, which had higher accuracy than the OM, was approximately 0.087. This means that the item lists of the two matrices shared approximately 9% of items. RM-1 and RM-2 exhibited superior performance over the OM, and Table 10 indicates that the ratio of items shared by the two matrices was actually smaller than the others. Therefore, it can be considered that the results obtained through the two matrices were not derived through matrices with similar contents.
Table 10. Jaccard similarity between item lists in each matrix.
In conclusion, the high prediction accuracy and low sparsity of our approach are verified by comparisons with the OM results. We can check that the proposed method can improve the prediction accuracy of 16% and 15% for KNN and SVD, respectively. We can also find that a three times improvement in the sparsity based on RM-1&OM is obtained. Although our approach can improve existing methods by utilizing regenerated input, we cannot regenerate an input matrix in the absence of metadata and users’ reaction information in a domain. Therefore, we can consider that our experimental results can be derived for domains where users’ reaction data and metadata exist.

5. Conclusions

Recommendation systems operate based on the various user reactions to items. As such systems operate according to user responses, problems exist whereby recommendations are difficult to apply for new or less responsive items. Moreover, it is known that recommendations based on sparse datasets are less reliable.
Thus, we have proposed improving the sparseness and increasing the accuracy of the user–item matrix by using item features selected by users. Based on the content-based filtering (CBF) concept, the collaborative filtering (CF) input matrix was regenerated from the original user–item matrix by using item features, such as the category. That is, prior to applying the original user–item matrix to CF, we regenerated the matrix as the CF input.
We first extracted the category ratio of the selected item by users. Moreover, we proposed a method for regenerating the original user–item matrix based on the extracted ratio. We assumed that the regenerated matrix (RM) considered the user preferences compared to the original matrix (OM) and applied it to CF.
Our contributions can be divided into academic and industrial sides. Based on the academic contributions, we can solve our research question. The academic contributions are summarized as follows:
  • We have proposed a novel approach that can regenerate the input from the OM for CF by constructing user PVs based on category selection rates and filtering the OM through user PVs.
  • The accuracy was verified by applying the regenerated input matrices to a total of six CF approaches. The prediction accuracy of the proposed method was verified through comparative experiments using the OM as input.
  • We have demonstrated that the results obtained by our approach are more precise than those of conventional CF approaches.
  • The low sparsity and high prediction accuracy of the proposed method were verified by comparisons with the OM results. (Improvements of approximately 16% based on KNN (MAE) and 15% based on SVD (MAE), and a three times improvement in the sparsity based on RM-1&OM were obtained.)
The recommender systems based on collaborative filtering approaches are currently addressed in various web services, such as Amazon and Netflix [1,2,5]. These approaches utilize the user–item rating matrix as input to generate recommendation results. Because of this reason, if we can construct the same shape of the input matrix, then we can utilize the constructed matrix as the input of the collaborative filtering approaches.
In our approach, we have reconstructed the input as the same structure by utilizing the original input matrix. It means that if the original matrix has users as a row and items as a column, the reconstructed matrix also has the same row and column. Because of this reason, we can easily apply our approach to the conventional collaborative filtering approaches. The industrial contributions of our approach are summarized as follows:
  • In the case of e-commerce or media content recommendation systems, most of them suffer from sparsity problems for input data. The matrix reconstruction scheme proposed in this paper can alleviate the sparsity problems for the real inputs.
  • Furthermore, based on the input matrix with reduced sparsity, we can derive a higher prediction accuracy than the existing input for the aspect of the average ratings.
  • Through our approach, it is possible to provide more reliable recommendation results to online service users with less input.
  • In addition, online service providers can build more reliable recommender systems based on less data.
We regenerated the matrix using the number of selections and average ratings of the items. The results were compared using the random-based RM and OM as inputs for the CF. We tested our approach using the MAE and RMSE. Moreover, we confirmed that the RM produced higher recommendation accuracy than the results obtained through the OM.
The sparsity of each matrix was calculated and the proposed matrix was verified to be denser than the OM. The differences in the items contained in the RM were demonstrated by calculating the Jaccard similarity of the set of items in each matrix. On this basis, we verified the differentiation of the RM derived from each methodology, and, finally, a method for constructing a denser input matrix, as well as a method for deriving high accuracy were presented.
We have proposed and simulated our approach based on the MovieLens dataset, however, we can apply the regenerated method to other domains such as music, books, and e-commerce items. Namely, if the domain has category information and users’ reaction to the items in the domain, we can regenerate the input matrix since our approach has utilized the feature for the item. Thus, there exist the possibility to apply our approach to various types of domains that have item features such as category information.
In future work, we will apply this approach to more diverse databases with various item features to analyze the results. Furthermore, we will apply the user PVs obtained through the item features to cross-domain recommender systems to verify the usability.
In addition, as deep learning-based recommender system studies progress, it the embedding approaches has been proposed in various ways using autoencoder or item2vec. We have introduced the data preprocessing process based on item features that can be used in terms of deep learning recommendation model. In other words, we have suggested the method of regenerating the input matrix to a more dense form by using the item features. The proposed method can be used as an input to various deep learning recommendation models.

Author Contributions

Conceptualization, S.-M.C.; Methodology, S.-M.C., D.L., C.P. and S.L.; Software, S.-M.C., D.L. and K.J.; Formal analysis, S.-M.C. and D.L.; Investigation, S.-M.C., D.L. and K.J.; Data curation, S.-M.C.; Writing–original draft, S.-M.C. and C.P.; Supervision, S.L.; Project administration, C.P. and S.L.; Funding acquisition, S.-M.C., C.P. and S.L.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by Regional Innovation Strategy through the National Research Foundation of Korea funded by the Ministry of Education (grant number: 2021RIS-003) and this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2022-00165785). Also, this study was supported by 2022 Research Grant from Kangwon National University and this research was supported by "Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE)(2022RIS-005).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sarwar, B.M.; Karypis, G.; Konstan, J.A.; Riedl, J. Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International World Wide Web Conference (WWW ’01), Hong Kong, China, 1–5 May 2001; pp. 285–295. [Google Scholar]
  2. Herlocker, J.L.; Konstan, J.; Borchers, A.; Riedl, J. An Algorithm Framework for Peforming Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’99), Berkeley, CA, USA, 15–19 August 1999; pp. 230–237. [Google Scholar]
  3. Tkalcic, M.; Odic, A.; Kosir, A.; Tasic, J.F. Affective Labeling in a Content-Based Recommender System for Images. IEEE Trans. Multimedia 2013, 15, 391–400. [Google Scholar] [CrossRef]
  4. Ricci, F.; Rokach, L.; Shapira, B.; Kantor, P.B. Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  5. Koren, Y.; Bell, R.M.; Volinsky, C. Matrix Factorization Techniques for Recommender Systems. IEEE Comput. 2009, 42, 30–37. [Google Scholar] [CrossRef]
  6. de Campos, L.; Fernández-Luna, J.; Huete, J.; Rueda-Morales, M. Combining content-based and collaborative recommendations: A hybrid approach based on Bayesian networks. Int. J. Approx. Reason. 2010, 51, 785–799. [Google Scholar] [CrossRef]
  7. Çano, Erion and Morisio, Maurizio Hybrid Recommender Systems: A Systematic Literature Review. Intell. Data Anal. 2017, 21, 1487–1524. [CrossRef]
  8. He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web (WWW 17), Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
  9. Liang, D.; Krishnan, R.G. Variational Autoencoders for Collaborative Filtering. In Proceedings of the 2018 World Wide Web Conference (WWW 18), Lyon, France, 23–27 April 2018; pp. 689–698. [Google Scholar]
  10. Duong, T.N.; Vuong, T.A.; Nguyen, D.M.; Dang, Q.H. Utilizing an Autoencoder-Generated Item Representation in Hybrid Recommendation System. IEEE Access 2020, 8, 75094–75104. [Google Scholar] [CrossRef]
  11. Barkan, O.; Koenigstein, N. Item2Vec: Neural Item Embedding for Collaborative Filtering. CoRR 2016, abs/1603.04259. Available online: https://arxiv.org/abs/1603.04259 (accessed on 20 February 2017).
  12. Chen, C.; Wang, C.; Tsai, M.; Yang, Y. Collaborative Similarity Embedding for Recommender Systems. In Proceedings of the World Wide Web Conference (WWW 2019), Thessaloniki, Greece, 14–17 October 2019; pp. 2637–2643. [Google Scholar]
  13. Zhao, X.; Liu, H.; Liu, H.; Tang, J.; Guo, W.; Shi, J.; Wang, S.; Gao, H.; Long, B. AutoDim: Field-aware Embedding Dimension Searchin Recommender Systems. In Proceedings of the WWW ’21: The Web Conference 2021, Virtual, 12–23 April 2021; pp. 3015–3022. [Google Scholar]
  14. Zhu, Z.; Wang, J.; Caverlee, J. Improving Top-K Recommendation via JointCollaborative Autoencoders. In Proceedings of the World Wide Web Conference (WWW 2019), San Francisco, CA, USA, 13–17 May 2019; pp. 3482–3483. [Google Scholar]
  15. Khawar, F.; Poon, L.K.M.; Zhang, N.L. Learning the Structure of Auto-Encoding Recommenders. In Proceedings of the WWW ’20: The Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 519–529. [Google Scholar]
  16. Xie, Z.; Liu, C.; Zhang, Y.; Lu, H.; Wang, D.; Ding, Y. Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation. In Proceedings of the WWW ’21: The Web Conference 2021, Virtual, 12–23 April 2021; pp. 449–459. [Google Scholar]
  17. Rendle, S.; Krichene, W.; Zhang, L.; Anderson, J.R. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In Proceedings of the RecSys 2020: Fourteenth ACM Conference on Recommender Systems (RecSys ’20), Virtual, 22–26 September 2020; pp. 240–248. [Google Scholar]
  18. Schein, A.I.; Popescul, A.; Ungar, L.H.; Pennock, D.M. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’02), Tampere, Finland, 11–15 August 2002; pp. 253–260. [Google Scholar]
  19. Ishikawa, M.; Géczy, P.; Izumi, N.; Morita, T.; Yamaguchi, T. Information Diffusion Approach to Cold-Start Problem. In Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence and International Conference on Intelligent Agent Technology–Workshops (WI-IAT ’07), Silicon Valley, CA, USA, 5–12 November 2007; pp. 129–132. [Google Scholar]
  20. Said, A.; Jain, B.; Narr, S.; Plumbaum, T. Users and Noise: The Magic Barrier of Recommender Systems. In Proceedings of the 20th Conference on User Modelling, Adaptation, and Personalization, Montreal, QC, Canada, 16–20 July 2012; Volume 7379. [Google Scholar]
  21. Bellogín, A.; Said, A.; de Vries, A. The Magic Barrier of Recommender Systems–No Magic, Just Ratings. In Proceedings of the 22nd International Conference on User Modelling, Adaptation, and Personalization, Aalborg, Denmark, 7–11 July 2014; pp. 25–36. [Google Scholar]
  22. Sarwar, B.M.; Karypis, G.; Konstan, J.A.; Riedl, J. Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM Conference on Electronic Commerce (EC ’00), Minneapolis, MN, USA, 17–20 October 2000; pp. 158–167. [Google Scholar]
  23. Bell, R.M.; Koren, Y. Lessons from the Netflix prize challenge. Sigkdd Explor. 2007, 9, 75–79. [Google Scholar] [CrossRef]
  24. Levy, O.; Goldberg, Y. Neural Word Embedding as Implicit Matrix Factorization. In Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, QC, Canada, 8–13 December 2014; pp. 2177–2185. [Google Scholar]
  25. Wei, K.; Huang, J.; Fu, S. A Survey of E-Commerce Recommender Systems. In Proceedings of the 2007 International Conference on Service Systems and Service Management, Chengdu, China, 9–11 June 2007; pp. 1–5. [Google Scholar]
  26. Bobadilla, J.; Ortega, F.; Hernando, A.; Gutiérrez, A. Recommender systems survey. Knowl.-Based Syst. 2013, 46, 109–132. [Google Scholar] [CrossRef]
  27. Ronen, R.; Koenigstein, N.; Ziklik, E.; Nice, N. Selecting Content-Based Features for Collaborative Filtering Recommenders. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys ’13), Hong Kong, China, 12–16 October 2013; pp. 407–410. [Google Scholar]
  28. Choi, S.M.; Ko, S.K.; Han, Y.S. A movie recommendation algorithm based on genre correlations. Expert Syst. Appl. 2012, 39, 8079–8085. [Google Scholar] [CrossRef]
  29. Pirasteh, P.; Jung, J.J.; Hwang, D. Item-Based Collaborative Filtering with Attribute Correlation: A Case Study on Movie Recommendation. In Proceedings of the Intelligent Information and Database Systems–6th Asian Conference (ACIIDS ’14), Bangkok, Thailand, 7–9 April 2014; pp. 245–252. [Google Scholar]
  30. Zhang, J.; Peng, Q.; Sun, S.; Liu, C. Collaborative filtering recommendation algorithm based on user preference derived from item domain features. Phys. Stat. Mech. Its Appl. 2014, 396, 66–76. [Google Scholar] [CrossRef]
  31. Christensen, I.; Schiaffino, S. A Hybrid Approach for Group Profiling in Recommender Systems. J. Univers. Comput. Sci. 2014, 20, 507–533. [Google Scholar]
  32. Lekakos, G.; Giaglis, G. A hybrid approach for improving predictive accuracy of collaborative filtering algorithms. User Model. User-Adapt. Interact. 2007, 17, 5–40. [Google Scholar] [CrossRef]
  33. Çano, E.; Morisio, M. Hybrid Recommender Systems: A Systematic Literature Review. CoRR 2019, abs/1901.03888. Available online: https://arxiv.org/abs/1901.03888 (accessed on 12 January 2019).
  34. Rojsattarat, E.; Soonthornphisaj, N. Hybrid Recommendation: Combining Content-Based Prediction and Collaborative Filtering. In Proceedings of the Intelligent Data Engineering and Automated Learning; Springer: Berlin/Heidelberg, Germany, 2003; pp. 337–344. [Google Scholar]
  35. Lang, K. NewsWeeder: Learning to Filter Netnews. In Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, USA, 9–12 July 1995; pp. 331–339. [Google Scholar]
  36. Krulwich, B. Learning user interests across heterogeneous document databases. In Proceedings of the 1995 AAAI Spring Symposium Series, Palo Alto, CA, USA, 27–29 March 1995; pp. 106–110. [Google Scholar]
  37. Chughtai, M.W.; Selamat, A.; Ghani, I.; Jung, J. E-Learning Recommender Systems Based on Goal-Based Hybrid Filtering. Int. J. Distrib. Sens. Netw. 2014, 2014. [Google Scholar] [CrossRef]
  38. Burke, R. Hybrid Recommender Systems: Survey and Experiments. User Model.-User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
  39. Lika, B.; Kolomvatsos, K.; Hadjiefthymiades, S. Facing the cold start problem in recommender systems. Expert Syst. Appl. 2014, 41, 2065–2073. [Google Scholar] [CrossRef]
  40. Carrer-Neto, W.; Hernández-Alcaraz, M.L.; Valencia-García, R.; García-Sánchez, F. Social knowledge-based recommender system. Application to the movies domain. Expert Syst. Appl. 2012, 39, 10990–11000. [Google Scholar] [CrossRef]
  41. Ghazanfar, M.A.; Prügel-Bennett, A. The Advantage of Careful Imputation Sources in Sparse Data-Environment of Recommender Systems: Generating Improved SVD-based Recommendations. Informatica (Slovenia) 2013, 37, 61–92. [Google Scholar]
  42. Choi, S.M.; Han, Y.S. Identifying representative ratings for a new item in recommendation system. In Proceedings of the 7th International Conferenece on Ubiquitous Information Management and Communication (ICUIMC ’13), Kota Kinabalu, Malaysia, 17–19 January 2013; p. 64. [Google Scholar]
  43. Gantner, Z.; Drumond, L.; Freudenthaler, C.; Rendle, S.; Schmidt-Thieme, L. Learning Attribute-to-Feature Mappings for Cold-Start Recommendations. In Proceedings of the 10th IEEE International Conference on Data Mining (ICDM ’10), Sydney, Australia, 13–17 December 2010; pp. 176–185. [Google Scholar]
  44. Sun, D.; Luo, Z.; Zhang, F. A novel approach for collaborative filtering to alleviate the new item cold-start problem. In Proceedings of the 11th International Symposium on Communications and Information Technologies (ISCIT ’11), Hangzhou, China, 12–14 October 2011; pp. 402–406. [Google Scholar]
  45. Volkovs, M.; Yu, G.W.; Poutanen, T. Content-based Neighbor Models for Cold Start in Recommender Systems. In Proceedings of the Recommender Systems Challenge 2017 (RecSys Challenge ’17), Como, Italy, 27–31 August 2017; pp. 7:1–7:6. [Google Scholar] [CrossRef]
  46. Deng, Y.; Wu, Z.; Tang, C.; Si, H.; Xiong, H.; Chen, Z. A Hybrid Movie Recommender Based on Ontology and Neural Networks. In Proceedings of the 2010 IEEE/ACM International Conference on Green Computing and Communications International Conference on Cyber, Physical and Social Computing, Washington, DC, USA, 18–20 December 2010; pp. 846–851. [Google Scholar]
  47. Wen, H.; Fang, L.; Guan, L. A hybrid approach for personalized recommendation of news on the Web. Expert Syst. Appl. Int. J. 2012, 39, 5806–5814. [Google Scholar] [CrossRef]
  48. Meel, P.; Bano, F.; Goswami, A.; Gupta, S. Movie Recommendation Using Content-Based and Collaborative Filtering. In Proceedings of the International Conference on Innovative Computing and Communications (ICICC ’21); Springer: Singapore, 2021; pp. 301–316. [Google Scholar]
  49. Chen, S.; Huang, L.; Lei, Z.; Wang, S. Research on personalized recommendation hybrid algorithm for interactive experience equipment. Comput. Intell. 2020, 36, 1348–1373. [Google Scholar] [CrossRef]
  50. Mehrabani, M.M.; Mohayeji, H.; Moeini, A. A Hybrid Approach to Enhance Pure Collaborative Filtering Based on Content Feature Relationship. Available online: https://arxiv.org/abs/2005.08148 (accessed on 17 May 2020).
  51. Zhao, W.; Tian, H.; Wu, Y.; Cui, Z.; Feng, T. A New Item-Based Collaborative Filtering Algorithm to Improve the Accuracy of Prediction in Sparse Data. Int. J. Comput. Intell. Syst. 2022, 15, 1–15. [Google Scholar] [CrossRef]
  52. Althbiti, A.; Alshamrani, R.; Alghamdi, T.; Lee, S.; Ma, X. Addressing Data Sparsity in Collaborative Filtering Based Recommender Systems Using Clustering and Artificial Neural Network. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Virtual, 27–30 January 2021; pp. 0218–0227. [Google Scholar] [CrossRef]
  53. Jiang, B.; Yang, J.; Qin, Y.; Wang, T.; Wang, M.; Pan, W. A Service Recommendation Algorithm Based on Knowledge Graph and Collaborative Filtering. IEEE Access 2021, 9, 50880–50892. [Google Scholar] [CrossRef]
  54. Ahmadian, S.; Joorabloo, N.; Jalili, M.; Ahmadian, M. Alleviating data sparsity problem in time-aware recommender systems using a reliable rating profile enrichment approach. Expert Syst. Appl. 2022, 187, 115849. [Google Scholar] [CrossRef]
  55. Chen, H.; Qian, F.; Chen, J.; Zhao, S.; Zhang, Y. Attribute-based Neural Collaborative Filtering. Expert Syst. Appl. 2021, 185, 115539. [Google Scholar] [CrossRef]
  56. Ajaegbu, C. An optimized item-based collaborative filtering algorithm. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 10629–10636. [Google Scholar] [CrossRef]
  57. khaledian, N.; Mardukhi, F. CFMT: A collaborative filtering approach based on the nonnegative matrix factorization technique and trust relationships. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 2667–2683. [Google Scholar] [CrossRef]
  58. Zhou, Q.; Zhuang, W.; Ren, H.; Chen, Y.; Yu, B.; Lou, J.; Wang, Y. Hybrid Collaborative Filtering Model for Consumer Dynamic Service Recommendation Based on Mobile Cloud Information System. Inf. Process. Manag. 2022, 59, 102871. [Google Scholar] [CrossRef]
  59. Liu, H.; Guo, L.; Li, P.; Zhao, P.; Wu, X. Collaborative filtering with a deep adversarial and attention network for cross-domain recommendation. Inf. Sci. 2021, 565, 370–389. [Google Scholar] [CrossRef]
  60. Lin, Z.; Tian, C.; Hou, Y.; Zhao, W.X. Improving Graph Collaborative Filtering with Neighborhood-Enriched Contrastive Learning. In Proceedings of the ACM Web Conference 2022 (WWW ’22), Athens, Greece, 26–29 June 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 2320–2329. [Google Scholar]
  61. Aljunid, M.F.; Huchaiah, M.D. IntegrateCF: Integrating explicit and implicit feedback based on deep learning collaborative filtering algorithm. Expert Syst. Appl. 2022, 207, 117933. [Google Scholar] [CrossRef]
  62. Surprise. k-NN Inspired Algorithms. Available online: https://surprise.readthedocs.io/en/stable/knn_inspired.html (accessed on 25 September 2017).
  63. Bulmer, M.G. Principle of Statistics; Dover Publications: New York, NY, USA, 1979. [Google Scholar]
  64. Surprise. Matrix Factorization-Based Algorithms. Available online: https://surprise.readthedocs.io/en/stable/matrix_factorization.html (accessed on 17 March 2017).
  65. Choi, S.M.; Cha, J.W.; Han, Y.S. Identifying representative reviewers in internet social media. In Proceedings of the Second International Conference on Computational Collective Intelligence: Technologies and Applications–Volume Part II (ICCCI ’10), Kaohsiung, Taiwan, 10–12 November 2010; pp. 22–30. [Google Scholar]
  66. Choi, S.M.; Cha, J.W.; Kim, L.; Han, Y.S. Reliability of Representative Reviewers on the Web. In Proceedings of the International Conference on Information Science and Applications–ICISA, Jeju Island, Republic of Korea, 26–29 April 2011; pp. 1–5. [Google Scholar]
  67. Jaccard, P. The distribution of the flora in the alpine zone. New Phytol. 1912, 11, 37–50. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.