# On Producing Accurate Rating Predictions in Sparse Collaborative Filtering Datasets

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

_{u,i}for the rating that user u would assign to item i, that u has not rated yet, the actual ratings of u‘s NN users v

_{k}to the item i are used. This idea follows the rationale that, since in the real world humans trust people close to them (e.g., family and close friends) when suggestions for a new commodity/experience are required (from making an online pizza order that costs EUR 10 to buying their next car that costs EUR 100,000) [5,6], an analogous setting would hold for computer-generated suggestions.

**RQ1:**What is the importance of the number of NNs for the rating prediction accuracy? If of importance, how many NNs ensure a more accurate rating prediction?

**RQ2:**Is there any connection between the average ratings value of users and the accuracy computed for rating predictions made to them?

**RQ3:**Is there any connection between the average ratings value of an item and the accuracy computed for rating predictions considering this item?

**RQ4:**What is the correlation between the accuracy of a rating prediction for a user and the number of items that the user has rated? If correlated, what is the number of user ratings to ensure better accuracy in the rating prediction to the user?

**RQ5:**Does a rating prediction formulated for an item that has been rated by several users have greater accuracy? If so, how many user ratings would ensure more accurate rating prediction for an item?

## 2. Related Work

## 3. Prerequisites

- The number of NNs participating in the rating prediction; for this feature, we consider the NNs that have actually rated item i and not the overall number of NNs in the user’s NN set;
- The user’s U average ratings value (U
_{avg}); - The item’s i average ratings value (I
_{avg}); - The number of items user U has rated (U
_{N}); - The number of users that have rated item i (I
_{N}).

## 4. Exploring Rating Prediction Features

- For the exact number of NNs participating in the prediction feature (NNs features), values from 1 up to 9, plus an extra case of NNs ≥ 10, are examined.
- For the user’s U average ratings value (U
_{avg}–feature), the values from the minimum to the maximum rating values are examined, with the increment step being equal to 0.5, i.e., the ranges of [1–1.5), [1.5–2), [2, 2.5), [2.5–3), [3–3.5), [4–4.5) and [4.5–5]. Notably, in all datasets used in this study, the ratings ranged from 1 to 5. - For the item’s i average ratings value (I
_{avg}–feature), the values from the minimum to the maximum rating values are again examined, with the increment step being equal to 0.5, exactly as in the previous case. - For the number of items user U has rated (U
_{N}–feature), the smallest value used is 5 (since the datasets were 5-core), and increments with a step of 3 were considered up to the value of 25, from which point onwards all cases are classified under a range “>25”. Effectively, the following ranges are considered: [5, 7], [8, 10], [11, 13], [14, 16], [17, 19], [20, 22], [23, 25] and >25. - For the number of users that have rated item i (I
_{N}–feature), the value ranges used are the same as with the previous case, i.e., [5, 7], [8, 10], [11, 13], [14, 16], [17, 19], [20, 22], [23, 25] and >25.

- An algorithm that considers rating variability [20] to improve rating prediction accuracy;
- A CF unveiling and exploiting causal relations between ratings [21];
- A temporal pattern-aware CF algorithm [22];
- A sequential CF recommender algorithm [23];
- A CF algorithm exploiting common histories up to the item review time [24].

#### 4.1. Number of NNs Features

#### 4.2. U_{avg}–Feature

_{avg}) feature. It is noted that the ranges (x–axis) are not the same with the previous factor (and the next ones), since they represent metrics that take not only different values, but also types of values (for example the NNs factor represents the number of near neighbour users that take part in the rating prediction calculation—integer value, while the U

_{avg}represents the average ratings value of the user for whom the rating prediction is being formulated—decimal value).

_{avg}–feature is close to the middle of the rating scale. More specifically, the average MAE reduction from 2.5 ≤ U

_{avg}≤ 3 to 4.5 ≤ U

_{avg}≤ 5 equals 61.2%. The respective average RMSE reduction equals 45.2%. Correspondingly, the average MAE reduction from 2.5 ≤ U

_{avg}≤ 3 to 1 ≤ U

_{avg}≤ 1.5 is equal to 57.92%, and the relevant RMSE reduction is 68.56%.

_{avg}–feature is close to the middle of the rating scale.

_{avg}–feature is correlated with rating prediction error reduction.

#### 4.3. I_{avg}–Feature

_{avg})–feature when using the PCC similarity metric.

_{avg}–feature is close to the middle of the rating scale. More specifically, the average MAE reduction from 2.5 ≤ I

_{avg}≤ 3 to 4.5 ≤ I

_{avg}≤ 5 equals to 59.8%; the respective average RMSE reduction equals to 50.1%. Correspondingly, the average MAE reduction from 2.5 ≤ U

_{avg}≤ 3 to 1 ≤ U

_{avg}≤ 1.5 is equal to 37.74%, and the relevant RMSE reduction is 30.76%.

_{avg}–feature is close to the middle of the rating scale.

_{avg}–feature is correlated with rating prediction error reduction.

#### 4.4. U_{N}–Feature

_{N}–feature) when using the PCC similarity metric.

_{N}–feature increases:

- For one dataset (“Digital music”), the MAE drops by more than 20% (23.82%) when U
_{N}increases from “5–7” to “>25”. - For five datasets (“Musical instruments”, “CDs and Vinyl”, “Sports”, “Home” and “Electronics”), the MAE drops by 10–20% (10.70%, 11.26%, 11.99%, 15.03% and 14.74%, respectively).
- For five datasets (“Videogames”, “Movies and TV”, “Books”, “Clothing” and “Epinions”), the MAE drops by 1–10% (8.34%, 4.02%, 6.52%, 2.19% and 1.96%, respectively).
- For the CiaoDVD dataset, the MAE deteriorates by 10.31%.

_{N}–feature and the rating prediction accuracy. Dataset-specific measurements need to be considered to determine whether higher confidence can be associated with predictions computed for users that have rated a higher number of items, as compared to predictions computed for users that have rated a low number of items (this issue will be investigated in our future work).

#### 4.5. I_{N}–Feature

_{N})–feature.

_{N}–feature increases:

- For four datasets (“Digital Music”, “CDs and Vinyl”, “Home” and “Electronics”), the MAE drops by 10–20% (16.52%, 10.53%, 12.16% and 11.59%, respectively).
- For six datasets (“Videogames”, “Musical Instruments”, “Sports”, “Movies and TV”, “Books”, “Electronics” and “Epinions”), the MAE drops by 1–10% (8.98%, 8.11%, 9.42%, 5.20%, 8.98% and 3.74%, respectively).
- For two datasets, “Clothing” and “Epinions”, the MAE deteriorates by 0.25% and 2.27%, respectively.

_{N}–feature and the rating prediction accuracy. Dataset-specific measurements need to be considered to determine whether a higher confidence can be associated with predictions computed for items that have received a higher number of ratings as compared to predictions computed for items that have received a low number of ratings (this issue will be investigated in our future work).

#### 4.6. Discussion of the Results and Complexity Analysis

- The number of NNs used for the formulation of the rating prediction is ≥4: a CF rating prediction formulated by taking into account ≥4 NNs is considered sounder than a prediction based on, for example, just 1 NN, since, as in real life, a recommendation based on very few opinions (friends, family members, etc.) bears a high probability of failure.
- The rating average of the user for whom the prediction is generated is close to the boundaries of the rating scale: a user with a rating average close to the maximum rating value (similarly, close to the minimum rating value) is a user who practically enters almost only excellent evaluations (similarly, bad evaluations), and hence it is easier for a rating prediction system to predict his/her next rating.
- The rating average of the item concerning the prediction is close to the boundaries of the rating scale: an item with a rating average close to the maximum rating value (or, similarly, close to the minimum rating value) is an item practically considered widely acceptable (similarly, widely unacceptable), and hence the probability that the (high value) rating prediction will be close to the (high) real user rating is relatively high.

## 5. Conclusions and Future Work

- When the number of NNs used for the formulation of the rating prediction is ≥4, the prediction accuracy was found to be significantly higher (on average, we obtained a lower prediction error by 15%);
- When the rating average of the user for whom the prediction is generated is close to the boundaries of the rating scale, the rating prediction accuracy is very high (on average, we obtained a rating prediction error lower by 56% as compared to the accuracy obtained for users whose average is near the middle of the scale);
- When the rating average of the item concerning the prediction is close to the boundaries of the rating scale, the prediction accuracy is very high (on average, we obtain a lower rating prediction error by 57% as compared to the accuracy obtained for items whose respective average is near the middle of the scale).

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Cui, Z.; Xu, X.; Xue, F.; Cai, X.; Cao, Y.; Zhang, W.; Chen, J. Personalized Recommendation System Based on Collaborative Filtering for IoT Scenarios. IEEE Trans. Serv. Comput.
**2020**, 13, 685–695. [Google Scholar] [CrossRef] - Lara-Cabrera, R.; González-Prieto, Á.; Ortega, F. Deep Matrix Factorization Approach for Collaborative Filtering Recommender Systems. Appl. Sci.
**2020**, 10, 4926. [Google Scholar] [CrossRef] - Balabanović, M.; Shoham, Y. Fab: Content-Based, Collaborative Recommendation. Commun. ACM
**1997**, 40, 66–72. [Google Scholar] [CrossRef] - Cechinel, C.; Sicilia, M.-Á.; Sánchez-Alonso, S.; García-Barriocanal, E. Evaluating Collaborative Filtering Recommendations inside Large Learning Object Repositories. Inf. Process. Manag.
**2013**, 49, 34–50. [Google Scholar] [CrossRef] - Herlocker, J.L.; Konstan, J.A.; Terveen, L.G.; Riedl, J.T. Evaluating Collaborative Filtering Recommender Systems. ACM Trans. Inf. Syst.
**2004**, 22, 5–53. [Google Scholar] [CrossRef] - Lops, P.; Narducci, F.; Musto, C.; de Gemmis, M.; Polignano, M.; Semeraro, G. Recommendations Biases and Beyond-Accuracy Objectives in Collaborative Filtering. In Collaborative Recommendations; World Scientific: Singapore, 2018; pp. 329–368. [Google Scholar]
- Singh, P.K.; Sinha, M.; Das, S.; Choudhury, P. Enhancing Recommendation Accuracy of Item-Based Collaborative Filtering Using Bhattacharyya Coefficient and Most Similar Item. Appl. Intell.
**2020**, 50, 4708–4731. [Google Scholar] [CrossRef] - Guo, J.; Deng, J.; Ran, X.; Wang, Y.; Jin, H. An Efficient and Accurate Recommendation Strategy Using Degree Classification Criteria for Item-Based Collaborative Filtering. Expert Syst. Appl.
**2021**, 164, 113756. [Google Scholar] [CrossRef] - Veras De Sena Rosa, R.E.; Guimaraes, F.A.S.; Mendonca, R.; da Lucena, V.F. Improving Prediction Accuracy in Neighborhood-Based Collaborative Filtering by Using Local Similarity. IEEE Access
**2020**, 8, 142795–142809. [Google Scholar] [CrossRef] - Ramezani, M.; Moradi, P.; Akhlaghian, F. A Pattern Mining Approach to Enhance the Accuracy of Collaborative Filtering in Sparse Data Domains. Phys. A Stat. Mech. Appl.
**2014**, 408, 72–84. [Google Scholar] [CrossRef] - Feng, C.; Liang, J.; Song, P.; Wang, Z. A Fusion Collaborative Filtering Method for Sparse Data in Recommender Systems. Inf. Sci.
**2020**, 521, 365–379. [Google Scholar] [CrossRef] - Li, K.; Zhou, X.; Lin, F.; Zeng, W.; Wang, B.; Alterovitz, G. Sparse Online Collaborative Filtering with Dynamic Regularization. Inf. Sci.
**2019**, 505, 535–548. [Google Scholar] [CrossRef] - Sarwar, B.; Karypis, G.; Konstan, J.; Reidl, J. Item-Based Collaborative Filtering Recommendation Algorithms. In Proceedings of the Tenth International Conference on World Wide Web—WWW ’01, Hong Kong, 1–5 May 2001; ACM Press: New York, NY, USA, 2001; pp. 285–295. [Google Scholar]
- Li, Y.; Hu, J.; Zhai, C.; Chen, Y. Improving One-Class Collaborative Filtering by Incorporating Rich User Information. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management—CIKM ’10, Toronto, ON, Canada, 26–30 October 2010; ACM Press: New York, NY, USA, 2010; p. 959. [Google Scholar]
- Herlocker, J.L.; Konstan, J.A.; Borchers, A.; Riedl, J. An Algorithmic Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval—SIGIR ’99, Berkeley, CA, USA, 15–19 August 1999; ACM Press: New York, NY, USA, 1999; pp. 230–237. [Google Scholar]
- Margaris, D.; Vassilakis, C.; Spiliotopoulos, D. What Makes a Review a Reliable Rating in Recommender Systems? Inf. Process. Manag.
**2020**, 57, 102304. [Google Scholar] [CrossRef] - Margaris, D.; Vassilakis, C.; Spiliotopoulos, D. Handling Uncertainty in Social Media Textual Information for Improving Venue Recommendation Formulation Quality in Social Networks. Soc. Netw. Anal. Min.
**2019**, 9, 64. [Google Scholar] [CrossRef] - Herlocker, J.L.; Konstan, J.A.; Borchers, A.; Riedl, J. An Algorithmic Framework for Performing Collaborative Filtering. ACM SIGIR Forum
**2017**, 51, 227–234. [Google Scholar] [CrossRef] - Schafer, J.B.; Frankowski, D.; Herlocker, J.; Sen, S. Collaborative Filtering Recommender Systems. In The Adaptive Web; Springer: Berlin/Heidelberg, Germany, 2007; pp. 291–324. [Google Scholar]
- Margaris, D.; Vassilakis, C. Improving Collaborative Filtering’s Rating Prediction Accuracy by Considering Users’ Rating Variability. In Proceedings of the 4th IEEE International Conference on Big Data Intelligence and Computing, Athens, Greece, 12–15 August 2018; IEEE: Athens, Greece, 2018; pp. 1022–1027. [Google Scholar]
- Zhang, J.; Chen, X.; Zhao, W.X. Causally Attentive Collaborative Filtering. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Online, 1–5 November 2021; ACM: New York, NY, USA, 2021; pp. 3622–3626. [Google Scholar]
- Chai, Y.; Liu, G.; Chen, Z.; Li, F.; Li, Y.; Effah, E.A. A Temporal Collaborative Filtering Algorithm Based on Purchase Cycle. In Proceedings of the ICCCS 2018: Cloud Computing and Security, Haikou, China, 8–10 June 2018; Xingming, S., Pan, Z., Bertino, E., Eds.; Springer: Cham, Switzerland, 2018; pp. 191–201. [Google Scholar]
- Li, J.; Wang, Y.; McAuley, J. Time Interval Aware Self-Attention for Sequential Recommendation. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 20 January 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 322–330. [Google Scholar]
- Margaris, D.; Spiliotopoulos, D.; Vassilakis, C.; Vasilopoulos, D. Improving Collaborative Filtering’s Rating Prediction Accuracy by Introducing the Experiencing Period Criterion. Neural Comput. Appl. Spec. Issue Inf. Intell. Syst. Appl.
**2020**. [Google Scholar] [CrossRef] - Fernández-García, A.J.; Iribarne, L.; Corral, A.; Criado, J.; Wang, J.Z. A Recommender System for Component-Based Applications Using Machine Learning Techniques. Knowl.-Based Syst.
**2019**, 164, 68–84. [Google Scholar] [CrossRef] - Forestiero, A. Heuristic Recommendation Technique in Internet of Things Featuring Swarm Intelligence Approach. Expert Syst. Appl.
**2022**, 187, 115904. [Google Scholar] [CrossRef] - Sahu, S.; Kumar, R.; Pathan, M.S.; Shafi, J.; Kumar, Y.; Ijaz, M.F. Movie Popularity and Target Audience Prediction Using the Content-Based Recommender System. IEEE Access
**2022**, 10, 42044–42060. [Google Scholar] [CrossRef] - Aivazoglou, M.; Roussos, A.O.; Margaris, D.; Vassilakis, C.; Ioannidis, S.; Polakis, J.; Spiliotopoulos, D. A Fine-Grained Social Network Recommender System. Soc. Netw. Anal. Min.
**2020**, 10, 8. [Google Scholar] [CrossRef] - Bouazza, H.; Said, B.; Zohra Laallam, F. A Hybrid IoT Services Recommender System Using Social IoT. J. King Saud Univ. Comput. Inf. Sci.
**2022**, in press. [Google Scholar] [CrossRef] - Zhang, Z.-P.; Kudo, Y.; Murai, T.; Ren, Y.-G. Enhancing Recommendation Accuracy of Item-Based Collaborative Filtering via Item-Variance Weighting. Appl. Sci.
**2019**, 9, 1928. [Google Scholar] [CrossRef] [Green Version] - Zhang, L.; Wei, Q.; Zhang, L.; Wang, B.; Ho, W.-H. Diversity Balancing for Two-Stage Collaborative Filtering in Recommender Systems. Appl. Sci.
**2020**, 10, 1257. [Google Scholar] [CrossRef] [Green Version] - Yan, H.; Tang, Y. Collaborative Filtering Based on Gaussian Mixture Model and Improved Jaccard Similarity. IEEE Access
**2019**, 7, 118690–118701. [Google Scholar] [CrossRef] - Jiang, L.; Cheng, Y.; Yang, L.; Li, J.; Yan, H.; Wang, X. A Trust-Based Collaborative Filtering Algorithm for E-Commerce Recommendation System. J. Ambient Intell. Humaniz. Comput.
**2019**, 10, 3023–3034. [Google Scholar] [CrossRef] [Green Version] - Iftikhar, A.; Ghazanfar, M.A.; Ayub, M.; Mehmood, Z.; Maqsood, M. An Improved Product Recommendation Method for Collaborative Filtering. IEEE Access
**2020**, 8, 123841–123857. [Google Scholar] [CrossRef] - Natarajan, S.; Vairavasundaram, S.; Natarajan, S.; Gandomi, A.H. Resolving Data Sparsity and Cold Start Problem in Collaborative Filtering Recommender System Using Linked Open Data. Expert Syst. Appl.
**2020**, 149, 113248. [Google Scholar] [CrossRef] - Shahbazi, Z.; Hazra, D.; Park, S.; Byun, Y.C. Toward Improving the Prediction Accuracy of Product Recommendation System Using Extreme Gradient Boosting and Encoding Approaches. Symmetry
**2020**, 12, 1566. [Google Scholar] [CrossRef] - Yang, X.; Liang, C.; Zhao, M.; Wang, H.; Ding, H.; Liu, Y.; Li, Y.; Zhang, J. Collaborative Filtering-Based Recommendation of Online Social Voting. IEEE Trans. Comput. Soc. Syst.
**2017**, 4, 1–13. [Google Scholar] [CrossRef] - Jalali, S.; Hosseini, M. Social Collaborative Filtering Using Local Dynamic Overlapping Community Detection. J. Supercomput.
**2021**, 77, 11786–11806. [Google Scholar] [CrossRef] - Zhang, T. Research on Collaborative Filtering Recommendation Algorithm Based on Social Network. Int. J. Internet Manuf. Serv.
**2019**, 6, 343. [Google Scholar] [CrossRef] - Guo, L.; Liang, J.; Zhu, Y.; Luo, Y.; Sun, L.; Zheng, X. Collaborative Filtering Recommendation Based on Trust and Emotion. J. Intell. Inf. Syst.
**2019**, 53, 113–135. [Google Scholar] [CrossRef] - Herce-Zelaya, J.; Porcel, C.; Bernabé-Moreno, J.; Tejeda-Lorente, A.; Herrera-Viedma, E. New Technique to Alleviate the Cold Start Problem in Recommender Systems Using Information from Social Media and Random Decision Forests. Inf. Sci.
**2020**, 536, 156–170. [Google Scholar] [CrossRef] - Margaris, D.; Vassilakis, C. Exploiting Rating Abstention Intervals for Addressing Concept Drift in Social Network Recommender Systems. Informatics
**2018**, 5, 21. [Google Scholar] [CrossRef] [Green Version] - Verstrepen, K.; Goethals, B. Unifying Nearest Neighbors Collaborative Filtering. In Proceedings of the 8th ACM Conference on Recommender systems—RecSys ’14, Foster City, CA, USA, 6–10 October 2014; ACM Press: New York, NY, USA, 2014; pp. 177–184. [Google Scholar]
- Logesh, R.; Subramaniyaswamy, V.; Malathi, D.; Sivaramakrishnan, N.; Vijayakumar, V. Enhancing Recommendation Stability of Collaborative Filtering Recommender System through Bio-Inspired Clustering Ensemble Method. Neural Comput. Appl.
**2020**, 32, 2141–2164. [Google Scholar] [CrossRef] - Schwarz, M.; Lobur, M.; Stekh, Y. Analysis of the Effectiveness of Similarity Measures for Recommender Systems. In Proceedings of the 2017 14th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM), Lviv, Ukraine, 21–25 February 2017; IEEE: Lviv, Ukraine, 2017; pp. 275–277. [Google Scholar]
- Sheugh, L.; Alizadeh, S.H. A Note on Pearson Correlation Coefficient as a Metric of Similarity in Recommender System. In Proceedings of the 2015 AI & Robotics (IRANOPEN), Qazvin, Iran, 12 April 2015; IEEE: Qazvin, Iran, 2015; pp. 1–6. [Google Scholar]
- Luo, C.; Zhan, J.; Xue, X.; Wang, L.; Ren, R.; Yang, Q. Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. In Proceedings of the 2018 Conference on Artificial Neural Networks and Machine Learning—ICANN 2018, Rhodes, Greece, 4–7 October 2018; Springer: Cham, Switzerland, 2018; pp. 382–391. [Google Scholar]
- Jin, R.; Chai, J.Y.; Si, L. An Automatic Weighting Scheme for Collaborative Filtering. In Proceedings of the 27th annual international conference on Research and development in information retrieval—SIGIR ’04, Sheffield, UK, 25–29 July 2004; ACM Press: New York, NY, USA, 2004; p. 337. [Google Scholar]
- Liu, H.; Hu, Z.; Mian, A.; Tian, H.; Zhu, X. A New User Similarity Model to Improve the Accuracy of Collaborative Filtering. Knowl.-Based Syst.
**2014**, 56, 156–166. [Google Scholar] [CrossRef] [Green Version] - Barkan, O.; Fuchs, Y.; Caciularu, A.; Koenigstein, N. Explainable Recommendations via Attentive Multi-Persona Collaborative Filtering. In Proceedings of the Fourteenth ACM Conference on Recommender Systems, Virtual Event, Brazil, 22–26 September 2020; ACM: New York, NY, USA, 2020; pp. 468–473. [Google Scholar]
- Wang, Q.; Yin, H.; Wang, H.; Nguyen, Q.V.H.; Huang, Z.; Cui, L. Enhancing Collaborative Filtering with Generative Augmentation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 25 July 2019; ACM: New York, NY, USA, 2019; pp. 548–556. [Google Scholar]
- Ni, J.; Li, J.; McAuley, J. Justifying Recommendations Using Distantly-Labeled Reviews and Fine-Grained Aspects. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, 3–7 November 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 188–197. [Google Scholar]
- He, R.; McAuley, J. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web, Montréal, QC, Canada, 11–15 April 2016; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2016; pp. 507–517. [Google Scholar]
- Guo, G.; Zhang, J.; Thalmann, D.; Yorke-Smith, N. ETAF: An Extended Trust Antecedents Framework for Trust Prediction. In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing, China, 17–20 August 2014; IEEE: Beijing, China, 2014; pp. 540–547. [Google Scholar]
- Meyffret, S.; Guillot, E.; Médini, L.; Laforest, F. RED: A Rich Epinions Dataset for Recommender Systems; LIRIS, ⟨hal-01010246⟩; HAL Open Science: Leiden, The Netherlands, 2014. [Google Scholar]
- Candillier, L.; Meyer, F.; Boullé, M. Comparing State-of-the-Art Collaborative Filtering Systems. In Machine Learning and Data Mining in Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2007; pp. 548–562. [Google Scholar]
- Candillier, L.; Meyer, F.; Fessant, F. Designing Specific Weighted Similarity Measures to Improve Collaborative Filtering Systems. In Advances in Data Mining. Medical Applications, E-Commerce, Marketing, and Theoretical Aspects; Springer: Berlin/Heidelberg, Germany, 2008; pp. 242–255. [Google Scholar]
- Yu, K.; Schwaighofer, A.; Tresp, V.; Xu, X.; Kriegel, H. Probabilistic Memory-Based Collaborative Filtering. IEEE Trans. Knowl. Data Eng.
**2004**, 16, 56–69. [Google Scholar] [CrossRef] - Wang, J.; Lin, K.; Li, J. A Collaborative Filtering Recommendation Algorithm Based on User Clustering and Slope One Scheme. In Proceedings of the 2013 8th International Conference on Computer Science & Education, Colombo, Sri Lanka, 26–28 April 2013; IEEE: Colombo, Sri Lanka, 2013; pp. 1473–1476. [Google Scholar]
- Massa, P.; Avesani, P. Trust-Aware Collaborative Filtering for Recommender Systems. In Proceedings of the OTM 2004: On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, and ODBASE, Agia Napa, Cyprus, 25–29 October 2004; Meersman, R., Tari, Z., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 492–508. [Google Scholar]
- Lu, L.; Yuan, Y.; Chen, X.; Li, Z. A Hybrid Recommendation Method Integrating the Social Trust Network and Local Social Influence of Users. Electronics
**2020**, 9, 1496. [Google Scholar] [CrossRef] - Wang, Y.; Deng, J.; Gao, J.; Zhang, P. A Hybrid User Similarity Model for Collaborative Filtering. Inf. Sci.
**2017**, 418–419, 102–118. [Google Scholar] [CrossRef] - Hu, R.; Pu, P. Enhancing Collaborative Filtering Systems with Personality Information. In Proceedings of the Fifth ACM Conference on Recommender Systems—RecSys ’11, Chicago, IL, USA, 23–27 October 2011; ACM Press: New York, NY, USA, 2011; p. 197. [Google Scholar]
- Lyon, G.F. Understanding and Customizing Nmap Data Files. In Nmap Network Scanning: The Official Nmap Project Guide to Network Discovery and Security Scanning; Insecure.Com LLC: Sunnyvale, CA, USA, 2009; p. 464. [Google Scholar]

**Figure 1.**Effect of the number of NNs features in the rating prediction MAE when using the PCC user similarity metric.

**Figure 2.**Effect of the U

_{avg}–feature in the rating prediction MAE when using the PCC user similarity metric.

**Figure 3.**Effect of the I

_{avg}–feature in the rating prediction MAE when using the PCC user similarity metric.

**Figure 4.**Effect of the U

_{N}–feature in the rating prediction MAE when using the PCC user similarity metric.

**Figure 5.**Effect of the I

_{N}–feature in the rating prediction MAE when using the PCC user similarity metric.

Publisher | Dataset # | Dataset Name | Density | #Users | #Items | #Ratings |
---|---|---|---|---|---|---|

Amazon Datasets [52,53] | 1 | Digital Music | 0.08% | 17 K | 12 K | 170 K |

2 | Videogames | 0.006% | 55 K | 17 K | 498 K | |

3 | Musical Instruments | 0.075% | 28 K | 11 K | 231 K | |

4 | CDs and Vinyl | 0.017% | 112 K | 74 K | 1.44 M | |

5 | Sports | 0.008% | 332 K | 105 K | 2.8 M | |

6 | Movies & TV | 0.019% | 298 K | 60 K | 3.4 M | |

7 | Home | 0.0047% | 777 K | 189 K | 6.9 M | |

8 | Books | 0.0021% | 1.85 M | 685 K | 27 M | |

9 | Clothing | 0.0025% | 1.2 M | 377 K | 11.3 M | |

10 | Electronics | 0.0057% | 729 K | 160 K | 6.7 M | |

Ciao [54] | 11 | CiaoDVD | 0.073% | 30 K | 73 K | 1.6 M |

Epinions [55] | 12 | Epinions (665 K) | 0.012% | 40 K | 140 K | 665 K |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Margaris, D.; Vassilakis, C.; Spiliotopoulos, D.
On Producing Accurate Rating Predictions in Sparse Collaborative Filtering Datasets. *Information* **2022**, *13*, 302.
https://doi.org/10.3390/info13060302

**AMA Style**

Margaris D, Vassilakis C, Spiliotopoulos D.
On Producing Accurate Rating Predictions in Sparse Collaborative Filtering Datasets. *Information*. 2022; 13(6):302.
https://doi.org/10.3390/info13060302

**Chicago/Turabian Style**

Margaris, Dionisis, Costas Vassilakis, and Dimitris Spiliotopoulos.
2022. "On Producing Accurate Rating Predictions in Sparse Collaborative Filtering Datasets" *Information* 13, no. 6: 302.
https://doi.org/10.3390/info13060302