# Dirichlet Matrix Factorization: A Reliable Classification-Based Recommender System

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

#### Our Contribution

- A new model, named DirMF, is introduced, and a complete description of its mathematical formulation, training procedure and prediction protocol is provided.
- The new DirMF model introduces a new probabilistic interpretation of the rating process. Instead of considering the rating as a continuous random variable or seeing each vote as an independent feature, as in the existing literature, our method is flexible enough to model the ratings as interrelated discrete random variables. For this purpose, the method proposes the novelty of relying in the Dirichlet probability distribution to model the user’s rating behavior;
- The performance of this new model is evaluated through an extensive collection of experiments. The results evidence that DirMF achieves a high performance both in terms of recommendation quality and management of reliability. The implementation of the reliability as an intrinsic part of the algorithm leads to a better performance of this method compared to the preexisting algorithms that do not treat reliability in this way;
- Additionally, DirMF shows similar results to BeMF, the other method that embeds the reliability as part of its training process. Furthermore, thanks to its more flexible nature, DirMF presents a more conservative forecasting trend than BeMF, which leads to a better prediction in scenarios in which failure is highly penalized.

## 2. Proposed Model

#### 2.1. Dirichlet Factorization

#### 2.2. Prediction

- The final prediction ${\widehat{R}}_{u,i}$, which is the mode of the aforementioned distribution, that is,$${\widehat{R}}_{u,i}={\mathrm{argmax}}_{s}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathbb{E}\left({\mathbf{R}}_{u,i}^{s}\right).$$
- The reliability ${\varrho}_{u,i}$ in the prediction, which is the probability attained at the mode of the distribution, that is,$${\varrho}_{u,i}={\mathrm{max}}_{s}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathbb{E}\left({\mathbf{R}}_{u,i}^{s}\right).$$

- Collect the rating matrix R of shape $N\times M$ with the known votes per user and item;
- Choose hyperparameters K (number of latent factors), $\gamma $ (learning rate), $\eta $ (regularization), m (number of iterations) and $\vartheta $ (reliability threshold, for the exploitation phase);
- Execute the training algorithm for DirMF (Algorithm 1) with the chosen hyperparameters. The output of the training is a collection of pairs of matrices $({P}^{s}=({P}_{1}^{s},\dots ,{P}_{N}^{s}),{Q}^{s}=({Q}_{1}^{s},\dots ,{Q}_{M}^{s}))$ for each possible vote s;
- Given a new pair $(u,i)$ of an user $1\le u\le N$ and an item $1\le i\le M$ to be predicted, compute the quantities$$\mathbb{E}\left({\mathbf{R}}_{u,i}^{s}\right)=\frac{\omega ({P}_{u}^{s}\xb7{Q}_{i}^{s})}{{\displaystyle \sum _{s\in \mathcal{S}}\omega ({P}_{u}^{s}\xb7{Q}_{i}^{s})}},$$
- The prediction ${\widehat{R}}_{u,i}$ is the vote ${s}_{0}$ for which $\mathbb{E}\left({\mathbf{R}}_{u,i}^{{s}_{0}}\right)$ is maximum. The reliability is the value ${\varrho}_{u,i}=\mathbb{E}\left({\mathbf{R}}_{u,i}^{{s}_{0}}\right)$.
- If ${\varrho}_{u,i}\ge \vartheta $, then return prediction ${\widehat{R}}_{u,i}$; otherwise return that no reliable prediction can be issued.

#### 2.3. Computational Complexity

Algorithm 1: DirMF model fitting algorithm fixing $\omega $ for the logistic function and shape parameters ${\beta}_{u,i}={R}_{u,i}$. |

#### 2.4. Running Example

## 3. Evaluation of Dirichlet Matrix Factorization

#### 3.1. Experimental Setup

#### 3.2. Experimental Results

#### 3.2.1. Mae vs. Coverage

#### 3.2.2. Recommendation Quality

#### 3.2.3. Time and Space Complexity

## 4. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Bobadilla, J.; Ortega, F.; Hernando, A.; Gutiérrez, A. Recommender systems survey. Knowl.-Based Syst.
**2013**, 46, 109–132. [Google Scholar] [CrossRef] - Kaminskas, M.; Bridge, D. Diversity, Serendipity, Novelty, and Coverage: A Survey and Empirical Analysis of Beyond-Accuracy Objectives in Recommender Systems. ACM Trans. Interact. Intell. Syst.
**2016**, 7, 1–42. [Google Scholar] [CrossRef] - Bobadilla, J.; Gutiérrez, A.; Ortega, F.; Zhu, B. Reliability quality measures for recommender systems. Inf. Sci.
**2018**, 442, 145–157. [Google Scholar] [CrossRef] - Villegas, N.M.; Sánchez, C.; Díaz-Cely, J.; Tamura, G. Characterizing context-aware recommender systems: A systematic literature review. Knowl.-Based Syst.
**2018**, 140, 173–200. [Google Scholar] [CrossRef] - Zamani, H.; Shakery, A. A language model-based framework for multi-publisher content-based recommender systems. Inf. Retr. J.
**2018**, 21, 369–409. [Google Scholar] [CrossRef] - Rezvanian, A.; Moradabadi, B.; Ghavipour, M.; Daliri Khomami, M.M.; Meybodi, M.R. Social Recommender Systems. In Learning Automata Approach for Social Networks; Springer International Publishing: Cham, Switzerland, 2019; pp. 281–313. [Google Scholar]
- Al-Shamri, M.Y.H. User profiling approaches for demographic recommender systems. Knowl.-Based Syst.
**2016**, 100, 175–187. [Google Scholar] [CrossRef] - Sohail, S.S.; Siddiqui, J.; Ali, R. Classifications of Recommender Systems: A review. J. Eng. Sci. Technol. Rev.
**2017**, 10, 132–153. [Google Scholar] [CrossRef] - Ignat’ev, V.U.; Lemtyuzhnikova, D.V.; Rul’, D.I.; Ryabov, I.L. Constructing a Hybrid Recommender System. J. Comput. Syst. Sci. Int.
**2018**, 57, 921–926. [Google Scholar] [CrossRef] - Mnih, A.; Salakhutdinov, R.R. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems 20 (NIPS 2007); 2008; pp. 1257–1264. Available online: https://papers.nips.cc/paper/2007/hash/d7322ed717dedf1eb4e6e52a37ea7bcd-Abstract.html (accessed on 30 November 2021).
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web WWW ’17, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar] [CrossRef][Green Version]
- Margaris, D.; Vassilakis, C.; Spiliotopoulos, D. What makes a review a reliable rating in recommender systems? Inf. Process. Manag.
**2020**, 57, 102304. [Google Scholar] [CrossRef] - Mesas, R.M.; Bellogín, A. Exploiting recommendation confidence in decision-aware recommender systems. J. Intell. Inf. Syst.
**2020**, 54, 45–78. [Google Scholar] [CrossRef] - Fan, S.; Yu, H.; Huang, H. An improved collaborative filtering recommendation algorithm based on reliability. In Proceedings of the 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), Chengdu, China, 20–22 April 2018; pp. 45–51. [Google Scholar]
- Ahmadian, S.; Afsharchi, M.; Meghdadi, M. A novel approach based on multi-view reliability measures to alleviate data sparsity in recommender systems. Multimed. Tools Appl.
**2019**, 78, 17763–17798. [Google Scholar] [CrossRef] - Liang, Y.; Huang, C.; Bao, X.; Xu, K. Sequential dynamic event recommendation in event-based social networks: An upper confidence bound approach. Inf. Sci.
**2021**, 542, 1–23. [Google Scholar] [CrossRef] - Xu, G.; Tang, Z.; Ma, C.; Liu, Y.; Daneshmand, M. A Collaborative Filtering Recommendation Algorithm Based on User Confidence and Time Context. J. Electr. Comput. Eng.
**2019**, 2019. [Google Scholar] [CrossRef] - Ortega, F.; Lara-Cabrera, R.; González-Prieto, A.; Bobadilla, J. Providing reliability in recommender systems through Bernoulli Matrix Factorization. Inf. Sci.
**2021**, 553, 110–128. [Google Scholar] [CrossRef] - Ortega, F.; Mayor, J.; López-Fernández, D.; Lara-Cabrera, R. CF4J 2.0: Adapting Collaborative Filtering for Java to new challenges of collaborative filtering based recommender systems. Knowl.-Based Syst.
**2021**, 215, 106629. [Google Scholar] [CrossRef] - Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst.
**2015**, 5, 1–19. [Google Scholar] [CrossRef] - Guo, G.; Zhang, J.; Yorke-Smith, N. A Novel Bayesian Similarity Measure for Recommender Systems. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 3–9 August 2013; pp. 2619–2625. [Google Scholar]
- MyAnimeList.net. MyAnimeList Dataset. 2020. Available online: https://www.kaggle.com/azathoth42/myanimelist (accessed on 14 October 2021).
- Netflix Inc. Netflix Prize Data. 2009. Available online: https://www.kaggle.com/netflix-inc/netflix-prize-data (accessed on 14 October 2021).

**Figure 2.**Quality of the predictions measured by MAE and coverage. Predictions with lower reliability than those indicated on the x-axis are filtered out.

**Figure 3.**Quality of the recommendations measured by precision and recall. Predictions with lower reliability than those indicated on the x-axis are filtered out.

${\mathit{R}}_{\mathit{u},\mathit{i}}$ | ${\mathit{i}}_{1}$ | ${\mathit{i}}_{2}$ | ${\mathit{i}}_{3}$ | ${\mathit{i}}_{4}$ | ${\mathit{i}}_{5}$ |
---|---|---|---|---|---|

${u}_{1}$ | ✖ | ✔ | ✔ | ||

${u}_{2}$ | ✔ | ✔ | ✖ | ||

${u}_{3}$ | ✖ | ✔ | ✔ |

${\mathit{R}}_{\mathit{u},\mathit{i}}^{\u2714}$ | ${\mathit{i}}_{\mathit{1}}$ | ${\mathit{i}}_{\mathit{2}}$ | ${\mathit{i}}_{\mathit{3}}$ | ${\mathit{i}}_{\mathit{4}}$ | ${\mathit{i}}_{\mathit{5}}$ | ${\mathit{R}}_{\mathit{u},\mathit{i}}^{\u2716}$ | ${\mathit{i}}_{\mathit{1}}$ | ${\mathit{i}}_{\mathit{2}}$ | ${\mathit{i}}_{\mathit{3}}$ | ${\mathit{i}}_{\mathit{4}}$ | ${\mathit{i}}_{\mathit{5}}$ |

${u}_{1}$ | 0.27 | 0.73 | 0.73 | ${u}_{1}$ | 0.73 | 0.27 | 0.27 | ||||

${u}_{2}$ | 0.73 | 0.73 | 0.27 | ${u}_{2}$ | 0.27 | 0.27 | 0.73 | ||||

${u}_{3}$ | 0.27 | 0.73 | 0.73 | ${u}_{3}$ | 0.73 | 0.27 | 0.27 |

${\mathit{P}}_{\mathit{u},\mathit{k}}^{\u2714}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ | ${\mathit{P}}_{\mathit{u},\mathit{k}}^{\u2716}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ |

${u}_{1}$ | 0.06 | 0.04 | 0.58 | ${u}_{1}$ | 0.73 | 0.85 | 0.88 |

${u}_{2}$ | 0.78 | 0.04 | 0.66 | ${u}_{2}$ | 0.95 | 0.23 | 0.53 |

${u}_{3}$ | 0.40 | 0.77 | 0.36 | ${u}_{3}$ | 0.90 | 0.87 | 0.75 |

${\mathit{Q}}_{\mathit{i},\mathit{k}}^{\u2714}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ | ${\mathit{Q}}_{\mathit{i},\mathit{k}}^{\u2716}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ |

${i}_{1}$ | 0.89 | 0.84 | 0.54 | ${i}_{1}$ | 0.14 | 0.95 | 0.46 |

${i}_{2}$ | 0.06 | 0.42 | 0.84 | ${i}_{2}$ | 0.96 | 0.66 | 0.77 |

${i}_{3}$ | 0.84 | 0.35 | 0.91 | ${i}_{3}$ | 0.13 | 0.75 | 0.25 |

${i}_{4}$ | 0.83 | 0.93 | 0.19 | ${i}_{4}$ | 0.26 | 0.33 | 0.51 |

${i}_{5}$ | 0.08 | 0.31 | 0.14 | ${i}_{5}$ | 0.43 | 0.69 | 0.84 |

${\mathit{P}}_{\mathit{u},\mathit{k}}^{\u2714}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ | ${\mathit{P}}_{\mathit{u},\mathit{k}}^{\u2716}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ |

${u}_{1}$ | 0.10 | 0.08 | 0.59 | ${u}_{1}$ | 0.72 | 0.84 | 0.87 |

${u}_{2}$ | 0.79 | 0.06 | 0.67 | ${u}_{2}$ | 0.94 | 0.22 | 0.52 |

${u}_{3}$ | 0.41 | 0.78 | 0.37 | ${u}_{3}$ | 0.89 | 0.85 | 0.73 |

${\mathit{Q}}_{\mathit{i},\mathit{k}}^{\u2714}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ | ${\mathit{Q}}_{\mathit{i},\mathit{k}}^{\u2716}$ | ${\mathit{k}}_{\mathit{1}}$ | ${\mathit{k}}_{\mathit{2}}$ | ${\mathit{k}}_{\mathit{3}}$ |

${i}_{1}$ | 0.90 | 0.85 | 0.56 | ${i}_{1}$ | 0.12 | 0.93 | 0.44 |

${i}_{2}$ | 0.08 | 0.42 | 0.85 | ${i}_{2}$ | 0.95 | 0.66 | 0.77 |

${i}_{3}$ | 0.84 | 0.35 | 0.91 | ${i}_{3}$ | 0.13 | 0.75 | 0.25 |

${i}_{4}$ | 0.83 | 0.93 | 0.21 | ${i}_{4}$ | 0.26 | 0.32 | 0.51 |

${i}_{5}$ | 0.09 | 0.33 | 0.15 | ${i}_{5}$ | 0.42 | 0.68 | 0.83 |

Dataset | No. Users | No. Items | No. Ratings | No. Test Ratings | Rating Scale |
---|---|---|---|---|---|

MovieLens1M | 6040 | 3706 | 911,031 | 89,178 | 1 to 5 |

FilmTrust | 1508 | 2071 | 32,675 | 2819 | 0.5 to 4.0 |

MyAnimeList | 69,600 | 9927 | 5,788,207 | 549,027 | 1 to 10 |

Netflix Prize | 480,189 | 17,770 | 99,945,049 | 535,458 | 1 to 5 |

Method | MovieLens | FilmTrust | MyAnimeList | Netflix |
---|---|---|---|---|

PMF | $\mathrm{factors}=8$, $\gamma =0.01$, $\lambda =0.045$ | $\mathrm{factors}=4$, $\gamma =0.015$, $\lambda =0.1$ | $\mathrm{factors}=10$, $\gamma =0.005$, $\lambda =0.085$ | $\mathrm{factors}=8$, $\gamma =0.01$, $\lambda =0.06$ |

NCF | $\mathrm{factors}=5$, $\mathrm{epochs}=10$ | $\mathrm{factors}=5$, $\mathrm{epochs}=8$ | $\mathrm{factors}=7$, $\mathrm{epochs}=15$ | $\mathrm{factors}=6$, $\mathrm{epochs}=4$ |

GMF | $\mathrm{factors}=5$, $\mathrm{epochs}=10$ | $\mathrm{factors}=5$, $\mathrm{epochs}=15$ | $\mathrm{factors}=7$, $\mathrm{epochs}=20$ | $\mathrm{factors}=5$, $\mathrm{epochs}=4$ |

BeMF | $\mathrm{factors}=2$, $\gamma =0.006$, $\lambda =0.16$, $m=100$ | $\mathrm{factors}=2$, $\gamma =0.02$, $\lambda =0.06$, $m=75$ | $\mathrm{factors}=4$, $\gamma =0.004$, $\lambda =0.1$, $m=100$ | $\mathrm{factors}=6$, $\gamma =0.0006$, $\lambda =0.02$, $m=50$ |

DirMF | $\mathrm{factors}=6$, $\gamma =0.01$, $\eta =0.022$, $m=50$ | $\mathrm{factors}=8$, $\gamma =0.015$, $\eta =0.09$, $m=100$ | $\mathrm{factors}=10$, $\gamma =0.02$, $\eta =0.01$, $m=100$ | $\mathrm{factors}=10$, $\gamma =0.02$, $\lambda =0.02$, $m=50$ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Lara-Cabrera, R.; González, Á.; Ortega, F.; González-Prieto, Á.
Dirichlet Matrix Factorization: A Reliable Classification-Based Recommender System. *Appl. Sci.* **2022**, *12*, 1223.
https://doi.org/10.3390/app12031223

**AMA Style**

Lara-Cabrera R, González Á, Ortega F, González-Prieto Á.
Dirichlet Matrix Factorization: A Reliable Classification-Based Recommender System. *Applied Sciences*. 2022; 12(3):1223.
https://doi.org/10.3390/app12031223

**Chicago/Turabian Style**

Lara-Cabrera, Raúl, Álvaro González, Fernando Ortega, and Ángel González-Prieto.
2022. "Dirichlet Matrix Factorization: A Reliable Classification-Based Recommender System" *Applied Sciences* 12, no. 3: 1223.
https://doi.org/10.3390/app12031223