Comparison of Selected Algorithms in Movie Recommender System

Walek, Bogdan; Sládek, Ondřej

doi:10.3390/app15179518

Open AccessArticle

Comparison of Selected Algorithms in Movie Recommender System

by

Bogdan Walek

^*

and

Ondřej Sládek

Department of Informatics and Computers, University of Ostrava, 30. dubna 22, 701 03 Ostrava, Czech Republic

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9518; https://doi.org/10.3390/app15179518

Submission received: 28 July 2025 / Revised: 22 August 2025 / Accepted: 25 August 2025 / Published: 29 August 2025

(This article belongs to the Special Issue Advanced Models and Algorithms for Recommender Systems)

Download

Browse Figures

Versions Notes

Abstract

Recommender systems are currently very popular, and their main goal is to propose relevant content to users based on various parameters. The main goal of this paper is to create a comprehensive comparison of selected algorithms in movie recommender systems. The recommender system works with the MovieLens database. The main output of the proposed comparison is finding the best algorithm for selecting movies that are the most relevant to user preferences. The paper contains experimental verification of the performance of the proposed algorithms, with an emphasis on their evaluation based on metrics such as Precision, Recall and F1 score. The goal of the evaluation is to assess how well each algorithm performs in generating accurate and relevant recommendations. The testing process includes analysis of the results achieved on the test set of users.

Keywords:

recommender system; collaborative filtering; content-based filtering; SVD; algorithm comparison; algorithm metrics

1. Introduction

Recommender systems are currently very popular, and their main goal is to propose relevant content to users based on various parameters. A recommender system is an information system designed to support user decision-making by recommending suitable products, information, or services. These systems are widely used in e-commerce, streaming platforms, online dating services, and many other industries [1,2].

Recommender systems analyze specific types of data to predict a user’s rating for individual items. Based on this analysis, they generate recommendations and adjust the content displayed on a page to align as closely as possible with the user’s preferences [1]. This is one of the key reasons why, in recent years, many companies and web applications have implemented systems that study user behavior to provide the most relevant product, service, or information recommendations.

The Fortune Global 500 listing of the U.S. wealthiest companies places many companies like Amazon, Google, Meta, and Netflix, which use Recommender systems, as an essential part of their business [3,4,5,6]. Amazon, one of the first companies that implement a Recommender system on a mass scale, reported a 29% increase in sales after exposing the public to the results of the recommender system [7]. Google implemented Recommender systems for their advertising platform Google Ads [8], their news aggregator Google News [9], or video/social media platform YouTube [10]. Also, their famous search tool, with many factors influencing the results, may be viewed to some extent as a hybrid system combining both textual search and personalized recommendations. Recommender systems have been shown many times to have a positive impact on the revenues of the companies [11]. Recommender systems are used by a variety of businesses including entertainment [3,12,13], social media [14,15], news media [2], e-commerce [5,16], advertisement [17,18], e-tourism services [19,20] or even movies [21,22,23,24].

In this article, we focus on comparing recommender algorithms in movie recommender system.

2. Related Work and Current State of Recommender Systems

In the era of digital expansion, where an unlimited amount of data and information is available, Recommender systems have proven to be essential for web platforms. Companies like Netflix and e-commerce giants such as Amazon implement these algorithms to process vast amounts of data and deliver content tailored to individual user preferences. The primary goal of these recommendation algorithms is to drive sales growth by suggesting items that best match users’ interests. These systems operate based on several key principles aimed at maximizing value for the user [1,2,25,26,27]:

Increase in item sales: A key function of commercial recommender systems is allowing the sale of additional items beyond the commonly sold ones.
Sale of more diverse items: The system allows users to select items that may not have been visible without precise recommendations.
Better understanding of user needs: Systems can describe the tastes and preferences of a given user.

Recommender systems pursue several specific goals, including increasing sales, enhancing user satisfaction, and gaining a better understanding of customer preferences. These objectives highlight a dual benefit: an immediate boost in revenue for businesses and the development of stronger long-term relationships between users and the services they engage with. One of the primary aims is to support business strategies and technological innovation by improving user satisfaction. Regularly receiving relevant recommendations can significantly enhance user experience, naturally fostering greater loyalty to the platform and increasing the likelihood of repeat purchases [1,2,25].

2.1. Movie Recommender Systems

Recommender systems for relevant movies are developed to suggest the most suitable movies for a user based on their preferences, favorite genres, actors, directors, and other parameters. The Netflix Movie Recommender System also adds an explanation of why given movies have been recommended [28]. Presenting reasonable explanations helps the user understand why a given movie should be interesting for them. This approach helps increase system credibility and user loyalty [25]. Such an approach has also been implemented in MovieExplain [29]. An interesting approach is also a proposal of suitable movies based on user emotions. The user marks three colors representing emotions (joy, anger, sadness, etc.) and the system then proposes suitable movies [30]. Movie recommender systems have been proposed using various methods and approaches. There are many implemented recommender systems, for instance [23,31,32,33].

In the area of design and implementation of a recommender system, there are currently 4 approaches [1,26,27,34,35]:

Content-based recommender systems
Collaborative filtering recommender systems
Knowledge-based recommender systems
Hybrid recommender systems

In this article, we focus mainly on content-based recommender systems, collaborative filtering recommender systems and hybrid recommender systems. Based on these approaches we selected algorithms for further comparison.

2.2. Strengths and Weaknesses of Current Recommender Systems

Modern recommender systems represent a key technology in the world of e-commerce. These systems are generally categorized into content-based filtering, collaborative filtering, and hybrid approaches that combine elements of both methods. Each of these approaches has its strengths and weaknesses, which influence their performance and effectiveness across different applications and use cases [2,25,36].

2.2.1. Strengths

Personalization: Thanks to advanced technologies, these systems can offer tailored recommendations.
Improved user experience: Users can easily encounter new and precisely suitable content through these systems.
Increased sales and customer retention: Especially for online stores, these systems are becoming invaluable tools for boosting sales and ensuring customer loyalty.

In this way, recommendation algorithms provide significant benefits for both users and service providers, enhancing the user experience while supporting business objectives [2,25,27,36].

2.2.2. Weaknesses

Despite their positive impact on personalizing user experiences and supporting the business models of online platforms, Recommender systems face several challenges [1,25,27,37], such as:

Cold start problem: Without sufficient data, initial recommendations may be less accurate.
Lack of diversity: There is a risk that systems will repeatedly offer content that users are already familiar with.
Manipulation and bias: There is a danger that systems could be manipulated for commercial purposes.

Although Recommender systems offer significant benefits for the digitalization of user experiences and the economic aspects of online platforms, it is crucial to address these challenges [1,25,27,36].

2.3. Content-Based Recommender Systems

Content-based recommender systems select and suggest products to users based on product characteristics and user interests. These interests can be explicitly specified by the user, but more often, they are inferred from the user’s ratings and interactions with various products. A wide range of machine learning algorithms is used to adapt to user preferences, with the choice of a specific algorithm depending on how the content is represented [1,36,38].

In Figure 1, the principle of the content-based filtering approach is shown.

Content-based recommender systems offer several advantages [27,36,37,40].

Personalization: They allow users to create their own profiles based on their ratings.
Independence from the number of users: These systems work even when extensive data from other users is not available.
Ability to recommend new items: They can identify and recommend new or underrated products.

There are also challenges, such as [27,36,40]:

Risk of over-specialization: Users may be recommended content that is too narrowly focused on their previous preferences.
Feedback collection: It can be more difficult to gather user feedback, which is essential for improving the accuracy and relevance of recommendations.

In Figure 2, the content-based filtering architecture is shown [27].

Although the accuracy of recommendations may sometimes be lower compared to collaborative filtering, content-based methods offer a fast and efficient solution, especially when detailed and comprehensive product information is available. In the domain of movie recommendations, the quality and accuracy of the system improve with the amount and granularity of the characteristics used to describe films [27,37].

2.4. Collaborative Filtering Recommender Systems

Collaborative filtering is one of the fundamental and earliest methods implemented in recommender systems. This approach suggests items to a user based on what has been favored by other users with similar tastes. The similarity in preferences between two users is determined by analyzing the similarity in their rating history. Collaborative filtering is considered the most popular and widely implemented technique in recommender systems [27,36,41].

In Figure 3, the collaborative filtering approach is shown [41].

Collaborative filtering is a well-known application of recommender systems based on estimating user preferences. People who have liked a certain product in the past are likely to enjoy it again in the future. This algorithm, also known as item-based collaborative filtering, connects users based on the similarity of their ratings rather than linking them to specific items. The system relies solely on user behavior, excluding personal content and profile information. Its core principle is to suggest new items based on behavioral similarities among users [1,27,41]. Collaborative filtering differs from content-based filtering in several key aspects. The main distinction is that collaborative filtering does not require specific knowledge about product content, such as movie genres. This method enables efficient evaluation and recommendation of products based on user reviews, without needing a deep understanding of the product itself [27,40,41].

In Figure 4, the differences between the collaborative filtering and content-based filtering approaches are shown [25,36].

Another limitation of content-based filtering is the inaccuracy of keywords used in product descriptions, which can make it difficult to provide ideal recommendations for customers with a limited number of ratings. Additionally, content-based systems rely solely on a user’s current interests, which may restrict their ability to offer relevant recommendations that adapt to changing needs and preferences. Despite these challenges, collaborative filtering remains one of the most effective recommendation methods. However, it faces scalability issues when processing large datasets [25,36].

2.5. Hybrid Recommender Systems

Hybrid recommender systems combine multiple recommendation methods to enhance efficiency and overcome the limitations of individual approaches. They typically integrate collaborative filtering with other techniques to mitigate specific challenges, such as the cold start problem, where new products or users lack sufficient ratings for effective inclusion in the system [1,36,42,43].

Key advantages of hybrid systems [1,36,42,43]:

Solving the cold start problem: They effectively respond to new users or products without rating history by utilizing content-based information.
Higher recommendation accuracy: By leveraging the strengths of both approaches, hybrid systems provide users with more accurate and relevant recommendations.
Flexibility: They have the ability to adapt to various user needs by combining multiple data sources and methods.

Hybrid systems provide a more effective solution for personalized recommendations compared to purely collaborative or content-based filtering. One of their main advantages is their flexibility and ability to combine the best features of different techniques. For example, collaborative filtering can be used to analyze patterns among users, while content-based systems can generate recommendations by comparing detailed product attributes. By leveraging this combination, hybrid systems deliver more comprehensive and relevant recommendations to users [1,36,42,43].

3. Design and Specification of Metrics for Algorithm Comparison

In this chapter, we propose metrics for comparing recommender system algorithms.

3.1. Precision and Recall

Decision support in recommender systems involves evaluating each recommendation to determine whether it is correct or incorrect. By analyzing each recommended item and comparing it with the user’s actual consumption, four possible outcomes can be identified: the item may have been recommended or not, and the user may have consumed it or not. If the system recommends an item, it is considered a positive outcome; if the user actually consumes the item, it is regarded as a correct decision [1,25]:

True Positive (TP)—The item was recommended and consumed by the user.
False Positive (FP)—The item was recommended, but the user did not consume it.
False Negative (FN)—The item was not included in the recommendations, but the user consumed it.
True Negative (TN)—The item was not recommended, and the user did not consume it.

The test result in this format allows defining two different metrics [1,25]:

Precision—What portion of the recommended items was consumed by the user.
Recall—What proportion of all items that the user consumed was recommended.
Precision and Recall are calculated as [1,25]:
Precision = True Positives/(True Positives + False Positives)
Recall = True Positives/(True Positives + False Negatives)

Recommender systems are often designed to always present users with at least one option for what to purchase or watch next. It is crucial that a Top-N recommendation list include at least one relevant item. In many cases, optimizing precision is considered a priority, rather than ensuring that the user receives all possible relevant items [1,25].

3.2. F1-Score

One way to create a metric that combines both Precision and Recall is the F1-score, which is the harmonic mean of precision and recall [27,37]. It is calculated as:

F 1 (t) = \frac{2 \cdot P r e c i s i o n (t) \cdot R e c a l l (t)}{P r e c i s i o n (t) + R e c a l l (t)}

(1)

Although the F1-score provides a better quantification than precision or recall alone, it still depends on the size t of the recommended list and, therefore, does not fully represent the trade-off between precision and recall. A more comprehensive way to analyze this trade-off is to vary the value of t and visualize precision against recall, allowing for a clearer examination of how recommendation effectiveness changes [1,25].

3.3. Mean Absolute Error

When evaluating Recommender system algorithms, metrics such as Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are often used, providing important insights into the accuracy and behavior of these models. Using both metrics together offers a comprehensive view of the performance of individual algorithms. While MAE provides an overview of average errors, RMSE assigns greater weight to larger errors, which is crucial in applications where large errors are undesirable [1,25].

M A E = \frac{1}{|\hat{R}|} \sum_{{\hat{r}}_{u_{i}} \in \hat{R}} |r_{u_{i}} - {\hat{r}}_{u_{i}}|

(2)

where

∑—The sum of all values in the dataset.
$r_{u_{i}}$ —The actual rating given by user u for item i.
${\hat{r}}_{u_{i}}$ —The predicted rating by the algorithm for user u for item i.

Mean Absolute Error (MAE) is the average of the absolute deviations between predicted and actual values. Simply put, MAE indicates how far, on average, our predictions are from the actual values, considering all errors equally serious.

3.4. Mean Squared Error

Root Mean Square Error (RMSE) is a key statistical metric used to measure prediction accuracy in the context of Recommender systems and predictive analysis in general. It provides a quantitative assessment of prediction errors and serves as a foundation for evaluating and comparing different predictive approaches [1,25].

R M S E = \sqrt{\frac{1}{|\hat{R}|} \sum_{r_{u_{i}}} {(r_{u_{i}} - {\hat{r}}_{u_{i}})}^{2}}

(3)

where

∑—The sum, representing the sum of all values.
${(r_{u_{i}} - {\hat{r}}_{u_{i}})}^{2}$ —The square of the difference between the actual rating given by user u for item i and the predicted rating. The squared function emphasizes larger errors.
$r_{u_{i}}$ ∈ $\hat{R}$ —Indicates that we consider all predicted ratings $r_{u_{i}}$ from the set of predictions $\hat{R}$ .

Root Mean Square Error (RMSE) is the square root of the mean of the squared deviations between predicted and actual values. Unlike MAE, RMSE assigns greater weight to larger errors, meaning that significant prediction errors will have a greater impact on the overall evaluation than smaller errors.

4. Implemented Recommender System Algorithms

This chapter describes the implemented algorithms used for the final comparison. The algorithms were implemented using the Surprise library—https://surprise.readthedocs.io/en/stable/ (accessed on 15 April 2025).

This Python library (v1.1.4) was developed to facilitate the development and testing of collaborative filtering algorithms [44]. Its goal is to provide developers with an intuitive platform for experimenting with recommendation methods and optimizing their performance. The library efficiently supports working with various types of datasets, including the popular MovieLens database. Each algorithm in the library is characterized by a specific set of features and parameters that influence its performance and accuracy.

The Surprise library was selected based on several advantages. Surprise adopts a familiar, user-friendly interface. The library offers integrated cross-validation and standard metrics like RMSE and MAE, facilitating robust model comparison. It also includes many ready-to-use algorithms. Based on these advantages we selected the Surprise library for our research. But it is also possible to use other libraries, for instance LightFM, TensorFlow Recommenders, RecBole, etc.

For experimental results we selected the MovieLens dataset [45,46,47], which is a database of personalized ratings of various movies from a large number of users. This database was developed by a research lab at the University of Minnesota. This dataset contains user personalized ratings so that each rating is assigned to a particular user. We selected this dataset based on several key advantages:

High data quality—The MovieLens datasets are carefully curated by GroupLens, a reputable research group at the University of Minnesota. Dataset is consistently clean, de-duplicated, and well-formatted, which minimizes preprocessing effort.
Rich metadata—Movies are annotated with genres, titles, and in some versions timestamps, and even user demographic data (in 1 M version)
Widely used and well documented—MovieLens is one of the most cited and used datasets in recommendation system research. That makes it easier to compare models, reproduce results, and benchmark algorithms.

We selected these algorithms based on their properties, significance and usability with the selected MovieLens dataset:

BaselineOnly algorithm—useful for quick and simple predictions, used as benchmark model.
CoClustering algorithm—ability to provide personalized recommendations by analyzing relationships and grouping users and items into co-clusters.
KNNBaseline algorithm—robust mechanism for enhancing recommendation accuracy within collaborative filtering.
SVD algorithm—crucial algorithm for the development of modern recommender systems due to its ability to decompose data and uncover hidden patterns; one of the most popular collaborative filtering algorithms.
Non-negative Matrix Factorization (NMF) algorithm—similar to SVD algorithm but restricts factors to be non-negative, producing additive (parts-based) representations; often leads to more interpretable latent factors.
SlopeOne algorithm—collaborative filtering algorithm based on a simple derivation method from user ratings, characterized by low computational complexity and easy implementation; its ability to quickly generate reliable predictions with minimal complexity makes it an attractive option for a wide range of recommender systems.

These algorithms also occur in benchmark of recommender algorithms [48]. Significance and relevance of all algorithms are also supported in papers [49,50]. All of the above algorithms are used in comparison and experimental results in these articles.

4.1. BaselineOnly Algorithm

The algorithm determines the characteristics of recommender systems through baseline estimates, which respond to the unique tendencies of users and items in ratings. It is based on the average rating μ and adjusts it considering the effects of specific users and items, formulated as b_ui = μ + b_u + b_i, where b_u and b_i reflect deviations from the average. Thanks to advanced global optimization methods and the use of both explicit and implicit user feedback, the algorithm strives to predict preferred items with high accuracy [51,52].

\hat{μ} = \frac{1}{|R_{train}|} \sum_{r_{u i} \in R_{train}} r_{u i}

(4)

\hat{σ} = \sqrt{\frac{\sum_{r_{u i}} {(r_{u i} - \hat{μ})}^{2}}{|R_{train}|}}

(5)

where

$r_{u i}$ is the predicted rating of user u for item i.
$μ$ is the global average of all ratings.

The algorithm predicts a random rating based on the distribution of the training dataset, assumed to be normal. The prediction is generated from a normal distribution, where the parameters μ (mean) and σ (standard deviation) are estimated from the training data using the Maximum Likelihood Estimation (MLE) method. If a user or item is unknown, their bias is assumed to be zero. This algorithm provides a baseline estimate for a given user and items [51,52].

Application

The algorithm is useful for quick and simple predictions, taking into account the general popularity of items and the tendency of users to rate either higher or lower than the average. It is particularly beneficial in situations where a fast estimate is needed without relying on more complex models, such as collaborative filtering or deep learning [52].

It is a simple algorithm that predicts user ratings based solely on baseline estimates. It does not use more complex models, for instance collaborative similarity, latent factor models, or neighborhood-based approaches. Instead, it serves as a benchmark model to compare how much improvement more complex algorithms provide. We used BaselineOnly algorithm to recognize its efficiency in comparison with more complex algorithms.

4.2. CoClustering Algorithm

The Co-clustering algorithm, based on collaborative filtering, represents an innovative approach that focuses on the simultaneous clustering of users and items. Its key idea is to generate predictions based on average ratings created by co-clusters (groups of users and items), while also considering individual biases of users and items. This approach enables efficient real-time collaborative filtering, allowing the system to quickly process new users, items, and ratings [53,54].

The predicted rating

r_{u i}

is set as

r_{u i} = {\bar{C}}_{u i} + (μ_{u} - {\bar{C}}_{u}) + (μ_{i} - {\bar{C}}_{i})

(6)

In this approach, users and items are assigned to specific clusters

{\bar{C}}_{u}

and

{\bar{C}}_{i}

and to a joint cluster

{\bar{C}}_{u i}

.

Where

${\bar{C}}_{u i}$ is the average rating in the joint cluster.
${\bar{C}}_{u}$ is the average rating in the user cluster.
${\bar{C}}_{i}$ is the average rating in the item cluster.

If the user is unknown, the prediction equals the average rating of the item (μ_i). If the item is unknown, the prediction equals the average rating of the user (u). In cases where both the user and the item are unknown, the global average rating is used as the prediction [53].

Parameters

n_cltr_u (int): The number of user clusters. The default value is 3.
n_cltr_i (int): The number of item clusters. The default value is 3.
n_epochs (int): The number of iterations for the optimization loop. The default value is 20.

Application

The algorithm utilizes an optimization method similar to k-means to assign users and items into clusters, dynamically responding to changes in data while maintaining high recommendation accuracy. This approach allows for flexible adaptation to different consumption patterns and effectively predicts user preferences. One of its key advantages is its ability to provide personalized recommendations by analyzing relationships and grouping users and items into co-clusters. This sophisticated approach not only enhances recommendation accuracy but also improves the user experience by uncovering new trends and tendencies in consumer behavior. As an advanced collaborative filtering model, the algorithm efficiently handles large datasets, addressing scalability issues while delivering relevant and targeted recommendations [53,54].

4.3. KNNBaseline Algorithm

This approach focuses on utilizing neighborhood relationships to predict unknown user ratings for specific items. Unlike traditional methods that rely on heuristic similarities between items or users, this approach models neighbor relationships by minimizing a global error function. It further enhances prediction accuracy by incorporating both explicit and implicit user feedback. Traditional models were limited by the need to compute similarity between all item or user pairs. This limitation is overcome by factorizing the neighborhood model, allowing both item-item and user-user implementations to scale linearly with the size of the data [52,55].

{\hat{r}}_{u i} = \frac{\sum_{v \in N_{i}^{k} (u)} sim (u, v) \cdot r_{v i}}{\sum_{v \in N_{i}^{k} (u)} sim (u, v)}

(7)

The prediction

r_{u i}

is set as the average rating from the k nearest neighbors, either users or items, depending on the user_based setting in the sim_options parameter.

Parameters

k: The maximum number of neighbors considered for aggregation. The default value is 40.
min_k: The minimum number of neighbors considered for aggregation. If there are not enough neighbors, the aggregation is set to zero. The default value is 1.
sim_options: A dictionary of options for computing similarity measures. It is recommended to use the pearson_baseline similarity measure.

Application

Its application proves to be effective in scenarios where it is crucial to consider not only specific user preferences but also the overall trend of ratings. This algorithm, combining baseline estimates with the KNN method, provides a robust mechanism for enhancing recommendation accuracy within collaborative filtering. It offers a simple yet effective solution for collaborative filtering, making it a suitable starting point for evaluating the effectiveness of more sophisticated recommendation methods [52,55].

4.4. SVD Algorithm

The algorithm known as Singular Value Decomposition (SVD) is used in the context of collaborative filtering to find a low-rank approximation of the user preference matrix [1,25].

In this study, SVD was used as the matrix factorization step within a collaborative filtering framework. Its role was to reduce the dimensionality of the user–item interaction matrix and capture latent preference patterns relevant to the movie recommendation task.

This process involves finding a matrix R = UTV, which minimizes the sum of squared distances to the observed target matrix R. One of the main challenges when applying this algorithm to real-world datasets is that most of these datasets are sparse, meaning that a majority of the values in matrix R are missing. This transforms the problem into a complex non-convex optimization challenge [56,57].

The predicted rating

{\hat{r}}_{u i}

is calculated as:

{\hat{r}}_{u i} = μ + b_{u} + b_{i} + q_{i}^{T} p_{u}

(8)

where

$μ$ is the global average of all ratings.
$b_{i}$ is the bias (deviation) of item i.

Stochastic Gradient Descent

Minimization is performed using stochastic gradient descent (SGD), where each parameter is updated as follows:

b_{u} \leftarrow b_{u} + γ (e_{\{u i\}} - λ b_{u}) b_{i} \leftarrow b_{i} + γ (e_{\{u i\}} - λ b_{i})

(9)

where

$γ$ and $λ$ are the learning rate and regularization coefficient.

The learning rate (λ) and the regularization term (γ) can vary for each type of parameter. By default, the learning rates are set to 0.005, and the regularization terms are set to 0.02 [57].

Parameters

n_factors: The number of latent factors. The default value is 100.
n_epochs: The number of SGD iterations. The default value is 20.
lr_all, reg_all: Global learning rate and regularization for all parameters. The default value is 0.005.

Application

SVD and its derived methods have proven to be crucial for the development of modern recommender systems due to their ability to decompose data and uncover hidden patterns. This capability not only enhances prediction accuracy but also provides deeper insights into how user preferences are formed. Additionally, SVD enables the discovery and recommendation of content that a user may not have actively searched for but that aligns with their latent interests. This feature plays a key role in increasing user satisfaction and encourages content exploration, which is essential for maintaining user engagement on online platforms. In the broader context of collaborative filtering, this mathematical approach emerges as a powerful tool for building sophisticated and precise recommender systems, capable of adapting to individual user needs [57,58].

4.5. NMF Algorithm

The Non-negative Matrix Factorization (NMF) algorithm represents a revolutionary tool in the field of collaborative filtering, offering a unique approach to decoding user preferences and item characteristics by decomposing rating matrices into non-negative factors. This method stands out for its ability to efficiently process highly sparse data while ensuring better interpretability of results by maintaining the non-negativity of elements. With innovations such as element-wise updates and the integration of regularization terms, this approach has become even more suitable for addressing the specific challenges of collaborative filtering [59].

The predicted rating

{\hat{r}}_{u i}

is calculated as

{\hat{r}}_{u i} = q_{i}^{T} P u

(10)

Optimization is performed using regularized stochastic gradient descent (SGD) with a specific step size selection that ensures the non-negativity of factors, provided that their initial values are also positive.

Parameters

n_factors: The number of latent factors. The default value is 15.
n_epochs: The number of iterations for the SGD procedure. The default value is 50.
biased: Specifies whether to use biases (baseline estimates). The default value is False.
reg_pu, reg_qi: Regularization terms for user and item latent factors. The default value is 0.06.

Application

The algorithm is highly sensitive to initial values, and it is recommended to initialize user and item factors within the range of init_low and init_high to ensure non-negativity. The option to use a bias-enhanced version allows for the inclusion of baseline estimates in predictions, which can improve accuracy but also increase the risk of overfitting. To mitigate this, it may be beneficial to reduce the number of factors or increase regularization [59].

4.6. SlopeOne Algorithm

The algorithm utilizes the principle of “popularity differential” between items, based on the average rating difference of one item compared to another among users who have rated both. The simplicity of the algorithm lies in computing and applying these differentials to predict unknown item ratings. It assumes that a user will rate an item similarly, in relation to the average rating difference between the two items [60].

The predicted rating

{\hat{r}}_{u i}

is calculated as

{\hat{r}}_{u i} = μ_{u} + \frac{1}{|R_{i} (u)|} \sum_{j \in R_{i} (u)} dev (i, j)

(11)

where

$μ_{u}$ is the average rating of user u.
$R_{i} (u)$ is the set of relevant items, i.e., the set of items j that user u has rated and that have at least one common user with item i.
$dev (i, j)$ is the average difference between the ratings of item i and the ratings of item j.

Application

Slope One is a collaborative filtering algorithm based on a simple derivation method from user ratings, characterized by low computational complexity and easy implementation. This makes it an ideal choice for systems with limited resources. The algorithm averages the differences in scores between pairs of items rated by different users and applies these averages to predict missing ratings, allowing for efficient handling of large datasets. Its ability to quickly generate reliable predictions with minimal complexity makes it an attractive option for a wide range of recommender systems [60].

5. Results

In this chapter, the results of testing the implemented algorithms will be presented. The algorithms were tested using the MovieLens database [46,47]. This database consists of personalized ratings of various movies from a large number of users and is developed by the research lab at the University of Minnesota. To evaluate the models, the Precision, Recall, and F1-score metrics were used, allowing an assessment of their ability to identify relevant items.

5.1. Dataset Preparation

For experimental purposes, the well-known MovieLens dataset was chosen, which is a commonly used standard in the field of recommender systems. This dataset includes:

100,000 ratings
Provided by 943 users
Up to 1682 rated movies
Ratings range from 0.5 to 5

A higher rating indicates a greater user preference for a movie. Thanks to its rich rating scale and a representative sample of users, this dataset provides a strong foundation for testing and comparing the performance of recommendation algorithms. The dataset was split into training and test sets in a 75:25 ratio, which is a common practice in machine learning evaluation.

The training set is used to build the model
The test set is used to validate predictive performance on unseen data

This data split enables an objective evaluation of algorithms and provides an accurate measure of their performance.

To analyze the MovieLens dataset, we focused on genre preferences among directors. Figure 5 illustrates the distribution of these preferences based on the number of films in each genre. The data clearly show that the most popular genres are comedy and drama, with their dominance evident among both audiences and filmmakers. Drama and comedy significantly outnumber other genres in terms of representation.

This graph is part of a broader analysis, which includes user ratings and movie characteristics, providing insights into trends. When working with the data, we ensured consistent preprocessing across all analyzed algorithms:

Errors and incomplete records (including incorrectly formatted ratings) were removed.
The data were normalized to maintain a consistent value range for all applied algorithms.

5.2. Model Evaluation

This chapter provides an overall perspective on the evaluation process of algorithms, covering selection and configuration, training, final testing, and result analysis. We focus on methods for measuring algorithm runtime and efficiency within the given data model, allowing us to identify the most promising candidates for effective Recommender systems.

In Algorithm 1 the function determines a user’s favorite and least favorite genres based on their movie ratings.

First, it records the occurrence count of each genre in the rated movies using a Counter data structure.
Then, it sorts the genres in descending order based on their frequency.

Algorithm 1: Obtaining lists of favorite and least favorite genres.

Require: user_ratings
Require: all_genres
1: genre_counts ← Counter()
2: for genres in user_ratings[‘genres’] do
3: for genre in genres.split(‘|’) do
4: genre_counts[genre] += 1
5: sorted_genres ← sorted(all_genres, key = lambda x: genre_counts.get(x, 0), reverse = True)
6: mid_point ← len(sorted_genres)//2x
7: return sorted_genres[:mid_point], sorted_genres[mid_point:]

Since the dataset contains 19 different genres, the method splits them into favorite and least favorite genres in a 10:9 ratio. The midpoint of the list determines the division between preferred and non-preferred genres.

This approach personalizes movie recommendations, ensuring they closely align with the user’s individual preferences.

In Algorithm 2 the function calculates the relevance of recommended movies based on how well they match the user’s favorite and least favorite genres.

3.: For each recommended movie, it determines how many of its genres are favorite and how many are least favorite.
4.: Relevance is then computed as the difference between the number of favorite and least favorite genres.
5.: A movie is considered relevant if this difference is positive.

Algorithm 2: Relevance of recommended movies.

Require: recommended_movies
Require: fav_genres, unfav_genres
1: relevant_movies ← 0
2: for each row in recommended_movies do
3: genres ← row[‘genres’].split(‘|’)
4: score ← sum(1 for genre in genres if genre in fav_genres)
sum(1 for genre in genres if genre in unfav_genres)
5: if score > 0 then
6: relevant_movies ← relevant_movies + 1
7: return relevant_movies

List of algorithms that will be used for modeling and evaluation simulation – see Algorithm 3. The selection includes standard algorithms such as:

SVD (Singular Value Decomposition)
CoClustering
NMF (Non-negative Matrix Factorization)
SlopeOne
KNNBaseline
BaselineOnly

Algorithm 3: List of algorithms for comparison.

Require: SVD
Require: CoClustering
Require: NMF
Require: SlopeOne
Require: KNNBaseline
Require: BaselineOnly
1: algorithms ← [SVD(), CoClustering(), NMF(), SlopeOne(), KNNBaseline(), BaselineOnly()]

The code snippet in Algorithm 4 demonstrates the process of splitting the dataset into training and test sets. The training set is used for building and learning the model, while the test set is used to evaluate its performance on data that was not used during training. In this case, the train_test_split method from the Surprise library is called with the parameters data and test_size = 0.25, meaning that 25% of the data is randomly selected for testing the model, while the remaining 75% is used for training.

Algorithm 4: Splitting the data into training and test sets.

Require: ratings_df
Require: Reader
Require: Dataset
Require: train_test_split
1: reader ← Reader(rating_scale=(0.5, 5))
2: data ← Dataset.load_from_df(ratings_df[[‘userId’, ‘movieId’, ‘rating’]], reader)
3: trainset, testset ← train_test_split(data, test_size=0.25)

Splitting data into training and test sets is a fundamental step in machine learning, providing the basis for objective model evaluation. Training data helps the algorithm learn patterns and relationships relevant to the given task. Test data allows verification of whether these patterns and relationships can be successfully applied to unseen data.

The code in Algorithm 5 demonstrates an iteration over different recommendation algorithms for their training and evaluation. For each algorithm, a timestamp is recorded before and after the evaluation process to measure the execution time of each evaluation. This is crucial for assessing the performance of the algorithm in terms of runtime efficiency.

Algorithm 5: Iteration through algorithms, training, and evaluation.

Require: algorithms
Require: trainset
Require: testset
Require: time
1: for each algorithm in algorithms do
2: start_time ← time.time()
3: algorithm.fit(trainset)
4: predictions ← algorithm.test(testset)
5: end_time ← time.time()
6: eval_time ← end_time − start_time

Each prediction is an object with the following attributes:

uid: The user ID for whom the prediction was generated. This is the user identifier from the test set.
iid: The movie ID for which the prediction was generated. This is the item identifier from the test set.
r_ui: The actual rating the user gave to the movie. This value comes from the test set and is used for comparison with the predicted rating.
est: The value predicted by the algorithm. This is the predicted rating that the algorithm generated for the given user-movie combination.

This step is crucial, as it directly impacts the quality of recommendations by considering the user’s personal taste in movie genres. The code in Algorithm 6 illustrates how data is further processed to identify relevant genre preferences. After predicting a movie rating, the algorithm analyzes whether the movie contains genres that the user prefers or dislikes. This is achieved by comparing the user’s genre preferences from the dataset with the recommended movies that have been suggested to the user but have not yet been rated.

Algorithm 6: Identification of relevant genre preferences.

Require: predictions
Require: movies_df
Require: fav_genres
Require: unfav_genres
Require: is_genre_relevant
1: genre_relevance_info ← []
2: for each pred in predictions do
3: movie_id ← pred.iid
4: genres_str ← movies_df.loc[movies_df[‘movieId’] == movie_id, ‘genres’].values [0]
5: genres ← genres_str.split(‘|’)
6: is_relevant ← is_genre_relevant(genres, fav_genres, unfav_genres)
7: genre_relevance_info.append(is_relevant)
8: relevant_predictions ← [pred for pred, is_relevant in zip(predictions, genre_relevance_info) if is_relevant]

The output in Algorithm 7 demonstrates the effectiveness of the algorithm based on the ratings of 20 randomly selected users from the test set. For these users, a set of favorite and least favorite genres is determined based on their previous ratings. The algorithm predicts 15 movies that the user might want to watch and evaluates their relevance according to genre preferences. The algorithm’s performance is assessed using Precision, Recall, and F1-score:

Precision measures how many of the recommended movies were actually relevant.
Recall evaluates how many relevant movies were successfully recommended.
F1-score provides a balanced measure of both metrics.

Algorithm 7: Identification of relevant genre preferences—SVD algorithm.

Require: user_id
Require: fav_genres
Require: unfav_genres
Require: precision
Require: recall
Require: f1_score
Require: recommended_movies
1: user_id ← 77
2: fav_genres ← [‘Action’, ‘Adventure’, ‘Sci-Fi’, ‘Drama’, ‘Thriller’]
3: unfav_genres ← [‘Romance’, ‘Children’, ‘Animation’, ‘Western’, ‘Horror’]
4: precision ← 0.80
5: recall ← 0.52
6: f1_score ← 0.63
7: recommended_movies ← [
(‘Usual Suspects, The (1995)’, ‘Crime|Mystery|Thriller’),
(‘Shawshank Redemption, The (1994)’, ‘Crime|Drama’),
(‘Forrest Gump (1994)’, ‘Comedy|Drama|Romance|War’),
(‘Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb (1964)’, ‘Comedy|War’),
(‘Godfather, The (1972)’, ‘Crime|Drama’),
(‘Rear Window (1954)’, ‘Mystery|Thriller’),
(‘Casablanca (1942)’, ‘Drama|Romance’),
(‘Reservoir Dogs (1992)’, ‘Crime|Mystery|Thriller’),
(‘Monty Python and the Holy Grail (1975)’, ‘Adventure|Comedy|Fantasy’),
(‘Good, the Bad and the Ugly, The (1966)’, ‘Action|Adventure|Western’),
(‘Lawrence of Arabia (1962)’, ‘Adventure|Drama|War’),
(‘Apocalypse Now (1979)’, ‘Action|Drama|War’),
(‘Goodfellas (1990)’, ‘Crime|Drama’),
(‘Godfather: Part II, The (1974)’, ‘Crime|Drama’),
(‘Fight Club (1999)’, ‘Action|Crime|Drama|Thriller’)]

The results indicate how effectively the algorithm identified suitable recommendations that match user preferences. A user-centered approach focused on genre preferences increases the likelihood that recommended movies will be appreciated and watched. The script illustrates the process of setting up and evaluating the recommendation model, including:

Data preparation
Algorithm selection
Model training and testing
Adjusting input data to account for user preferences

5.3. Experimental Results

In this section, we focus on the analysis and interpretation of data from experiments with various recommendation algorithms. To ensure objective results, a custom evaluation mechanism for genre relevance was applied. The model works with a database containing 19 genres, which are divided for each user into 10 favorites and 9 least favorites based on their past movie ratings. The relevance of a recommended movie is assessed based on whether it falls into the user’s favorite genres. To be classified as relevant, a movie must contain more favorite genres than least favorite genres. Relevance is evaluated using a weighted scoring mechanism, where each favorite genre adds a point, and each least favorite genre subtracts a point. This mechanism provides a better understanding and quantification of how accurately recommendations align with individual user preferences.

Below, pseudocode is displayed, outlining the key functions and principles used for evaluation.

Require: movies_df

1: all_genres ← empty set

2: for each row in movies_df[‘genres’] do

a: genres ← split row by ‘|’

b: add genres to all_genres

3: return all_genres as list

Pseudocode for a method to divide favorite genres

Require: user_ratings

Require: all_genres

1: genre_counts ← Counter()

2: for each row in user_ratings[‘genres’] do

a: genres ← split row by ‘|’

b: for each genre in genres do

i: genre_counts[genre] += 1

3: sorted_genres ← sort all_genres by values in genre_counts descending

4: mid_point ← length(sorted_genres)/2

5: fav_genres ← first mid_point genres in sorted_genres

6: unfav_genres ← remaining genres in sorted_genres

7: return fav_genres and unfav_genres

Pseudocode for a method to divide least favorite genres.

The pseudocode in the image describes two functions. The extract_genres function extracts all unique genres from the movie dataset and returns them as a list. The get_fav_unfav_genres function identifies a user’s favorite and least favorite genres based on their ratings. The function analyzes the user’s ratings, counts the frequency of each genre, sorts them by popularity, and divides them into favorite and least favorite genres. This approach allows for a better understanding of the user’s preferences.

Require: recommended_movies

Require: fav_genres, unfav_genres

1: relevant_movies ← 0

2: for each film in recommended_movies do

a: genres ← split film[‘genres’] by ‘|’

b: score ← 0

c: for each genre in genres do

i: if genre in fav_genres then

score ← score + 1

ii: if genre in unfav_genres then

score ← score − 1

d: if score > 0 then

relevant_movies ← relevant_movies + 1

3: return relevant_movies

Pseudocode for calculating the relevance of a given movie.

This pseudocode describes the compute_relevance function, which evaluates the relevance of recommended movies based on user preferences. The function analyzes the genres of each recommended movie and compares them with the user’s preferences. Based on whether the genres match the favorite or least favorite categories, the function calculates a score and determines the relevance of the movie. At the end, it returns the total number of relevant movies.

Require: None

1: Load and prepare data

a: Load ‘movies.csv’ into movies_df

b: Load ‘ratings.csv’ into ratings_df

c: Join ratings_df and movies_df on ‘movieId’ into ratings_with_genres

d: Extract unique genres into all_genres using extract_genres(movies_df)

2: Split data

a: Split data into training (75%) and testing (25%) sets using train_test_split

3: Select algorithms and users

a: Algorithms ← [SVD, CoClustering, NMF, SlopeOne, KNNBaseline, BaselineOnly]

b: Randomly select 20 users from the training set into selected_users

4: Test algorithms

for each algorithm in Algorithms do

a: Train algorithm on the training set

for each user in selected_users do

i: Get user ratings from ratings_with_genres

ii: Calculate favorite and unfavorite genres using get_fav_unfav_genres

iii: Generate recommendations using an algorithm

iv: Calculate relevance using compute_relevance

v: Calculate recall using compute_recall_relevance

vi: Calculate precision as the ratio of relevant films to total recommended films

vii: Calculate F1 score as the harmonic mean of precision and recall

viii: Store precision, recall and F1 score in lists

5: Evaluate results

for each algorithm in Algorithms do

a: Calculate average precision, recall and F1 score

b: Print average precision, recall and F1 score for a given algorithm

Pseudocode for the entire mechanism of calculation and verification.

The pseudocode describes the complete process of experimenting with recommendation algorithms, from loading and preparing data to evaluating results. This pseudocode provides a detailed guide for conducting experiments with various recommendation algorithms. It outlines the entire process, including data loading and preparation, training and testing of algorithms, and performance evaluation using different metrics. This approach enables a systematic and comparable assessment of various recommendation algorithms.

These experiments focused on evaluating algorithm performance using the MovieLens dataset and were conducted repeatedly, specifically 10 times for each algorithm. To train the models, the dataset was split into 75% for training and 25% for testing using the train_test_split(data, test_size = 0.25) function. Next, 20 users were randomly selected from the test set, and each algorithm recommended the top 15 movies they had not yet rated.

Table 1, Table 2 and Table 3 show the comparison of algorithms based on different metrics.

The performance of each algorithm was evaluated based on key metrics, including RMSE, MAE, Precision, Recall, and F1-score. These metrics provide a comprehensive view of how well an algorithm predicts user ratings and how effectively it identifies relevant movies. In addition to these metrics, the time required to complete the experiment for each algorithm was also recorded. This adds another dimension of evaluation in terms of computational complexity. This aspect is crucial for practical applications, where the execution time of recommendation algorithms plays a significant role in real-world usability.

A performance comparison of recommender system algorithms shows that the SVD algorithm performed the best in terms of RMSE and MAE metrics, indicating its ability to accurately predict user ratings with minimal error. Low RMSE and MAE values are crucial indicators for applications where prediction accuracy is a priority. Compared to other algorithms, such as NMF or CoClustering, SVD delivers more consistent and precise results, as confirmed by an average score of 0.8128 across ten simulated trials. Additionally, SVD is among the fastest algorithms, making it a strong choice for efficient and high-accuracy recommender systems.

Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 show the comparison of algorithms based on different metrics.

SlopeOne and KNNBaseline demonstrated the ability to include a larger number of relevant movies, which is evident from their F1 score. They are well-suited for scenarios where maximizing coverage of potentially interesting movies is a priority, even though they exhibit longer processing times and sometimes lower accuracy. BaselineOnly stands out for its exceptional speed and solid accuracy, making it a good choice for cases where fast processing is a priority. Its ability to quickly generate accurate recommendations is significant, though its lower Recall and F1-score may limit its usefulness in scenarios requiring broader coverage. My custom genre relevance model provided valuable insights into how different algorithms respond to individual user preferences, adding an extra dimension to the evaluation of their performance.

A graph of evaluation speed of algorithms is shown in Figure 11.

The graph comparing the computational times of individual runs of the tested recommendation algorithms reveals significant differences in their computational demands. While the SlopeOne algorithm, with a runtime of 5.81 s, ranks among the slowest due to its complex mathematical operations and iterative processes, the BaselineOnly algorithm stands out with a speed of just 0.21 s, reflecting its simplicity and efficiency. Other algorithms, such as SVD and KNNBaseline, also demonstrate solid efficiency with execution times of 1.01 and 1.19 s, making them ideal for applications requiring fast and accurate recommendations. The CoClustering and NMF algorithms, with runtimes of 1.76 and 1.52 s, are similarly fast.

Choosing the right algorithm should take into account not only accuracy and the ability to generate relevant recommendations but also speed, which is crucial in environments where time is a critical factor. This graph plays a key role in assessing the practical applicability of these algorithms in real-time scenarios. Table 4 shows averages of key metrics for different algorithms.

As part of the experiment, ten performance measurements were conducted for various recommender system algorithms. The results show that the SVD algorithm achieves the best values for RMSE (0.8656) and MAE (0.6720), demonstrating high accuracy and reliability in predicting ratings. SVD also had the highest average accuracy of 0.8128 among the algorithms, placing it first based on the evaluated metrics.

The NMF and CoClustering algorithms show higher values for RMSE (0.9814, 0.9316) and MAE (0.7554, 0.7330), indicating lower prediction accuracy compared to SVD, while also exhibiting higher computational times. SlopeOne, with RMSE (0.9119) and MAE (0.7026) values, requires a significantly longer processing time (6.0221 s), which is the highest among the tested algorithms. KNNBaseline offers a good compromise between accuracy and computation time with relatively balanced metrics, although it has a low Recall (0.3511). BaselineOnly provides comparable accuracy to SVD with a significantly shorter computation time (0.2781 s), making it suitable for applications that require fast processing.

The experiment demonstrated that the choice of algorithm should be made based on the specific requirements of the application. Algorithms like SVD and BaselineOnly proved to be highly effective in terms of accuracy and processing speed, making them ideal for applications that require fast and accurate results. In contrast, SlopeOne and CoClustering showed longer processing times, which could pose a challenge for applications that require real-time interaction. The SVD algorithm appears to be the best choice for most applications due to its accuracy and speed, while other algorithms may be suitable in scenarios where different performance and efficiency requirements exist.

6. Conclusions

Our experimental evaluation demonstrated the effectiveness of recommendation approaches in addressing common challenges in recommender systems. The paper utilized multiple evaluation metrics, including Precision, Recall, and F1 score, to assess algorithm performance on the MovieLens dataset using the Surprise library. The analysis revealed that the SVD algorithm achieved superior performance metrics, with an average accuracy of 0.8128 and minimal RMSE and MAE values. This performance, combined with efficient evaluation speed, established SVD as a particularly effective baseline algorithm.

6.1. Practical Implications

The experimental results confirm that SVD is a particularly effective algorithm. SVD is one of the collaborative filtering algorithms, and it should be used as follows:

An efficient part of hybrid recommender system which combines collaborative filtering and content-based approach;
Based on its properties, it can capture underlying patterns in user preferences and item attributes;
An efficient algorithm for implementing recommender systems in various areas.

The SVD algorithm is very useful in recommender systems in various domains. In addition to movie recommender systems, it can be used in recommender systems in e-commerce, e-tourism services, news portals, social media and other domains as we mentioned in the Introduction. Its use is appropriate wherever we have a reasonable amount of user ratings, and therefore, we have the opportunity to create a user–item matrix. In such conditions, we will use the advantages and strengths of this algorithm. The way of implementing the algorithm depends on the specific situation and domain; we have successfully implemented SVD in several recommender systems [61,62].

6.2. Future Work

In future work, we plan to focus on several areas.

The first area is the dynamic modeling of user preferences. It would be beneficial to not only identify favorite genres, actors, or directors but also those that users actively dislike. Such negative feedback could be integrated into the recommendation process to reduce the likelihood of recommending content that users might find unappealing. Over time, user preferences naturally evolve, so it will be important to continuously monitor and adapt the recommendation system to reflect these changes, ensuring the content remains relevant and engaging. Additionally, incorporating fuzzy logic into this modeling process could provide a more nuanced and flexible evaluation of user preferences, allowing for more personalized and precise recommendations.

Another area is the refinement of conflict resolution mechanisms in recommendations. Sometimes, a single movie might have both positive and negative traits—for example, a favored genre paired with a disliked actor. In such cases, the system needs to dynamically assess the weight of these factors and calculate an appropriate recommendation score. Developing more robust algorithms, potentially leveraging fuzzy logic to balance these conflicting inputs, will improve the overall user experience and increase the likelihood that users trust and rely on the recommendations.

Next, we aim to expand the recommendation models used in the system. By integrating additional models, such as more sophisticated collaborative filtering techniques or knowledge-based approaches, we can create a more robust and adaptable recommendation framework. This expanded set of models would enable the system to better capture subtle user preferences and provide more accurate suggestions.

Author Contributions

Conceptualization, B.W.; Data Curation, O.S.; Methodology, B.W. and O.S.; Project Administration, B.W.; Resources, O.S.; Software, O.S.; Supervision, B.W.; Validation, O.S.; Visualization, O.S.; Writing—Original Draft Preparation, B.W. and O.S.; Writing—Review and Editing, B.W. and O.S. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by a Student Grant SGS11/PŘF/2025 with student participation, supported by the Czech Ministry of Education, Youth and Sports.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Falk, K. Practical Recommender Systems; Manning Publications: Shelter Island, NY, USA, 2019. [Google Scholar]
Raza, S.; Rahman, M.; Kamawal, S.; Toroghi, A.; Raval, A.; Navah, F.; Kazemeini, A. A comprehensive review of recommender systems: Transitioning from theory to practice. arXiv 2024, arXiv:2407.13699. [Google Scholar] [CrossRef]
Goodrow, C. On YouTube’s Recommender System. YouTube Official Blog. 2021. Available online: https://blog.youtube/inside-youtube/on-youtubes-recommendation-system/ (accessed on 15 April 2025).
Netflix. How Netflix’s Recommendations System Works. Netflix Help Center. 2023. Available online: https://help.netflix.com/en/node/100639 (accessed on 15 April 2025).
Smith, B.; Linden, G. Two decades of recommender systems at Amazon. com. IEEE Internet Comput. 2017, 21, 12–18. [Google Scholar] [CrossRef]
Zhou, R.; Khemmarat, S.; Gao, L. The impact of YouTube Recommender system on video views. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, Melbourne, Australia, 1–3 November 2010; pp. 404–410. [Google Scholar]
Mangalindan, J.P. Amazon’s Recommendation Secret. Fortune. 30 July 2012. Available online: https://fortune.com/2012/07/30/amazons-recommendation-secret (accessed on 15 April 2025).
Google Ads. Google Ads–Get Customers and Sell More with Online Advertising. 2024. Available online: https://ads.google.com/ (accessed on 15 April 2025).
Google News. 2024. Available online: https://news.google.com/topstories (accessed on 15 April 2025).
YouTube. 2024. Available online: https://www.youtube.com (accessed on 15 April 2025).
Azaria, A.; Hassidim, A.; Kraus, S.; Eshkol, A.; Weintraub, O.; Netanely, I. Movie recommender system for profit maximization. In Proceedings of the 7th ACM conference on Recommender systems, Hong Kong, China, 12–16 October 2013; pp. 121–128. [Google Scholar]
McLachlan, S. How the YouTube Algorithm Works in 2023: The Complete Guide; Hootsuite: Vancouver, BC, USA, 2023. [Google Scholar]
Millecamp, M.; Htun, N.N.; Jin, Y.; Verbert, K. Controlling spotify recommendations: Effects of personal characteristics on music recommender user interfaces. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, Singapore, 8–11 July 2018; pp. 101–109. [Google Scholar]
Anandhan, A.; Shuib, L.; Ismail, M.A.; Mujtaba, G. Social media recommender systems: Review and open research issues. IEEE Access 2018, 6, 15608–15628. [Google Scholar] [CrossRef]
Ilic, A.; Kabiljo, M. Recommending Items to More Than a Billion People. Engineering at Meta. 2018. Available online: https://engineering.fb.com/2015/06/02/core-infra/recommending-items-to-more-than-a-billion-people/ (accessed on 15 April 2025).
Hwangbo, H.; Kim, Y.S.; Cha, K.J. Recommender system development for fashion retail e-commerce. Electron. Commer. Res. Appl. 2018, 28, 94–101. [Google Scholar] [CrossRef]
Ahn, H.; Kim, K.J.; Han, I. Mobile advertisement recommender system using collaborative filtering: MAR-CF. In KGSF-Conference; The Korea Society of Management Information Systems: Seoul, Republic of Korea, 2006; Volume 2006, pp. 709–715. [Google Scholar]
Broder, A.Z. Computational advertising and recommender systems. In Proceedings of the 2008 ACM Conference on Recommender Systems, Lausanne, Switzerland, 23–25 October 2008; pp. 1–2. [Google Scholar]
Sebastia, L.; Garcia, I.; Onaindia, E.; Guzman, C. e-Tourism: A tourist recommendation and planning application. Int. J. Artif. Intell. Tools 2009, 18, 717–738. [Google Scholar] [CrossRef]
Steinbauer, A.; Werthner, H. Consumer Behaviour in e-Tourism; Springer: Vienna, Austria, 2007; pp. 65–76. [Google Scholar]
Darban, Z.Z.; Valipour, M.H. GHRS: Graph-based hybrid recommendation system with application to movie recommendation. Expert Syst. Appl. 2022, 200, 116850. [Google Scholar] [CrossRef]
Lavanya, R.; Singh, U.; Tyagi, V. A comprehensive survey on movie recommendation systems. In Proceedings of the 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 25–27 March 2021; pp. 532–536. [Google Scholar]
Mu, Y.; Wu, Y. Multimodal movie recommendation system using deep learning. Mathematics 2023, 11, 895. [Google Scholar] [CrossRef]
Sankareswaran, S.P.; Sugumaran, V.; Senthilnathan, H.; Wesley, J.S. Movie recommender system using hybrid filtering techniques. In AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2025; Volume 3175, p. 020015. [Google Scholar]
Aggarwal, C.C. Recommender Systems: The Textbook; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Adomavicius, G.; Tuzhilin, A. Towards the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005, 17, 734–749. [Google Scholar] [CrossRef]
Ricci, L.R.; Rokach, L.; Shapira, B. Introduction to recommender systems handbook. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Kantor, P.B., Eds.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Netflix and the Recommender System Medium. 2020. Available online: https://medium.com/swlh/netflix-and-the-recommendation-system-e806f062ba74 (accessed on 15 April 2025).
Symeonidis, P.; Nanopoulos, A.; Manolopoulos, Y. MoviExplain: A recom-mender system with explanations. In Proceedings of the Third ACM Conference on Recommender Systems, New York, NY, USA, 22–25 October 2009; pp. 317–320. [Google Scholar]
Ho, A.T.; Menezes, I.L.; Tagmouti, Y. E-mrs: Emotion-based movie recommender system. In Proceedings of the IADIS e-commerce Conference, Barcelona, Spain, 9–11 December 2006; University of Washington Both-ell: Bothell, WA, USA, 2006; pp. 1–8. [Google Scholar]
Behera, G.; Nain, N. Collaborative filtering with temporal features for movie recommendation system. Procedia Comput. Sci. 2023, 218, 1366–1373. [Google Scholar] [CrossRef]
Siet, S.; Peng, S.; Ilkhomjon, S.; Kang, M.; Park, D.S. Enhancing sequence movie recommendation system using deep learning and kmeans. Appl. Sci. 2024, 14, 2505. [Google Scholar] [CrossRef]
Sridhar, S.; Dhanasekaran, D.; Latha, G. Content-Based Movie Recommendation System Using MBO with DBN. Intell. Autom. Soft Comput. 2023, 35, 3241–3257. [Google Scholar] [CrossRef]
Burke, R. Hybrid web recommender systems. In The Adaptive Web; Springer: Berlin/Heidelberg, Germany, 2007; pp. 377–408. [Google Scholar]
Lu, J.; Wu, D.; Mao, M.; Wang, W.; Zhang, G. Recommender system application developments: A survey. Decis. Support Syst. 2015, 74, 12–32. [Google Scholar] [CrossRef]
Roy, D.; Dutta, M. A systematic review and research perspective on recommender systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
Kumar, P.P. Recommender Systems Algorithms and Applications; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
Pazzani, M.J.; Billsus, D. Content-based Recommender systems. In The Adaptive Web: Methods and Strategies of Web Personalization; Springer: Berlin/Heidelberg, Germany, 2007; pp. 325–341. [Google Scholar]
Collaborative Filtering in Recommender Systems: Learn All You Need To Know. Recommender Systems. 2021. Available online: https://www.iteratorshq.com/blog/collaborative-filtering-in-recommender-systems/ (accessed on 15 April 2025).
Hahm, E. DATA612 RD 1. RPubs. 2020. Available online: https://rpubs.com/ehahm/627319 (accessed on 15 April 2025).
Kniazieva, Y. What Is a Movie Recommender system in ML? Label Your Data 2022, 2022, 2. [Google Scholar]
Burke, R. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
Chiny, M.; Chihab, M.; Bencharef, O.; Chihab, Y. LSTM, VADER and TF-IDF based hybrid sentiment analysis model. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 265–275. [Google Scholar] [CrossRef]
Matrix Factorization-Based Algorithms. Surprise. 2015. Available online: https://surprise.readthedocs.io/en/stable/matrix_factorization.html (accessed on 15 April 2025).
Exploratory Data Analysis—Movie Lens Dataset. Jovian. 2020. Available online: https://jovian.com/surendranjagadeesh/MovieLens-eda (accessed on 15 April 2025).
Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. (TIIS) 2016, 5, 19. [Google Scholar] [CrossRef]
MovieLens. 2019. Available online: https://movielens.org/ (accessed on 15 April 2025).
Mokeddem, A.; Benharzallah, S.; Arar, C.; Dilekh, T.; Kahloul, L.; Moumen, H. Benchmarking Service Recommendations Made Easy with SerRec-Validator. In Proceedings of the 2024 1st International Conference on Innovative and Intelligent Information Technologies (IC3IT), Batna, Algeria, 3–5 December 2024; pp. 1–7. [Google Scholar]
Saini, K.; Singh, A. A content-based recommender system using stacked LSTM and an attention-based autoencoder. Meas. Sens. 2024, 31, 100975. [Google Scholar] [CrossRef]
Tran, D.T.; Huh, J.H. New machine learning model based on the time factor for e-commerce recommendation systems. J. Supercomput. 2023, 79, 6756–6801. [Google Scholar] [CrossRef]
Basic Algorithms. Surprise. 2015. Available online: https://surprise.readthedocs.io/en/stable/basic_algorithms.html (accessed on 15 April 2025).
Koren, Y. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Trans. Knowl. Discov. Data (TKDD) 2010, 4, 1–24. [Google Scholar] [CrossRef]
Co-Clustering. Surprise. 2015. Available online: https://surprise.readthedocs.io/en/stable/co_clustering.html (accessed on 15 April 2025).
George, T.; Srujana, M. A scalable collaborative filtering framework based on co-clustering. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, TX, USA, 27–30 November 2005; p. 4. [Google Scholar]
K-NN Inspired Algorithms. Surprise. 2015. Available online: https://surprise.readthedocs.io/en/stable/knn_inspired.html (accessed on 15 April 2025).
Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. Incremental singular value decomposition algorithms for highly scalable recommender systems. In Proceedings of the Fifth International Conference on Computer and Information Science, Seoul, Republic of Korea, 28–29 November 2002; Volume 1, pp. 27–28. [Google Scholar]
Simon Funk’s Netflix SVD Method in Tensorflow, Part 1. Temuge’s Blog. 2021. Available online: https://temugebatpurev.wordpress.com/2021/02/04/simon-funks-netflix-svd-method-in-tensorflow-part-1/ (accessed on 15 April 2025).
Slope One. Surprise. 2015. Available online: https://surprise.readthedocs.io/en/stable/slope_one.html (accessed on 15 April 2025).
Luo, X. An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems. IEEE Trans. Ind. Inform. 2014, 10, 1273–1284. [Google Scholar]
Lemire, D.; Maclachlan, A. Slope one predictors for online rating-based collaborative filtering. In Proceedings of the 2005 SIAM International Conference on Data Mining, Newport Beach, CA, USA, 21–23 April 2005; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2005; pp. 471–475. [Google Scholar]
Walek, B.; Fojtik, V. A hybrid recommender system for recommending relevant movies using an expert system. Expert Syst. Appl. 2020, 158, 113452. [Google Scholar] [CrossRef]
Walek, B.; Fajmon, P. A hybrid recommender system for an online store using a fuzzy expert system. Expert Syst. Appl. 2023, 212, 118565. [Google Scholar] [CrossRef]

Figure 1. Content-based filtering approach [39].

Figure 2. Content-based filtering architecture [27].

Figure 3. Collaborative filtering approach [41].

Figure 4. Differences between collaborative filtering and content-based filtering approach [25].

Figure 5. Most popular genres in the MovieLens database.

Figure 6. Experimental results—metric F1.

Figure 7. Experimental results—metric MAE.

Figure 8. Experimental results—metric Precision.

Figure 9. Experimental results—metric Recall.

Figure 10. Experimental results—metric RMSE.

Figure 11. Graph of algorithm evaluation speed.

Table 1. Quantitative comparison of algorithm performance in the first experimental trial.

Algorithms	RMSE	MAE	Precision	Recall	F1 Score	Time
SVD	0.8633	0.6680	0.8275	0.3940	0.4832	0.7019
CoClustering	0.9501	0.7346	0.7876	0.3488	0.4346	1.8854
NMF	0.9350	0.7150	0.7824	0.3741	0.4077	1.4731
SlopeOne	0.9070	0.6925	0.7823	0.3897	0.4222	6.3382
KNNBaseline	0.8830	0.6744	0.7936	0.3206	0.4254	1.1207
BaselineOnly	0.8795	0.6783	0.8073	0.3379	0.4385	0.2948

Table 2. Quantitative comparison of algorithm performance in the second experimental trial.

Algorithms	RMSE	MAE	Precision	Recall	F1 Score	Time
SVD	0.8625	0.6711	0.8149	0.4469	0.4967	0.5524
CoClustering	0.9491	0.7333	0.7642	0.3152	0.3889	1.8820
NMF	0.9243	0.7076	0.7744	0.3305	0.4033	1.6180
SlopeOne	0.9023	0.6886	0.7899	0.3852	0.4268	6.4808
KNNBaseline	0.8747	0.6694	0.7658	0.3473	0.4300	1.1989
BaselineOnly	0.8700	0.6714	0.7947	0.3469	0.4852	0.2271

Table 3. Quantitative comparison of algorithm performance in the third experimental trial.

Algorithms	RMSE	MAE	Precision	Recall	F1 Score	Time
SVD	0.8745	0.6727	0.8077	0.2839	0.3497	0.8724
CoClustering	0.9461	0.7323	0.7762	0.2132	0.2999	2.2950
NMF	0.9223	0.7066	0.7644	0.2275	0.3233	1.8180
SlopeOne	0.9023	0.6896	0.7539	0.2152	0.2938	3.2808
KNNBaseline	0.8757	0.6694	0.7758	0.2393	0.3080	1.3589
BaselineOnly	0.8720	0.6734	0.7839	0.2603	0.3062	0.1871

Table 4. Averages of key metrics for different algorithms.

Algorithms	RMSE	MAE	Precision	Recall	F1 Score	Time
	Mean	Mean	Mean	Mean	Mean	Mean
SVD	0.8656	0.6720	0.8128	0.3686	0.4014	0.7040
CoClustering	0.9316	0.7330	0.7717	0.3313	0.3787	1.9205
NMF	0.9814	0.7554	0.7557	0.3343	0.3685	1.5209
SlopeOne	0.9119	0.7026	0.7728	0.3516	0.3795	6.0221
KNNBaseline	0.8966	0.7072	0.7891	0.3153	0.3511	1.1523
BaselineOnly	0.8767	0.6760	0.7929	0.35312	0.3987	0.2781

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Walek, B.; Sládek, O. Comparison of Selected Algorithms in Movie Recommender System. Appl. Sci. 2025, 15, 9518. https://doi.org/10.3390/app15179518

AMA Style

Walek B, Sládek O. Comparison of Selected Algorithms in Movie Recommender System. Applied Sciences. 2025; 15(17):9518. https://doi.org/10.3390/app15179518

Chicago/Turabian Style

Walek, Bogdan, and Ondřej Sládek. 2025. "Comparison of Selected Algorithms in Movie Recommender System" Applied Sciences 15, no. 17: 9518. https://doi.org/10.3390/app15179518

APA Style

Walek, B., & Sládek, O. (2025). Comparison of Selected Algorithms in Movie Recommender System. Applied Sciences, 15(17), 9518. https://doi.org/10.3390/app15179518

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Selected Algorithms in Movie Recommender System

Abstract

1. Introduction

2. Related Work and Current State of Recommender Systems

2.1. Movie Recommender Systems

2.2. Strengths and Weaknesses of Current Recommender Systems

2.2.1. Strengths

2.2.2. Weaknesses

2.3. Content-Based Recommender Systems

2.4. Collaborative Filtering Recommender Systems

2.5. Hybrid Recommender Systems

3. Design and Specification of Metrics for Algorithm Comparison

3.1. Precision and Recall

3.2. F1-Score

3.3. Mean Absolute Error

3.4. Mean Squared Error

4. Implemented Recommender System Algorithms

4.1. BaselineOnly Algorithm

4.2. CoClustering Algorithm

4.3. KNNBaseline Algorithm

4.4. SVD Algorithm

4.5. NMF Algorithm

4.6. SlopeOne Algorithm

5. Results

5.1. Dataset Preparation

5.2. Model Evaluation

5.3. Experimental Results

6. Conclusions

6.1. Practical Implications

6.2. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI