Modeling Recommender Systems Using Disease Spread Techniques

He, Peixiong; Sun, Libo; Gao, Xian; Zhou, Yi; Qin, Xiao

doi:10.3390/info16080687

Open AccessArticle

Modeling Recommender Systems Using Disease Spread Techniques

by

Peixiong He

¹

,

Libo Sun

¹,

Xian Gao

²

,

Yi Zhou

^2,* and

Xiao Qin

^1,*

¹

Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA

²

TSYS School of Computer Science, Columbus State University, Columbus, GA 31907, USA

^*

Authors to whom correspondence should be addressed.

Information 2025, 16(8), 687; https://doi.org/10.3390/info16080687

Submission received: 8 July 2025 / Revised: 2 August 2025 / Accepted: 8 August 2025 / Published: 13 August 2025

(This article belongs to the Special Issue 2nd Edition of Modern Recommender Systems: Approaches, Challenges and Applications)

Download

Browse Figures

Versions Notes

Abstract

Recommender systems on digital platforms profoundly influence user behavior through content dissemination, and their diffusion process is similar to the spreading mechanism of infectious diseases to some extent. In this paper, we use a network-based susceptibility-infection (SI) model to model the propagation dynamics of recommended content, and systematically compare the differences in propagation efficiency among three recommendation strategies based on popularity, collaborative filtering, and content. We constructed scale-free user networks based on real-world clickstream data and dynamically adapted the SI model to reflect the realistic scenario of user engagement decay over time. To enhance the understanding of the recommendation process, we further simulate the visualization changes of the propagation process to show how the content spreads among users. The experimental results show that collaborative filtering performs superior in the initial dissemination, but its dissemination effect decays rapidly over time and is weaker than the other two methods. This study provides new ideas for modeling and understanding recommender systems from an epidemiological perspective.

Keywords:

recommender systems; network simulation; epidemiological modeling

1. Introduction

The explosion of online information and digital content today has greatly exacerbated the information overload problem faced by users. This trend highlights the importance of recommender systems in filtering and pushing relevant content, which becomes a key tool for guiding users’ attention, increasing interaction frequency, and prolonging platform dwell time. Nowadays, recommender systems optimize the user experience through personalized content recommendations [1]. However, despite the remarkable achievements in the industry, we still lack a deep and systematic theoretical understanding of how recommended content diffuses in the user community and how users interact with each other to facilitate clicking or consuming behaviors. Previous studies have focused on improving recommendation accuracy [2], optimizing sorting strategies [3], or introducing contextual information to enhance user satisfaction, but these approaches are often based on static evaluation metrics, ignoring the temporal characteristics of recommended content spreading in dynamic user networks and the changing process of user engagement. In fact, a recommendation algorithm is not only a content matching process [4], but also a complex information diffusion process with obvious propagation and evolution [5]. Therefore, it is of great theoretical and practical significance to understand how recommendation strategies diffuse content across user networks from a macro perspective, as this sheds light on the collective dynamics of user engagement, platform influence, and viral propagation.

The motivation for this study stems from bridging the gap between recommender systems and information diffusion modeling by proposing an interdisciplinary analytical framework. We consider the recommendation process as analogous to the spread of an infectious disease in a social network [6]. This perspective not only captures the dynamic process of user reception and dissemination of content, but also quantifies the performance of different recommendation algorithms in terms of propagation speed, reach, and user coverage. Specifically, we explore the path of integrating three classical recommendation strategies: popularity-based [7], collaborative filtering [8], and content-based recommendation [9] with an epidemic network model [10]. With the help of real user interaction data, we construct a scale-free network reflecting the real social structure [11], and introduce a time-varying propagation rate function to simulate the phenomenon of user engagement decay over time, which makes the model closer to the user behavioral patterns in online platforms. This design can more realistically reflect the evolutionary trend of the recommendation effect under a long running time, and also provides an interpretable theoretical basis for the study of user churn and cold start problems [12]. Based on this methodological framework, we not only systematically compare the differences in content dissemination efficiency between different recommendation strategies, but also explore the interaction mechanism between algorithm design and user network structure. The experimental results reveal the rapid diffusion ability of collaborative filtering in the initial propagation, but also point out its weak propagation sustainability; in contrast, the popularity-based approach maintains high propagation stability over a long period of time despite its limited initial influence. These findings provide empirical references for the dynamic scheduling of recommendation algorithms and the design of hybrid strategies.

The main contributions of this paper are as follows. First, this paper introduces the SI model in epidemiology into the study of recommender systems, and constructs an analytical framework that can dynamically simulate the process of content diffusion, so as to be closer to the real-life changes in user behavior. Second, in a scale-free network environment constructed based on real clickstream data, the system compares the performance of three types of classical recommendation strategies: popularity-based, collaborative filtering, and content-based approaches in terms of propagation efficiency. Third, this paper introduces time-varying propagation rate to characterize the phenomenon of user engagement decay over time, providing a more nuanced perspective for understanding the evolution of recommendation effectiveness. Finally, this interdisciplinary study not only enriches the theoretical connection between recommender systems and information diffusion mechanisms, but also provides insights with practical value for optimizing recommendation algorithms to maintain user engagement.

The remainder of this paper is organized as follows. Section 2 briefly reviews the related literature. In Section 3, we clarify our research motivation and present research questions. Section 4 defines the problem of modeling recommendation diffusion and outlines key assumptions. Section 5 introduces our proposed approach, which integrates classical recommendation algorithms with a network-based SI model incorporating time-varying infection rates. In Section 6, we verify the model’s effectiveness and feasibility by testing its performance under various scenarios. Section 7 discusses the limitations of the current model and suggests that more sophisticated propagation mechanisms could be introduced in the future to enhance the realistic applicability of the model. Finally, Section 8 summarizes the key contributions of the article and outlines potential avenues for future research.

2. Related Work

The fundamentals and development directions of mainstream recommendation algorithms serve as our introductory focus before we explore the application of epidemic models for information diffusion modeling and end with an analysis of network structure and user interaction dynamics in content dissemination.

2.1. Recommendation Algorithms

The last twenty years have seen a significant rise in the development of recommender systems that primarily work to deliver personalized content suggestions for improving user experience. In general, recommendation methods can be categorized into three main types: collaborative filtering methods along with content-based recommendation methods and hybrid recommendation methods [13]. Matrix factorization within collaborative filtering methods makes use of historical user–item interactions to predict future preferences based on past similarities between users [14]. Content-based recommendation systems utilize characteristics from items and user profiles to suggest items that resemble those the user previously liked. Hybrid methods combine two techniques [15] in order to enhance precision while solving typical problems like cold-start challenges and incomplete data. The algorithms achieve success but they receive evaluations based solely on static performance measures like precision, recall and click-through rate, which fail to consider the constantly shifting patterns of user behavior and engagement.

2.2. Epidemic Models

Epidemiological models establish a systematic method to simulate disease transmission dynamics across a population throughout time [16]. The primary models used in epidemiology are the SI (Susceptible-Infected) [17], SIR (Susceptible-Infected-Recovered) [18], and SEIR (Susceptible-Exposed-Infected-Recovered) frameworks [19]. The SI model defines a process where susceptible individuals transition into an infected state and stay infected forever. The model excels at describing irreversible cumulative processes like user engagement in recommender systems because user interactions with content create permanent behavioral effects. The SIR model includes the Recovery (R) state, which enables individuals to leave the infected state, thus modeling decaying or saturating interest. The SI model’s straightforward nature and alignment with continuous digital behavior patterns make it especially appropriate for our study despite the more detailed dynamics in SIR and SEIR models. This approach facilitates straightforward representation of recommendation distribution across user networks without presuming disengagement or behavioral decline, which might not fit persistent recommendation scenarios.

2.3. Information Diffusion Models in Social Networks

The study of information dissemination among social network users remains a fundamental area of exploration across computational social science and recommender systems research fields [20]. Researchers have developed multiple information diffusion models to explain how users adopt behaviors or interact after being influenced by their peers. The Independent Cascade model [21] and Linear Threshold model [22] stand out as the two primary examples in the field of information diffusion modeling [23]. According to the Independent Cascade model, information propagation from an active node to its neighboring nodes occurs independently with a constant probability during the initial attempt, while the Linear Threshold model dictates that a node activates when the aggregated influence from its neighbors surpasses its threshold value. These two models establish the theoretical groundwork and simulation structure to examine content dissemination as well as user influence ranking and opinion evolution. As research progressed further numerous modifications were introduced to the foundational models, including node characteristics [24], time delay elements, diverse content types, and mechanisms tracking user interest [25] development, which enhanced model adaptability and expressiveness for complex network environments [26]. Research efforts have integrated Graph Neural Networks from deep learning with diffusion techniques to learn information propagation paths and forecast user responses [27]. Algorithmic advances and theoretical modeling improvements have been made, yet most diffusion models remain concentrated on forecasting propagation range or pinpointing key nodes rather than examining the sequence of user behaviors activated by recommender system signals. The study extends this approach by integrating diffusion mechanisms with recommendation strategies to examine the influence of various recommendation methods on content reach and user engagement online.

3. Research Statement

Recommendation algorithms determine content propagation in digital platforms and profoundly affect user behavior. Although most of the existing studies focus on algorithm performance and efficiency, there is still a lack of systematic modeling and empirical analysis on how content dynamically spreads among users and the influence of network structure.

To fill this gap, this study introduces an epidemiological modeling perspective and attempts to simulate the propagation process of recommendation signals through the Susceptible-Infected (SI) model. Figure 1 is a simple demonstration of SI model diffusion. This model provides us with a tool to portray the time-evolving behavior of recommendation content as an “infected” propagation unit, which gradually spreads in the user network. This approach not only breaks through the traditional evaluation of recommendation effects based on static indicators (e.g., accuracy and click-through rate), but also gives us a systemic perspective to re-understand the behavioral characteristics of recommender systems in terms of propagation breadth, speed, and sustainability.

In this work, we have two goals. First, we are interested in analyzing whether the Susceptible-Infected (SI) model can be used to effectively capture the realism in recommendation behavior, for example, if the diffusion can be used to mimic recommendation algorithms’ propagation on different levels, for example, propagation speed, final coverage and the time decay of user activities. The second goal is to build a realistic simulator that is based on a real dataset from a music streaming platform. By using the simulator, we can easily sample scale-free networks, attribute different properties to users, such as language and city, and then compare different recommendation strategies on the generated datasets to model the diffusion effects under different settings but in a more realistic data-driven environment. In summary, we believe that the two goals we set can also provide a general basis for interdisciplinary efforts in both advancing the theory of recommender systems and understanding the potential of developing more engagement adaptive recommendation algorithms.

To systematically study the dynamic dissemination of recommended content, we propose a series of research questions as a guide, focusing on evaluating the applicability of epidemiological models in recommender systems and designing close-to-real network structures to simulate the process of user interaction and information dissemination.

Research Questions

Can the SI epidemiological model effectively represent the dynamics of digital recommendation algorithms?
How can user interactions be used to construct networks that realistically represent the structure of digital recommendation systems?

To answer the above questions, we first construct a scale-free user network based on real clickstream data to reflect the typical heterogeneous connection characteristics in digital platforms. Then, we design a simulation environment to embed different types of recommendation strategies into the SI model and observe their propagation trajectories evolving over time. Finally, we introduce a dynamic propagation rate mechanism to simulate the real-world scenario of declining user engagement, so as to enhance the ecological fidelity and predictive capability of the model.

4. Problem Definition

Recommendation systems face a crucial challenge when it comes to predicting user preferences accurately and spreading those preferences effectively throughout networks. Traditional recommendation systems generally produce fixed recommendations which do not account for dynamism in user preferences. The dynamic nature of user behaviors allows preferences and adoption decisions to spread through networks in the same way infections move in disease models. Recommendation systems that overlook the process of dynamic propagation through networks will miss opportunities to leverage network effects, which restricts their real-world performance.

Formally, we define our research problem as follows. Given a set of users

U = {u_{1}, u_{2}, \dots, u_{n}}

with associated preference labels

L = {l_{1}, l_{2}, \dots, l_{n}}

and group attributes (such as language, city, and genre), our goal is to:

Develop a robust recommendation algorithm f integrating collaborative filtering with dynamic propagation mechanisms, defined as:

$\hat{L} = f (U, A, S)$

(1)

where $\hat{L}$ denotes predicted preferences, A is the set of group attributes, and S represents the user similarity matrix derived from collaborative filtering.
Model the recommendation propagation using an SI (Susceptible-Infected) epidemic framework. In this model, each user $u_{i} \in U$ transitions from susceptible (S) to infected (I) state based on the probability:

$P (u_{i} infected at t + 1) = 1 - \prod_{u_{j} \in I_{t}} (1 - β_{i j} (t))$

(2)

where $β_{i j} (t)$ is the dynamically decreasing transmission rate between infected user $u_{j}$ and susceptible user $u_{i}$ at time t.
Investigate the impact of initial recommendation strength (collaborative filtering scores $C F (u_{i})$ ) on the propagation efficiency over time, given by:

$β_{i j} (0) = α \times C F (u_{i})$

(3)

where $α$ is a scaling factor controlling initial transmission strength.

Specifically, we aim to answer the following key questions:

What methods can we use to merge collaborative filtering recommendations with an SI epidemic propagation model for precise integration?
What impact does a transmission rate that decreases over time have on how recommendations are adopted by users during different periods?
How does the network structure and the attributes of user groups influence the success of recommendation propagation?

The above questions have been addressed from different perspectives, and in this work we seek to further contribute to this end. To this aim, we propose a combined approach of collaborative filtering and an epidemic model, in particular an SI model, to describe, from a macroscopic perspective, the diffusion process of content recommendation in the network of users. This approach allows at the same time to account for the microscopic matching of user–item preferences, while at the same time modeling the macroscopic propagation processes. Modeling the impact of different parameters, the network topology and characteristics of the node groups allows us to gain an understanding of their impact on the diffusion process, thus supporting the development of more flexible and network-aware recommendation mechanisms.

5. Approach

Our approach bridges recommendation system algorithms with epidemic diffusion models to simulate and evaluate content propagation within a user network, and Figure 2 is the overview of the whole simulation. The methodology consists of three main components: network construction, recommendation strategy implementation, and dynamic SI-based simulation.

5.1. Network Construction

We use the Barabási–Albert (BA) model to generate a scale-free user network to mimic social systems at the structural level [28]. The model takes as input a number of users,

N = 27, 244

, as each user is represented by a node in the network. As each node is added to the network, it forms links with

m = 3

other nodes, which are selected based on their degree. In other words, we simulate user-level propagation of posts on a network in which newly added users tend to link to highly connected existing users. As with all the random models, we map real user IDs to the generated network. The generated network has several characteristics of social systems, such as hubs and a power-law degree distribution. We verify this by using different graph metrics. The mean degree of the network is ≈6.00, which shows that the network is sparse. The network density is ≈0.00022, which shows that the network is not fully connected, and the clustering coefficient is ≈0.0024, which shows that friends of friends are unlikely to be friends.

5.2. Recommendation Algorithms

This research compares three traditional recommendation algorithms: popularity-based recommendation along with collaborative filtering recommendation and content-based recommendation. Users receive recommendations for the most popular items under the popularity-based method, which utilizes historical click data to illustrate the general interest effect. Collaborative filtering methods function through user–item interaction matrices by evaluating item similarities to pinpoint relevant items for users based on their past behaviors and recommend content that other similar users have liked, thereby delivering personalized recommendations [29]. Recommendation systems based on content analysis utilize metadata attributes of items such as genre or creator to generate suggestions. These methods generate personalized recommendation results by matching feature vectors that represent item characteristics with users’ past preferences and combining those with users’ interests [30]. During the evaluation phase we define each algorithm’s recommendation approach at the start of the simulation while identifying initial target users and then simulate how the recommended content spreads through the user network to examine both the diffusion path and coverage among users as well as dynamic performance changes over time.

5.3. Dynamic SI Simulation

Our simulation of content propagation relies on an adjusted Susceptible-Infected (SI) model. At the beginning of the simulation, a selected group of users becomes “infected” after encountering recommended content. During each time step, infected users attempt to infect their neighbors with a probability determined by the initial transmission rate and a time-dependent decay factor representing declining engagement. Users who do not interact regardless of exposure to content are not considered in propagation models. The quantities

S (t)

and

I (t)

denote the number of susceptible and infected users at time t in this system. The overall population size remains unchanged as

N = S (t) + I (t)

. The classical SI model explains how infections spread through populations.

\frac{d I (t)}{d t} = β (t) \cdot \frac{S (t) \cdot I (t)}{N}

(4)

To simulate declining user interest over time, we define the transmission rate

β (t)

as an exponentially decaying function:

β (t) = β_{0} \cdot e^{- λ t}

(5)

where

β_{0}

is the initial transmission rate, and

λ

is the decay rate parameter controlling the rate at which recommendation effectiveness decreases. This dynamic SI formulation enables us to account for temporal variations in user receptiveness to recommendations, making the diffusion model more realistic for online environments.

While Equation (4) is derived from the classical SI model assuming a fully mixed population, in our implementation, the infection process is constrained by the network structure. Specifically, an infected user can only transmit content to their immediate neighbors, reflecting localized person-to-person interaction. This adapts the global formulation to a network-based setting consistent with realistic user interactions.

5.4. Item Representation in Epidemic Framework

Our approach for embedding recommendation items into epidemic modeling involves visualizing each item as a standalone infectious entity that spreads through the user network. The process of suggesting an item to a user functions similarly to exposing that user to a possible infection. A user becomes infected when they engage with (e.g., click) the recommended item.

As shown in Figure 3, we define

u_{i}

as a user node and

v_{j}

as an item within this formal framework. The exposure function

R (u_{i}, v_{j})

specifies whether

v_{j}

gets recommended to user

u_{i}

. The interaction probability of

u_{i}

with

v_{j}

after a recommendation is represented by

β_{i j}

, which serves as a parallel to the transmission rate in the SI model. The item

v_{j}

acts as the propagation medium, while the diffusion of

v_{j}

through the network is measured by monitoring the total number of users who become engaged over time.

In this formulation, multiple items can propagate simultaneously, each governed by their own temporal transmission dynamics. Moreover, to reflect the natural decay in user interest, we define the effective transmission rate of item

v_{j}

at time t as:

β_{i j} (t) = β_{i j} (0) \cdot e^{- λ_{j} t}

(6)

where

λ_{j}

is the decay constant specific to item

v_{j}

, modeling the decline in novelty or visibility over time.

6. Experiments

This paper designs and implements a series of simulation experiments to test the proposed recommendation propagation framework based on epidemic modeling in real-world environments. Real-world user click data serves as the foundation for our experiments while we systematically assess propagation efficiency and dynamic evolution of three classical recommendation algorithms using a scale-free user network that reflects real-world network structure characteristics. The next section details the dataset and network construction method while explaining the recommendation strategy settings and propagation simulation process used in experiments before presenting and analyzing key experimental results.

6.1. Dataset

The dataset used in this study is sourced from KKBox, a major Asia-based music streaming platform offering over 30 million tracks. Specifically, we utilize the publicly available KKBox dataset released for the Kaggle Music Recommendation Challenge. The task is to predict whether a user will repeatedly listen to a song within one month after the first recorded listening event. If repeated listening occurs within this window, the label is 1; otherwise, it is 0. The dataset includes user–song pairs with their first observable interaction, along with associated metadata.

6.2. Data-Driven Simulation Perspective

In this paper, based on the real-world click-through behavior data, we design and implement a simulation framework of recommendation propagation from the perspective of epidemiology to empirically measure the diffusion efficiency of various recommendation strategies in the user network. To begin with, we generated a scale-free network based on the Barabási–Albert (BA) model with a total of N nodes. The model incrementally adds new nodes, each connecting to existing nodes with a probability proportional to their degree, thereby reflecting the heterogeneity in connectivity and the emergence of highly connected “hub” nodes—features commonly observed in real-world social networks. Each node in the network corresponds to a user and is annotated with the anonymized user ID. Each user–content interaction (e.g., a song) is expressed as a binary click to reflect whether the content is effectively received or adopted. To simulate the propagation process of recommended content among users, we then choose three mainstream recommendation algorithms as the benchmark for the experiments, including popularity-based recommendation, collaborative filtering recommendation, and content-based recommendation. Each algorithm produces an initial recommendation list for each user, which corresponds to the “exposure set” in the simulation. Since each recommended item can be seen as an “infectious agent” in the epidemiological analogy, the users who are recommended the item form the initial set of “infected”, so the propagation process of the recommender system can naturally be mapped to the modeling logic of infectious disease propagation.

The recommendation propagation process is modeled using a dynamic Susceptible-Infected (SI) framework. In this model, each infected user, i.e., a user who has interacted with the recommended item, has a chance to transmit the item to their neighboring users at each discrete time step. To account for the natural decay of user interest over time, we define a time-dependent transmission probability function as Formula (5). The simulation iteratively updates the infection status of users at each time step until the propagation process either converges or reaches a predefined temporal limit.

To provide a detailed comparison on the propagation properties of the various recommendation strategies, we design experiments in several dimensions. We first look at the propagation speed (i.e., the number of infected users growing in unit time) and the terminal coverage (i.e., the overall number of infected users) of items in the network and measure the propagation efficiency of the three recommendation algorithms by the difference. The simulation network is constructed using real user IDs from the KKBox dataset, where each node corresponds to a user and carries metadata such as language and cities. This metadata is used to initialize node attributes, enabling the simulation to reflect real-world audience segments. In addition, we introduce multiple time dimension settings (e.g., time of day, time of week, time of year) for the multi-scenario testing in order to simulate the various information dissemination paths on real platforms, and explore the time sensitivity of content propagation. We also use the item category information (e.g., language) provided in the dataset to classify items, simulate the propagation trajectories of different categories of items and their dynamic evolutionary process, and test the adaptability of the algorithms in different contexts and preference structures.

6.3. Daily Diffusion Simulation

The first experiment simulates the process of content propagation over a 5-day period, setting the propagation decay rate to

λ = 0.1

. As shown in Figure 4, most content types spread rapidly in the early stage of propagation (the first 5 time units), and then the curves flatten out, suggesting that user engagement experiences an early peak followed by rapid saturation. The percentage of infected users eventually stabilizes at around 0.58, meaning that about 58% of users clicked or interacted after the initial exposure. Among the different content types, the Language 59 and Language 3 categories spread slightly faster in the early stages, but also saturated earlier. This phenomenon suggests that under Language 59, the novelty of recommended content fades quickly in a short period of time, and the spreading range is limited by the rapid decay of user interest. This result is consistent with the shorter life cycle of user attention in real digital media.

6.4. Weekly Diffusion Simulation

The second experiment simulates the content propagation process over a 12-week period, again setting the decay rate

λ = 0.1

, but with the time granularity adjusted to weekly. As shown in Figure 5, compared to the daily simulation, the overall speed of this propagation process is slower, the curve grows smoother, and the final percentage of infected users is slightly higher, stabilizing at about 0.585. This suggests that a larger temporal granularity can help to smooth out fluctuations in the early user behaviors, making the propagation process more persistent and more continuous. It is worth noting that the difference in propagation curves between different content types is more pronounced here; in particular, Language 45 and Language 17 show some divergence in growth rates. This phenomenon suggests that time granularity has a significant impact on the assessment of content dissemination efficiency, and a longer time window helps to reveal the subtle differences in dissemination performance across content types.

6.5. Yearly Diffusion Simulation

A third experiment extended the propagation time to 5 years and increased the propagation decay rate to

λ = 0.2

, reflecting longer-term decline in user engagement. As shown in Figure 6, although the proportion of infected users continued to grow slowly throughout the process, the overall spreading range remained limited, eventually converging between 0.56 and 0.58. Compared to the previous two experiments, the rate of spread is significantly slower and the saturation time is longer. Higher decay coefficients lead to a rapid weakening of user influence over time, simulating the phenomenon of content gradually losing its appeal due to decreasing novelty in a long-term recommendation environment. In addition, the differences in the curves between different content types are significantly reduced, suggesting that the variability in spreading ability is further compressed under the effect of strong attenuation. This experiment emphasizes the critical roles of time scale and decay mechanism in recommender system evaluation, suggesting that in the absence of content reactivation mechanism or continuous push, recommendation propagation will be difficult to sustainably expand its influence in the long-term scope.

6.6. Comparative Analysis of Propagation Performance Across Recommendation Algorithms

In Figure 7 we observe the result of the performance propagation of three recommendation algorithms, namely Content-Based Filtering algorithm, Collaborative Filtering algorithm and a popularity-based algorithm (Popularity-Based) on the three dimensions of user attributes (language, city and type). In each simulation, we run all three algorithms in the same simulation settings, with the nine subfigures according to different algorithms and user attributes.

In all of the subgraphs, we can see the content adoption rate of users increases rapidly in the first two time steps, and then the rate of growth slows down and levels off, showing a trend of obvious decreasing marginal effect. This is the typical information diffusion in the scale-free network: fast propagation at the beginning, but as it approaches saturation, the propagation capacity will be significantly reduced.

In all of the three algorithms, the collaborative filtering algorithm (subfigures d, e and f) has the best performance in propagation speed and coverage. The algorithm fully plays the role of the similarity of interests between users; it could deliver the content to users who are more interested faster by taking advantage of the network proximity structure, thus effectively triggering the network effect.

The overall spreading performance of the content-based recommendation algorithm (subfigures a, b and c) is in the middle. Its spreading rate is relatively slow, the main reason is that the algorithm depends on the feature information of items rather than the behavioral relationship between users, and thus has limited propagation ability in the network structure. We could also find that the algorithm’s performance is slightly different in different user attribute dimensions, the spreading effect based on the “content type” dimension is relatively stable.

The popularity-based recommendation model (subgraphs g, h, i) introduces a time-decaying mechanism for the propagation probability, and exhibits more diverse diffusion characteristics. Although the method shows a strong spreading ability in the initial stage, its curve stabilizes earlier and fluctuates more among different user groups. This phenomenon is particularly pronounced in language or genre subgroups with smaller user bases, suggesting that the decay mechanism has a greater impact on long-term diffusion in smaller populations.

It is interesting to note that the spread gap between different user groups (e.g., language, city, type) is most significant in the popularity-based model. This suggests that a single strategy relying on popularity may inadvertently exacerbate the imbalance in content exposure, allowing groups that are already dominant to receive more referral resources.

Combining collaborative filtering algorithms with epidemic propagation modeling methods provides new perspectives for understanding user behavior and network interaction mechanisms. On the one hand, the method highlights the important influence of initial recommendation strength on the overall propagation path; on the other hand, by introducing the time-decaying propagation rate, it more realistically simulates the behavioral characteristics of users’ decreasing interest over time, such as information fatigue or interest transfer.

In addition, this study also emphasizes the influence of network structure and user attribute differences on recommendation propagation efficiency. In a network structure with high connectivity and “hub” users, it is easier for recommendation content to achieve rapid diffusion, while in a network with sparse structure or more edge users, recommendation diffusion may be limited, and more targeted strategies need to be designed to improve the effect.

7. Discussion

In fact, although the modeling method of recommendation propagation dynamics in this paper based on epidemiology provides some new ideas, there are also many shortcomings. Here we summarize several obvious deficiencies to be improved in this work, hoping to provide some direction for follow-up work.

Firstly, this paper employs the classical SI (susceptible-infected) model, where a user who has seen and clicked on the recommended content is considered “infected” and will not recover. It is assumed that the infected individuals do not lose their interest in the item and will be eternally infected. However, user behavior in real life is generally more complicated and also time-aware. In the real propagation process, there may be two kinds of effects of time on users. On the one hand, in the short term, users may quickly forget the recommended contents and lose interest in it; on the other hand, due to the intervention of external forces (such as information overload or deletion of platform authority), users may temporarily or permanently interrupt the propagation. But these processes are impossible to reflect in an SI model.

In order to depict the change of user behavior more realistically, it is also possible to introduce a more complex model of user communication in future research. For instance, in the SIR model, users can transition to the “recovered” state after a certain period of “infection” [31], which is suitable for the situation where the content attraction is declining or users are gradually cooling down. In the SEIR model, users have to go through the “exposed” state before becoming infected [32], and it is used to model the propagation behavior of users in the latent period after receiving the recommendation. The extended models mentioned above are expected to further enrich the description of the diffusion of recommended content and the time characteristics of user response.

In addition, the scale-free network used for propagation simulation in this paper was generated by the Barabasi–Albert model, which can better reflect the “central node” and “long-tailed distribution” structural characteristics of the real social network, but it is still an idealized modeling. Real networks have many complex attributes, such as community structure, homogeneous connection [33], asymmetric influence, etc. Therefore, future research can construct a more representative network structure based on real social platforms or content dissemination data, so as to improve the external validity and predictive power of the model.

From the perspective of propagation performance of recommendation algorithms, it is observed that collaborative filtering algorithms have more balanced and efficient propagation ability across different user groups, which fully leverages the potential of user preference similarity and network structure. In comparison, the performance of popularity-based recommendation methods decreases more rapidly over time after an impressive initial propagation stage, while their performance also varies across different user groups more. This gap indicates the possibility that a popularity-driven recommendation mechanism might further amplify the imbalance of content exposure, leading to higher exposure in mainstream groups and further marginalization of marginalized groups.

These results provide crucial guidance for developing practical recommender systems. Recently, as more attention has been paid to platform user stickiness, fairness, content distribution, and other issues, how to improve recommendation efficiency and distribute content more scientifically has become an urgent problem to be solved. Our experiments have found that the recommendation algorithm can not only improve the individual recommendation experience, but also affect the overall content diffusion mode at the network level. Therefore, when designing more socially responsible recommendation strategies, platforms need to consider not only the dissemination mechanism itself, but also the characteristics of the user structure. Methodologically, by connecting collaborative filtering algorithms with epidemic propagation models, we have found a new approach for studying user behavior mechanisms and network interaction. On the one hand, the method emphasizes the impact of the initial recommendation strength on the entire propagation path; on the other hand, it introduces the time-varying propagation rate to more realistically reflect the behavioral characteristics of users’ gradually waning interest, such as fatigue or change of interest.

8. Conclusions

By integrating recommender systems with epidemiological modeling, this study provides a new perspective for modeling and analyzing the spread of recommended content in user networks. We construct a scale-free network based on actual clickstream data, model the recommendation behavior as an “infection”-like spreading process, and use a dynamic SI model to evaluate the diffusion efficiency of three mainstream recommendation algorithms in different scenarios. The results show that different recommendation strategies have different spreading speed, scope and persistence, and the spreading process is also significantly affected by the time granularity modeling and the user interest decay.

Specifically, collaborative filtering algorithms show more balanced and efficient spreading ability among most user groups, while popularity-based methods, despite their higher spreading intensity in the early stage, have a relative disadvantage in long-term sustainability and fairness. Moreover, our simulations further reveal that the patterns of recommendation propagation also show significant differences in days, weeks and years, further highlighting the importance of using a time-sensitive evaluation mechanism as a complement to the traditional static performance metrics.

This work not only broadens the modeling of recommender systems, but also supplements the theoretical support for how algorithms shape user behavior in complex social networks. The introduction of the epidemiological communication logic into recommender system evaluation provides a new way for system designers to optimize recommendation strategies and user engagement management. The current model continues to operate under certain restrictive assumptions. For instance, the SI model does not consider the recovery mechanism of user interest and the impact of external events on recommendation propagation. Therefore, future work can further extend the current framework in the following aspects.

On the one hand, it can be beneficial to incorporate more complicated propagation mechanisms, such as SIR or SEIR, to better reproduce the attenuation and resurgence process of user interest in reality; on the other hand, we should also attach importance to the heterogeneity of users themselves, such as the differences of user behaviors, the fluctuations of user preferences and users’ sensitivity to the propagation of the recommended content to improve the personalization of the model and its conformity to reality. Additionally, in the future, we can also look at the mechanism of parallel propagation of multiple recommendation programs in the network [34], and see how they compete or complement each other along the propagation path.

On the other hand, integrating the platform’s real time exposure data and contextual information will bring additional gains in interpretability and predictability. Last but not least, generalizing the current single static network structure and migrating to a dynamic network [35] or a multi-layer social network [36] will also help to reveal more details about the effect of the user’s cross-platform behavior on the propagation path [37]. In conclusion, this work contributes new theoretical tools and ideas to the modeling of propagation in recommender systems, with practical values to the design of future more complex, dynamic and network-aware recommender systems.

Author Contributions

Conceptualization, P.H., X.Q.; methodology, P.H. and X.Q.; software, P.H.; validation, P.H.; formal analysis, P.H.; investigation, P.H., L.S., X.G., Y.Z. and X.Q.; resources, P.H., Y.Z. and X.Q.; data curation, P.H. and X.Q.; writing—original draft, P.H.; writing—review and editing, L.S., X.G., Y.Z. and X.Q.; visualization, P.H., L.S., X.G., Y.Z. and X.Q.; supervision, X.Q.; project administration, P.H., Y.Z. and X.Q. All authors have read and agreed to the published version of the manuscript.

Funding

Xiao Qin’s work is supported by the U.S. NSF (Grant DUE-2424934), NASA (Grant 80NSSC20M0044), NHTSA (Grant 451861-19158), Alabama Research and Development Enhancement Fund (Grant 1ARDEF25 02), and Wright Media, LLC (Grants 240250 and 240311).

Data Availability Statement

The original contributions presented in this study are included in the article material. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liang, T.P.; Lai, H.J.; Ku, Y.C. Personalized content recommendation and user satisfaction: Theoretical synthesis and empirical findings. J. Manag. Inf. Syst. 2006, 23, 45–70. [Google Scholar] [CrossRef]
Gedikli, F.; Jannach, D. Improving recommendation accuracy based on item-specific tag preferences. ACM Trans. Intell. Syst. Technol. 2013, 4, 1–19. [Google Scholar] [CrossRef]
Vasto-Terrientes, L.D.; Valls, A.; Zielniewicz, P.; Borras, J. A hierarchical multi-criteria sorting approach for recommender systems. J. Intell. Inf. Syst. 2016, 46, 313–346. [Google Scholar] [CrossRef]
Mao, M.; Lu, J.; Zhang, G.; Zhang, J. A fuzzy content matching-based e-commerce recommendation approach. In Proceedings of the 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Istanbul, Turkey, 2–5 August 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–8. [Google Scholar]
Wan, S.; Niu, Z. A hybrid e-learning recommendation approach based on learners’ influence propagation. IEEE Trans. Knowl. Data Eng. 2019, 32, 827–840. [Google Scholar] [CrossRef]
Sooknanan, J.; Comissiong, D. Trending on social media: Integrating social media into infectious disease dynamics. Bull. Math. Biol. 2020, 82, 86. [Google Scholar] [CrossRef] [PubMed]
Cañamares, R.; Castells, P. Should I follow the crowd? A probabilistic analysis of the effectiveness of popularity in recommender systems. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8–12 July 2018; pp. 415–424. [Google Scholar]
Goldberg, D.; Nichols, D.; Oki, B.M.; Terry, D. Using collaborative filtering to weave an information tapestry. Commun. ACM 1992, 35, 61–70. [Google Scholar] [CrossRef]
Di Noia, T.; Mirizzi, R.; Ostuni, V.C.; Romito, D.; Zanker, M. Linked open data to support content-based recommender systems. In Proceedings of the 8th International Conference on Semantic Systems, Graz, Austria, 5–7 September 2012; pp. 1–8. [Google Scholar]
Van Mieghem, P. The N-intertwined SIS epidemic network model. Computing 2011, 93, 147–169. [Google Scholar] [CrossRef]
Barabási, A.L. Scale-free networks: A decade and beyond. Science 2009, 325, 412–413. [Google Scholar] [CrossRef] [PubMed]
Wei, Y.; Wang, X.; Li, Q.; Nie, L.; Li, Y.; Li, X.; Chua, T.S. Contrastive learning for cold-start recommendation. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 5382–5390. [Google Scholar]
Ko, H.; Lee, S.; Park, Y.; Choi, A. A survey of recommendation systems: Recommendation models, techniques, and application fields. Electronics 2022, 11, 141. [Google Scholar] [CrossRef]
Mnih, A.; Salakhutdinov, R.R. Probabilistic matrix factorization. Adv. Neural Inf. Process. Syst. 2007, 20, 1257–1264. [Google Scholar]
Basilico, J.; Hofmann, T. Unifying collaborative and content-based filtering. In Proceedings of the Twenty-First International Conference on Machine Learning, Banff, AB, Canada, 4–8 July 2004; p. 9. [Google Scholar]
Keeling, M.J.; Eames, K.T. Networks and epidemic models. J. R. Soc. Interface 2005, 2, 295–307. [Google Scholar] [CrossRef]
Jacquez, J.A.; Simon, C.P. The stochastic SI model with recruitment and deaths I. Comparison with the closed SIS model. Math. Biosci. 1993, 117, 77–125. [Google Scholar] [CrossRef] [PubMed]
Volz, E.; Meyers, L.A. Susceptible–infected–recovered epidemics in dynamic contact networks. Proc. R. Soc. B Biol. Sci. 2007, 274, 2925–2934. [Google Scholar] [CrossRef]
Viguerie, A.; Lorenzo, G.; Auricchio, F.; Baroli, D.; Hughes, T.J.; Patton, A.; Reali, A.; Yankeelov, T.E.; Veneziani, A. Simulating the spread of COVID-19 via a spatially-resolved susceptible–exposed–infected–recovered–deceased (SEIRD) model with heterogeneous diffusion. Appl. Math. Lett. 2021, 111, 106617. [Google Scholar] [CrossRef]
Luarn, P.; Yang, J.C.; Chiu, Y.P. The network effect on information dissemination on social network sites. Comput. Hum. Behav. 2014, 37, 1–8. [Google Scholar] [CrossRef]
Wang, C.; Chen, W.; Wang, Y. Scalable influence maximization for independent cascade model in large-scale social networks. Data Min. Knowl. Discov. 2012, 25, 545–576. [Google Scholar] [CrossRef]
Chen, W.; Yuan, Y.; Zhang, L. Scalable influence maximization in social networks under the linear threshold model. In Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, NSW, Australia, 13–17 December 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 88–97. [Google Scholar]
Li, M.; Wang, X.; Gao, K.; Zhang, S. A survey on information diffusion in online social networks: Models and methods. Information 2017, 8, 118. [Google Scholar] [CrossRef]
Park, J.; Barabási, A.L. Distribution of node characteristics in complex networks. Proc. Natl. Acad. Sci. USA 2007, 104, 17916–17920. [Google Scholar] [CrossRef] [PubMed]
Bujlow, T.; Carela-Español, V.; Sole-Pareta, J.; Barlet-Ros, P. A survey on web tracking: Mechanisms, implications, and defenses. Proc. IEEE 2017, 105, 1476–1510. [Google Scholar] [CrossRef]
Strogatz, S.H. Exploring complex networks. Nature 2001, 410, 268–276. [Google Scholar] [CrossRef] [PubMed]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Albert, R.; Barabási, A.L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47. [Google Scholar] [CrossRef]
Koren, Y.; Rendle, S.; Bell, R. Advances in collaborative filtering. In Recommender Systems Handbook; Springer: Boston, MA, USA, 2021; pp. 91–142. [Google Scholar]
Pazzani, M.J.; Billsus, D. Content-based recommendation systems. In The Adaptive Web: Methods and Strategies of Web Personalization; Springer: Berlin/Heidelberg, Germany, 2007; pp. 325–341. [Google Scholar]
Cooper, I.; Mondal, A.; Antonopoulos, C.G. A SIR model assumption for the spread of COVID-19 in different communities. Chaos Solitons Fractals 2020, 139, 110057. [Google Scholar] [CrossRef] [PubMed]
He, S.; Peng, Y.; Sun, K. SEIR modeling of the COVID-19 and its dynamics. Nonlinear Dyn. 2020, 101, 1667–1680. [Google Scholar] [CrossRef] [PubMed]
Macpherson, D. A survey of homogeneous structures. Discret. Math. 2011, 311, 1599–1634. [Google Scholar] [CrossRef]
Ye, Z.; Zhang, L.; Xiao, K.; Zhou, W.; Ge, Y.; Deng, Y. Multi-user mobile sequential recommendation: An efficient parallel computing paradigm. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2624–2633. [Google Scholar]
Zhang, M.; Wu, S.; Yu, X.; Liu, Q.; Wang, L. Dynamic graph neural networks for sequential recommendation. IEEE Trans. Knowl. Data Eng. 2022, 35, 4741–4753. [Google Scholar] [CrossRef]
Farseev, A.; Samborskii, I.; Filchenkov, A.; Chua, T.S. Cross-domain recommendation via clustering on multi-layer graphs. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo Japan, 7–11 August 2017; pp. 195–204. [Google Scholar]
Yin, J.; Jia, H.; Zhou, B.; Tang, T.; Ying, L.; Ye, S.; Peng, T.Q.; Wu, Y. Blowing Seeds across Gardens: Visualizing Implicit Propagation of Cross-Platform Social Media Posts. IEEE Trans. Vis. Comput. Graph. 2024, 31, 185–195. [Google Scholar] [CrossRef]

Figure 1. An overview of the simulation across four phases of SI model diffusion. Green nodes represent susceptible individuals, red nodes represent infected individuals.

Figure 2. An overview of whole simulation.

Figure 3. Item representation.

Figure 4. Model with dynamically decreasing transmission rate (daily,

λ = 0.1

).

Figure 4. Model with dynamically decreasing transmission rate (daily,

λ = 0.1

).

Figure 5. Model with dynamically decreasing transmission rate (weekly,

λ = 0.1

).

Figure 5. Model with dynamically decreasing transmission rate (weekly,

λ = 0.1

).

Figure 6. Model with dynamically decreasing transmission rate (yearly,

λ = 0.2

).

Figure 6. Model with dynamically decreasing transmission rate (yearly,

λ = 0.2

).

Figure 7. Diffusion patterns of recommender systems across language, city, and genre under three algorithms: Content-Based, Collaborative Filtering, and Popularity-Based. (a) Language—Content-Based, (b) City—Content-Based, (c) Genre—Content-Based, (d) Language—Collaborative Filtering, (e) City—Collaborative Filtering, (f) Genre—Collaborative Filtering, (g) Language—Popularity-Based, (h) City—Popularity-Based, (i) Genre—Popularity-Based.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, P.; Sun, L.; Gao, X.; Zhou, Y.; Qin, X. Modeling Recommender Systems Using Disease Spread Techniques. Information 2025, 16, 687. https://doi.org/10.3390/info16080687

AMA Style

He P, Sun L, Gao X, Zhou Y, Qin X. Modeling Recommender Systems Using Disease Spread Techniques. Information. 2025; 16(8):687. https://doi.org/10.3390/info16080687

Chicago/Turabian Style

He, Peixiong, Libo Sun, Xian Gao, Yi Zhou, and Xiao Qin. 2025. "Modeling Recommender Systems Using Disease Spread Techniques" Information 16, no. 8: 687. https://doi.org/10.3390/info16080687

APA Style

He, P., Sun, L., Gao, X., Zhou, Y., & Qin, X. (2025). Modeling Recommender Systems Using Disease Spread Techniques. Information, 16(8), 687. https://doi.org/10.3390/info16080687

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling Recommender Systems Using Disease Spread Techniques

Abstract

1. Introduction

2. Related Work

2.1. Recommendation Algorithms

2.2. Epidemic Models

2.3. Information Diffusion Models in Social Networks

3. Research Statement

Research Questions

4. Problem Definition

5. Approach

5.1. Network Construction

5.2. Recommendation Algorithms

5.3. Dynamic SI Simulation

5.4. Item Representation in Epidemic Framework

6. Experiments

6.1. Dataset

6.2. Data-Driven Simulation Perspective

6.3. Daily Diffusion Simulation

6.4. Weekly Diffusion Simulation

6.5. Yearly Diffusion Simulation

6.6. Comparative Analysis of Propagation Performance Across Recommendation Algorithms

7. Discussion

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI