1. Introduction
Epidemiological models are an indispensable tool to understand the spread of viral infections in human populations. However, time and again, they have popped up and have been studied in rather different disciplines. The popularity of epidemiological models in a variety of otherwise disconnected disciplines implies that scientists working in these disconnected disciplines run the risk of studying models that have been studied already in different contexts. Therefore, it is worthwhile to look at epidemiological models across disciplines, which is the objective of the current study.
The study of mathematical models describing the spreading dynamics of a virus in a human population has a long tradition [
1,
2]. As such, epidemiological models come in different types. Some frequently used epidemiological modeling approaches are listed in
Figure 1: network models, ODE models, time-discrete models, stochastic models, time-delayed models, and models with nonlinear transition mechanisms. These characterizations do not necessarily describe mutually exclusive approaches. Rather, the approaches shown in
Figure 1 are related to each other and may be combined. For example, ODE models can be derived from network models using mean field approximations [
3,
4,
5,
6]. ODE models are frequently simulated with the help of time-discrete models (e.g., [
7,
8,
9]). Adding a noise term to an ODE model yields a model in terms of a stochastic differential equation [
10,
11]. Network models are typically stochastic models [
3,
4,
5]. However, not every stochastic model is a network model (e.g., the aforementioned stochastic differential equation models are not network models). Time delays and nonlinear transition mechanisms can be incorporated into ODE models and stochastic models (e.g., [
12,
13,
14], and see
Section 3.9 and
Section 5). In doing so, time-delayed models, nonlinear transition mechanism models, and ODE models or stochastic models may be combined with each other.
Addressing all approaches is a task beyond the scope of the current study. In the current study, the focus will be on the field of ODE models, and within that research field, two specific ODE models will be addressed in detail (see below). As such, all of the individual modeling approaches depicted in
Figure 1 are important in their own merit. However, as indicated in
Figure 1 and anticipated in part above, ODE models have a central position within that collection of theoretical advances. Accordingly, ODE models can be regarded as mean field approximations (i.e., simplifications) of network models. ODE models give rise to time-discrete models when discretizing time. ODE models can be generalized to take on the form of stochastic models and time-delayed models, and they can account for nonlinear transmission mechanisms (i.e., be combined with nonlinear transition mechanism models).
While over time epidemiological models have been improved and have increased in complexity, every now and then, a novel application of epidemiological models in a discipline different from epidemiology has been discovered. Currently, applications range from describing rumor spreading to the description of drug addiction and the explanation of sales curves of new market products. Reviews have typically taken either relatively narrow or relatively broad perspectives on this issue. For example, insightful (mini-)reviews focusing specifically on epidemiological models for infectious diseases, rumor spreading, drug addiction, and sales dynamics can be found in the literature [
15,
16,
17,
18,
19,
20]. That is, these reviews address a specific research field and review various different models or modeling approaches used in that field, as illustrated in panel (a) of
Figure 2.
In contrast, interdisciplinary reviews typically take a broad perspective and address the modeling of complex network dynamics in general and in applications in epidemiology and other fields in particular [
3,
4]. That is, they focus on a general theoretical concept and work that concept out with the help of a variety of models and modeling approaches used in an array of research fields, as illustrated schematically in panel (b) of
Figure 2. Both discipline-specific reviews on epidemiological models and interdisciplinary reviews on complex system dynamics are valuable sources of information. However, reviews of the first type, by definition, do not allow researchers to look across disciplines. Reviews of the second type tend to abound with a plenitude of diverse modeling approaches, which again makes it difficult for researchers to compare these approaches across disciplines. In particular, as indicated in panel (b), due to the relatively broad perspective of these reviews, there are naturally various empty cells (as indicated by dashes) in the matrix that plots the discussed models versus the addressed research fields. This indicates that the goal of such a review is typically not to demonstrate the general applicability of certain specific models or modeling approaches. These reviews are about the applicability of the general concept. At issue is the presentation of an interdisciplinary overview that focuses on some of the most fundamental epidemiological models, which is the objective of the current study. More explicitly, the goal of the present study is to present two fundamental epidemiological models, the SIR and SEIR models, and their applications in a variety of disciplines in order to work out explicit commonalities across disciplines and differences between disciplines. As mentioned above, to this end, we will focus on the ODE modeling approach (see
Figure 1). That is, as such, in the current study, the modeling approach will be fixed and applications in various fields will be reviewed, as illustrated in panel (c) of
Figure 2. However, in order to work out the general applicability of such ODE models in a more explicit, concrete way, the current review will narrow down the ODE modeling approach to the two aforementioned benchmark models: SIR and SEIR. In doing so, the current study will assume the structure shown in panel (d). In line with the schematic shown in panel (d), the first objective is to demonstrate that in a variety of disconnected disciplines, the exact same mathematical baseline models have indeed been studied and used, and will most likely be used in future studies. Making researchers aware of this issue allows them to take advantage of the findings obtained in unrelated disciplines. Roughly speaking, researchers do not have to reinvent the wheel again and again. The second objective is to discuss peculiarities and differences across disciplines. Looking across disciplines allows one to question an established approach in one discipline in view of an established approach in another discipline. For example, a particular mechanism may be well studied in discipline A but not in discipline B. There might be good reasons for that. If not, it might be worthwhile to address that mechanism in the context of discipline B. To be clear, the current study is not a review as such. As mentioned above, excellent discipline-specific reviews (i.e., type (a) reviews, see
Figure 1) can be found in the literature. No attempt will be made to condense them to a single unified review. The current study is about two benchmark models (the SIR and SEIR models) that have been studied and applied in their own merit and are the basic building blocks of most more complex epidemiological models. The current study will address these models in the context of the following eight selected disciplines (see again panel (d) of
Figure 2): epidemiology, virus dynamics, computer viruses, drug addiction, voter dynamics, rumor spreading, sales dynamics, and viral marketing/viral videos. In this way, this study attempts to deal with the trade-off of being specific and concrete on the one hand, and of illustrating generality on the other hand. This mixture of concreteness and generality should help and motivate researchers to look at models and disciplines not addressed in the current study in a similar way. Finally, it should be noted that this study is not about the solutions of models; it is about the (sub-)types of SIR and SEIR models used in epidemiology and other disciplines. It is about the structure of these models.
Before proceeding with the main findings, let us make a few comments on the methodological approach of the current study. First, in order to keep focused on an explicit set of mathematical models, in the current study, only SIR and SEIR models in the form of coupled ordinary differential equations will be considered. Second, in the current study, textbooks and (mini-)reviews have been evaluated in order to present the aforementioned SIR and SEIR models within coupled ordinary differential equation frameworks for the eight disciplines mentioned above. To this end, the following references have been used: Frank [
1] and Rock et al. [
2] for classical epidemiological models and virus dynamics models, Nwokoye and Madhussdanan [
21] for modeling maleware spreading, van den Ende et al. [
18] and Wang et al. [
19] for epidemiological modeling of drug addiction, Turenne [
17] for rumor spreading, Guidolin and Manfredi [
20] for sales dynamics and its generalized research field, which is innovation diffusion, and Li et al. [
16] for modeling viral marketing and the emergence and spread of viral videos. In addition, searches using Google Scholar and Scopus have been conducted to determine studies on SIR and SEIR models in all disciplines to supplement the aforementioned references if necessary. Note that in view of the study objectives listed above, the particular selection of references is not a crucial part of the current study. In particular, the first objective is to exemplify the utilization of the benchmark SIR and SEIR ordinary differential equation models in a variety of research fields. Any appropriate selection of references would fulfill that purpose. Likewise, note that the selection of the aforementioned eight disciplines or research fields is again not a crucial step of the current study. As will be shown in the results section below, the objective of the current study is to be as explicit as possible and to provide a back-to-back comparison of the epidemiological models across the selected disciplines. Such a enterprise requires a limitation of the number of disciplines. Any limited set of disciplines would serve the purpose of the current study. The selected disciplines should be considered as exemplary research fields in which the utilization of epidemiological models has attracted interest.
The remainder of this study is organized as follows.
Section 2 discusses the use of SIR and SEIR (and their related SEI) models in the eight aforementioned disciplines. The focus is on mathematical equivalence or at least commonalities of the sub-types of SIR and SEIR models used in these fields. In
Section 3, differences across disciplines are pointed out. That is, the focus is on discipline-specific peculiarities. For the sake of brevity, only the SIR model will be reviewed. In this context, a generalized SIR model will be introduced in
Section 3.1.
Section 4 continues the discussion about discipline-specific peculiarities. More precisely, in
Section 4, different interpretations of the basic reproduction number, which is a key concept in epidemiology, will be briefly addressed. A brief discussion section including our conclusions will be presented in
Section 5.
4. The Basic Reproduction Number and Discipline-Specific Peculiarities
The basic reproduction number
is a key concept in classical epidemiology [
1,
2,
90]. It is defined as the average number of newly infected individuals produced by a single typically infected individual when assuming a completely susceptible population [
1,
2,
90]. In the literature, there are several alternative ways to refer to
. Accordingly, the term “reproduction” may be replaced by “reproductive”. Likewise, the term “number” may be replaced by “ratio” [
1]. Irrespective of the precise terminology, roughly speaking,
quantifies how many individuals are infected by an infected person on average when a new virus invades a population. In addition to this quantitative aspect of
,
can be used as a bifurcation parameter because if an infected person produces more than one infected case, then there will be a disease outbreak and the disease-free fixed point is unstable. In contrast, if an infected person produces (on average) less than one case, then the initial subpopulation of infected individuals will monotonically decay over time and the disease-free state is stable. For the baseline SIR model (
1),
reads [
90]
For the baseline SEIR model (
12),
reads [
2]
The question arises whether
has been utilized and adopted in other disciplines such as those discussed in the current study. Let us dwell on this issue.
In the field of virus dynamics, the basic reproduction number
has indeed been used in analogy to
, as defined above for classical epidemiology. Accordingly, the basic reproduction number
corresponds to the average number of newly infected cells due to the presence of a single infected cell when assuming that the cell population under consideration consists entirely of non-infected target cells [
105,
106,
107,
108]. The quantitative aspect of
may be exploited by estimating
from virus load data. In doing so, a quantitative measure for the within-host infectiousness of a given virus can be obtained (e.g., see [
108,
109,
110]). Just as in classical epidemiology,
can be used a bifurcation parameter. For
, the virus will spread out in the human body. For
, the initial infection will decay, that is, the virus will be removed and cleared out faster than it can reproduce itself. Finally, note that in the field of virus dynamics,
is a within-host reproduction number and describes the spread of a disease within an individual. In contrast, in classical epidemiology,
denotes a reproduction number on the population level and describes the spread of an infectious disease across individuals.
Studies examining the spread of computer viruses have also adopted the concept of the basic reproduction number (see, for example, [
23,
26,
27,
29,
30,
63,
64,
65]). In this context, researchers frequently just point to the utilization of
in classical epidemiology without presenting an explicit interpretation of
. Once
is introduced in this heuristic way, it is used as a bifurcation parameter to examine the stability of fixed points of interest. Having said that, in the context of computer viruses, the basic reproduction number
may be defined as the average number of newly infected computers produced by a single infected computer in a network of non-infected computers. In this definition, computers may be switched out by nodes if it is more appropriate to talk about the spread of malware across network nodes.
The basic reproduction number has been used in various studies on drug addiction [
11,
31,
32,
33,
35,
36,
67,
68,
69,
70]. Explicit definitions of
have been given by some authors in the context of their respective studies. For example, White and Comiskey defined the basic reproduction number as the average “total number of people that each single drug user will initiate to drug use during the drug-using career” (see Section 3.1.1 in [
31]). A similar definition can be found in Tang et al. [
67]. In the context of studies focusing on tobacco addiction, it has been suggested to denote
as the smoker’s generation number [
35,
36]. In line with ref. [
36], when studying tobacco addiction at university campuses and how tobacco-addicted students lure fellow students into addiction, the smoker’s generation number may be defined as the average number of secondary cases of addicted smokers produced by a single addicted student smoker in a university population composed of non-addicted students. Note that in ref. [
36], a slightly different definition is actually presented because the authors consider the so-called effective reproduction number, which applies to situations where an epidemic is ongoing [
1]. Irrespective of the interpretation of
, studies concerned with drug epidemics frequently use
as a bifurcation parameter. If
holds, then there is an epidemic outbreak of a drug (e.g., a new synthetic drug [
68]). In contrast, if
holds, then the initial invasion of a population by a particular drug will fail to trigger a drug epidemic outbreak.
The basic reproduction number
has been used as a bifurcation parameter in a few studies on voter dynamics that take advantage of epidemiological ODE modeling [
37,
39]. In addition, Romero et al. [
41] defined specific reproduction numbers to capture the influence of specific groups (such as decided voters and party members) on undecided voters. Inspired by the terminology used by Romero et al., [
41], in the context of the baseline SIR model (
1) used in the study by Yong and Samat [
37] to describe voter dynamics, the basic reproduction number
may be explicitly defined as the average number of undecided voters that are influenced to vote for a particular candidate by a single voter (of that candidate) who is placed in an entire population of undecided voters.
Studies on rumor spreading that have used epidemiological ODE models have often adopted
as a bifurcation parameter to discuss the stability of fixed points (see e.g., [
47,
73,
74,
75,
76,
77]). Accordingly, if
holds for a particular rumor, then the rumor distributed by a single spreader generates more than one new spreader on average and the rumor spreads in the population. For
, a single spreader convinces on average less than one person to become a spreader and, consequently, the rumor will fade monotonically away. As such,
may be explicitly defined as the average number of secondary cases of rumor spreaders caused by a single spreader in a population full of ignorant individuals [
74,
76].
In studies on sales dynamics and innovation diffusion, the basic reproduction number has rarely been used. For example, in the recent review by Svoboda et al. [
53] on the utilization of the SIR model in the field of innovation diffusion, it is acknowledged that the basic reproduction number is a cornerstone concept of epidemiological modeling and that innovation diffusion and epidemiology exhibit strong similarities. However, the explicit utilization of
in the field of innovation diffusion itself is not mentioned. In the review by Guidolin and Manfredi on innovation diffusion and the Bass sales model [
20], the basic reproduction number is discussed in the context of the SIR model (
1) for sales dynamics, as reviewed in
Section 2.7. Applications of
to more complex models such as the Bass–SIR model (
29) or alternative models reviewed in ref. [
20] are not given. As an exception to this seeming absence of
in the sales dynamics and innovation literature, Sharma [
79] used
as a bifurcation parameter to analyze the buyers-and-reviews model (
22) described in
Section 2.15. In the context of the model (
22),
may be defined as the average number of new shoppers at the online shop X that are generated by a single shopper at X in an initial population composed of online non-shoppers. Recall that according to the model (
22), these new shoppers emerge due to the positive reviews written by the shoppers at X. These reviews are not explicitly mentioned in the definition of
— just as the virus is not explicitly mentioned in the definitions of
given above for classical epidemiological systems and the spread of viruses in human hosts.
In closing this section, let us briefly address viral marketing modeling and the modeling of the emergence of viral videos. In the context of the unaware–broadcaster–inert model (
11), the basic reproduction number
has been defined as the average number of broadcasters that a single broadcaster produces when ignoring the inert class [
57]. The requirement of ignoring the inert class can be replaced by requiring (in analogy to the various previous definitions of
listed above) that a scenario is considered that involves a completely susceptible population [
58]. That is, the population consists entirely of unaware individuals. Irrespective of the definition of
, in this research field,
has typically been used as a bifurcation parameter for the purpose of stability analysis [
56,
57,
58,
82].
5. Discussion
It has been demonstrated that epidemiological models have found applications in a variety of disciplines outside of classical epidemiology. Importantly, it has been shown that the exact same mathematical ODE models utilized in epidemiology have been used in these alternative research fields. This has been exemplified for two benchmark epidemiological models: the SIR and SEIR models. The implication of this demonstration is far-reaching for researchers dealing in all kind of disciplines in which epidemiological models are currently applied: researchers studying a particular mathematical, epidemiological model in the context of a certain discipline should look beyond their specific discipline in order to find previous work relevant for their work. The reason for this is that theorems and solutions obtained in one discipline carry over to other disciplines as long as the mathematical models are identical across the disciplines. For example, in the wake of the COVID-19 pandemic, extensive work has been carried out to develop a perspective alternative to the state space perspective of epidemiological ODE models [
1]. This alternative perspective uses amplitude equations that have certain benefits as compared to the original state space equations (for details, see ref. [
1]). The amplitude equations for the baseline SIR models (
1) and (
4) with and without demographics are derived in chapter 4 in ref. [
1]. Likewise, the amplitude equations for the baseline
and
SEIR models (
17) and (
24) in the absence of demographic terms are derived in chapter 5 in ref. [
1]. The amplitude equation perspective obtained in this work in the context of classical virus spreading in human populations is ready for use in alternative disciplines. For example, it can be used to describe rumor spreading [
48] or drug addiction dynamics [
66]. In other words, once a problem associated with a particular epidemiological, mathematical model has been solved in a certain discipline, the same problem related to the same mathematical model does not need to be solved again in the context of another discipline. The demonstration provided in
Section 2 of mathematical equivalent models used across disciplines comes with a plea to researchers not to reinvent the wheel again and again. More importantly, the demonstration in
Section 2 comes with the invitation to researchers working in a particular research field to learn from and take advantage of results obtained in otherwise disconnected disciplines that just happen to use the same kind of mathematical, epidemiological model.
In particular, as mentioned earlier, analytical solutions of epidemiological models that have been developed in a particular field may be utilized by researchers working in alternative research fields despite characteristic differences across these fields. For example, characteristic time scales typically vary across the scientific disciplines that take advantage of epidemiological modeling. Computer viruses typically spread out in computer networks within hours or days [
24,
111]. Likewise, rumors may act on relatively short time scales. A rumor may become popular within a day. Subsequently, its popularity may decay dramatically [
48,
99]. Just like rumors, viral videos tend to reach their maximal popularity within a few days [
60]. Viral infections within humans such as influenza and COVID-19 come with virus load dynamics that build up within a week, reach a peak level, and subsequently decay on a similarly fast time scale [
1,
109]. In contrast, virus infections on the population level typically take place on longer time scales of months [
1,
9,
112], and the development of such epidemics may be studied over generations, that is, on even longer time scales [
2]. Likewise, the sale of new products may reach a maximum only after several years [
54]. Exact analytical solutions obtained for a particular model (e.g., the baseline SIR model), by definition, are not approximations and, consequently, hold for any parameter set of the model in question and for any time scale being considered. For this reason, exact solutions can conveniently be transferred across disciplines. An illustrative example in this regard is the equation for the maximum value of an infection wave described by the baseline SIR model without vital dynamics, as defined by Equation (
4). Let
and
denote the initial values of
I and
S at the initial time
. Then, the maximal value of infected individuals,
, can be computed from the following [
1,
90]:
If influenza waves or COVID-19 waves are modeled by a SIR model (
4) as in refs. [
1,
112,
113], then Equation (
33) can be applied. If the virus dynamics within humans infected by the influenza virus are modeled using the TV model as in ref. [
109], then Equation (
33) may be applied as well due to the equivalence of the TV model to the SIR model (see
Section 2.2). In this context, it does not matter that in the former case, the infection waves evolve over several months, while in the latter case, the virus load trajectories evolve over a much shorter period of days. Having said that, the situation may be more complex when considering analytical approximative solutions of epidemiological ODE models. For example, in the field of virus dynamics, two-phase approximative solutions describing the increase in and decay of virus load have been proposed [
1,
114,
115]. For illustration purposes, let us present here a slightly improved version of the two-phase approximative solution derived in ref. [
1]. It reads
In Equation (
34),
describes the first phase, namely, the increase in the viral load towards the maximal (peak) value
, and
describes the second phase, namely, the decay of the virus load towards zero after the peak value has been reached. The approximations (
34) can be derived from the TIV model (
15) for
[
1]. Accordingly,
denotes the largest eigenvalue of the TIV model for which an analytical expression exists [
1,
115], and
b can be computed from the so-called unstable eigenvector of the model [
1].
is the time point at which the virus load becomes maximal (such that
), and
is the decay parameter occurring in Equation (
15).
denotes a time shift that was neglected in ref. [
1] and can be used improve the approximation. More precisely,
can be used to shift the reference time point
away from
into the second phase of exponential decay (see the example below). Finally,
is the virus load at the reference time point
:
. For
, Equation (
34) is reduced to the original two-phase approximation presented in ref. [
1].
Figure 3 illustrates an application of Equation (
34) for one of the COVID-19 patients discussed in ref. [
1]. The virus load is shown as a function of time in a log-lin scale. The virus load data of the patient (blue circles) are fitted to the TIV model (
15) [
1]. The best-fit solution is shown as black line. From the best-fit model, parameters
and
b are calculated. The dotted red line shows the first-phase approximation
thus obtained. According to the TIV model, the virus load peaks at 5.36 days. A shift
of 1 day moves the reference point into the second phase of exponential decay (which shows up as a linear decay in the log-lin plot). The dotted green line shows
with
d,
d, and
, as obtained from the best-fit TIV model.
Due to the equivalence of the TIV model with the 3D SEIR model (
13), as seen in
Section 2.10, the two-phase approximation (
34) can be used in different disciplines using the SEIR model (
13). However, the accuracy of the approximation depends in general on the model parameters. Since the model parameters vary across disciplines just as the characteristic time scales mentioned above, the usefulness of the approximation must be checked for each discipline separately (and, in general, within disciplines for each application). Moreover, clinically relevant changes in virus load can be seen on log-scales. On the log-lin graph shown in
Figure 3, the two-phase approximation based on the two exponential functions provides a reasonably good fit to the exact TIV model solution. In other disciplines, typically state variables are shown on linear scales. Consequently, the usefulness of the approximation (
34) for state variables shown on linear scales would require additional testing. Overall, the two examples related to Equations (
33) and (
34) illustrate that transferring knowledge across different scientific disciplines that use the same kind of epidemiological models is possible and promising but should be done with caution.
In
Section 3, it was also exemplified with the help of the eight research fields considered in the current study that there are differences across disciplines when using the same type of epidemiological ODE models. Naturally, the interpretation of the state variables and model coefficients differs across disciplines, as seen in
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9. However, when comparing
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9, it also becomes clear that in some disciplines, transition coefficients are relevant that are irrelevant (or have frequently been neglected) in other disciplines. To make this point more explicit,
Table 10 provides an condensed overview of
Table 2,
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9. In
Table 10, for each discipline, the non-vanishing transition coefficients of the SIR model with general linear transition mechanisms (
26) are indicated. In doing so,
Table 10 allows the identification of convenient structural similarities and differences in the SIR modeling approaches across the disciplines considered in the current study.
Note that in
Table 10, the two research fields about sales dynamics/innovation diffusion and viral marketing/viral videos are listed separately, just like the other remaining six research fields. However, since these research fields are closely related to each other, we also condensed them into a single discipline called sales dynamics/viral marketing. Consequently, the following discussion will be centered around seven disciplines rather than eight. Note also that in the case of the SIR model for epidemiology, some coefficients are presented with parentheses “()” and stars “***”. The meanings of these quantifiers were discussed in
Section 3.2. For the discussion about structural differences across disciplines, we just need to note that the relevance of these coefficients is open for debate. That is, SIR models in epidemiology in fact could exhibit more than just one irrelevant coefficient. For the sake of simplicity, let us count the transmission coefficients flagged by the aforementioned quantifiers “()” and “***” as relevant ones.
As can be seen in
Table 10, there are three single-entry missing disciplines. That is, there are three disciplines featuring SIR models for which all transmission coefficients except for one are relevant. These disciplines are epidemiology, computer viruses, and the discipline centered around sales dynamics and viral marketing. These single-entry missing SIR models in the disciplines epidemiology, drug addiction, and sales dynamics/viral marketing feature different missing entries. For SIR models in epidemiology, the coefficient
, reflecting spontaneous
transitions, is typically ignored. In contrast, for SIR models describing computer virus spreading, it is the coefficient
, associated with spontaneous
transitions, that is typically ignored and set as equal to zero. Finally, for SIR models in the field of sales dynamics and viral marketing, the missing transition coefficient is the coefficient
, describing spontaneous
transitions. This implies that the maximally complex models (i.e., the models that exhibit all but one coefficient) in these three disciplines exhibit different mathematical structures. More precisely, they exhibit transition matrices with zero entries at different positions. Disciplines exhibiting (according to the discussion provided in
Section 3) two missing entries are virus dynamics, drug addiction, and voter dynamics. Interestingly, virus dynamics SIR models and drug addiction SIR models exhibit the same vanishing coefficients:
and
. Having said that, one should note that the virus dynamics SIR model does not exhibit an
R compartment. Consequently, only the dynamics of the compartments
S and
I can be compared across the two disciplines when focusing on SIR modeling approaches. The virus dynamics model and the drug addiction model are mathematically equivalent (in the
subspace) when the dynamics of
R for the drug addiction SIR model do not affect the dynamics in the
subspace. That is, we need to require that the coefficients
and
vanish in the drug addiction model. In summary, any solution or theorem that is about the
subspace dynamics and holds either for the virus dynamics SIR model characterized in
Table 10 or for the drug dynamics SIR model with a coefficient listed in
Table 10 where
can be carried over to the respective complementary discipline (i.e., from virus dynamics to drug addiction or vice versa from drug addiction to virus dynamics). As mentioned earlier, the third discipline featuring a maximally complex model with two vanishing transition coefficients is the research field for voter dynamics. The missing coefficients are
and
. Consequently, the maximally complex SIR model in this research field differs from the maximally complex SIR models in the fields of virus dynamics and drug addiction. Finally, according to the review presented in
Section 3, the maximally complex SIR model for rumor dynamics is characterized by a triple-entry-missing transition matrix. Overall, certain disciplines (those with single-entry-missing maximally complex models) exhibit a richer structure than other disciplines (those with double-entry- or triple-entry-missing maximally complex models). Moreover, the structures of the maximally complex models differ across all seven disciplines. That is, each discipline exhibits a maximally complex model with a unique structure.
As discussed in the context of
Table 10, spontaneous
transitions (as described by
) are typically not considered in the disciplines of classical epidemiology, voter dynamics, and rumor dynamics. However, the argument made in the field of drug addiction modeling that spontaneous
transitions may occur could also be made for voter and rumor dynamics. As far as classical epidemiology is concerned, the class
R typically describes recovered individuals that have cleared the virus under consideration out of their bodies. Consequently, in this context, spontaneous
transitions cannot occur.
When taking a data-driven approach, the maximally complex model of the respective discipline may be fitted to the data at hand. A purely data-driven approach avoids a detailed discussion about which transition coefficients should be included and which should not. In data-driven approaches, the data decides how the model at hand is parameterized without taking any a priori knowledge into account [
117]. Only in a subsequent step would the estimated matrix coefficient
be interpreted in terms of the mechanisms listed in
Table 2. In this step, the hypothesis about the different mechanisms underlying the coefficients can be checked. For example,
Table 2 suggests that
should hold such that the coefficients can be interpreted as estimates for the decay parameter
like
.
Note that in the context of such a data-driven approach, the coefficients are assumed to be independent from each other. In this context, the question arises if a data set at hand is sufficient to fit the parameters of a given model. According to the 1-to-10 rule of statistics for each to-be-fitted parameter, there should be at least 10 data points. The baseline SIR model (
1) without vital dynamics comes with two parameters:
and
. Including vital dynamics, the model exhibits four parameters:
, and
. The maximal complex models as listed in
Table 10 can have up to eight non-vanishing matrix coefficients (which are considered to be independent when considering the data-driven approach). Consequently, when ignoring vital dynamics, a data set should have at least 20 data points to fit the two-parameter baseline SIR model and should contain at least 90 data points to fit a model featuring
and eight independent matrix coefficients
. A similar consideration can be made for the case when the vital dynamics are taken into account.
If a data set does not contain sufficient data points, a feasible solution is to fit mechanisms (i.e., factors) rather than coefficients. In this approach, the matrix coefficients are not independent and thus the number of independent parameters becomes relatively small. Moreover, one may fit just one additional mechanism (factor) at a time. In doing so, as is common practice in statistics, one may test models with different predictors against each other (see, for example, [
118,
119]). For example, let us assume the goal is to fit an SIR model to infectious disease data of a relatively short infection wave for which vital dynamics can be ignored. In this case, the two-parameter baseline model (
26) with
and
,
, and all other parameters set equal to zero, is fitted in a first step to the data. Subsequently, three-parameter models that exhibit one additional parameter (that will be denoted below by
) are considered, capturing one of the possible mechanisms listed in
Table 2. For the four mechanisms listed in
Table 2, we obtain the four models labeled 1–4 in
Table 11. The goodness of fit of models 1–4 may be compared. If so, the mechanism (factor) that matters most to improve model fit can be identified. In addition, the performance of the three-parameter models 1–4 may also be compared with the two-parameter baseline model using the Akaike information criterion, which is a standard procedure when comparing the performance of models that vary in the number of their parameters (see, for example, [
119]).
Having said that, in practice, a priori knowledge is used when it comes to model fitting (see, for example, [
1] and the references therein). That is, usually not all parameter values are estimated; some are taken from the literature. For example, birth rate and death rate parameters can often be inferred on the basis of available demographic data. In doing so, the parameter-fitting step can be dramatically simplified. In closing these considerations, let us return to
Figure 1. As it has been pointed out several times throughout this study, the SIR ODE model with linear transition mechanisms can be generalized in various ways. For example, adding the compartment
E to the SIR model such that it turns into an SEIR model increases the number of potentially relevant elements in the transition matrix
A from 9 to 16. Adding another compartment inflates the number of possibly relevant transition matrix elements to 25. Likewise, adding nonlinear transmission mechanisms or delays to an ODE model with linear transition mechanisms typically increases the number of to-be-fitted parameters. Consequently, overfitting a model that exhibits too many parameters relative to the length of a given data set can become a severe problem when dealing with complex, high-dimensional epidemiological models.
The current study focused on epidemiological models formulated in terms of ODEs. As discussed in the Introduction and illustrated in
Figure 1, there are different types of epidemiological mathematical models that were not addressed in the current study and are beyond the scope of the current study. For example, delay differential equations rather than ordinary differential equations were used. Epidemiological delay differential equations models can be found in a variety of different disciplines such as classical epidemiology [
12,
13], virus dynamics [
120], computer viruses [
26,
29,
121], drug addiction dynamics [
87,
122], and rumor spreading [
123,
124]. In principle, the approach presented in the current study for epidemiological ODE models can be applied to any type of epidemiological mathematical models. That is, there is room to generalize the present study in future works. In this context, note that in
Section 3, several examples of nonlinear transition mechanism models were presented, as seen in Equations (
27), (
28), and (
30). We are inclined to believe that discussing nonlinear transition mechanism models using the approach of the present study would be a cumbersome and almost impossible enterprise due to the almost endless mathematical possibilities of formulating nonlinear transition terms. A feasible approach would be to focus on models exhibiting a particular type of nonlinearity. For example, the models (
27), (
28), and (
30) all exhibit quadratic nonlinearities of the form
or
, where
and
are state variables.
Nonlinear functions may not only be implemented to generalize transition mechanisms as captured by the transition matrix
A defined in Equation (
26); they can also be used to generalize the standard infection term
(see, for example, Equations (
1) and (
12)), describing the rate at which susceptible individuals become infected. Accordingly,
is replaced by
, where
are state variables of the model under consideration and
h is a function of
. Saturation effects [
29,
64,
125] or other effects slowing down the infection rate [
6,
8,
126,
127] are frequently accounted for by introducing this kind of nonlinearity.
Note that the bilinear term
describes the infection of a susceptible individual due to their contact with an infected individual. As argued in the previous paragraph, under certain circumstances, infection processes should be described by expressions that go beyond this bilinear form. Another interesting mechanism that leads to additional nonlinearities
in the infection term
is to take multiple or higher-order interactions into account. For example, when describing the spread of infectious diseases, the infection of an individual due to two contacts with infectious individuals that happen over a relatively short period of time is better described by the term
, where the state variable
I occurs in quadratic form [
128]. Likewise, in the field of drug addiction, the impact of social pressure that lures non-addicted individuals into addiction may be better described by a quadratic function of the addicted individuals
A like
[
98,
122]. Again, the rationale here is that social pressure is not about a single individual but a group of individuals affecting a susceptible person. In the context of rumor spreading and information diffusion, it suggests itself that such higher-order interactions are likely to take place. For example, susceptible individuals may become aware of a new piece of information via the joint impact of two spreaders, that is, two individuals who tend to pass on the information under consideration [
129].
Figure 1 not only points out the importance of nonlinear mechanisms but also indicates that network models outperform ODE models in terms of generality. While network models can capture effects related to the heterogeneity of networks that are neglected in ODE models, ODE models are a powerful tool to study a plenitude of mechanisms that hold irrespective of the explicit structure of networks. For example, ODE models provide a useful framework for studying the effect of vaccination on the spread of an infectious disease in a population [
1,
2]. Likewise, when considering, for example, the spread of computer viruses, various effects, as discussed in
Section 3.4, can be adequately addressed with the help of ODE models, such as the roll-out of anti-virus software [
29,
96,
97], the waning of such anti-virus software’s protection [
24,
25,
29], the reactivation of cleaned computers [
96,
97], or the premature reactivation of computers that are still infected [
96,
97]. Such effects and many others can be studied without a specific network structure in mind. Of course, network structures may have an impact on the magnitude of an effect. Therefore, the results obtained from the ODE models establish baseline scenarios that may be re-investigated in follow-up studies involving network models.
In this study, eight research disciplines were considered in which epidemiological modeling makes an essential contribution. Other research disciplines could be considered and/or some of the disciplines mentioned in this study could be broadened to include a wider spectrum of applications. For example, not only could smoking and alcohol addiction be passed on from addicted individuals to non-addicted individuals, as discussed in
Section 2 and
Section 3, but also certain types of behaviors. In this context, Usaini et al. [
130] pointed out that a negative attitude and behavior in some student populations against their instructors is known to spread during exam weeks. The authors used an epidemiological ODE model to describe this kind of transfer of attitude-guided behavior across students and the emerging behavioral epidemic. Likewise, related to the topic of innovation diffusion discussed in
Section 2 and
Section 3, one may consider the spread of knowledge in academia as a form of epidemic in which knowledge is passed on from scholars to scholars. The key idea in this context is to measure the amount of knowledge in terms of the number of appropriately selected publications [
100,
131,
132]. Epidemiological ODE models can then be proposed and fitted to data, as in refs. [
131,
132]. In the wake of the COVID-19 pandemic, it became obvious how vulnerable supply chains are against disturbances and how easily they can break down. This observation fueled the interest to apply epidemiological models to other models and examine the spread of disturbances in supply chains [
15]. In short, research fields and applications that have not been addressed in the current study may be evaluated in a similar way as in the present study either to advance the research in those fields or to learn from insights or peculiarities that have been obtained and addressed in those fields.
In certain cases, it may be useful to combine epidemiological models from two different research fields into a single comprehensive model. For example, the spreading of knowledge concerning an infectious disease and the spreading of the disease itself have been modeled by a single complex model that takes into account various interactions between these two layers [
129]. Likewise, information diffusion as an epidemiological phenomenon and the aforementioned spread of supply chain disturbances has been studied by merging two epidemiological models [
133]. Such approaches may be seen in analogy to well-established models of infectious zoonotic diseases, where the virus spreads both within animal and human populations and between human and animal populations [
2,
9]. While in zoonotic disease modeling two epidemiological models from the same kind of discipline (i.e., infectious disease epidemiology) are combined, the previous examples are about merging models from different disciplines. Being aware of this analogy can help researchers in the field of zoonotic diseases to take advantage of research on interdisciplinary two-layered epidemiological approaches and vice versa.