Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting

Liyanage, Yuganthi R.; Chowell, Gerardo; Pogudin, Gleb; Tuncer, Necibe

doi:10.3390/v17040496

Open AccessArticle

Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting

¹

Department of Mathematics and Statistics, Florida Atlantic University, Boca Raton, FL 33431, USA

²

School of Public Health, Georgia State University, Atlanta, GA 30303, USA

³

Department of Applied Mathematics, Kyung Hee University, Yongin 17104, Republic of Korea

⁴

Laboratoire d’Informatique, CNRS, Ecole Polytechnique, IP Paris, 91120 Palaiseau, France

^*

Authors to whom correspondence should be addressed.

Viruses 2025, 17(4), 496; https://doi.org/10.3390/v17040496

Submission received: 26 February 2025 / Revised: 21 March 2025 / Accepted: 27 March 2025 / Published: 29 March 2025

(This article belongs to the Special Issue Computational Biology of Viruses: From Molecules to Epidemics, Volume 2)

Download

Browse Figures

Versions Notes

Abstract

Phenomenological models are highly effective tools for forecasting disease dynamics using real-world data, particularly in scenarios where detailed knowledge of disease mechanisms is limited. However, their reliability depends on the model parameters’ structural and practical identifiability. In this study, we systematically analyze the identifiability of six commonly used growth models in epidemiology: the generalized growth model (GGM), the generalized logistic model (GLM), the Richards model, the generalized Richards model (GRM), the Gompertz model, and a modified SEIR model with inhomogeneous mixing. To address challenges posed by non-integer power exponents in these models, we reformulate them by introducing additional state variables. This enables rigorous structural identifiability analysis using the StructuralIdentifiability.jl package in JULIA. We validated the structural identifiability results by performing parameter estimation and forecasting using the GrowthPredict MATLAB Toolbox. This toolbox is designed to fit and forecast time series trajectories based on phenomenological growth models. We applied it to three epidemiological datasets: weekly incidence data for monkeypox, COVID-19, and Ebola. Additionally, we assessed practical identifiability through Monte Carlo simulations to evaluate parameter estimation robustness under varying levels of observational noise. Our results confirm that all six models are structurally identifiable under the proposed reformulation. Furthermore, practical identifiability analyses demonstrate that parameter estimates remain robust across different noise levels, though sensitivity varies by model and dataset. These findings provide critical insights into the strengths and limitations of phenomenological models to characterize epidemic trajectories, emphasizing their adaptability to real-world challenges and their role in informing public health interventions.

Keywords:

phenomenological growth models; epidemic forecasting; structural identifiability; practical identifiability

1. Introduction

Phenomenological growth models are widely used to describe infectious disease dynamics, offering critical insights into the parameters governing their spread and control. A critical aspect for ensuring the reliability of epidemiological modeling is the identifiability of the model parameters [1,2,3,4,5], as it determines whether the values of the parameters can be accurately and uniquely estimated from available data. Structural identifiability analysis addresses whether model parameters can be uniquely determined from perfect, error-free data [2], which forms the basis for robust modeling. Without this assurance, reliable parameter estimation becomes impossible, undermining the model’s utility for forecasting and informing public health interventions.

Structural identifiability is particularly challenging for phenomenological models, which often involve nonlinear dynamics and transformations with non-integer exponents that complicate traditional analytical approaches. In this study, we focus on analyzing the structural identifiability of six widely adopted growth models in epidemiology: the generalized growth model (GGM), generalized logistic model (GLM), Richards model, generalized Richards model (GRM), Gompertz model, and a modified SEIR model incorporating inhomogeneous mixing [6]. These models capture a range of growth patterns, from sub-exponential to logistic growth, enabling application to diverse epidemiological settings.

We use the differential algebra approach implemented in the StructuralIdentifiability.jl package (version 0.5.14) in JULIA (version v1.10.2) [7] to assess structural identifiability. This approach eliminates unobserved state variables and derives differential algebraic polynomials that relate observed variables and model parameters. We reformulate models with non-integer exponents by introducing additional state variables, ensuring compatibility with existing analysis methods. This adaptation enables identifiability analysis for models that would otherwise be analytically intractable [8,9].

However, real-world data often contain noise and measurement errors, making some parameters difficult to estimate in practice. This is where practical identifiability becomes important. Practical identifiability analysis evaluates whether structurally identifiable parameters can still be estimated with reasonable accuracy given real-world observational noise [10,11,12]. Practical identifiability analysis allows us to evaluate the extent to which structurally identifiable parameters can be estimated from noisy observations. By simulating datasets with varying noise levels, we examine each model’s robustness and the practical feasibility of accurately estimating its parameters. To evaluate these models’ performance and practical identifiability under real-world conditions, we used the GrowthPredict Toolbox [13] in MATLAB (version R2024b), which facilitates parameter estimation and forecasting using a suite of phenomenological growth models.

The contributions of this work are twofold: (1) we rigorously establish the structural identifiability of commonly used phenomenological models, providing theoretical assurance for parameter estimation; (2) we investigate the practical identifiability of these models under noisy data conditions, offering insights into their reliability and constraints when applied to real-world epidemic datasets. These findings build upon and complement prior works that have established identifiability for other epidemiological models, including compartmental and mechanistic frameworks, thereby situating our work within the broader literature on model identifiability [14,15,16,17,18,19,20,21,22]. Together, these contributions enhance the utility of phenomenological models for forecasting infectious disease dynamics and guiding public health decision-making. The input codes and output results from JULIA and Matlab used in this paper are publicly accessible at our GitHub repository: https://github.com/YuganthiLiyanage/Phenomenological-Growth-Models (accessed on 26 March 2025).

2. Structural Identifiability of Phenomenological Models

Structural identifiability analysis is a theoretical framework used to determine whether the parameters of a model can be uniquely inferred from unlimited, error-free observations. This analysis is essential for ensuring the mathematical integrity of a model before applying it to real-world data, as it confirms that parameter estimates are both unique and reliable. Various techniques, such as differential algebra methods, the Taylor series expansion method, and input-output approaches, have been developed to evaluate structural identifiability [10,12].

In this study, we employ the differential algebra method, a powerful approach that systematically eliminates unobserved state variables and derives differential algebraic polynomials involving the observed variables and model parameters [12,22,23]. To achieve this, we utilize the StructuralIdentifiability.jl package in JULIA, which enables efficient derivation of these polynomials and facilitates rigorous identifiability analysis [7].

We reformulate the model structure for models with non-integer power exponents by introducing additional state variables. This reformulation not only preserves the fundamental structure of the model but also extends the applicability of identifiability analysis to previously intractable cases. By applying this approach, we ensure that parameters and state variables remain identifiable across models (3), (6), (9), (11), (13), (15), broadening the utility of these models in practical epidemic forecasting.

Structural identifiability theory: To analyze the structural identifiability of models, we rewrite them in the following compact form:

\begin{matrix} x^{'} (t) & = f (x, p), \\ y (t) & = g (x, p), \end{matrix}

(1)

where

x (t)

denotes the vector of state variables,

y (t)

represents the observations, and p denotes the vector of parameters. A parameter p in model (1) is called structurally identifiable if its value can be uniquely determined from the observable

y (t)

(assuming noise-free measurements). A similar property for states is typically referred to as observability, but in the context of the present paper, it will be convenient for us to use the term identifiability in both cases. In fact, we will define identifiability for an arbitrary function of parameters and states. We will give a simplified version of the definition to avoid unnecessary technicalities; a formalized version can be found in [24] (Definition 2.5).

Definition 1.

A function

h (x, p)

in the states and parameters of model (1) is said to be structurally globally identifiable if, for generic

(x, p)

and any

(\hat{x}, \hat{p})

, the following implication holds:

g (x, p) = g (\hat{x}, \hat{p}) implies h (x, p) = h (\hat{x}, \hat{p}) .

If all the parameters are identifiable, then we will say that the model is identifiable.

Definition 2.

A function

h (x, p)

in the states and parameters of model (1) is said to be structurally locally identifiable if, for generic

(x, p)

, there exists a neighborhood such that for any

(\hat{x}, \hat{p})

, the following implication holds:

g (x, p) = g (\hat{x}, \hat{p}) implies h (x, p) = h (\hat{x}, \hat{p}) .

While the definitions above reflect the intuitive notion of identifiability, when it comes to checking identifiability for specific models, they are not very convenient. One standard approach to assessing structural identifiability is via input-output equations (also referred to as the differential algebra approach). We will not describe input-output equations in full generality, only for the single-output model, since all the models considered in this paper belong to this class. For a single-output model, the input-output equation is the irreducible equation of minimal order satisfied by the output. We will normalize the equation by dividing by the leading coefficient (considered a polynomial in y and its derivatives) to obtain a monic polynomial. Let

c_{1} (p), \dots, c_{ℓ} (p)

be the coefficients of this equation. Then, under a certain assumption on the Wronskian of the input-output equations (see, e.g., [25] (Lemma 4.6)), the identifiable functions of the parameters are precisely the ones expressible in terms of

c_{1}, \dots, c_{ℓ}

. In this study, we used the software StructuralIdentifiability.jl version v1.10.2 to assess identifiability; the software automatically checks if the aforementioned condition on Wronskians is fulfilled.

Structural identifiability results for the generalized growth model

We derive input-output equations by eliminating unobserved state variables using the differential algebra method. This process generates algebraic relationships involving the observed variables and model parameters, enabling identifiability analysis. For example, the generalized growth model is defined by the following differential equation:

\begin{matrix} GGM : & \frac{d C}{d t} = r C^{α} (t) \end{matrix}

(2)

where

C^{'} (t)

denotes incidences at time t,

C (t)

denotes the cumulative number of cases at time t, and

α

denotes the growth rate of the infectious diseases such that

0 \leq α \leq 1 .

Since epidemic data are typically collected as incident case counts over discrete time intervals, we assume that the incidence,

C^{'} (t)

, corresponds to the observed data. This assumption is made for methodological consistency and computational feasibility, as it simplifies the identifiability analysis by ensuring that our model structure directly aligns with real-world epidemic datasets. It also allows us to use a standardized observation function across different growth models, facilitating direct comparisons in both structural and practical identifiability assessments.

To facilitate this process, we introduce an additional state variable, which allows us to reformulate models with non-integer power exponents into a structure that can be analyzed using the differential algebra approach. For this purpose, we introduce an additional state variable

x (t)

to eliminate the non-integer power exponent, resulting in the extended version of the model by letting

x (t) = r C^{α} (t)

; then, GGM becomes

\begin{matrix} \frac{d C}{d t} & = x (t) \\ \frac{d x}{d t} & = α \frac{x^{2} (t)}{C (t)} . \end{matrix}

(3)

In this study, we examine the scenario in which the observations are equal to the incidences; that is,

y (t) = x (t)

. The monic input-output equation obtained from the StructuralIdentifiability.jl package in JULIA, with the observation

y (t)

, is given by

0 = - y^{″} y α + y^{' 2} (2 α - 1) = y^{″} y + y^{' 2} (2 - \frac{1}{α})

In the context of structural identifiability, the input-output equation indicates that

2 - 1 / α

can be identified with the observation

y (t)

, so

α

is identifiable as well. Now, we can express

C (t) = α \frac{x^{2} (t)}{x^{'} (t)}

in terms of identifiable parameters and state variables. As a result,

C (t)

becomes identifiable. This implies that the parameter r is identifiable. Therefore, model (2) is structurally identifiable with the observation

y (t) .

We state the following proposition:

Proposition 1.

The generalized growth model GGM is structurally identifiable from the observation of incidences

y = x (t)

.

We would like to stress that the way the lifting to the extended model (3) is performed is important to obtain correct identifiability results for the original model. Suppose that, instead of

x (t) = r C^{α} (t)

, we would have introduced

\tilde{x} (t) = C^{α} (t)

. Then, we would obtain a different extended system:

\frac{d C}{d t} = r \tilde{x} (t) and \frac{d \tilde{x}}{d t} = α r \frac{{\tilde{x}}^{2} (t)}{C (t)}

(4)

In this model, neither r nor

\tilde{x}

are identifiable because there exists an output-preserving transformation

r \to λ r, \tilde{x} \to \frac{\tilde{x}}{λ}

for any nonzero number

λ

. The discrepancy between the identifiability results for different extended models can be explained as follows. The trajectories of the original versions of model (2) and of (3) are parametrized by three numbers:

α, r, C (0)

for the former, and

α, C (0), x (0)

for the latter. On the other hand, the space of trajectories of (4) is four-dimensional, parametrized by

α, r, C (0), \tilde{x} (0)

. The images of the trajectories of (2) among the trajectories of (4) are constrained to a manifold

\tilde{x} (t) - C^{α} (t) = 0

. Thus, this is the additional trajectories that (4) possesses, which turn r into nonidentifiable status.

Structural identifiability results for the generalized logistic growth model

The generalized logistic growth model is given by the following equation:

\begin{matrix} GLM : & \frac{d C}{d t} = r C^{α} (t) (1 - \frac{C (t)}{k}) \end{matrix}

(5)

where r is the generalized growth rate, k is the final epidemic size,

C^{'} (t)

denotes incidences at time t,

C (t)

denotes the cumulative number of cases at time t, and the parameter

α \in [0, 1]

denotes the different growth scenarios; the constant incidents is

α = 0

, sub-exponential growth is

0 < α < 1

, and exponential growth is

α = 1

. To obtain the extended version of the model without a non-integer exponent, we substitute

x (t) = r C^{α} (t)

, where GLM becomes

\begin{matrix} \frac{d C}{d t} & = x (t) (1 - \frac{C (t)}{k}) \\ \frac{d x}{d t} & = α \frac{x^{2} (t)}{C (t)} (1 - \frac{C (t)}{k}) . \end{matrix}

(6)

Here, we consider the case where the observations

y (t) = x (t) (1 - \frac{C (t)}{k})

correspond to the incidences. The input-output equation obtained from JULIA is normalized, with the observation

y (t)

given by

\begin{matrix} 0 & = y^{″ 2} y^{2} + y^{″} y^{' 2} y \frac{- 3 α + 1}{α} + 2 y^{″} y^{'} y^{3} \frac{- α^{2} + 1}{k α} + y^{″} y^{5} \frac{α^{3} + 3 α^{2} + 3 α + 1}{α k^{2}} + y^{' 4} \frac{2 α - 1}{α} + \\ 2 y^{' 3} y^{2} \frac{2 α^{2} - α - 1}{k α} + y^{' 2} y^{4} \frac{- 2 α^{3} - 5 α^{2} - 4 α - 1}{α k^{2}} \end{matrix}

(7)

We will check that all the parameters can be expressed in terms of the coefficients of this equations, thus showing that they are identifiable. First,

α

can be expressed from

\frac{2 α - 1}{α}

, so it is identifiable. Next, k can be expressed from

α

and the coefficient

\frac{- α^{2} + 1}{α k}

. This concludes that only parameters k and

α

are identifiable.

By using the StructuralIdentifiability.jl package in JULIA, we obtained the state variable

x (t)

, and

C (t)

is identifiable (observable) from the given observation. Then,

r = \frac{x (t)}{C^{α} (t)}

can be written as a combination of identifiable parameters and the state variables. Therefore, the parameter r is also identifiable. We assert the following proposition:

Proposition 2.

The generalized logistic growth model (5) is structurally identifiable from the observation of incidences,

y (t) = x (t) (1 - \frac{C (t)}{k})

.

Structural identifiability results for the Richards model

The Richards model is given by the following equation:

\begin{matrix} Richards : & \frac{d C}{d t} = r C (t) (1 - {(\frac{C (t)}{k})}^{a}) \end{matrix}

(8)

We let

x (t) = {(\frac{C (t)}{k})}^{a}

; the Richards model becomes

\begin{matrix} \frac{d C}{d t} & = r C (t) (1 - x (t)) \\ \frac{d x}{d t} & = a r x (t) (1 - x (t)) . \end{matrix}

(9)

Here, we consider the case where the observations,

y (t) = r C (t) (1 - x (t))

, correspond to the incidences. The normalized input-output equation of the model (9) is given by

\begin{matrix} 0 & = - y^{″} y + y^{' 2} \frac{2 a + 1}{a + 1} + y^{'} y \frac{a r (a - 1)}{a + 1} - y^{2} \frac{a^{2} r^{2}}{a + 1} \end{matrix}

The parameter a can be expressed from the coefficient

\frac{2 a + 1}{a + 1} = 2 - \frac{1}{a + 1}

. Next, parameter r can be expressed from a and coefficient

\frac{a r (a - 1)}{a + 1}

. Thus, both a and r are identifiable from the observation

y (t)

.

Now, we check the identifiability of the state variables; both

x (t)

and

C (t)

are identifiable from the observation. Therefore, parameter a can be written as a combination of the identifiable parameters and state variables. Thus, a is identifiable. We state the following proposition:

Proposition 3.

The Richards model (8) is structurally identifiable from the observation of incidences,

y (t) = r C (t) (1 - x (t))

.

Structural identifiability results for the generalized Richards model

The model is represented by the following equation:

\begin{matrix} GRM : & \frac{d C}{d t} = r C^{α} (t) (1 - {(\frac{C (t)}{k})}^{a}) \end{matrix}

(10)

where r is the generalized growth rate, k is the final epidemic size,

C^{'} (t)

denotes incidences at time t,

C (t)

denotes the cumulative number of cases at time t, and the parameter

α \in [0, 1]

denotes the different growth scenarios; the constant incidents is

α = 0

, sub-exponential growth is

0 < α < 1

, and exponential growth is

α = 1

, and the exponent a denotes the deviation from the symmetric s-shaped dynamics of the simple logistic curve.

To obtain the extended version without any non-integer power exponent, we let

x (t) = r C^{p} (t) and z (t) = {(C / k)}^{a} (t)

, where GRM becomes

\begin{matrix} \frac{d C}{d t} & = x (t) (1 - z (t)) \\ \frac{d x}{d t} & = α \frac{x^{2} (t)}{C (t)} (1 - z (t)) \\ \frac{d z}{d t} & = a \frac{z (t) x (t)}{C (t)} (1 - z (t)), \end{matrix}

(11)

By using StructuralIdentifiability.jl in JULIA, we determined that the parameters a and

α

are locally identifiable, and the state variables

x (t)

and

z (t)

are also locally identifiable. Furthermore, the product

a^{2}

and the summation

a + 2 α

are globally identifiable. Since a is positive, it follows that a is globally identifiable, ensuring that

α, x (t),

and

z (t)

are globally identifiable as well. To obtain the identifiability of parameters r and k, we express

r = \frac{x (t)}{C^{α} (t)}

and

k = \sqrt[a]{\frac{C^{a} (t)}{z (t)}}

in terms of identifiable states and parameters. Therefore, the parameters r and k are identifiable. We conclude the following proposition:

Proposition 4.

The generalized Richards model (10) is structurally identifiable from the observation of incidences

y (t) = x (t) (1 - z (t))

under the assumption of the positivity of a and k.

Structural identifiability results for the Gompertz model

The Gompertz model is given by the following equation:

\begin{matrix} GOM : & \frac{d C}{d t} = r C (t) e^{- a t} \end{matrix}

(12)

When letting

x (t) = r e^{- a t}

, GOM becomes

\begin{matrix} \frac{d C}{d t} & = C (t) x (t) \\ \frac{d x}{d t} & = - a e^{- a t} = - a x (t) . \end{matrix}

(13)

Here, we consider the case where the observations,

y (t)

, correspond to the incidences,

y (t) = C (t) x (t) .

The input-output equation of model (13) is given by

\begin{matrix} 0 & = - y^{″} y + y^{' 2} - y^{'} y a - y^{2} a^{2} \end{matrix}

Since a appears among the coefficients, it is identifiable. The StructuralIdentifiability package in JULIA shows that both

C (t)

and

x (t)

are identifiable from the observation

y (t)

. Thus, the parameter

r = x (t) e^{a t}

can be written as a product of identifiable parameters and state variables. Thus, r is identifiable.

Proposition 5.

The model (12) is structurally identifiable from the observation of incidences,

y (t) = C (t) x (t)

.

Structural identifiability results for the SEIR model with inhomogeneous mixing

The SEIR model with inhomogeneous mixing is given by the following system of equations:

\begin{matrix} SEIR & \frac{d S}{d t} = \frac{- β S (t) I {(t)}^{α}}{N} \\ \frac{d E}{d t} = \frac{β S (t) I {(t)}^{α}}{N} - κ E (t) \\ \frac{d I}{d t} = κ E (t) - γ I (t) \\ \frac{d R}{d t} = γ I (t) . \end{matrix}

(14)

Let

x (t) = β I^{α} (t)

then SEIR becomes

\begin{matrix} \frac{d S}{d t} & = \frac{- S (t) x (t)}{N} \\ \frac{d E}{d t} & = \frac{S (t) x (t)}{N} - κ E (t) \\ \frac{d x}{d t} & = \frac{α x (t)}{I (t)} (κ E (t) - γ I (t)) \\ \frac{d I}{d t} & = κ E (t) - γ I (t) \\ \frac{d R}{d t} & = γ I (t) . \end{matrix}

(15)

Here, we consider the case where the observations are

y (t) = κ E (t)

. By using StructuralIdentifiability.jl in JULIA, we determined that the parameters

α

,

γ

, and

κ

are globally identifiable, whereas N is not identifiable. However, since N represents the total population size of a closed population, its value is known, and

N = S_{0} + E_{0} + I_{0} + R_{0}

. With N being a known quantity, we treat it as an additional observation in the model. We then perform the identifiability analysis again and obtain the fact the state variable

x (t)

is identifiable from observations

y (t)

and N. Thus, we can express

β = \frac{x (t)}{I^{α}}

as a combination of identifiable parameters and state variables. Thus, model (14) is structurally identifiable. The summary of the structural identifiability results for models (3), (6), (9), (11), (13), and (15), obtained from StructuralIdentifiability.jl in JULIA, is given in Table 1.

Cases with time-varying

N (t)

introduce additional degrees of freedom into the model, potentially leading to the unidentifiability of key parameters unless additional constraints or external data sources (e.g., demographic data) are available. Future work could explore methods such as including auxiliary equations for

N (t)

or assuming known functional forms to restore identifiability.

Another critical challenge in real-world applications is the uncertainty in initial conditions. Since initial values for

S (0)

,

E (0)

,

I (0)

, and

R (0)

are often unknown or estimated from limited data, they can significantly impact parameter identifiability and estimation accuracy. Given these considerations, we consider analyses with and without knowing the initial conditions of the systems.

Proposition 6.

The model (14) is structurally identifiable from the observation of incidences,

y (t) = κ E (t)

, with known initial conditions.

3. Fitting Models to Real Epidemic Data

3.1. Data

We performed parameter estimation using the GrowthPredict MATLAB Toolbox, a robust platform developed explicitly for fitting and forecasting time series trajectories based on phenomenological growth models using ordinary differential equations [13]. The models outlined in Equations (5) to (14) were fitted to three sets of data: the weekly incidence curve of monkeypox data, COVID-19 data, and Ebola data, as described in the Data section. The computation of prediction intervals assumes that observed values are generated from the deterministic model with added normal error (mean zero and estimated variance). This assumption allows us to quantify uncertainty while maintaining consistency with the Monte Carlo framework used for practical identifiability analysis.

3.2. Parameter Estimation Method

We used the least squares (LSQ) method to perform the parameter estimation problem by minimizing the following objective function.

\begin{matrix} {\hat{p}}_{i} = \underset{p_{i}}{argmin} {(\sum_{j = 1}^{n} {(f (t_{j}, p_{i}) - y_{t_{j}})}^{2})}^{1 / 2} . \end{matrix}

(16)

Here,

f (t_{j}, p_{i})

is the predicted epidemic trajectory given by model i, where

i = {2, 3, 4, 5, 6}

, and

y_{t_{j}}

is the time series data, where

t_{j}, j = 1, 2, 3, \dots, n

are the discrete time points for the time series data. This method was chosen for its computational efficiency and ability to handle nonlinear parameter spaces effectively.

To evaluate model performance, we used the GrowthPredict MATLAB Toolbox, which provides several goodness-of-fit metrics, including the Akaike Information Criterion (AIC), mean absolute error (MAE), mean squared error (MSE), and weighted interval score (WIS) [13]. The corresponding formulations are as follows.

Akaike information criterion (AIC) is computed as

A I C = n log (S S E) + 2 m + \frac{2 m (m + 1)}{n - m - 1},

where m is the number of model parameters, and n is the number of data points (calibration period). The sum of squared errors

S S E

is defined as

S S E = \sum_{j = 1}^{n} {(f (t_{j}, p_{i}) - y_{t_{j}})}^{2})^{1 / 2} .

Mean absolute error (MAE) and mean squared error (MSE) are given by

M A E = \frac{1}{n} \sum_{j = 1}^{n} | f (t_{j}, p_{i}) - y_{t_{j}} |,

M S E = \frac{1}{n} \sum_{j = 1}^{n} {(f (t_{j}, p_{i}) - y_{t_{j}})}^{2} .

The weighted interval score (WIS) is computed as

W I S_{α_{0 : K}} (F, y) = \frac{1}{K + \frac{1}{2}} (w_{0} | y - \tilde{y} | + \sum_{k = 1}^{K} w_{k} I S_{α_{k}} (F, y)),

where K is the number of prediction intervals (PIs),

\tilde{y}

is the predictive median, and

w_{k} = \frac{α_{k}}{2}

for

k = 1, 2, \dots, K

, with

w_{0} = 1 / 2

. The interval scoring function is defined as

I S_{α} (F, y) = (u - 1) + \frac{2}{α} (l - y) 1 (y < l) + \frac{2}{α} (y - u) l (y > u),

where

1 (y < l)

is the indicator function that equals 1 if

y < l

, and 0 otherwise. The terms l and u represent the

\frac{α}{2}

and

1 - \frac{α}{2}

quantiles of the forecast F, respectively.

The coverage of the

95 %

prediction interval is given by

95 % P I c o v e r a g e = \frac{1}{N} \sum_{i = 1}^{n} 1 {y_{t_{i}} > L_{i} \cap y_{t_{i}} < U_{i}} \times 100 %,

where

L_{i}

and

U_{i}

are the lower and upper bounds of the

95 %

PIs, respectively,

y_{t_{i}}

represents the data, and

1

is an indicator variable that equals 1 if

y_{t_{i}}

is in the specified interval and is 0 otherwise.

The estimated parameter values from fitting three sets of time series data (monkeypox, COVID-19, and Ebola) to models (5)–(14) are given in Table 2, Table 3, and Table 4, respectively.

The Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 illustrate the fit of each model using the dataset that yielded the best fit based on AIC values.

4. Practical Identifiability

Structural identifiability examines the feasibility of estimating parameters in a model with unlimited observation without measurement errors. However, real-world data are often collected discretely and may include significant measurement errors. Thus, the structural identifiability of a model does not imply the practically identifiable parameters. So, it becomes crucial to assess whether structurally identifiable parameters can be accurately estimated from data containing noise. Various methods are available to examine practical identifiability [1,10,12,23,26,27,28]. In this study, we employed the Monte Carlo simulation (MCS) method, described below [1,10,12,23].

We solved the models’ Equations (5), (8), (10), (12), and (14), numerically using the true parameter values $\hat{p}$ obtained from the fitting of the model to experimental data from the data fitting section. Then, we obtained the observations (output) at the experimental time points.
We generated $M = 1000$ datasets at each of the experimental time points, with four different levels of increasing measurement error, $σ = 1 %, 5 %, 10 %, 20 %,$ by adding measurement errors to each given experimental data point. Each measurement error is assumed to be distributed in a normal distribution with zero mean and standard deviation $σ$ .
We fit the model to each of the 1000 datasets to estimate the parameter set ${\hat{p}}_{i}$ using the objective function Equation (16).
We calculated the average relative estimation error $(A R E)$ for each parameter in the model according to

$A R E ({\hat{p}}^{(k)}) = 100 % \times \frac{1}{M} \sum_{i = 1}^{M} \frac{| {\hat{p}}^{(k)} - {\hat{p}}_{i}^{(k)} |}{| {\hat{p}}^{(k)} |}$

where ${\hat{p}}^{(k)}$ is the kth element of the true parameter set $\hat{p}$ , and ${\hat{p}}_{i}^{(k)}$ is kth element of ${\hat{p}}_{i} .$
We used $A R E s$ values to determine whether each of the model’s parameters is practically identifiable, using Definition 3 given below.

Definition 3.

Let

A R E

be the average relative estimation error of the parameter

p^{(k)}

. Let σ be the measurement error.

If $0 < A R E ({\hat{p}}^{(k)}) \leq σ$ , then parameter $p^{(k)}$ is strongly practically identifiable.
If $σ < A R E ({\hat{p}}^{(k)}) \leq 10 σ$ , then parameter $p^{(k)}$ is weakly practically identifiable.
If $10 σ < A R E ({\hat{p}}^{(k)})$ , then parameter $p^{(k)}$ is not practically identifiable.

A model is said to be practically identifiable if the parameters

p^{(k)}

are practically identifiable for all k.

In the data fitting section, we fitted the model equations (Equations (5) through (14)) to three different datasets: monkeypox data, COVID-19 data, and Ebola data. Here, we focus on assessing the practical identifiability of each model using the data set that produced the best fit based on the lowest

A I C

values. Thus, our investigation focuses on determining whether the Richards model, GLM, and GRM demonstrate practical identifiability when applied to the monkeypox, Ebola, and COVID-19 datasets, respectively. Additionally, we evaluated the GRM, GOM, and SEIR models on the datasets where they demonstrated the best relative performance to further assess their practical identifiability.

We found that all models, Richards, GRM, GOM, GLM, and SEIR, are practically identifiable from the given datasets. Each model’s parameters were reliably estimated, with well-constrained confidence intervals, indicating that the data provided sufficient information to determine the parameters uniquely. Moreover, these results are consistent with the structural identifiability findings, further validating the models’ capacity to capture the dynamics of the epidemics studied. The results of our analysis are presented in Table 5 for the Richard model, Table 6 and Table 7 for GRM, Table 8 for GLM, Table 9 for GOM, and Table 10 for SEIR.

5. Discussion

This study evaluates the structural and practical identifiability of six commonly used phenomenological growth models in epidemiology. It demonstrates their robustness in parameter estimation and applicability across diverse datasets. To assess structural identifiability, we reformulated the models to address challenges posed by non-integer power exponents; we created an extended structure with fewer parameters and additional equations while preserving the original degrees of freedom. Our findings confirm that all reformulated models are structurally identifiable, even with unknown initial conditions. The original model formats were applied for data fitting and practical identifiability analyses, revealing that all models remained practically identifiable under varying noise levels, with performance differing across datasets.

The first step in validating these models involved assessing whether all unknown parameters were structurally identifiable, ensuring they could theoretically be uniquely determined from perfect, unlimited data. This step establishes a necessary foundation for reliable parameter estimation. We used the differential algebra method for this evaluation [12,22,23]. Given the lack of dedicated software tools for analyzing systems of ordinary differential equations with non-integer exponents, we reformulated the models by introducing additional state variables. This reformulation enabled the use of the StructuralIdentifiability.jl package in Julia [7]. Our findings demonstrated that all parameters in the reformulated models were structurally identifiable, even with unknown initial conditions. By assessing the observability of state variables, we inferred the structural identifiability of the original models, bridging the gap between the reformulated and original structures [8,9].

Model validation was conducted using the GrowthPredict MATLAB Toolbox, a specialized tool for fitting and forecasting time series trajectories based on phenomenological growth models. This toolbox facilitated comparisons across datasets and enabled the calculation of performance metrics such as AIC, MAE, MSE, and WIS. This toolbox was applied to three epidemiological datasets: weekly incidence data for monkeypox, COVID-19, and Ebola [13]. To compare performance, metrics such as AIC, MAE, MSE, WIS, and coverage were calculated for each model. The selection of models based on AIC values revealed that certain models are better suited to specific epidemic contexts. For example, the Richards model provided the best fit for monkeypox data. In contrast, the GRM model excelled at fitting the COVID-19 data. The GLM model was most effective for the Ebola data.

In the next phase, we assessed practical identifiability by applying the models to datasets that produced the best fits, as indicated by the AIC values. We focused on the Richards, SEIR, and Gompertz models with the monkeypox dataset, the GLM with the Ebola dataset, and the GRM with the COVID-19 dataset. Practical identifiability was evaluated using Monte Carlo simulations, highlighting the robustness of parameter estimation under real-world conditions, where data are often noisy and limited. Other approaches, such as the Fisher Information Matrix (FIM), Profile Likelihood Method, and Bayesian methods, could complement this analysis [1,10,12,23]. Our results showed that all the models were practically identifiable for their respective datasets. However, parameter estimation accuracy varied across models and datasets, emphasizing the sensitivity of these models to data quality. These findings underscore the importance of including practical identifiability analysis in epidemiological studies to ensure robust model validation.

By addressing the challenge of non-integer power exponents and validating the models under practical conditions, this study broadens the applicability of phenomenological models. It strengthens confidence in their use for epidemic forecasting. However, several limitations must be acknowledged. First, while introducing additional state variables facilitates structural identifiability analysis, further research is needed to assess its impact on computational efficiency and model interpretability. Second, the dependency of parameter estimation accuracy on data quality remains a challenge, emphasizing the need for high-quality, well-calibrated datasets. Third, alternative approaches such as Bayesian methods and Fisher Information Matrix analysis should be explored to complement the Monte Carlo simulations performed here. Finally, future research should examine the role of time-varying parameters and unknown initial conditions in shaping identifiability outcomes.

Our analysis highlights the structural and practical identifiability of six phenomenological models across various epidemiological datasets. The results underscore the utility of these models in forecasting disease dynamics and their adaptability to diverse epidemic contexts. However, the accuracy of parameter estimates is highly dependent on data quality, emphasizing the critical need for careful data collection and the integration of practical identifiability evaluations in the model selection process.

Author Contributions

Conceptualization, Y.R.L., G.C., G.P. and N.T.; methodology, Y.R.L., G.C., G.P. and N.T.; software, Y.R.L., G.C., G.P. and N.T.; validation, Y.R.L., G.C., G.P. and N.T.; formal analysis, Y.R.L.; investigation, Y.R.L.; writing—original draft preparation, Y.R.L., G.C., G.P. and N.T.; writing—review and editing, Y.R.L., G.C., G.P. and N.T.; visualization, Y.R.L.; supervision, Y.R.L., G.C., G.P. and N.T.; methodology, Y.R.L., G.C., G.P. and N.T. All authors have read and agreed to the published version of the manuscript.

Funding

G.C. is partially supported by NSF grants 2125246 and 2026797. G.P. is supported by the French ANR-22-CE48-0008 OCCAM project. N.T. and Y.R.L. are supported by NIH NIGMS grant no. 1R01GM152743-01.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data for the research is available at https://github.com/YuganthiLiyanage/Phenomenological-Growth-Models (accessed on 26 March 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Miao, H.; Dykes, C.; Demeter, L.M.; Cavenaugh, J.; Park, S.Y.; Perelson, A.S.; Wu, H. Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments. Bull. Math. Biol. 2008, 70, 1749–1771. [Google Scholar] [CrossRef] [PubMed]
Tuncer, N.; Gulbudak, H.; Cannataro, V.L.; Martcheva, M. Structural and practical identifiability issues of immuno-epidemiological vector–host models with application to rift valley fever. Bull. Math. Biol. 2016, 78, 1796–1827. [Google Scholar] [CrossRef] [PubMed]
Eisenberg, M.C.; Robertson, S.L.; Tien, J.H. Identifiability and estimation of multiple transmission pathways in Cholera and waterborne disease. J. Theor. Biol. 2013, 324, 84–102. [Google Scholar] [CrossRef]
Simpson, M.J.; Browning, A.P.; Warne, D.J.; Maclaren, O.J.; Baker, R.E. Parameter identifiability and model selection for sigmoid population growth models. J. Theor. Biol. 2022, 535, 110998. [Google Scholar] [CrossRef] [PubMed]
Chowell, G.; Skums, P. Investigating and forecasting infectious disease dynamics using epidemiological and molecular surveillance data. Phys. Life Rev. 2024, 51, 294–327. [Google Scholar] [CrossRef]
Viboud, C.; Simonsen, L.; Chowell, G. A generalized-growth model to characterize the early ascending phase of infectious disease outbreaks. Epidemics 2016, 15, 27–37. [Google Scholar] [CrossRef]
Dong, R.; Goodbrake, C.; Harrington, H.; Pogudin, G. Differential Elimination for Dynamical Models via Projections with Applications to Structural Identifiability. SIAM J. Appl. Algebra Geom. 2023, 7, 194–235. [Google Scholar]
Margaria, G.; Riccomagno, E.; White, L.J. Structural identifiability analysis of some highly structured families of statespace models using differential algebra. J. Math. Biol. 2004, 49, 433–454. [Google Scholar] [CrossRef]
Margaria, G.; Riccomagno, E.; Chappell, M.J.; Wynn, H.P. Differential algebra methods for the study of the structural identifiability of rational function state-space models in the biosciences. Math. Biosci. 2001, 174, 1–26. [Google Scholar] [CrossRef]
Miao, H.; Xia, X.; Perelson, A.; Wu, H. On identifiability of nonlinear ODE models and applications in viral dynamics. SIAM Rev. 2011, 53, 3–39. [Google Scholar] [CrossRef]
Liyanage, Y.R.; Kohan, L.M.; Martcheva, M.; Tuncer, N. Identifiability and Parameter Estimation of Within-Host Model of HIV with Immune Response. Mathematics 2024, 12, 2837. [Google Scholar] [CrossRef]
Tuncer, N.; Le, T.T. Structural and practical identifiability analysis of outbreak models. Math. Biosci. 2018, 299, 1–18. [Google Scholar] [CrossRef]
Chowell, G.; Bleichrodt, A.; Dahal, S.; Tariq, A.; Roosa, K.; Hyman, J.M.; Luo, R. GrowthPredict: A toolbox and tutorial-based primer for fitting and forecasting growth trajectories using phenomenological growth models. Sci. Rep. 2024, 14, 1630. [Google Scholar] [CrossRef]
Bellman, R.; Åström, K. On structural identifiability. Math. Biosci. 1970, 7, 329–339. [Google Scholar] [CrossRef]
Roosa, K.; Chowell, G. Assessing parameter identifiability in compartmental dynamic models using a computational approach: Application to infectious disease transmission models. Theor. Biol. Med. Model. 2019, 16, 1. [Google Scholar] [CrossRef] [PubMed]
Sauer, T.; Berry, T.; Ebeigbe, D.; Norton, M.M.; Whalen, A.J.; Schiff, S.J. Identifiability of Infection Model Parameters Early in an Epidemic. SIAM J. Control Optim. 2022, 60, S27–S48. [Google Scholar] [CrossRef] [PubMed]
Massonis, G.; Banga, J.R.; Villaverde, A.F. Structural identifiability and observability of compartmental models of the COVID-19 pandemic. Annu. Rev. Control 2021, 51, 441–459. [Google Scholar] [CrossRef] [PubMed]
Tuncer, N.; Timsina, A.; Nuno, M.; Chowell, G.; Martcheva, M. Parameter identifiability and optimal control of an SARS-CoV-2 model early in the pandemic. J. Biol. Dyn. 2022, 16, 412–438. [Google Scholar] [CrossRef]
Gallo, L.; Frasca, M.; Latora, V.; Russo, G. Lack of practical identifiability may hamper reliable predictions in COVID-19 epidemic models. Sci. Adv. 2022, 8, eabg5234. [Google Scholar] [CrossRef]
Renardy, M.; Kirschner, D.; Eisenberg, M. Structural identifiability analysis of age-structured PDE epidemic models. J. Math. Biol. 2022, 84, 9. [Google Scholar] [CrossRef]
Evans, N.D.; White, L.J.; Chapman, M.J.; Godfrey, K.R.; Chappell, M.J. The structural identifiability of the susceptible infected recovered model with seasonal forcing. Math. Biosci. 2005, 194, 175–197. [Google Scholar] [CrossRef] [PubMed]
Chowell, G.; Dahal, S.; Liyanage, Y.R.; Tariq, A.; Tuncer, N. Structural identifiability analysis of epidemic models based on differential equations: A tutorial-based primer. J. Math. Biol. 2023, 87, 79. [Google Scholar] [CrossRef]
Tuncer, N.; Marctheva, M.; LaBarre, B.; Payoute, S. Structural and Practical Identifiability Analysis of Zika Epidemiological Models. Bull. Math. Biol. 2018, 80, 2209–2241. [Google Scholar] [CrossRef]
Hong, H.; Ovchinnikov, A.; Pogudin, G.; Yap, C. Global Identifiability of Differential Models. Commun. Pure Appl. Math. 2020, 73, 1831–1879. [Google Scholar]
Ovchinnikov, A.; Pogudin, G.; Thompson, P. Parameter identifiability and input–output equations. Appl. Algebra Eng. Commun. Comput. 2023, 34, 165–182. [Google Scholar] [CrossRef]
Wu, H.; Zhu, H.; Miao, H.; Perelson, A.S. Parameter identifiability and estimation of HIV/AIDS dynamic models. Bull. Math. Biol. 2008, 70, 785–799. [Google Scholar] [CrossRef]
Wieland, F.G.; Hauber, A.L.; Rosenblatt, M.; Tönsing, C.; Timmer, J. On structural and practical identifiability. Curr. Opin. Syst. Biol. 2021, 25, 60–69. [Google Scholar] [CrossRef]
Raue, A.; Karlsson, J.; Saccomani, M.P.; Jirstrand, M.; Timmer, J. Comparison of approaches for parameter identifiability analysis of biological systems. Bioinformatics 2014, 30, 1440–1448. [Google Scholar] [CrossRef]

Figure 1. Fit of the Richards model to weekly incidence data for the monkeypox epidemic. The model demonstrates close alignment with the observed data, capturing the nonlinear growth dynamics characteristic of the outbreak. Key features such as the early exponential growth phase and subsequent plateau are well-represented, reflecting the model’s robustness in describing sub-logistic epidemic patterns. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 1. Fit of the Richards model to weekly incidence data for the monkeypox epidemic. The model demonstrates close alignment with the observed data, capturing the nonlinear growth dynamics characteristic of the outbreak. Key features such as the early exponential growth phase and subsequent plateau are well-represented, reflecting the model’s robustness in describing sub-logistic epidemic patterns. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 2. Model fit of the Gompertz model to the weekly incidence data for monkeypox. The figure illustrates the observed weekly incidence data (dots) and the corresponding Gompertz model fit (solid line). The Gompertz model effectively captures the overall trend and deceleration in the growth of cases over time. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 2. Model fit of the Gompertz model to the weekly incidence data for monkeypox. The figure illustrates the observed weekly incidence data (dots) and the corresponding Gompertz model fit (solid line). The Gompertz model effectively captures the overall trend and deceleration in the growth of cases over time. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 3. Model fit of the generalized logistic model (GLM) to the weekly incidence data for Ebola. The figure illustrates the alignment of the GLM with the observed data, showing its ability to capture the epidemic dynamics effectively. Key features of the GLM, such as its flexibility to model sub-exponential growth, are evident in the close agreement between the predicted trajectory and the observed data points. The model parameters were estimated using the least squares method, and the fit was evaluated based on the Akaike Information Criterion (AIC), mean absolute error (MAE), and mean squared error (MSE). In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 3. Model fit of the generalized logistic model (GLM) to the weekly incidence data for Ebola. The figure illustrates the alignment of the GLM with the observed data, showing its ability to capture the epidemic dynamics effectively. Key features of the GLM, such as its flexibility to model sub-exponential growth, are evident in the close agreement between the predicted trajectory and the observed data points. The model parameters were estimated using the least squares method, and the fit was evaluated based on the Akaike Information Criterion (AIC), mean absolute error (MAE), and mean squared error (MSE). In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 4. Fit of the generalized Richards model (GRM) to the weekly incidence curve of monkeypox data. The figure illustrates the model’s ability to capture the observed dynamics of the epidemic, highlighting the flexibility of the GRM in accommodating varying growth patterns. The best-fit parameter estimates were derived using the least squares method, and the model’s performance is validated through metrics such as AIC, MAE, and MSE. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 4. Fit of the generalized Richards model (GRM) to the weekly incidence curve of monkeypox data. The figure illustrates the model’s ability to capture the observed dynamics of the epidemic, highlighting the flexibility of the GRM in accommodating varying growth patterns. The best-fit parameter estimates were derived using the least squares method, and the model’s performance is validated through metrics such as AIC, MAE, and MSE. In the bottom panel, the solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs), highlighting the model’s ability to quantify uncertainty in its projections.

Figure 5. The fit of the generalized Richards model (GRM) to the COVID-19 dataset. This figure illustrates the model’s ability to capture the dynamics of the epidemic, highlighting its flexibility in accounting for deviations from symmetric S-shaped growth curves. The GRM demonstrated the best fit for this dataset, as indicated by its low AIC score and high coverage of the

95 %

prediction intervals. Observed data points are marked, and the model’s predicted trajectory and uncertainty bounds are overlaid for comparison. The solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs).

Figure 5. The fit of the generalized Richards model (GRM) to the COVID-19 dataset. This figure illustrates the model’s ability to capture the dynamics of the epidemic, highlighting its flexibility in accounting for deviations from symmetric S-shaped growth curves. The GRM demonstrated the best fit for this dataset, as indicated by its low AIC score and high coverage of the

95 %

prediction intervals. Observed data points are marked, and the model’s predicted trajectory and uncertainty bounds are overlaid for comparison. The solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations, and the shaded region indicates the

95 %

prediction intervals (PIs).

Figure 6. Fit of the SEIR model with inhomogeneous mixing to the monkeypox dataset. The figure illustrates the model’s ability to capture the weekly incidence trends, with parameter estimates derived from fitting the SEIR framework to observed data. Shaded regions indicate the

95 %

prediction intervals, reflecting uncertainty in the model’s forecasts. The model strongly aligns with the data, emphasizing its suitability for analyzing diseases with similar transmission dynamics. Key features of the fit, such as the peak incidence and decline phase, are accurately represented. The solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations.

Figure 6. Fit of the SEIR model with inhomogeneous mixing to the monkeypox dataset. The figure illustrates the model’s ability to capture the weekly incidence trends, with parameter estimates derived from fitting the SEIR framework to observed data. Shaded regions indicate the

95 %

prediction intervals, reflecting uncertainty in the model’s forecasts. The model strongly aligns with the data, emphasizing its suitability for analyzing diseases with similar transmission dynamics. Key features of the fit, such as the peak incidence and decline phase, are accurately represented. The solid red line is the median model fit and the dashed lines correspond to the

95 %

prediction intervals. The blue dots indicate the observed data points. The gray lines correspond to the mean of the model fits obtained from the parametric bootstrapping with 1000 bootstrap realizations.

Table 1. Summary of structural identifiability results. This table summarizes the structural identifiability results for the model parameters and the observability of state variables across six models: the generalized growth model (GGM), the generalized logistic model (GLM), the Richards model, the generalized Richards model (GRM), the Gompertz model (GOM), and the SEIR model with inhomogeneous mixing. The results are presented for cases with and without known initial conditions. The parameters marked as “globally identifiable” can be uniquely determined from perfect data, whereas “locally identifiable” parameters require specific constraints for unique estimation. The observability of state variables, which indicates whether they can be inferred from the data, is also included. These findings were obtained using the StructuralIdentifiability.jl package in JULIA.

Model	Case	Globally Identifiable	Locally Identifiable	Not Identifiable
GGM (3)	Without IC	$α, C (t), x (t)$	-	-
	With IC	$α, C (t), x (t)$	-	-
GLM (6)	Without IC	$k, α, C (t), x (t)$	-	-
	With IC	$k, α, C (t), x (t)$	-	-
Richards (9)	Without IC	$a, r, C (t), x (t)$	-	-
	With IC	$a, r, C (t), x (t)$	-	-
GRM (11)	Without IC	$C (t)$	$a, α, x (t), z (t)$	-
	With IC	$a, α, C (t), x (t), z (t)$	-	-
GOM (13)	Without IC	$a, C (t), x (t)$	-	-
	With IC	$a, C (t), x (t)$	-	-
SEIR (15)	Without IC	$κ, γ, α, S (t), E (t), I (t)$	-	$x (t), R (t), N$
	With IC	$N, κ, γ, α, S (t), E (t), x (t), I (t), R (t)$	-	-

Table 2. Parameter estimates and model performance metrics obtained by fitting models (5)–(14) using the GrowthPredict toolbox to the weekly incidence curve of monkeypox data.

Parameter	r	$α$	k	a	$β$	$κ$	$γ$	AIC	MAE	MSE	WIS	Coverage
GLM	$1.9$	$0.84$	$2.9 \times 10^{4}$	-	-	-	-	$432.76$	$110.197$	17,245.91	$63.16$	100
RIC	$0.92$	-	$2.9 \times 10^{4}$	$0.36$	-	-	-	$405.67$	$69.9$	$7691.72$	$43.16$	100
GRM	$2.3$	$0.82$	$3.1 \times 10^{4}$	$0.9$	-	-	-	$442.59$	$100.63$	22,044.17	$69.72$	93.75
GOM	$1.5$	-	-	$0.2$	-	-	-	$488.88$	$261.37$	108,193.6	$160.41$	93.75
SEIR	-	$0.96$	-	-	$7.3$	$4.7$	$4.8$	$407.19$	$60.8825$	$7296.921$	$40.749$	$96.875$

Table 3. Parameter estimates and model performance metrics obtained by fitting models (5)–(14) using the GrowthPredict toolbox to the incidence curve of COVID-19 data.

Parameter	r	$α$	k	a	$β$	$κ$	$γ$	AIC	MAE	MSE	WIS	Coverage
GLM	$3.3$	$0.7$	$2 \times 10^{5}$	-	-	-	-	$1755.59$	$512.55$	387,968.7	$303.66$	97
RIC	5	-	$1.9 \times 10^{5}$	$0.089$	-	-	-	$1761.20$	$485.79$	410,155.9	$302.44$	93
GRM	$4.8$	$0.67$	$2.1 \times 10^{5}$	$1.2$	-	-	-	$1646.33$	$764.37$	885,696.74	$455.35$	$96.67$
GOM	$0.93$	-	-	$0.077$	-	-	-	$1753.01$	$474.94$	385,462.9	$294.35$	93
SEIR	-	$0.8$	-	-	$3.6$	$0.22$	$0.01$	$1910.94$	$1098.83$	1,903,801	$615.758$	98

Table 4. Parameter estimates and model performance metrics obtained by fitting models (5)–(14) using GrowthPredict toolbox to the incidence curve of Ebola data.

Parameter	r	$α$	k	a	$β$	$κ$	$γ$	AIC	MAE	MSE	WIS	Coverage
GLM	$0.78$	$0.85$	$1.1 \times 10^{4}$	-	-	-	-	$783.6$	$3.472$	$1932.064$	$20.46$	96.97
RIC	$0.46$	-	$1.1 \times 10^{4}$	$0.38$	-	-	-	$796.3$	$35.6$	$2318.422$	$22.25$	96.97
GRM	$1.3$	$0.96$	$1.3 \times 10^{4}$	$0.11$	-	-	-	$857.57$	$60.2$	$5665.86$	$36.90$	96.97
GOM	$0.84$	-	-	$0.1$	-	-		$860.2$	$63.97$	$6321.14$	$38.64$	95.45
SEIR	-	$0.99$	-	-	$5.7$	5	5	$816.345$	$43.813$	$3211.164$	$27.617$	$93.94$

Table 5. Results of Monte Carlo simulations assessing the practical identifiability of the Richards model (Equation (8)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (AREs) for each model parameter (r, a, and k) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

) alongside the corresponding confidence intervals (CIs). These results demonstrate the model’s robustness in parameter estimation across different levels of observational noise.

Table 5. Results of Monte Carlo simulations assessing the practical identifiability of the Richards model (Equation (8)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (AREs) for each model parameter (r, a, and k) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

) alongside the corresponding confidence intervals (CIs). These results demonstrate the model’s robustness in parameter estimation across different levels of observational noise.

Parameter	r	a	k
$A R E$
$σ = 1 %$	$2.8867 \times 10^{- 5}$	$3.4759 \times 10^{- 5}$	$4.1286 \times 10^{- 7}$
CI	$[0.9200, 0.9200]$	$[0.3600, 0.3600]$	[29,000.0000, 29,000.0000]
$A R E$
$σ = 5 %$	$8.5510 \times 10^{- 4}$	$0.0015$	$2.7180 \times 10^{- 4}$
CI	$[0.9200, 0.9200]$	$[0.3600, 0.3600]$	[28,999.6373, 29,000.3469]
$A R E$
$σ = 10 %$	$0.0020$	$0.0036$	$9.1805 \times 10^{- 4}$
CI	$[0.9200, 0.9200]$	$[0.3600, 0.3600]$	[28,999.2199, 29,000.7637]
$A R E$
$σ = 20 %$	$0.0042$	$0.0074$	$0.0021$
CI	$[0.9199, 0.9201]$	$[0.3599, 0.3601]$	[28,998.3931, 29,001.5577]

Table 6. Monte Carlo simulation results for the generalized Richards model (GRM, Equation (10)) using virtual datasets generated for discrete COVID-19 experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r,

α

, k, and a) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). Confidence intervals (CIs) for the estimated parameters are also included, illustrating the impact of observational noise on parameter identifiability and estimation accuracy.

Table 6. Monte Carlo simulation results for the generalized Richards model (GRM, Equation (10)) using virtual datasets generated for discrete COVID-19 experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r,

α

, k, and a) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). Confidence intervals (CIs) for the estimated parameters are also included, illustrating the impact of observational noise on parameter identifiability and estimation accuracy.

Parameter	r	$α$	k	a
$A R E$
$σ = 1 %$	0	0	0	0
CI	$[4.8, 4.8]$	$[0.67, 0.67]$	[21,000.00, 21,000.00]	$[1.20, 1.20]$
$A R E$
$σ = 5 %$	$4.9570 \times 10^{- 9}$	$1.5778 \times 10^{- 9}$	0	$4.0848 \times 10^{- 8}$
CI	$[4.8, 4.8]$	$[0.67, 0.67]$	[21,000.00, 21,000.00]	$[1.20, 1.20]$
$A R E$
$σ = 10 %$	$2.8156 \times 10^{- 8}$	$2.2234 \times 10^{- 9}$	0	$8.3800 \times 10^{- 8}$
CI	$[4.8, 4.8]$	$[0.67, 0.67]$	[21,000.00, 21,000.00]	$[1.20, 1.20]$
$A R E$
$σ = 20 %$	$7.6621 \times 10^{- 8}$	$3.5967 \times 10^{- 9}$	$1.3859 \times 10^{- 17}$	$4.0848 \times 10^{- 8}$
CI	$[4.8, 4.8]$	$[0.67, 0.67]$	[21,000.00, 21,000.00]	$[1.20, 1.20]$

Table 7. Results of Monte Carlo simulations assessing the practical identifiability of the generalized Richards model (GRM, Equation (10)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r,

α

, k, and a) across varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). The corresponding confidence intervals (CIs) are also provided, demonstrating the sensitivity of parameter estimates to different levels of observational noise.

Table 7. Results of Monte Carlo simulations assessing the practical identifiability of the generalized Richards model (GRM, Equation (10)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r,

α

, k, and a) across varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). The corresponding confidence intervals (CIs) are also provided, demonstrating the sensitivity of parameter estimates to different levels of observational noise.

Parameter	r	$α$	k	a
$A R E$
$σ = 1 %$	$4.1031 \times 10^{- 6}$	$9.0123 \times 10^{- 7}$	$1.2002 \times 10^{- 9}$	$8.1173 \times 10^{- 6}$
CI	$[2.3000, 2.3000]$	$[0.8200, 0.8200]$	[31,000.0000, 31,000.0000]	$[0.9000, 0.9000]$
$A R E$
$σ = 5 %$	$8.8096 \times 10^{- 4}$	$2.0969 \times 10^{- 4}$	$4.2781 \times 10^{- 5}$	$0.0011$
CI	$[2.3000, 2.3002]$	$[0.8200, 0.8200]$	[30,999.9583, 31,000.0272]	$[0.9000, 0.9001]$
$A R E$
$σ = 10 %$	$0.0039$	$8.4508 \times 10^{- 4}$	$3.3973 \times 10^{- 4}$	$0.0043$
CI	$[2.2996, 2.3007]$	$[0.8200, 0.8200]$	[30,999.2908, 31,000.7419]	$[0.8999, 0.9002]$
$A R E$
$σ = 20 %$	$0.0153$	$0.0032$	$0.0014$	$0.0168$
CI	$[2.2984, 2.3018]$	$[0.8199, 0.8201]$	[30,997.8189, 31,001.7471]	$[0.8993, 0.9008]$

Table 8. Results of Monte Carlo simulations evaluating the practical identifiability of the generalized logistic model (GLM, Equation (5)) using virtual datasets generated for discrete Ebola experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r, a, and k) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results quantify the model’s sensitivity to observational noise and highlight the robustness of parameter estimation.

Table 8. Results of Monte Carlo simulations evaluating the practical identifiability of the generalized logistic model (GLM, Equation (5)) using virtual datasets generated for discrete Ebola experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter (r, a, and k) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results quantify the model’s sensitivity to observational noise and highlight the robustness of parameter estimation.

Parameter	r	a	k
$A R E$
$σ = 1 %$	$4.5294 \times 10^{- 4}$	$8.6649 \times 10^{- 5}$	$5.4069 \times 10^{- 5}$
CI	$[0.7800, 0.7800]$	$[0.8500, 0.8500]$	[10,999.9599, 11,000.0587]
$A R E$
$σ = 5 %$	$0.0058$	$0.0011$	$0.0019$
CI	$[0.7799, 0.7801]$	$[0.8500, 0.8500]$	[10,999.4004, 11,000.6236]
$A R E$
$σ = 10 %$	$0.0126$	$0.0024$	$0.0043$
CI	$[0.7798, 0.7803]$	$[0.8499, 0.8501]$	[10,998.7789, 11,001.3100]
$A R E$
$σ = 20 %$	$0.0249$	$0.0047$	$0.0082$
CI	$[0.7795, 0.7805]$	$[0.8499, 0.8501]$	[10,997.4842, 11,002.4169]

Table 9. Monte Carlo simulation results for the Gompertz (GOM) model (Equation (12)) based on virtual datasets generated at discrete monkeypox experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter across varying levels of observational noise (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results highlight the sensitivity of the parameter estimates to noise and provide confidence intervals (CIs) for each parameter to evaluate the robustness of the model under real-world data conditions.

Table 9. Monte Carlo simulation results for the Gompertz (GOM) model (Equation (12)) based on virtual datasets generated at discrete monkeypox experimental data points. The table presents the average relative estimation errors (

A R E

s) for each model parameter across varying levels of observational noise (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results highlight the sensitivity of the parameter estimates to noise and provide confidence intervals (CIs) for each parameter to evaluate the robustness of the model under real-world data conditions.

Parameter	r	a
$A R E$
$σ = 1 %$	$0.0531$	$0.0533$
CI	$[1.4980, 1.5019]$	$[0.1997, 0.2003]$
$A R E$
$σ = 5 %$	$0.2639$	$0.2742$
CI	$[1.4896, 1.5105]$	$[0.1987, 0.2014]$
$A R E$
$σ = 10 %$	$0.5309$	$0.5485$
CI	$[1.4803, 1.5187]$	$[0.1974, 0.2027]$
$A R E$
$σ = 20 %$	$1.0741$	$1.1507$
CI	$[1.4596, 1.5384]$	$[0.1939, 0.2053]$

Table 10. Results of Monte Carlo simulations assessing the practical identifiability of the SEIR model with nonhomogenous mixing (Equation (14)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (AREs) for each model parameter (

β

,

κ

,

γ

, N, and

α

) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results highlight the robustness of parameter estimation.

Table 10. Results of Monte Carlo simulations assessing the practical identifiability of the SEIR model with nonhomogenous mixing (Equation (14)) using virtual datasets generated for discrete monkeypox experimental data points. The table presents the average relative estimation errors (AREs) for each model parameter (

β

,

κ

,

γ

, N, and

α

) under varying noise levels (

σ = 1 %

,

5 %

,

10 %

, and

20 %

). These results highlight the robustness of parameter estimation.

Parameter	$β$	$κ$	$γ$	N	$α$
$A R E$
$σ = 1 %$	$5.3855 \times 10^{- 7}$	$6.3303 \times 10^{- 8}$	$3.0005 \times 10^{- 8}$	$2.3138 \times 10^{- 15}$	$1.0316 \times 10^{- 7}$
CI	$[7.3000, 7.3000]$	$[4.7000, 4.7000]$	$[4.8000, 4.8000]$	[100,000.0000, 100,000.0000]	$[0.9600, 0.9600]$
$A R E$
$σ = 5 %$	$3.0360 \times 10^{- 4}$	$3.0366 \times 10^{- 4}$	$3.0385 \times 10^{- 4}$	$3.3689 \times 10^{- 9}$	$6.3055 \times 10^{- 5}$
CI	$[7.3000, 7.3000]$	$[4.7000, 4.7000]$	$[4.8000, 4.8000]$	[100,000.0000, 100,000.0000]	$[0.9600, 0.9600]$
$A R E$
$σ = 10 %$	$0.0042$	$0.0043$	$0.0044$	$4.1763 \times 10^{- 8}$	$4.9886 \times 10^{- 5}$
CI	$[7.2940, 7.3000]$	$[4.6999, 4.7041]$	$[4.7960, 4.8000]$	[100,000.0000, 100,000.0000]	$[0.9600, 0.9600]$
$A R E$
$σ = 20 %$	$0.0363$	$0.0371$	$0.0379$	$9.5587 \times 10^{- 7}$	$2.6628 \times 10^{- 4}$
CI	$[7.2869, 7.3001]$	$[4.6997, 4.7089]$	$[4.7909, 4.8000]$	[99,999.9964, 100,000.0022]	$[0.9600, 0.9600]$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liyanage, Y.R.; Chowell, G.; Pogudin, G.; Tuncer, N. Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting. Viruses 2025, 17, 496. https://doi.org/10.3390/v17040496

AMA Style

Liyanage YR, Chowell G, Pogudin G, Tuncer N. Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting. Viruses. 2025; 17(4):496. https://doi.org/10.3390/v17040496

Chicago/Turabian Style

Liyanage, Yuganthi R., Gerardo Chowell, Gleb Pogudin, and Necibe Tuncer. 2025. "Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting" Viruses 17, no. 4: 496. https://doi.org/10.3390/v17040496

APA Style

Liyanage, Y. R., Chowell, G., Pogudin, G., & Tuncer, N. (2025). Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting. Viruses, 17(4), 496. https://doi.org/10.3390/v17040496

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Structural and Practical Identifiability of Phenomenological Growth Models for Epidemic Forecasting

Abstract

1. Introduction

2. Structural Identifiability of Phenomenological Models

3. Fitting Models to Real Epidemic Data

3.1. Data

3.2. Parameter Estimation Method

4. Practical Identifiability

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI