Weibull-Type Incubation Period and Time of Exposure Using γ-Divergence

Daisuke Yoneoka; Takayuki Kawashima; Yuta Tanoue; Shuhei Nomura; Akifumi Eguchi

doi:10.3390/e27030321

,

and

¹

Center for Surveillance, Immunization, and Epidemiologic Research, National Institute of Infectious Diseases, Tokyo 162-8640, Japan

²

Department of Mathematical and Computing Science, School of Computing, Institute of Science Tokyo, Tokyo 152-8552, Japan

³

Faculty of Marine Technology, Tokyo University of Marine Science and Technology, Tokyo 135-8533, Japan

⁴

Department of Health Policy and Management, School of Medicine, Keio University, Tokyo 160-8582, Japan

Entropy2025, 27(3), 321;https://doi.org/10.3390/e27030321

This article belongs to the Special Issue Entropy in Biomedical Engineering, 3rd Edition

Version Notes

Order Reprints

Review Reports

Abstract

Accurately determining the exposure time to an infectious pathogen, together with the corresponding incubation period, is vital for identifying infection sources and implementing targeted public health interventions. However, real-world outbreak data often include outliers—namely, tertiary or subsequent infection cases not directly linked to the initial source—that complicate the estimation of exposure time. To address this challenge, we introduce a robust estimation framework based on a three-parameter Weibull distribution in which the location parameter naturally corresponds to the unknown exposure time. Our method employs a

γ

-divergence criterion—a robust generalization of the standard cross-entropy criterion—optimized via a tailored majorization–minimization (MM) algorithm designed to guarantee a monotonic decrease in the objective function despite the non-convexity typically present in robust formulations. Extensive Monte Carlo simulations demonstrate that our approach outperforms conventional estimation methods in terms of bias and mean squared error as well as in estimating the incubation period. Moreover, applications to real-world surveillance data on COVID-19 illustrate the practical advantages of the proposed method. These findings highlight the method’s robustness and efficiency in scenarios where data contamination from secondary or tertiary infections is common, showing its potential value for early outbreak detection and rapid epidemiological response.

Keywords:

three-parameter Weibull distribution; γ-divergence; exposure time; infectious disease; robust estimation

1. Introduction

Identifying the precise time of exposure to a newly emerging infectious pathogen using symptom onset data is a fundamental step for locating infection sources and implementing effective public health measures such as contact tracing, particularly in diseases transmissible via human-to-human contact [1,2]. Once the exposure time is well estimated, one can also determine the incubation period, which is defined as the time from the exposure event to symptom onset [3]. Historically, a variety of parametric distributions including exponential, lognormal, and Weibull distributions have been employed to model this period [4,5]. Nishiura (2007) provides a comprehensive account of their application in infectious disease epidemiology [6].

However, real-world outbreak surveillance systems rarely supply “clean” data in which all individuals share the same single-exposure event when estimating the incubation period. In practice, many datasets are inherently mixed: they contain not only secondary infections (Case 1), who were exposed at the original infection source, but also tertiary or later infections (Case 2), whose exposure stems from secondary transmissions. Data from Case 2 do not conform to the single-exposure assumption and thus act as outliers in the estimation process. Although excluding these outliers a priori would be ideal, the detailed investigations to confidently remove them are often expensive and time-consuming tasks, especially in urgent pandemic surveillance contexts. As a motivating example, Figure 1, which will be revisited in Section 4 of this paper, shows an epidemic curve of COVID-19 in Tianjin, China (21 January–12 February 2020), where four tertiary cases deviate considerably from the main cluster yet are not straightforward to exclude.

Figure 1. Epicurve of COVID-19 in China [7].

Our previous work tackled a similar situation using a three-parameter lognormal framework, proposing a robust estimation approach based on gamma-divergence, which is a robust divergence measure generalizing standard cross-entropy [8] that mitigated the impact of outliers on incubation period estimates [9]. While that approach proved valuable for distributions with long right tails, other scenarios may call for alternative parametric assumptions. In particular, the three-parameter Weibull distribution may be more suitable if the data lack extremely heavy tails or if epidemiological insight suggests that the hazard rate of onset changes monotonically over time rather than following the shape implied by the lognormal form. Nonetheless, just as with the lognormal model, conventional maximum likelihood methods applied to the Weibull distribution remain susceptible to contamination from tertiary infections. In addition, a general

γ

-divergence-based estimation framework, which can in principle be applied to various parametric models, has been proposed by Okuno (2023) [10]. Although this approach can be extended to outbreak datasets, it does not specifically target the optimization challenges posed by a three-parameter Weibull model under severe data contamination (e.g., tertiary infections). In this study, we focus on a dedicated method tailored to the three-parameter Weibull distribution and its estimation procedure. We develop a specialized majorization–minimization (MM) algorithm that guarantees a monotonic decrease in our

γ

-based objective function and also derive its covariance matrix. This specialization allows us to handle mixed or outlier-contaminated outbreak data more stably in practice while still benefiting from the robust properties of

γ

-divergence.

To address this issue, the current study adapts a

γ

-divergence-based robust methodology to the three-parameter Weibull setting, interpreting its location parameter as the unknown exposure time. We also develop a tailored MM algorithm to optimize a

γ

-cross-entropy criterion, enabling a stable estimation of the shape, scale, and location parameters—even under mixed or contaminated data conditions. Simulation experiments demonstrate that our approach substantially reduces bias and variance compared to traditional estimators, while an application to real-world surveillance data for COVID-19 further highlights its practicality.

This article is organized as follows: In Section 2, we review

γ

-divergence and introduce the associated objective function tailored to the three-parameter Weibull distribution of our interest. We then present an optimization method based on the MM algorithm. Next, a sandwich-type estimator for the covariance matrix is proposed using the theory of M-estimation. Simulation and real-world data analyses employing epidemiological surveillance data for COVID-19 and hepatitis A are described in Section 3 and Section 4, respectively. The article ends with a discussion in Section 5.

2. Method

2.1. Three-Parameter Weibull Distribution for Estimating the Exposure Time to Infectious Source and Incubation Period

Let

y_{i}

be the disease onset timing of the ith individual (

i = 1, \dots, N

) and assumed to have the probability density function (PDF) given by

\begin{matrix} f (y_{i} | θ) & = α β {(y_{i} - η)}^{β - 1} exp \{- α {(y_{i} - η)}^{β}\} \\ (- \infty < η < y_{i} < \infty, α > 0, β > 0) . \end{matrix}

(1)

The cumulative distribution function (cdf) and hazard function are, respectively,

\begin{matrix} F (y) = 1 - exp \{- α {(y - η)}^{β}\} \end{matrix}

(2)

and

\begin{matrix} h (y) = α β {(y - η)}^{β - 1} . \end{matrix}

(3)

The three-parameter Weibull distribution reduces to the two-parameter (i.e., conventional) Weibull distribution when

η = 0

. Figure 2 illustrates the PDFs in Equation (1) when

η = - 1

.

Figure 2. Probability density functions of three-parameter Weibull distribution when

η = - 1

.

Here, we assume the single and simultaneous exposure to the infectious source: i.e., every individual was supposed to be exposed to one infectious source at the same time. Under this assumption of single-point exposure, the notable advantage of the three-parameter Weibull distribution is the fact that the support of Y (the disease onset timing) ranges

[η, \infty)

; therefore,

η

can be interpreted as “the timing of exposure to the infectious source”. Once the parameters are estimated, the average and q-percentile of the incubation period can be calculated, respectively, as

\begin{matrix} {(\frac{1}{α})}^{1 / β} Γ (1 + \frac{1}{β}) + η and {\{\frac{- log (1 - q)}{α}\}}^{1 / β} + η \end{matrix}

(4)

where q is the 100q% percentile of the three-parameter Weibull distribution.

2.2. Brief Introduction of $γ$ -Divergence

The

γ

-divergence was defined for two PDFs by Fujisawa and Eguchi (2008) [11]. Let

g (x)

and

f (x | θ)

be the PDFs of the data-generating and parametric model distributions of x, respectively. The

γ

-divergence is defined by

\begin{matrix} D_{γ} (g, f_{θ}) & = \frac{1}{γ (1 + γ)} log \int g {(x)}^{1 + γ} d x - \frac{1}{γ} log \int g (x) f {(x | θ)}^{γ} d x \\ + \frac{1}{1 + γ} log \int f {(x | θ)}^{1 + γ} d x, \end{matrix}

where

f_{θ}

is the parametric PDF of interest. Note that

{lim}_{γ \to 0} D_{γ} (g, f_{θ}) = \int g (x) log \frac{g (x)}{f (x | θ)} d x

, which is the Kullback–Leibler (KL) divergence. This divergence satisfies the following two properties: (i)

D_{γ} (g, f_{θ}) \geq D_{γ} (g, g)

and (ii)

D_{γ} (g, f_{θ}) = 0 \Leftrightarrow g (x) = c f (x | θ)

, where c is a positive constant. Lower values of

γ

approach traditional likelihood-based methods such as KL divergence, which are efficient but less robust to outliers, while higher values prioritize robustness at the cost of efficiency.

Based on the above

γ

-divergence, the empirical version of

γ

-cross entropy between

g (x)

and

f (x | θ)

is defined as

\begin{matrix} d_{γ} (\bar{g}, f_{θ}) = - \frac{1}{γ} log \sum_{i = 1}^{N} f {(x_{i} | θ)}^{γ} + \frac{1}{1 + γ} log \int_{X} f {(x | θ)}^{1 + γ} d x, \end{matrix}

where

\bar{g}

is the empirical PDF. The robust estimator (

γ

-estimator) is then defined as

\begin{matrix} {\hat{θ}}_{γ} = \underset{θ}{argmin} d_{γ} (\bar{g}, f_{θ}) . \end{matrix}

(5)

Now, we consider the case where the data-generating distribution is contaminated with outliers (i.e., Case 2) and given by

\begin{matrix} g (x) = (1 - ε) f_{θ} (x) + ε δ (x), \end{matrix}

(6)

which is a mixture of the target distribution

f_{θ} (x)

and certain contamination distribution

δ (x)

, and

ε

denotes the proportion of outliers. The most important assumption here is

\begin{matrix} ν_{f_{θ}} = {\{\int δ (x) f_{θ} {(x)}^{γ} d x\}}^{1 / γ} \approx 0 for an appropriately large γ, \end{matrix}

(7)

which assumes the practical situation where the outliers mostly lie on the tail of the target distribution. Kanamori and Fujisawa (2015) show the robust properties from a viewpoint of latent bias [12].

2.3. γ-Entropy and MM Algorithm for Optimization

Using (1), the

γ

-cross entropy function can be written as

\begin{matrix} l_{γ} (θ) & = - \frac{1}{γ} log (\frac{1}{n} \sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}) + \frac{1}{1 + γ} log (\int_{η}^{\infty} f {(y | θ)}^{1 + γ} d y) \\ = - \frac{1}{γ} log (\frac{1}{n} \sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}) + \frac{1}{1 + γ} log [α^{\frac{γ}{β}} β^{γ} {(1 + γ)}^{- (1 + γ - \frac{γ}{β})} Γ (1 + γ - \frac{γ}{β})] \end{matrix}

(8)

Notably, the second term in Equation (8) can be written in the simple form here, while it is not possible in many cases. To obtain the minimizer, we propose the iterative algorithm of the majorization–minimization algorithm (MM algorithm) as follows.

Let us prepare the majorization function h for the cross-entropy satisfying

\begin{matrix} h (θ^{(t)} | θ^{(t)}) = l_{γ} (θ^{(t)}) \\ h (θ | θ^{(t)}) \geq l_{γ} (θ), \end{matrix}

where

θ^{(t)}

is the parameter value of the t-th iteration step for

t = 0, 1, 2, \dots

. The MM algorithm applies the iterative procedure by

\begin{matrix} θ^{(t + 1)} = \underset{θ}{argmin} h (θ | θ^{(t)}) . \end{matrix}

It is possible to show that the objective function

l_{γ} (θ)

monotonically decreases at each step because

\begin{matrix} l_{γ} (θ^{(t)}) = h (θ^{(t)} | θ^{(t)}) \geq h (θ^{(t + 1)} | θ^{(t)}) \geq l_{γ} (θ^{(t + 1)}) . \end{matrix}

Here, we propose the majorization function for Equation (8) using Jensen’s inequality as follows:

\begin{matrix} l_{γ} (θ) & \leq - \frac{1}{n γ} \sum_{i = 1}^{N} w_{i} log (f {(y_{i} | θ)}^{γ} \frac{\sum_{j = 1}^{N} f {(y_{j} | θ^{(t)})}^{γ}}{f {(y_{i} | θ^{(t)})}^{γ}}) + \\ \frac{1}{1 + γ} log [α^{\frac{γ}{β}} β^{γ} {(1 + γ)}^{- (1 + γ - \frac{γ}{β})} Γ (1 + γ - \frac{γ}{β})] \\ = - \sum_{i = 1}^{N} w_{i} log f (y_{i} | θ) + \frac{1}{1 + γ} log [α^{\frac{γ}{β}} β^{γ} {(1 + γ)}^{- (1 + γ - \frac{γ}{β})} Γ (1 + γ - \frac{γ}{β})] + c (θ^{(t)}) \\ = h (θ | θ^{(t)}) + c (θ^{(t)}), \end{matrix}

where

w_{i} = \frac{f {(y_{i} | θ^{(t)})}^{γ}}{\sum_{j = 1}^{N} f {(y_{j} | θ^{(t)})}^{γ}}

and

c (θ^{(t)})

is a term that does not depend on the parameter

θ

. The first term on the target function

l_{γ} (θ)

is a mixture of densities, which is not easy to be optimized in general, while the first term in

h (θ | θ^{(t)})

is a weighted log-likelihood and is easy to be optimized by using the derivatives in the Appendix A.

Using the t-th iteration values of

α^{(t)}

, the

t + 1

-th iteration values can be obtained as follows:

\begin{matrix} α^{(t + 1)} = \frac{1 - \frac{γ}{β^{(t)} (1 + γ)}}{\sum_{i = 1}^{N} w_{i} {(y_{i} - η^{(t)})}^{β^{(t)}}} \end{matrix}

(9)

Additionally, at the

t + 1

-th iteration, the values of

β^{(t + 1)}

and

η^{(t + 1)}

can be obtained by finding the root of the following equations:

\begin{matrix} - \sum_{i = 1}^{N} w_{i} \{\frac{1}{β^{(t + 1)}} + log (y_{i} - η^{(t)}) - α^{(t)} {(y_{i} - η^{(t)})}^{β^{(t + 1)}} log (y_{i} - η^{(t)})\} + \\ \frac{1}{1 + γ} {- \frac{γ}{β^{2 (t + 1)}} log α^{(t)} + \frac{γ}{β^{(t + 1)}} - \frac{γ}{β^{2 (t + 1)}} log (1 + γ) + \\ \frac{γ}{β^{2 (t + 1)}} ψ (1 + \frac{γ (β^{(t + 1)} - 1)}{β^{(t + 1)}})} = 0 \end{matrix}

(10)

\begin{matrix} \sum_{i = 1}^{N} w_{i} \{\frac{β^{(t)} - 1}{y_{i} - η^{(t + 1)}} - α^{(t)} β^{(t)} {(y_{i} - η^{(t + 1)})}^{β^{(t)} - 1}\} = 0 . \end{matrix}

(11)

Remark 1

(Identification of outliers). By plugging the estimated

{\hat{θ}}_{γ}

into Equation (1) (or the associated likelihood function), we can visually inspect the presence or absence of outliers. If a sample is an outlier, it will be plotted at the tail of the distribution. By using the estimated ε, we can identify

100 (1 - ε)

% cases as the smallest estimated values of

f (y_{i} | {\hat{θ}}_{γ})

as outliers.

Remark 2

(Selection criterion for tuning parameter

γ

). From the result of Sugasawa and Yonekura (2021) [13], we use the following selection criterion:

\begin{matrix} H_{N} (γ) & = \sum_{i = 1}^{N} [f^{'} {(y_{i} | {\hat{θ}}_{γ})}^{2} f {(y_{i} | {\hat{θ}}_{γ})}^{γ - 2} \{\frac{2 (γ - 1)}{C_{γ} ({\hat{θ}}_{γ})} + \frac{f {(y_{i} | {\hat{θ}}_{γ})}^{γ}}{C_{γ} ({\hat{θ}}_{γ})}\} \\ + \frac{2 f {(y_{i} | {\hat{θ}}_{γ})}^{γ - 1} f^{″} (y_{i} | {\hat{θ}}_{γ})}{C_{γ} ({\hat{θ}}_{γ})}], \end{matrix}

(12)

where

\begin{matrix} C_{γ} ({\hat{θ}}_{γ}) = {\{\int_{X} f {(x | {\hat{θ}}_{γ})}^{1 + γ} d x\}}^{\frac{γ}{1 + γ}} = {\{{\hat{α}}^{\frac{γ}{\hat{β}}} {\hat{β}}^{γ} {(1 + γ)}^{- (1 + γ - \frac{γ}{\hat{β}})} Γ (1 + γ - \frac{γ}{\hat{β}})\}}^{\frac{γ}{1 + γ}}, \\ f^{'} (y_{i} | {\hat{θ}}_{γ}) = (\frac{\hat{β}}{y_{i} - \hat{η}} - \hat{α} \hat{β}) f (y_{i} | {\hat{θ}}_{γ}), a n d f^{″} (y_{i} | {\hat{θ}}_{γ}) = {(\frac{\hat{β}}{y_{i} - \hat{η}} - \hat{α} \hat{β})}^{2} f (y_{i} | {\hat{θ}}_{γ}) . \end{matrix}

Then, we estimate the optimal γ as

γ_{o p t} = {argmin}_{γ} H_{N} (γ)

. In practice, as in the simulation section, several γ values are prepared a priori, and the one with the smallest

H_{N} (γ)

is selected. Other simple decision rules might be based on previous studies using gamma divergence [14] and density power divergence [15,16], in which the value of 0.5 was selected as a reasonable middle ground for practical applications, offering sufficient robustness without a significant loss of efficiency. In addition, other practical approaches for tuning γ, such as cross-validation or utilizing external validation datasets, are also viable [14].

Note that

H_{N} (γ)

is known as the Hyvarinen score, which is a model selection criterion defined via the score function (from a Bayesian perspective). Further, note also that while γ-divergence is known to exhibit relative affine invariance [11], the selection criterion

H_{N} (γ)

itself does not necessarily share this property; thus, rescaling data (e.g., doubling the observed time) may alter the optimal value of γ.

2.4. Initial Value of MM Algorithm

The proposed MM algorithm ensures the monotonic decreasing property of the objective function. However, when Equation (8) has several local minima, it converges to a local minimum rather than the global minimum. Hence, the selection of the initial value is essential. A simple approach is to start from the estimated values using the maximum likelihood method or method of moments, which is applied in the simulation section. Another and more complex approach is to run the MM algorithm with various initial values and select the best run with the smallest value of

l_{γ}

. The procedure for creating the initial values is as follows. First, a subsample is created by randomly selecting q samples from N observations. Next, the median of each subsample is used as the initial value for

α^{(0)}

, the median absolute deviation as the initial value for

β^{(0)}

, and

min (y_{i}) - κ (κ > 0)

as the initial value for

η^{(0)}

, and then we calculate the minimum value of Equation (8). The value of

κ

should be determined beforehand based on an expert opinion or other criteria. Repeat the above process M times and select the initial values that yield the smallest value of Equation (8).

2.5. Asymptotic Properties of ${\hat{θ}}_{γ}$

We consider the estimation of the covariance matrix of

{\hat{θ}}_{γ}

. Let us assume some regularity conditions, which are common in the M-estimator (more precisely, in the theory of normalized estimating equation [17]). The asymptotic normality of

{\hat{θ}}_{γ}

is given as

\sqrt{N} ({\hat{θ}}_{γ} - θ_{γ}^{*}) \overset{d}{\to} N (0, Σ_{γ})

, where

Σ_{γ} = G_{γ}^{- 1} U_{γ} G_{γ}^{- 1}

,

G_{γ} = E [\frac{\partial^{2} l_{γ} (θ)}{\partial θ \partial θ^{T}}]

and

U_{γ} = E [\frac{\partial l_{γ} (θ)}{\partial θ} \frac{\partial l_{γ} (θ)}{\partial θ^{T}}]

. The asymptotic covariance matrix,

Σ_{γ}

, can then be estimated using the sandwich-type estimator:

\begin{matrix} {\hat{Σ}}_{γ} = {{\hat{G}}_{γ} ({\hat{θ}}_{γ})}^{- 1} {\hat{U}}_{γ} ({\hat{θ}}_{γ}) {{\hat{G}}_{γ} ({\hat{θ}}_{γ})}^{- 1}, \end{matrix}

(13)

where

{\hat{H}}_{γ} ({\hat{θ}}_{γ})

and

{\hat{G}}_{γ} ({\hat{θ}}_{γ})

can be empirically estimable. The detailed derivations are in the Appendix B.

3. Monte Carlo Simulation Experiments

3.1. Simulation Setup

To assess the performance of our approach, we conducted Monte Carlo simulation experiments, varying three key parameters: the proportion of outliers,

ε \in {0.05, 0.1, 0.3}

; the value of

η

in the distribution that generates the outliers (thus it is denoted as

η_{out}

); and the number of non-outlier samples,

N \in {20, 50, 200}

. In particular, the main body of the data—denoted by

f_{θ} (x)

in Equation (6)—is assumed to follow a three-parameter Weibull distribution with parameters

α = 0.111

,

σ = 3

, and

η = - 1

. Conversely, the outliers are drawn from the same three-parameter Weibull distribution except that

η_{out}

takes values from the set

{0, 1, 3}

. Note that in principle, the distribution of outliers could be any form as long as it appears in the tail of the main distribution; thus, the Weibull assumption for the outliers is adopted here primarily for simplicity rather than necessity. Moreover, since the same pathogen is implicated in both secondary (Case 1) and tertiary or subsequent (Case 2) infections, applying the same distribution is justified. Overall, considering all combinations of

N, ε

, and

η_{out}

yields a total of 27 scenarios (see Table 1). For each scenario, 1000 Monte Carlo simulations were performed.

Table 1. Settings in 27 Monte Carlo simulation scenarios.

For each scenario, the procedure to generate an individual dataset for the kth scenario,

y^{(k)} = (y_{1}^{(k)}, \dots, y_{N}^{(k)})

, is as follows. First, we randomly generate N samples from a three-parameter Weibull distribution with parameters

(α = 0.111, σ = 3, η = - 1)

. Next, we randomly generate

N ε

samples from a three-parameter Weibull distribution with parameters

(α = 0.111, σ = 3, η^{(k)} \in {0, 1, 3})

corresponding to the kth scenario.

We evaluated the performance in terms of the bias and mean squared error (MSE) of estimated mean and 95% percentile values of the true distribution, which were calculated from Equation (4) and

\hat{η}

. The comparison methods included estimates based on (1) ml: maximum likelihood method [18,19], (2) mm: method of moments [20], and (3) mps: the method of maximum product spacing [21,22]. These conventional methods are easily implemented in the R program [23] using the fitWeibull() function in the ForestFit package [24].

3.2. Simulation Results

The Monte Carlo simulation results show that the proposed method consistently outperforms the conventional approaches across most scenarios, providing less biased and more efficient estimates for both mean and 95% percentile of the true distribution (defined in Equation (6)) and

\hat{η}

. As shown in Table 2 and Table 3, our method achieves the smallest bias and MSE on average. Specifically, the bias in the mean under our approach achieved 65% reduction compared with that of the conventional approaches; the bias for our approach ranges from

0.002

to

0.363

(mean

0.106

) compared to ranges of

0.036

to

0.937

(mean

0.288

) for ml,

0.036

to

0.926

(mean

0.285

) for moment, and

0.047

to

1.182

(mean

0.361

) for mps. The difference in performance between our approach and conventional approaches is even more pronounced in the estimation of the 95% percentile, achieving a 68% reduction compared with that of the conventional approaches; the bias for our approach ranges from

0.010

to

0.960

(mean

0.268

) compared to ranges of

0.077

to

3.346

(mean

0.979

) for ml,

- 0.340

to

1.903

(mean

0.316

) for moment, and

0.137

to

4.365

(mean

1.238

) for mps. Regarding the bias of

\hat{η}

, our approach also outperforms the conventional approaches, achieving nearly a 90% reduction (excluding the results of mps, which show extremely large biases): the bias for our approach ranges from

- 0.080

to

0.321

(mean

0.086

) compared to ranges of

- 0.445

to

0.586

(mean

0.164

) for ml,

1.458

to

1.734

(mean

1.550

) for moment, and

- 4509.500

to

1.017

(mean

- 2532.864

) for mps.

Table 2. Results of 27 Monte Carlo scenarios: bias.

Table 3. Results of 27 Monte Carlo scenarios: MSE.

In terms of MSE, our method also provides markedly smaller values in the estimation of the mean (a 70% reduction overall), ranging from

0.002

to

0.407

(mean

0.054

) compared to ranges of

0.004

to

0.884

(mean

0.162

) for ml,

0.004

to

0.874

(mean

0.159

) for moment, and

0.005

to

1.422

(mean

0.230

) for mps. Again, the performance gap is even enlarged for the 95% percentile, achieving a 76% reduction in MSE compared with the conventional approaches; MSE of our approach range from

0.013

to

3.175

(mean

0.418

) compared to ranges of

0.019

to

11.315

(mean

1.882

) for ml,

0.004

to

3.664

(mean

0.560

) for moment, and

0.027

to

19.408

(mean

2.806

) for mps. Regarding the MSE of

\hat{η}

, our approach also outperforms the conventional approaches, achieving an 88% reduction: the MSE for our approach ranges from

0.035

to

0.496

(mean

0.194

), compared to ranges of

0.032

to

9.238

(mean

0.843

) for ml,

2.134

to

3.008

(mean

2.421

) for moment, and

0.401

to

10^{12}

(mean

10^{13}

) for mps.

Overall, conventional methods tend to suffer in performance under small sample sizes, when the proportion of outliers is high, or when outliers are not concentrated in the tail of the target distribution (i.e., small

η

). In contrast, our method remains robust and yields stable estimates even under these challenging conditions. In addition, although all methods show improved performance (i.e., reduced bias and MSE) as N increases, our method continues to provide similar or superior performance compared to the conventional methods even at

N = 200

and smaller outlier proportions

ε = 0.05

or

0.1

(Scenarios 3 and 6).

4. Application for Real-World Data: Epidemiological Surveys for COVID-19

This section applies both the proposed method and the comparison methods to contact tracing surveillance for COVID-19, aiming to identify infection sources and estimate the incubation period. Contact tracing surveillance refers to investigations primarily conducted by local public health authorities to prevent disease spread in communities where infections have occurred. The data and corresponding R code (https://www.r-project.org/, accessed on 17 February 2025) are available on the corresponding author’s GitHub page (https://github.com/, accessed on 17 February 2025).

We focus on a COVID-19 outbreak in Tianjin Province, mainland China, from 21 January to 12 February 2020, as detailed by Wang and Teunis (2020) [7]. The dataset, consisting of 112 confirmed cases, highlights a mixture of secondary, tertiary, and subsequent infection cases with the transmission network inferred from symptom onset dates. Figure 1 displays the epidemic curve: the first exposure is designated as day 0, and the reported onset dates span day 1 to day 24. In the figure, 31 secondary infection cases are shown in gray, whereas 39 tertiary or subsequent infection cases are shown in black.

To estimate the exact exposure time, the tertiary and subsequent cases would ideally be excluded, but doing so requires extensive investigation, which is typically time consuming and expensive in the context of emerging infectious diseases. Consequently, using a “mixed” dataset without removing these cases is common in practice as is treating the tertiary and subsequent cases as outliers in the analysis.

Table 4 briefly summarizes a comparison of the estimated time of exposure to the infectious sources,

\hat{η}

, across our approach and the conventional methods. With the selected

γ

of 1, our method produces the estimates of

η

with the corresponding 95% CI for

η

, mean and 95% percentile of the distribution of incubation period:

{\hat{η}}^{o u r} = - 0.19 (95 % CI : - 3.14, 2.76)

, and the mean and 95% percentile are 3.96 and 9.77. We note that the estimated

{\hat{η}}^{o u r}

is quite close to the actual time of exposure (day 0). In contrast, the conventional approaches produce MLE values of

η

ranging from −0.49 to 2.20. Clearly, our method succeeds in returning the preferable estimated value of

η

closer to the realistic exposure time. In terms of the distribution of the incubation period, our method provides a mean and 95% percentile of 3.96 and 9.77, respectively, whereas the conventional methods provide estimates ranging from 3.89 to 4.28 for the mean and from 6.96 to 10.63 for the 95% percentile.

Table 4. Results of real-world data analysis of COVID-19.

Our robust estimation method has practical implications for understanding the biological evolution of COVID-19, particularly regarding changes in incubation period distributions associated with different SARS-CoV-2 variants. Accurately estimating exposure times and incubation periods in contaminated datasets enables epidemiologists to better capture subtle shifts in viral characteristics, such as transmissibility and generation intervals, across successive infection generations. Such insights can be crucial when assessing how viral mutations or emerging variants alter epidemiological parameters, influencing the trajectory of outbreaks and informing timely public health responses.

5. Discussion

We have introduced a novel robust approach for estimating both the exposure time to infectious sources and the incubation period based on the

γ

-divergence approach for the Weibull distribution. This approach maintains robustness even under substantial contamination, which often arises when unexpected secondary or tertiary cases are captured in rapid epidemiological surveillance. A frequent challenge in robust estimation lies in developing an efficient algorithm, especially given the non-convex and non-differentiable nature of many robust objective functions. In this study, we devised a practical estimation method that guarantees a monotonic decrease in the objective function by leveraging the MM algorithm.

Although our analysis assumed a contaminated density of the form

g (x) = (1 - ε) f_{θ} (x) + ε δ (x)

, we note that this setup can be generalized to a more intricate mixture,

g (x) = (1 - \sum_{j = 1}^{k} ε_{j}) f_{θ} (x) + \sum_{j = 1}^{k} ε_{j} δ_{j} (x)

, using essentially the same framework. Numerical simulations and applications to real-world data consistently indicate that our method surpasses conventional estimators in terms of bias, MSE, and both the mean and 95% percentile of the true distribution.

A key remaining question in the realm of infectious disease surveillance involves the practical selection of the incubation period distribution. Currently, we assume that the incubation period follows the Weibull distribution. However, the accuracy of the estimation strongly depends on the validity of this assumed distribution. If the true distribution deviates from the assumed one, the estimates of exposure time may be biased or imprecise, resulting in an inaccurate estimation of the incubation period. Our robust approach can be extended to other types of distributions. Although the idea is straightforward, further efforts are required to derive a new MM algorithm and covariance matrices. This will be the subject of our future research.

Author Contributions

Conceptualization, D.Y., T.K. and Y.T.; methodology, D.Y., T.K. and Y.T.; validation, D.Y., T.K. and Y.T.; formal analysis, D.Y.; investigation, D.Y.; resources, D.Y. and S.N.; data curation, D.Y., S.N. and A.E.; writing—original draft preparation, D.Y.; writing—review and editing, D.Y., S.N. and A.E.; visualization, D.Y.; supervision, D.Y.; project administration, D.Y.; funding acquisition, D.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Japan Science Technology Agency PRESTO Grant (JPMJPR21RC) and KAKENHI Grant-in-Aid for Young Scientists (22K17859).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data and corresponding R code are available on the corresponding author’s GitHub page (https://github.com/kingqwert/R/tree/master/Robust_3ParWeibull/, accessed on 17 February 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Derivation of the Updating Rule in MM Algorithm: Equations (9)–(11)

The three majorization functions which need to be solved iteratively are derived as follows. Differentiating

h (θ | θ^{(t)})

with respect to each parameter,

α

,

β

and

η

, we obtain

\begin{matrix} \frac{\partial h (θ | θ^{(t)})}{\partial α} = - \sum_{i = 1}^{N} w_{i} \{\frac{1}{α} - {(y_{i} - η)}^{β}\} + \frac{γ}{α β (1 + γ)} = 0 \\ \frac{\partial h (θ | θ^{(t)})}{\partial β} = - \sum_{i = 1}^{N} w_{i} \{\frac{1}{β} + log (y_{i} - η) - α {(y_{i} - η)}^{β} log (y_{i} - η)\} + \\ \frac{1}{1 + γ} \{- \frac{γ}{β^{2}} log α + \frac{γ}{β} + \frac{γ}{β^{2}} log (1 + γ) + \frac{γ}{β^{2}} ψ (1 + \frac{γ (β - 1)}{β})\} = 0 \\ \frac{\partial h (θ | θ^{(t)})}{\partial η} = \sum_{i = 1}^{N} w_{i} \{\frac{β - 1}{y_{i} - η} + α β {(y_{i} - η)}^{β - 1}\} = 0 \end{matrix}

Direct calculations give Equations (9)–(11).

Appendix B. Detailed Derivation of ${\hat{G}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) and ${\hat{U}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) in Equation (13)

Differentiating

l_{γ} (θ)

with respect to each parameter,

α

,

β

and

η

, we obtain

\begin{matrix} \frac{\partial f {(x | θ)}^{γ}}{\partial α} = - \frac{γ \{α {(x - η)}^{β} - 1\}}{α} f {(x | θ)}^{γ} \\ \frac{\partial f {(x | θ)}^{γ}}{\partial β} = - \frac{γ [β log (x - η) \{α {(x - η)}^{β} - 1\} - 1]}{β} f {(x | θ)}^{γ} \\ \frac{\partial f {(x | θ)}^{γ}}{\partial η} = \frac{γ [β \{α {(x - η)}^{β} - 1\} + 1]}{x - η} f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial α^{2}} = \frac{γ [γ {\{α {(x - η)}^{β} - 1\}}^{2} - 1]}{α^{2}} f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial β^{2}} = \frac{γ}{β^{2}} [β^{2} {log}^{2} (x - η) \{γ {(α {(x - η)}^{β} - 1)}^{2} - α {(x - η)}^{β}\} \\ - 2 γ β log (x - η) \{α {(x - η)}^{β} - 1\} + γ - 1] f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial η^{2}} = \frac{γ [γ {\{β (α {(x - η)}^{β} - 1) + 1\}}^{2} - (β - 1) \{α β {(x - η)}^{β} + 1\}]}{{(x - η)}^{2}} f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial α \partial β} = \frac{γ}{α β} [- γ α {(x - η)}^{β} + β log (x - η) \{γ {(α {(x - η)}^{β} - 1)}^{2} - α {(x - η)}^{β}\} \\ + γ] f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial α \partial η} = - \frac{γ}{α (x - η)} [γ \{α {(x - η)}^{β} - 1\} \{β (α {(x - η)}^{β} - 1) + 1\} - α β {(x - η)}^{β}] f {(x | θ)}^{γ} \\ \frac{\partial^{2} f {(x | θ)}^{γ}}{\partial β \partial η} = - \frac{γ}{β (x - η)} [β - α β {(x - η)}^{β} + γ \{β - α β {(x - η)}^{β} - 1\} \\ + β log (x - η) \{γ (α {(x - η)}^{β} - 1) (- β + α β {(x - η)}^{β} + 1) - α β {(x - η)}^{β}\}] f {(x | θ)}^{γ} \end{matrix}

Then, we obtain

\begin{matrix} \frac{\partial l_{γ} (θ)}{\partial α} = - \frac{\sum_{i = 1}^{N} {α {(y_{i} - η)}^{β} - 1} f {(y_{i} | θ)}^{γ}}{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}} + \frac{γ}{α β (1 + γ)} \\ \frac{\partial l_{γ} (θ)}{\partial β} = - \frac{1}{β} \frac{\sum_{i = 1}^{N} [β log (y_{i} - η) \{α {(y_{i} - η)}^{β} - 1\} - 1] f {(x_{i})}^{γ}}{\sum_{i = 1}^{N} f {(x_{i})}^{γ}} + \\ \frac{γ}{1 + γ} [- \frac{1}{β^{2}} log α + \frac{1}{β} - \frac{1}{β^{2}} log (1 + γ) + \frac{1}{β^{2}} ψ (1 + γ - \frac{γ}{β})] \\ \frac{\partial l_{γ} (θ)}{\partial η} = - \frac{1}{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}} \sum_{i = 1}^{N} \frac{1}{y_{i} - η} [β \{α {(y_{i} - η)}^{β} - 1\} + 1] f {(y_{i} | θ)}^{γ} \\ \frac{\partial^{2} l_{γ} (θ)}{\partial α^{2}} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial α^{2}} - {\{\sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial α}\}}^{2}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} - \frac{γ}{α^{2} β (1 + γ)} \\ \frac{\partial^{2} l_{γ} (θ)}{\partial β^{2}} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial β^{2}} - {\{\sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial β}\}}^{2}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} + \\ \frac{γ}{β^{4} (1 + γ)} [β \{2 log α - β + 2 log (1 + γ)\} - 2 β ψ (γ + 1 - \frac{γ}{β}) + γ ψ^{(1)} (γ + 1 - \frac{γ}{β})] \\ \frac{\partial^{2} l_{γ} (θ)}{\partial η^{2}} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial η^{2}} - {\{\sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial η}\}}^{2}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} \\ \frac{\partial^{2} l_{γ} (θ)}{\partial α \partial β} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial α \partial β} - \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial α} \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial β}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} - \frac{γ}{α β^{2} (1 + γ)} \\ \frac{\partial^{2} l_{γ} (θ)}{\partial α \partial η} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial α \partial η} - \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial α} \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial η}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} \\ \frac{\partial^{2} l_{γ} (θ)}{\partial β \partial η} = - \frac{1}{γ} \frac{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ} \sum_{i = 1}^{N} \frac{\partial^{2} f {(y_{i} | θ)}^{γ}}{\partial β \partial η} - \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial β} \sum_{i = 1}^{N} \frac{\partial f {(y_{i} | θ)}^{γ}}{\partial η}}{{\{\sum_{i = 1}^{N} f {(y_{i} | θ)}^{γ}\}}^{2}} \end{matrix}

These derivatives give

{\hat{G}}_{γ} ({\hat{θ}}_{γ})

and

{\hat{U}}_{γ} ({\hat{θ}}_{γ})

in Equation (13).

References

Lauer, S.A.; Grantz, K.H.; Bi, Q.; Jones, F.K.; Zheng, Q.; Meredith, H.R.; Azman, A.S.; Reich, N.G.; Lessler, J. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: Estimation and application. Ann. Intern. Med. 2020, 172, 577–582. [Google Scholar] [CrossRef] [PubMed]
Reich, N.G.; Lessler, J.; Cummings, D.A.; Brookmeyer, R. Estimating incubation period distributions with coarse data. Stat. Med. 2009, 28, 2769–2784. [Google Scholar] [CrossRef]
Armitage, P.; Colton, T. Encyclopedia of Biostatistics; John Wiley & Sons: Chichester, UK, 1999. [Google Scholar]
Virlogeux, V.; Li, M.; Tsang, T.K.; Feng, L.; Fang, V.J.; Jiang, H.; Wu, P.; Zheng, J.; Lau, E.H.; Cao, Y.; et al. Estimating the distribution of the incubation periods of human avian influenza A (H7N9) virus infections. Am. J. Epidemiol. 2015, 182, 723–729. [Google Scholar] [CrossRef] [PubMed]
Xin, H.; Wong, J.Y.; Murphy, C.; Yeung, A.; Taslim Ali, S.; Wu, P.; Cowling, B.J. The incubation period distribution of coronavirus disease 2019: A systematic review and meta-analysis. Clin. Infect. Dis. 2021, 73, 2344–2352. [Google Scholar] [CrossRef] [PubMed]
Nishiura, H. Early efforts in modeling the incubation period of infectious diseases with an acute course of illness. Emerg. Themes Epidemiol. 2007, 4, 2. [Google Scholar] [CrossRef]
Wang, Y.; Teunis, P. Strongly heterogeneous transmission of COVID-19 in mainland China: Local and regional variation. Front. Med. 2020, 7, 329. [Google Scholar] [CrossRef]
Cichocki, A.; Amari, S.I. Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities. Entropy 2010, 12, 1532–1568. [Google Scholar] [CrossRef]
Yoneoka, D.; Kawashima, T.; Tanoue, Y.; Nomura, S.; Eguchi, A. Robust estimation of the incubation period and the time of exposure using γ-divergence. J. Appl. Stat. 2024, 1–19. [Google Scholar] [CrossRef]
Okuno, A. Minimizing robust density power-based divergences for general parametric density models. Ann. Inst. Stat. Math. 2024, 76, 851–875. [Google Scholar] [CrossRef]
Fujisawa, H.; Eguchi, S. Robust parameter estimation with a small bias against heavy contamination. J. Multivar. Anal. 2008, 99, 2053–2081. [Google Scholar] [CrossRef]
Kanamori, T.; Fujisawa, H. Robust estimation under heavy contamination using unnormalized models. Biometrika 2015, 102, 559–572. [Google Scholar] [CrossRef]
Sugasawa, S.; Yonekura, S. On Selection Criteria for the Tuning Parameter in Robust Divergence. Entropy 2021, 23, 1147. [Google Scholar] [CrossRef]
Kawashima, T.; Fujisawa, H. Robust and sparse regression via γ-divergence. Entropy 2017, 19, 608. [Google Scholar] [CrossRef]
Ghosh, A.; Basu, A. Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electron. J. Stat. 2013, 7, 2420–2456. [Google Scholar] [CrossRef]
Ghosh, A.; Basu, A. Robust Bayes estimation using the density power divergence. Ann. Inst. Stat. Math. 2016, 68, 413–417. [Google Scholar] [CrossRef]
Fujisawa, H. Normalized estimating equation for robust parameter estimation. Electron. J. Stat. 2013, 7, 1587–1606. [Google Scholar] [CrossRef]
Mahdi Teimouri, S.M.H.; Nadarajah, S. Comparison of estimation methods for the Weibull distribution. Statistics 2013, 47, 93–109. [Google Scholar] [CrossRef]
Cohen, C.; Whitten, B.J. Modified maximum likelihood and modified moment estimators for the three-parameter weibull distribution. Commun. Stat.-Theory Methods 1982, 11, 2631–2656. [Google Scholar] [CrossRef]
Cran, G. Moment estimators for the 3-parameter Weibull distribution. IEEE Trans. Reliab. 1988, 37, 360–363. [Google Scholar] [CrossRef]
Almetwally, E.M.; Almongy, H.M.; Rastogi, M.K.; Ibrahim, M. Maximum Product Spacing Estimation of Weibull Distribution Under Adaptive Type-II Progressive Censoring Schemes. Ann. Data Sci. 2020, 7, 257–279. [Google Scholar] [CrossRef]
Almetwally, E.M.; Almongy, H.M. Maximum Product Spacing and Bayesian Method for Parameter Estimation for Generalized Power Weibull Distribution Under Censoring Scheme. J. Data Sci. 2022, 17, 407–444. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Teimouri, M.; Doser, J.W.; Finley, A.O. ForestFit: An R package for modeling plant size distributions. Environ. Model. Softw. 2020, 131, 104668. [Google Scholar] [CrossRef]

Figure 1. Epicurve of COVID-19 in China [7].

Figure 2. Probability density functions of three-parameter Weibull distribution when

η = - 1

.

Table 1. Settings in 27 Monte Carlo simulation scenarios.

Scenario	Samples	Parameter
Scenario	N	$ε$	$η^{(k)}$
1	20	0.05	0
2	50	0.05	0
3	200	0.05	0
4	20	0.1	0
5	50	0.1	0
6	200	0.1	0
7	20	0.3	0
8	50	0.3	0
9	200	0.3	0
10	20	0.05	1
11	50	0.05	1
12	200	0.05	1
13	20	0.1	1
14	50	0.1	1
15	200	0.1	1
16	20	0.3	1
17	50	0.3	1
18	200	0.3	1
19	20	0.05	3
20	50	0.05	3
21	200	0.05	3
22	20	0.1	3
23	50	0.1	3
24	200	0.1	3
25	20	0.3	3
26	50	0.3	3
27	200	0.3	3

Table 2. Results of 27 Monte Carlo scenarios: bias.

Scenario	Mean (True = 0.858)				95% Percentile (True = 2.000)				$η$ (True = −1)
Scenario	Our	ml	Moment	mps	Our	ml	Moment	mps	Our	ml	Moment	mps
1	0.039	0.046	0.047	0.122	0.064	0.089	−0.327	0.346	−0.080	−0.445	1.467	−11,091.270
2	0.035	0.036	0.036	0.074	0.080	0.077	−0.340	0.190	−0.013	−0.070	1.458	−0.119
3	0.042	0.045	0.046	0.047	0.095	0.107	−0.320	0.137	0.025	0.008	1.464	0.009
4	0.074	0.084	0.085	0.160	0.144	0.181	−0.259	0.443	0.000	−0.158	1.491	−44,509.500
5	0.086	0.091	0.090	0.128	0.187	0.196	−0.248	0.314	0.050	0.017	1.495	0.217
6	0.088	0.094	0.094	0.096	0.187	0.207	−0.241	0.239	0.056	0.029	1.498	0.051
7	0.223	0.231	0.231	0.297	0.421	0.441	−0.042	0.675	0.011	−0.196	1.597	−5137.573
8	0.223	0.227	0.227	0.261	0.451	0.453	−0.037	0.566	0.073	0.053	1.590	0.260
9	0.226	0.231	0.231	0.234	0.450	0.458	−0.033	0.490	0.080	0.042	1.595	0.056
10	0.042	0.086	0.085	0.200	0.129	0.331	−0.167	0.698	0.090	0.087	1.470	−2361.782
11	0.046	0.080	0.081	0.145	0.146	0.301	−0.187	0.475	0.137	0.120	1.475	−4377.724
12	0.047	0.094	0.094	0.099	0.156	0.366	−0.142	0.420	0.132	0.104	1.479	0.198
13	0.098	0.181	0.180	0.314	0.287	0.623	0.045	1.053	0.161	0.252	1.511	−919.006
14	0.105	0.188	0.187	0.266	0.325	0.638	0.059	0.851	0.220	0.227	1.521	0.664
15	0.092	0.183	0.182	0.188	0.281	0.628	0.053	0.683	0.191	0.146	1.519	0.229
16	0.363	0.467	0.465	0.587	0.960	1.291	0.537	1.722	0.297	0.355	1.673	0.745
17	0.334	0.465	0.462	0.534	0.926	1.282	0.539	1.500	0.321	0.282	1.672	0.643
18	0.243	0.464	0.461	0.469	0.672	1.268	0.539	1.315	0.264	0.185	1.670	0.222
19	0.022	0.193	0.189	0.387	0.051	0.976	0.346	1.595	0.027	0.448	1.482	0.974
20	0.008	0.161	0.157	0.266	0.014	0.863	0.252	1.124	0.006	0.314	1.491	0.809
21	0.002	0.196	0.190	0.206	0.011	1.012	0.361	1.088	0.005	0.214	1.497	0.311
22	0.042	0.374	0.367	0.605	0.129	1.637	0.832	2.453	0.015	0.528	1.510	1.017
23	0.011	0.377	0.367	0.498	0.020	1.635	0.830	1.988	−0.009	0.373	1.521	0.836
24	0.003	0.376	0.363	0.386	0.016	1.624	0.824	1.704	0.023	0.250	1.524	0.320
25	0.329	0.931	0.926	1.182	0.894	3.346	1.903	4.365	0.164	0.586	1.716	1.002
26	0.047	0.929	0.920	1.052	0.118	3.236	1.879	3.724	0.054	0.414	1.727	0.779
27	0.003	0.937	0.923	0.946	0.010	3.177	1.880	3.258	0.018	0.270	1.734	0.304

Table 3. Results of 27 Monte Carlo scenarios: MSE.

Scenario	Mean (True = 0.858)				95% Percentile (True = 2.000)				$η$ (True = −1)
Scenario	Our	ml	Moment	mps	Our	ml	Moment	mps	Our	ml	Moment	mps
1	0.03	0.02	0.02	0.04	0.13	0.08	0.15	0.25	0.496	9.238	2.175	6,626,303,000.000
2	0.01	0.01	0.01	0.02	0.06	0.04	0.13	0.08	0.248	0.385	2.134	55.854
3	0.00	0.00	0.00	0.00	0.02	0.02	0.11	0.03	0.037	0.038	2.145	0.049
4	0.03	0.03	0.03	0.05	0.15	0.11	0.11	0.34	0.439	2.822	2.247	1,051,845,000,000.000
5	0.02	0.02	0.02	0.03	0.08	0.07	0.08	0.14	0.191	0.281	2.246	0.858
6	0.01	0.01	0.01	0.01	0.04	0.05	0.06	0.07	0.038	0.038	2.245	0.050
7	0.07	0.07	0.07	0.11	0.30	0.27	0.04	0.57	0.461	4.174	2.569	4,799,133,000.000
8	0.06	0.06	0.06	0.08	0.26	0.23	0.02	0.36	0.155	0.183	2.536	0.428
9	0.05	0.06	0.06	0.06	0.21	0.22	0.00	0.25	0.035	0.032	2.545	0.041
10	0.03	0.03	0.03	0.06	0.18	0.20	0.07	0.67	0.369	2.160	2.185	3,927,284,000.000
11	0.01	0.02	0.02	0.03	0.08	0.12	0.05	0.27	0.126	0.172	2.186	19,169,410,000.000
12	0.01	0.01	0.01	0.01	0.04	0.14	0.02	0.19	0.042	0.039	2.188	0.072
13	0.05	0.05	0.05	0.12	0.32	0.48	0.05	1.29	0.276	0.464	2.306	845,950,400.000
14	0.03	0.04	0.04	0.08	0.20	0.44	0.02	0.78	0.128	0.148	2.321	0.589
15	0.01	0.04	0.04	0.04	0.11	0.40	0.01	0.48	0.055	0.045	2.310	0.083
16	0.22	0.24	0.24	0.37	1.65	1.76	0.33	3.14	0.253	0.354	2.822	0.803
17	0.16	0.22	0.22	0.30	1.21	1.68	0.31	2.30	0.171	0.159	2.803	0.551
18	0.08	0.22	0.21	0.22	0.65	1.62	0.29	1.74	0.086	0.054	2.792	0.079
19	0.03	0.06	0.06	0.17	0.18	1.03	0.17	2.73	0.335	0.341	2.219	1.004
20	0.01	0.03	0.03	0.08	0.07	0.77	0.08	1.31	0.169	0.162	2.232	0.725
21	0.00	0.04	0.04	0.05	0.01	1.03	0.14	1.19	0.052	0.065	2.243	0.124
22	0.04	0.16	0.16	0.39	1.58	2.77	0.74	6.26	0.336	0.389	2.302	1.079
23	0.01	0.15	0.14	0.26	0.07	2.71	0.71	4.01	0.178	0.192	2.321	0.762
24	0.00	0.14	0.13	0.15	0.01	2.65	0.68	2.91	0.047	0.079	2.324	0.132
25	0.41	0.88	0.87	1.42	3.18	11.32	3.66	19.41	0.282	0.437	2.967	1.056
26	0.05	0.87	0.85	1.15	0.48	10.51	3.55	14.40	0.170	0.217	2.989	0.690
27	0.00	0.88	0.85	0.90	0.02	10.10	3.54	10.63	0.056	0.088	3.008	0.118

Table 4. Results of real-world data analysis of COVID-19.

		COVID-19
Susceptive exposure day		0
Sample size	N	70
	Secondary infection	31
	Tertiary and beyond infection	39
$η$	Our	−0.19
	ml	−0.49
	moment	2.20
	mps	0.70
Estimated incubation period: mean	Our	3.96
	ml	3.90
	moment	3.89
	mps	4.28
Estimated incubation period: 95% percentile	Our	9.77
	ml	8.95
	moment	6.96
	mps	10.63

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Weibull-Type Incubation Period and Time of Exposure Using γ-Divergence

Abstract

1. Introduction

2. Method

2.1. Three-Parameter Weibull Distribution for Estimating the Exposure Time to Infectious Source and Incubation Period

2.2. Brief Introduction of $γ$ -Divergence

2.3. γ-Entropy and MM Algorithm for Optimization

2.4. Initial Value of MM Algorithm

2.5. Asymptotic Properties of ${\hat{θ}}_{γ}$

3. Monte Carlo Simulation Experiments

3.1. Simulation Setup

3.2. Simulation Results

4. Application for Real-World Data: Epidemiological Surveys for COVID-19

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Derivation of the Updating Rule in MM Algorithm: Equations (9)–(11)

Appendix B. Detailed Derivation of ${\hat{G}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) and ${\hat{U}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) in Equation (13)

References

Article Metrics

Citations

Article Access Statistics

Weibull-Type Incubation Period and Time of Exposure Using γ-Divergence

Abstract

1. Introduction

2. Method

2.1. Three-Parameter Weibull Distribution for Estimating the Exposure Time to Infectious Source and Incubation Period

2.2. Brief Introduction of γ -Divergence

2.3. γ-Entropy and MM Algorithm for Optimization

2.4. Initial Value of MM Algorithm

2.5. Asymptotic Properties of θ ^ γ

3. Monte Carlo Simulation Experiments

3.1. Simulation Setup

3.2. Simulation Results

4. Application for Real-World Data: Epidemiological Surveys for COVID-19

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Derivation of the Updating Rule in MM Algorithm: Equations (9)–(11)

Appendix B. Detailed Derivation of G ^ γ ( θ ^ γ ) and U ^ γ ( θ ^ γ ) in Equation (13)

References

Article Metrics

Citations

Article Access Statistics

2.2. Brief Introduction of $γ$ -Divergence

2.5. Asymptotic Properties of ${\hat{θ}}_{γ}$

Appendix B. Detailed Derivation of ${\hat{G}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) and ${\hat{U}}_{γ}$ ( ${\hat{θ}}_{γ}$ ) in Equation (13)