Since 8 December 2019, clusters of pneumonia cases of unknown etiology have emerged in Wuhan City, Hubei Province, China [1
]. Virological investigation suggests that the causative agent of this pneumonia is a novel coronavirus (COVID-19) [3
]. As of 27 January 2020, a total of 4515 cases including 106 deaths were confirmed [4
]. Forty-one cases of COVID-19 infections were also reported outside China, in other Asian countries, the United States, France, Australia, and Canada.
A local market selling seafood and wildlife in Wuhan was visited by many cases in the initial cluster, indicating that a common-source zoonotic exposure may have been the main mode of transmission [5
]. However, even after shutting down the market, the number of cases continued to grow across China and several instances of household transmission were reported [6
]. It is now speculated that sustained human-to-human transmission aided in the establishment of the epidemic [7
] and that reported case counts greatly underestimated the actual number of infections in China [8
Early assessment of the severity of infection and transmissibility can help quantify the pandemic potential of COVID-19 and anticipate the likely number of deaths by the end of the epidemic. One important epidemiological measure of severity is case fatality risk (CFR), which can be measured using three different approaches, by estimating (i) the proportion of the cumulative number of deaths out of the cumulative number of cases at a point in time, (ii) the ratio of the cumulative number of deaths to the cumulative number of infected individuals whose clinical outcome is known (i.e., the deceased or recovered), and (iii) the risk of death among confirmed cases, explicitly accounting for the time from illness onset to death [9
]. Estimating the CFR using the ratio of deaths to confirmed cases (cCFR), with an adjustment of the time delay from illness onset to death (i.e., method (iii)), can provide insight into severity of the disease, because the naïve CFR based on method (i) tends to be an underestimate due to the real-time nature of the growth of fatal cases. For example, during the early stage of an epidemic, failing to right-censor cases with respect to the time delay from illness onset to death may lead to underestimation of the CFR. This is because death due to infection may yet occur following case identification [9
When the CFR denominator only includes confirmed cases it is referred to as the cCFR [10
] and may overestimate the actual CFR among all infected individuals due to under-ascertainment of infections in the population. Nonetheless, the cCFR is still a valuable measure of the upper bound of the CFR among all symptomatic cases (sCFR), particularly in circumstances of high uncertainty, such as the emergence of a new human pathogen (i.e., COVID-19).
The basic reproduction number (
, the average number of secondary cases generated by a single primary case in a fully susceptible population, represents an epidemiological measurement of the transmissibility, helping us to quantify the pandemic potential of COVID-19. Here, we define a pandemic as the worldwide spread of a newly emerged disease, in which the number of simultaneously infected individuals exceeds the capacity for treatment [12
]. Using the growth rate of the estimated cumulative incidence from exportation cases and accounting for the time delay from illness onset to death, the present study aims to estimate the cCFR of COVID-19 in real-time.
A,B show the mean and SD of the time from illness onset to reporting and death, respectively. Employing the gamma distribution, the mean time from illness onset to reporting was estimated at 7.1 days (95% confidence interval [CI]: 5.9, 8.4). The mean time from illness onset to death—adopted from a previous study [14
]—was estimated at 20.2 days (95% CI: 15.1, 29.5), which led to a lognormal distribution with a location parameter of 2.84 and a scale parameter of 0.52.
Subsequently, the cumulative incidence was estimated from exported case data by fitting an exponentially growing incidence curve for both Scenarios 1 and 2 (Figure 2
). As of 24 January 2020, 20 exported cases were reported, and the cumulative incidence in China was estimated at 6924 cases (95% CI: 4885, 9211) in Scenario 1 and 19,289 cases (95% CI: 10,901, 30,158) in Scenario 2. Table 1
shows the real-time update of the estimated cumulative incidence. The exponential growth rates (r
), derived from the growth rate of cumulative incidence, were estimated at 0.15 per day (95% CI: 0.14, 0.15) and 0.29 per day (95% CI: 0.22, 0.36) in Scenarios 1 and 2, respectively.
C,D show the estimated cCFR value accounting for the time delay from illness onset to death under Scenarios 1 and 2, respectively. A total of 41 confirmed deaths were reported as of 24 January 2020, the cCFR value was estimated at 5.3% (95% CI: 3.5%, 7.5%) for Scenario 1 and 8.4% (95% CI: 5.3%, 12.3%) for Scenario 2, respectively.
We estimated the basic reproduction number for the COVID-19 infection, using the estimated exponential growth (r
) and accounting for possible variations of the mean serial interval (Figure 3
). Assuming that the mean serial interval was 7.5 days [2
], the basic reproduction number was estimated at 2.10 (95% CI: 2.04, 2.16) and 3.19 (95% CI: 2.66, 3.69) for Scenarios 1 and 2, respectively. However, as the mean serial interval varies, the estimates can range from 1.6 to 2.6 and 2.2 to 4.2 for Scenarios 1 and 2, respectively.
To address the uncertainty in the unobserved date of illness onset of the index case in Scenario 1, cCFR was estimated by varying the starting date of the exponential growth in the incidence by placing the single index case between 1 and 10 December 2019 (Figure S1 and Table S1
). When we assumed the date of illness onset of the index case was 1 December 2019, the estimated incidence in China and the cCFR on 24 January 2020 were estimated at 4718 (95% CI: 3328, 6278) and 5.3% (95% CI: 3.5, 7.6). The sensitivity analyses for varied cutoff dates between 15 and 24 January 2020 were conducted. Depending on the number of time points, the estimates of the cumulative incidence and cCFR have similar values in Scenario 1 (Figure S2
) but slightly decreased in Scenario 2, when the cutoff date was earlier (Figure S3
). In addition, considering the uncertainty of two fixed parameters (i.e., detection time window and catchment population in Wuhan airport), sensitivity was assessed by varying those parameters. As the catchment population increases and the detection window time decreases, estimated cCFR on 24 January 2020 gets smaller due to increased number of incidence in China (Tables S2 and S3
The present study estimated the risk of death among confirmed cases while addressing ascertainment bias by using data from cases diagnosed outside mainland China and a right-censored likelihood for modeling the count of deceased cases. The estimated cCFR value was 5.3% (95% CI: 3.5, 7.5) when the date of illness onset for the index case was fixed a priori
at 8 December 2019 (Scenario 1), and 8.4% (95% CI: 5.3%, 12.3%) when the timing of the exponential growth of the epidemic was fitted to data alongside with other model parameters (Scenario 2). The estimated value of u
in Scenario 2, which adjusts the time delay from illness onset to death, was smaller than that in Scenario 1 due to the larger value of the growth rate of incidence, while the estimated cCFR value as of 24 January 2020 was larger in Scenario 2 compared with Scenario 1. Depending on the available data, the estimate of the cCFR (i.e., the delay-adjusted risk of death among confirmed cases) may vary in time and show some fluctuations. In addition to that, we estimated the value of the basic reproduction number
in the range of 1.6–2.6 for Scenario 1 and 2.2–4.2 for Scenario 2. From either estimate, we conclude that COVID-19 has substantial potential to spread via human-to-human transmission. However, R0
> 1 does not guarantee that a single exported (and untraced) case would immediately lead to a major epidemic in the destination country as government responses such as border control, isolation of suspected cases, and intensive surveillance should serve to reduce opportunities for transmission to occur [17
Our cCFR estimates of 5.3% and 8.4% indicate that the severity of COVID-19 is not as high as that of other diseases caused by coronaviruses, including severe acute respiratory syndrome (SARS), which had an estimated CFR of 17% in Hong Kong [9
], and Middle East respiratory syndrome, which had an estimated CFR of 20% in South Korea [21
]. Nonetheless, considering the overall magnitude of the ongoing epidemic, a 5%–8% risk of death is by no means insignificant. In addition to quantifying the overall risk of death, future research must identify groups at risk of death (e.g., the elderly and people with underlying comorbidities) [22
]. Moreover, considering that about 9% of all infected individuals are ascertained and reported [24
], the infection fatality risk (IFR), i.e., the risk of death among all infected individuals, would be on the order of 0.5% to 0.8%.
range of 1.6–4.2 for COVID-19 is consistent with other preliminary estimates posted on public domains [25
], and is comparable to the
of SARS, which was in the range of 2–5 during the 2003 outbreak in Singapore [15
]. Between our two estimates, the latter scenario yielded a greater value than the former, and there was an increasingly improved ascertainment in early January 2020. The virus was identified and sequenced on 7 January 2020 and subsequently the primer was widely distributed, allowing for rapid laboratory identification of cases and contributing to a time-dependent increase in the number of confirmed cases out of China. Consequently, Scenario 2, which was fully dependent on the growth rate of exported cases, could have overestimated the intrinsic growth rate of cases. Considering the estimated value of
, the possibility of presymptomatic transmission in the ongoing epidemic is a critical question, as it would have a substantial impact on public health response to the epidemic (e.g., whether the contact tracing should be prioritized or not) as well as overall predictability of the epidemic during the containment stage [29
From the technical side, it should be emphasized that our proposed approach can be especially useful during the early stage of an epidemic when local surveillance is affected by substantial ascertainment bias and export and death data are available and better ascertained. Nonetheless, caution must be used when implementing similar estimations for the COVID-19 epidemic, as all flights from Wuhan airport were grounded as of 23 January 2020 [13
] and this intervention abruptly changed the human migration network. Despite the decrease in the outbound flow of travelers from Wuhan, there is a substantial risk that the next epidemic wave will originate from other cities.
There are five main limitations in the present study. First, our results present an estimate for the cCFR which only addresses fatality among confirmed cases. More precise IFR estimates that include infected individuals other than confirmed cases can only be estimated using additional data (e.g., seroepidemiological data or outpatient clinic visits). It should be noted that not only the denominator but also the numerator values are subject to better estimation (e.g. excess mortality estimate). Second, our study relied on limited empirical data that were extracted from publicly available data sources. Thus, future studies with greater sample size and precision are needed. Nonetheless, we believe that this study will improve the situational assessment of the ongoing epidemic. Third, our assumed date of illness onset for the index case in Scenario 1 is based on initial reports of the earliest onset date for a case, and the continued exponential growth with the rate r is the authors’ extrapolation. However, we conducted a sensitivity analysis and ensured that the resulting statistical estimates would not greatly vary from our main results. Fourth, there is an uncertainty in the detection window time T. Since the epidemiological investigations are being actively implemented outside China, we believe that the sum of the incubation period and the infectious period can be a plausible estimation of the detection time window. Fifth, heterogeneous aspects of death (e.g. age and risk groups) need to be addressed in the future studies.
In conclusion, the present study has estimated cCFR to be on the order of 5%–8% and R0 to be 1.6–4.2, endorsing the notion that COVID-19 infection in the ongoing epidemic possesses the potential to become a pandemic. The proposed approach can also help direct risk assessment in other settings with the use of publicly available datasets.