COVID-19 Transmission: Bangladesh Perspective

: The sudden emergence of the COVID-19 pandemic has tested the strength of the public health system of the most developed nations and created a “new normal”. Many nations are struggling to curb the epidemic in spite of expanding testing facilities. In this study, we consider the case of Bangladesh, and ﬁt a simple compartmental model holding a feature to distinguish between identiﬁed infected and infectious with time series data using least square ﬁtting as well as the likelihood approach; prior to which, dynamics of the model were analyzed mathematically and the identiﬁability of the parameters has also been conﬁrmed. The performance of the likelihood approach was found to be more promising and was used for further analysis. We performed ﬁtting for different lengths of time intervals starting from the beginning of the outbreak, and examined the evolution of the key parameters from Bangladesh’s perspective. In addition, we deduced proﬁle likelihood and 95% conﬁdence interval for each of the estimated parameters. Our study demonstrates that the parameters deﬁning the infectious and quarantine rates change with time as a consequence of the change in lock-down strategies and expansion of testing facilities. As a result, the value of the basic reproduction number R 0 was shown to be between 1.5 and 12. The analysis reveals that the projected time and amplitude of the peak vary following the change in infectious and quarantine rates obtained through different lock-down strategies and expansion of testing facilities. The identiﬁcation rate determines whether the observed peak shows the true prevalence. We ﬁnd that by restricting the spread through quick identiﬁcation and quarantine, or by implementing lock-down to reduce overall contact rate, the peak could be delayed, and the amplitude of the peak could be reduced. Another novelty of this study is that the model presented here can infer the unidentiﬁed COVID cases besides estimating the ofﬁcially conﬁrmed COVID cases.


Introduction
The disease COVID-19 is caused by SARS-CoV-2, which is a strain of the virus that causes severe acute respiratory syndrome (SARS) [1]. The other prominent members of this family are severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). Although SARS-CoV-2 seems to be associated with milder infections and low fatality rate compared to SARS and MERS [2], it has caused far greater morbidity and mortality due to its can keep an account of the unidentified infected who are mostly responsible for the transmission. All individuals are divided into five non-intersecting subclasses: Susceptible(S(t)), Exposed(E(t)), Infectious(I(t)-not officially confirmed), Isolated(Q(t)-officially confirmed and quarantined), and removed (R(t)-recovered or dead). We use the model to understand the COVID-19 outbreak scenario of Bangladesh, which may also be used for countries with similar socioeconomic condition.
The rest of this paper is organized as follows: In Section 2, the mathematical model is presented and the dynamics of the model is analyzed. The background of choosing baseline parameter values for the model and the model fitting techniques as well as the identifiability issues are discussed in Section 3. The results are discussed and summarized in Section 4. Finally, the work is concluded in Section 5. The shortcomings of the work along with the future possible research directions are also discussed in this section.

Model
We assume that all individuals that are at risk of acquiring the infection belong to the subclass S. When a susceptible individual comes in contact with an infectious individual, s/he may become exposed at the rate β. We consider density dependent transmission here, hence susceptible individuals transfer to exposed class is modeled as βSI. The exposed individuals are considered to have latent infections and cannot infect others. Here, we assume that the isolated individuals are held strictly quarantined and do not transmit the disease. They start transmitting the disease after a few days and move to the infectious subclass I at the rate σ. Due to not expressing symptoms, inadequate testing facilities and widespread outbreak, it is considered that all the infected patients are not identified. We assume that the infectious individuals may be identified and isolated at an expected rate q and move to isolated subclass Q, or maybe recovered and move to the subclass R at the rate γ 1 . The isolated individuals recover at an average rate of γ 2 and are assumed not to transmit the disease. The parameter π represents the rate of immigration due to birth. The parameters µ and δ represent the intrinsic death rate and disease related death rate, respectively. It is to be noted that, we assume all the severe cases are identified, as they seek for medical treatment due to the severity. Further, disease related deaths are mostly associated with severe cases [31,32]. So, we assume no disease related death rate in compartment I. The above mentioned transmission mechanism is summarized in the following system of differential equations. The flow of individuals between compartments are illustrated graphically in Figure 1.

Basic Reproduction Number
The basic reproduction number for the system is R 0 = πβσ k E k I µ , where K E = σ + µ and K I = γ 1 + q + µ. The expression for R 0 has been obtained by using the next generation matrix [33] approach (see the Appendix A for details). The expected infectious lifetime of an infected individual is 1 γ 1 , they are identified and quarantined at rate q, and the normal death rate is µ. Therefore, the average infectious lifetime reduces to . Therefore, the expression for R 0 is well justified.

Existence and Uniqueness of Equilibria
The first three equations of system (1) are independent of the fourth and fifth equations, and therefore can be decoupled without loss of generality. The system (1) then can be rewritten as We, therefore, study the system (2) in the following feasible region: It can be easily seen that Ω is positively invariant with respect to (2). Let Ω be the interior of Ω. The following theorem summarizes the existence and uniqueness of the possible equilibria. Theorem 1. System (2) has two possible equilibria in Ω: the disease-free equilibrium E 0 = ( π µ , 0, 0), which is independent of disease-related parameter values, and an endemic equilibrium E * = (S * , E * , I * ) ∈ Ω when R 0 > 1.
Proof. Steady states of the system (2) can be obtained by solving the following nullclines If I = 0, then E = 0 and S = π µ , therefore the system (2) possesses a disease-free equilibrium E 0 = ( π µ , 0, 0) which is independent of disease-related parameter values. Furthermore, if I = 0, then we have E * = (S * , E * , I * ) = π µR 0 , . It can be seen that the equilibrium E * is biologically acceptable and different from the disease-free equilibrium if and only if R 0 > 1. The proof is thus completed.

Stability of the Disease-Free Equilibrium
We have the following theorem on the global asymptotic stability of the disease-free equilibrium E 0 .
Proof. (a) We construct a Lyapunov function as Differentiating Equation (4) with respect to time, we obtain Therefore, R 0 ≤ 1 ensures that L ≤ 0 for all t > 0, which implies that the system (2) is globally asymptotically stable for R 0 ≤ 1. The proof of claim (a) is thus completed. (b) The Jacobian of the system (2) at E 0 is given by The Jacobian J E0 has an eigenvalue λ = −µ, besides the following characteristic equation where a 1 = (K E + K I ) and a 0 = K E K I (1 − R 0 ). Since a 1 > 0, it is easy to show that Equation (5) has a real positive root when R 0 > 1. Hence E 0 is unstable when R 0 > 1, thus completing the proof of claim (b).

Local Stability of the Endemic Equilibrium
Theorem 3. If R 0 > 1, then the endemic equilibrium E * is locally asymptotically stable.
Proof. The Jacobian of the system (2) at E * is given by The characteristic equation is where a 3 = 1, a 2 = µR 0 + K E + K I , a 1 = µR 0 (K E + K I ), and a 0 = βπσ R 0 (R 0 − 1). Here, a 3 is always positive, a 0 is positive when R 0 > 1. In addition, Therefore, according to the Hurwitz's criterion, E * is locally asymptotically stable for R 0 > 1, thus completing the proof.

Global Stability of the Endemic Equilibrium by Geometrical Approach
In this section, we study the global stability of the endemic equilibrium E * by using geometrical approach based on the second additive compound matrix, developed by Li and Muldowney [34]. We prove the following result first. Proof. Based on the Theorem 2, the disease-free equilibrium E 0 = ( π µ , 0, 0) is unstable when R 0 > 1. Since the disease-free equilibrium is on the boundary of Ω, this implies that the system is uniformly persistent if and only if R 0 > 1.
Therefore, there exists a constant c > 0 such that every solution (S, E, I) of system (2), with (S(0), E(0), I(0)) in Ω satisfies Here c is independent of the initial data in Ω. Consequently, the uniform persistence of system (2) in the bounded set Ω implies the existence of a compact subset K of Ω that is absorbing for system (2) [35,36]. We state our main result in the following theorem. Theorem 4. Assume R 0 > 1. Then the unique endemic equilibrium E * is globally asymptotically stable in Ω.
Proof. It is shown that there exists a compact subset K of Ω that is absorbing for system (2), and there exists a unique endemic equilibrium in Ω. The remaining part of the proof of this theorem involves checking the following Bendixson criterion [37] q := lim sup where denotes the vector field of (2), i.e., dx dt = F(x). The Jacobian matrix J = ∂F ∂x with a general solution and its second additive compound matrix J [2] is where the matrix A F is obtained by replacing the entry a ij of A(x) by its derivatives in the direction of [2] A −1 can be written in the following block form We select a norm in R 3 as where (u, v, w) denote vectors in R 3 and µ denotes the Lozinskiȋ measure with respect to this norm. Following [39], we estimate µ as follows where |B 12 |, |B 21 | are matrix norms with respect to l 1 vector norm and µ 1 denotes the Lozinskiȋ measure with respect to the l 1 norm. We have where µ 1 (B 11 ) is calculated by adding the absolute value of the off-diagonal elements to the diagonal one in each column of B 22 , and then the maximum of two sums is taken; see ( [38], p. 41). Then By using the second and third of equations of system (2), the above relations can be written as which implies q < 0 from Equation (7), thus proving Theorem 4.
The global stability of the endemic equilibrium E * has also been checked by constructing a Lyapunov function using the graph-theoretic method developed by Shaui and Driessche [40]. Please refer to the Theorem A1 in Appendix B for details.

Fitting Early Epidemic Data of Bangladesh
COVID-19 was officially reported in Bangladesh for the first time on 8 March 2020. Here, we used data from March 11, 2020, as there were anomalies in the data for the first few days. We fit the identified Active Cases, which is calculated in terms of several useful data as follows Active Cases, The identified cases are considered to be isolated at their residence or at hospital, which corresponds to Q subclass of our model.

Parameterization
We assumed that at the beginning of the epidemic, the population was at an equilibrium of 1.6 × 10 8 and the average life expectancy is 72 years. So, we chose µ = 3.8052 × 10 −5 and π = Hµ = 6088.3, where H = 1.6 × 10 8 represents the total population at the beginning. As there were no available studies or data regarding the exposed period and total duration of infectious life time, we chose the expected latency period to be 5 days [41,42], which gives σ = 0.2. The average infectious life time was chosen as 22 days [43], which gives γ 1 = 0.04545. Regarding the infected life time of identified patients, we used the data of 130 patients treated in the Kuwait-Bangladesh Friendship Government Hospital from Director General Health, Dhaka, Bangladesh, as given in Table 1. The value of γ 2 + δ, as per data in this table, amounted to 0.066. We can separate γ 2 and δ by using the ratio of total deaths to total recovered by 8 June 2020, as follows: δ = 0.066×total deaths/(total deaths + total recovered) = 0.0039 and γ 2 = 0.066×total recovered/(total deaths + total recovered) = 0.062. The parameter β is the product of the per-capita contact rate and transmission probability on each contact, both of which are unknown. The parameter q is the product of identification rate (reciprocal of the duration required to get identified once become infectious) and probability of getting identified, both of which are unknown as well. So, we leave the parameters β and q to be estimated through data fitting.
The parameter values are summarized in Table 2. To find the two key parameters β and q, we fit our model with the early epidemic data of Bangladesh.

Identifiability and Fitting
Let, X = (S, E, I, Q, R) ∈ R 5 + denotes the state variables, f is the vector containing the right side of the system (1) and Y is the active identified infectious cases. Then the system (1) can be written as If there exists a unique combination of β and q for which (8) holds, then the model is uniquely structurally identifiable. To check the identifiability, we used the web application COMBOS [44]. This application is an implementation of the differential algebra-based technique introduced in [45]. Under the above setting, β and q are found uniquely identifiable (details are provided in the Appendix C).

Fitting Methods
For the purpose of comparison, we checked the efficiency of two different fitting methods: the Least Square (LS) Method and Maximum Likelihood (ML) Method (Poisson Distribution). In the LS method, the error to be minimized is In the ML method, we assume all y(t i ) are independent and follow Poisson distribution with parameter Y (t i , β, q). Therefore, the likelihood function becomes Our objective is to find Y(t i , β, q), which maximizes the probability of observing y(t i ) i.e., to maximize the above likelihood function. For computational convenience, we rather minimize the negative log likelihood function and write it as a function of β and q as follows.
To compare the performance of LS against ML we use Absolute Errors (AE) and Total Relative Absolute Errors (TRAE) which are defined as follows From Figure 2, it can be seen that the absolute error produced by LS method is significantly higher than that of ML method in each of the first 30 days. Further, TRAE is 1.494 083 × 10 3 for LS, which is higher than TRAE, 7.426 629 × 10 2 for ML. In the second and third figure, corresponding estimates of β andq are shown with 95% confidence intervals. It can be seen from this figure that the confidence intervals are smaller for ML. Both the methods give different estimate. The value of q estimated by LS is nearly 2.5, which is practically feasible though, but in the case of Bangladesh, it is obsolete, as there is deficiency in testing facilities. We, therefore, choose ML method for rest of the analysis of the COVID-19 data of Bangladesh.

Profile Likelihood
Profile likelihood resembles the dependency of the likelihood function on each of the parameters and helps extract confidence intervals for the parameters. The profile likelihoods PL β (β) and PL q (q) of β and q respectively can mathematically be expressed as PL β (β) = min q NLL(β, q) and PL q (q) = min β NLL(β, q).

Results
We used a SEIQR model to understand COVID-19 outbreak in Bangladesh. Mathematical analysis confirms that the epidemic can take off iff R 0 > 1; and will die out iff R 0 < 1, irrespective of the initial condition.
As discussed in the earlier section, Bangladesh had minimal testing facilities at the initial stage. After about a month since the detection of the first COVID-19 patient, the minimum number of COVID-19 tests performed daily in the country significantly improved over the next couple of months. Besides, the country had nationwide lockdown in different forms for about two months. The parameters in the model, such as quarantine rate and transmission rate, are therefore expected to be time-varying. The ML fitting does not provide a time-dependent estimation of parameters. We thus fit the model with data sets of different time intervals starting from the beginning in order to understand the evolution of parameter values throughout the epidemic. It is worth mentioning that we tried to fit the data over non-intersecting window of intervals to understand the change of the parameters values with time, which lead to under fitting the model and hence avoided. We use the MATLAB function "fminsearch" for the optimization purpose and the code is available on request. Figure 3 shows the model fitting with the reported data on the time intervals starting from t i = 0 to t f = 30, 50, 70 and 90 days, respectively. The blue curves in the first and second column shows the profile likelihood of β and q respectively, for each of which one of the parameters are set to the value re-tuned by the minimization algorithm and the other is varied over the interval shown on the horizontal axis. The pattern of the profile likelihood affirms the identifiability of the parameters. The red dots in the first two columns show the best fit for the parameters β and q, whereas the third column shows the agreement between the model and data corresponding to these β and q.
We intend to inspect the evolution of parameter values over time and its consequence on R 0 and disease outbreak from Bangladesh perspective. The estimated values of β, q and corresponding R 0 is shown in Figure 4. From this figure, we see that in the first 30 days average R 0 was about 6.769 and for the first 40 days it was 12.05. This implies that β was increased from 30 to 40 days, which is related to the fact that the lockdown was not maintained properly at the beginning. The country then went for strict lockdown, law-enforcing agencies, e.g., Police and Army, were deployed to implement the lockdown properly. As a result, the value of β was decreased and consequently R 0 dropped to 6.904. The value of β started to increase again after 60 days. The lockdown in the country was relaxed around that time; shopping malls and markets were open before the religious festival (called Eid), some offices were open as well, which allowed β to increase. However, q is found to increase throughout the epidemic due to continuous effort in increasing testing facilities by the government. Though, the value of R 0 is observed to increase following an increasing β at the beginning, the value then dropped following a decrease in β and an increase in q. β is high at the end of 80 days along with a high value of q (high rate of identification through increasing testing facilities) resulted in decreased transmission and consequently lowered R 0 . Following a descending pattern, R 0 reached 1.515 at the end.    To understand the relationship better, we plot R 0 as a function of β and q in Figure 5, which clearly shows that R 0 increases with an increasing β and a decreasing q. The black dotted line shows the evolution of R 0 in β − q space, a high R 0 is observed corresponding to a high β at the beginning. Although β was high afterwards, increasing q is observed to act predominantly to decrease the value of R 0 . The solid line in the top left corner shows the threshold R 0 = 1, above which the combination β and q gives R 0 < 1, resulting in no outbreak, which agree with the mathematical analysis presented in Section 2 more specifically Theorem 2 (a). The value of β and q can be controlled to reduce R 0 below 1, which will curve the epidemic. With reference to the parameter estimates presented in Figure 4 (for different length of time intervals t f ), we present the simulation of our model over a period of 365 days in Figure 6. It illustrates the evolution in prevalence of the disease due to change in human behavior and lockdown scenario. From here, the association of the peak with R 0 can be explained, as well. The figure demonstrates that the total number of infected people, the peak of the epidemic, and the time when the peak will appear are strongly associated with R 0 . The higher the value of R 0 , the higher the peak of the epidemic and sooner the peak occurs. As there were minimal testing facilities at the beginning of the epidemic, the value of q was significantly low, and β was relatively high due to the low quarantine rate. The value of R 0 was, therefore, relatively large. As a result, the peak of the epidemic was observed to be higher and appeared sooner, and the corresponding number of total infected is high. Due to the remarkable improvement of testing facilities over the next few months, the value of q continued to grow, resulting in lowering R 0 . The peak of the epidemic, therefore, observed to become lower and appeared lately with relatively lower number of total infected people. During this time, the maximum number of officially confirmed cases is observed to increase, as expected due to improved testing facilities. However, the opposite scenario is observed from t f = 70 to t f = 90, where both I and Q are found to decrease even though q was still increasing. In a nutshell, the combined effect of β and q on the confirmed cases is perplexing. Also, there are unidentified cases, as well. Therefore, we investigated the answer to the question: "Do the identified patients represents the actual scenario of the epidemic?". We define Q max := max t {Q(t)} and I max := max t {I(t)} and plot them in q − β plane, as shown in Figure 7. In the triangular region, where the combination of q and β results in R 0 < 1, both Q max and I max are zero. Outside the triangular region, I max increases as q decreases, in other words, as R 0 increases. However, Q max evolves differently outside the triangular region. The solid blue line, which is the intersecting line of the Q max and I max surface, represents a threshold for q. Q max decreases as q moves away from this line. Here, q has two fold role on the difference between Q max and I max . As q decreases, average duration in compartment I increases and influx to compartment Q decreases simultaneously. Two different scenarios of the epidemic thus can be seen on the two sides of solid blue line. On the left side, where the quarantine rate is considerably high, Q max is higher than I max , which indicates the fact that most of the infected people are held quarantine and the actual scenario of the epidemic is mostly known. In contrast, on the right side of this line, where quarantine rate is significantly low, Q max is lower than I max , which indicates that the actual scenario of the epidemic is poorly known. In such a case, fitting data with SEIR model might yield misleading predictions.

Discussion and Conclusions
The pathogenesis of COVID-19 and the responsible virus strain SARS-CoV-2 are still poorly understood as it is a new strain. Scientists all over the world are trying to understand the epidemiology in the context of different countries, as the context of every country is unique from the perspective of international traffic, trade, local resources, awareness, government planning, etc. In most of the study, the available data of cumulative identified, cumulative recovered, and cumulative dead have been used to fit models, where the input in the I compartment has been accumulated to fit with the cumulative identified. However, the irony is, the I compartment refers to the infectious compartment responsible for spreading the infection, where the identified individuals hardly transmit the disease in hospital care facilities. Even if the identified patients transmit the disease, that will be, on average, very negligible. In contrast, the unidentified patients, who are not counted at all, are mainly responsible for spreading the infection. To overcome this limitation, we propose a variant of the SEIR type model, which, besides the identified infected, can also estimate the unidentified fraction of infected. We confirmed that the basic reproduction number is the key threshold for disease outbreak.
Many different techniques are used to fit model with data, for example, non-linear least-square fitting [6,13,14,16], genetic algorithm [15], etc., where the objective function is the Euclidean distance of the data and model estimates. In this study, we examined the Least Square Technique and Maximum Likelihood fitting. The latter, which minimizes an objective function other than the Euclidean norm, outperformed the former. We have analyzed the early epidemic scenario of COVID-19 outbreak in Bangladesh and intended to predict the plausible future scenarios. In addition, we intended to identify the required measures to reduce the outbreak.
We analyzed available data from 11 March 2020 to 8 June 2020. The reproduction number was found to vary between about 1.5 and 12, which is due to the fact that the country had lockdown in different forms, and the testing facilities were significantly improved during this time. We found that the peak of the epidemic and the time when the peak will appear are strongly associated with the reproduction number R 0 . The peak of the epidemic found to become higher and appear sooner as the reproduction number increases, which was also reported in [19]. The simulations of the model indicated that the peak of the epidemic is expected to appear around 2 October 2020. However, the fluctuations in R 0 value seen over the different time intervals suggests that the arrival and amplitude of the peak might change due to partial lockdown in red zones (areas with highest case load) and improved testing facilities. As the epidemic advances, with real-time data, our model can be used to estimate the amplitude and time of the peak as well as the number of unidentified infected people with reasonable accuracy.
The amplitude and arrival of the epidemic peak depend on the reproduction number R 0 , which suggests that the peak can be delayed, and the amplitude of the peak can be reduced by reducing the reproduction number. This can be achieved, for instance, by identifying potential superspreaders and reduce their contacts, or by implementing lockdown to reduce overall contact rate, or by increasing public awareness and mass education. Another factor that can reduce the reproduction number below one is increasing the identification rate by increasing the testing facility and identifying the persons that came in contact with the infected person during the course of active infection. It is needless to say that if the reproduction number can be reduced and maintained below one, the outbreak will be under control. However, it is worth mentioning that the herd immunity varies with the basic reproduction number and consequently with transmission rate and isolation rate. Therefore, when the epidemic curve starts to decrease after the peak or reaches the end of the epidemic, uncontrolled/unplanned social interaction or relaxed lockdown and decreased testing rate may further initiate a second wave of the epidemic.
There are several shortcomings of the study. For example, the model only predicts the average scenario and neglects the impact of plausible randomness. The sample size of hospitalized patients used to estimate the recovery rate and disease related death rate of identified patients is small. Also, hospital mediated transmissions have been neglected. The latency period and the course of active infection may vary with mutation of the virus and immunity of the host, which are not known for the circulating strain in Bangladesh. In addition, there is no precise information about the mechanisms of environmental transmission, also it varies greatly within and across the countries [13,47]. We, therefore, have excluded the environmental transmission from our modeling. Moreover, the method used in this study gives constant value of the estimating parameters for the whole period, though the key parameters like transmission rate and identification rate are likely to vary over time. Due to these shortcomings, the accuracy of the results and prediction in this paper could vary to some extent. Our future study aims to investigate these shortcomings. Then by Theorems 3.3 and 3.5 in [40], there exists a Lyapunov function for (2) such that V(t) = c 1 D 1 + c 2 D 2 + c 3 D 3 with c 1 = c 2 and c 3 = βS * I * σE * c 1 = σ + µ σ c 1 . Hence, we obtain a Lyapunov candidate function for (2) as The derivative of V(t) along the system (2) is Thus V (t) is non-positive, which implies that the endemic equilibrium E * is globally asymptotically stable when exists. This completes the proof of the Theorem (A1).