Assessment of Social Distancing for Controlling COVID-19 in Korea: An Age-Structured Modeling Approach

The outbreak of the novel coronavirus disease 2019 (COVID-19) occurred all over the world between 2019 and 2020. The first case of COVID-19 was reported in December 2019 in Wuhan, China. Since then, there have been more than 21 million incidences and 761 thousand casualties worldwide as of 16 August 2020. One of the epidemiological characteristics of COVID-19 is that its symptoms and fatality rates vary with the ages of the infected individuals. This study aims at assessing the impact of social distancing on the reduction of COVID-19 infected cases by constructing a mathematical model and using epidemiological data of incidences in Korea. We developed an age-structured mathematical model for describing the age-dependent dynamics of the spread of COVID-19 in Korea. We estimated the model parameters and computed the reproduction number using the actual epidemiological data reported from 1 February to 15 June 2020. We then divided the data into seven distinct periods depending on the intensity of social distancing implemented by the Korean government. By using a contact matrix to describe the contact patterns between ages, we investigated the potential effect of social distancing under various scenarios. We discovered that when the intensity of social distancing is reduced, the number of COVID-19 cases increases; the number of incidences among the age groups of people 60 and above increases significantly more than that of the age groups below the age of 60. This significant increase among the elderly groups poses a severe threat to public health because the incidence of severe cases and fatality rates of the elderly group are much higher than those of the younger groups. Therefore, it is necessary to maintain strict social distancing rules to reduce infected cases.


Introduction
Coronavirus Disease 2019 (COVID-19) is a novel viral disease that is currently threatening public health worldwide. The virus responsible for the disease was initially called Novel Coronavirus (2019-nCoV) due to its novelty. Analysis of the phylogeny and taxonomy of 2019-nCoV have shown that the virus belongs to the subgenus Sarbecovirus, which SARS-CoV belongs to [1], but is more closely related to bat SARS-CoV [2,3]. Thus, 2019-nCoV was named "severe acute respiratory syndrome coronavirus 2" or "SARS-CoV-2" [4]. Cases of SARS-CoV-2 display symptoms such as fever, dry cough, dyspnea, and diarrhea, which are similar to symptoms noted in MERS-CoV and SARS-CoV. However, the distribution of each symptom differs [5].

Epidemiological Data
In this study, we use the outbreak data of COVID-19 in the Seoul and Gyeonggi provinces between 1 February and 15 June 2020 [18,19]. Figure 1 shows the epidemic curve of confirmed cases of COVID-19 over the date of illness onset. A total of 1577 COVID-19 cases were reported. COVID-19 incidences were divided into different age groups to capture the age-dependent transmission dynamics. Figure 1a shows that the number of imported cases comprised about 39.6% infected cases before May but drastically diminished to about 4.0% afterward. Figure 1b shows that about 55.2% of infected cases were among ages 20-49 throughout the whole outbreak, and from May 1 through 14, about 76.8% of infected cases were among ages 20-39. Table 1 shows the incidence data by age group and the sources of infection in the target area during the period. The sample dataset used in this study is shown in Table S1 in Supplementary Section A. Int

Timeline of Control Interventions
The transmission dynamics of COVID-19 are greatly affected by governmental control policies such as social distancing, school closures, and lockdowns. The Korean government has attempted to implement appropriate control policies in response to changes in the number of infected people. In Korea, on 23 February, the increasing level of COVID-19 cases raised the alert to its highest level of "Red", thus strengthening the overall response system to possible epidemics [20]. As a result of this increase in the number of infected people, different levels of social distancing were implemented by the Korean government [21]. A brief description of the four levels of social distancing in Korea is shown in Table 2, and further details about the social distancing policies are given in Table S3 in Supplementary Section B.

Timeline of Control Interventions
The transmission dynamics of COVID-19 are greatly affected by governmental control policies such as social distancing, school closures, and lockdowns. The Korean government has attempted to implement appropriate control policies in response to changes in the number of infected people. In Korea, on 23 February, the increasing level of COVID-19 cases raised the alert to its highest level of "Red", thus strengthening the overall response system to possible epidemics [20]. As a result of this increase in the number of infected people, different levels of social distancing were implemented by the Korean government [21]. A brief description of the four levels of social distancing in Korea is shown in Table 2, and further details about the social distancing policies are given in Table S3 in Supplementary Section B.

Contact Matrix
As Seoul and Gyeonggi provinces are densely populated with diverse people, to enhance the realism of our model, it is beneficial to consider the heterogeneity in contact networks. Two of the most important heterogeneous aspects of a contact network are location and age since different locations are often visited by certain age groups, which leads to consistent contact with specific age groups. For instance, people tend to have contact with people of a similar age outside their households (i.e., schools and workplaces). Since our age-structured model allows us to adjust the transmission rates among different age groups and since the location is closely linked to an individual's contact pattern with certain age groups, we applied these location-based contact patterns to the transmission rates.
We divided the contact locations into four categories: school, workplace, household, and other locations. For each location category, we used the specific contact matrix of Korea from [21] to build our model. Each contact was defined by either physical or nonphysical contact; physical contact includes skin-to-skin contact like kissing, handshaking, etc., whereas nonphysical contact includes,

Contact Matrix
As Seoul and Gyeonggi provinces are densely populated with diverse people, to enhance the realism of our model, it is beneficial to consider the heterogeneity in contact networks. Two of the most important heterogeneous aspects of a contact network are location and age since different locations are often visited by certain age groups, which leads to consistent contact with specific age groups. For instance, people tend to have contact with people of a similar age outside their households (i.e., schools and workplaces). Since our age-structured model allows us to adjust the transmission rates among different age groups and since the location is closely linked to an individual's contact pattern with certain age groups, we applied these location-based contact patterns to the transmission rates.
We divided the contact locations into four categories: school, workplace, household, and other locations. For each location category, we used the specific contact matrix of Korea from [21] to build our model. Each contact was defined by either physical or nonphysical contact; physical contact includes skin-to-skin contact like kissing, handshaking, etc., whereas nonphysical contact includes, e.g., a two-way conversation with three or more words in the physical presence of another person but no skin-to-skin contact [24].
Each location-specific contact matrix is a 16 × 16 square matrix, which represents the mean number of instances of contact between individuals of five-year age groups, such as 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, and 75 and above. Each element is the contact rate of an individual in one of the 16 age groups with people in the other 16 age groups at the specific locations. More precisely, the location-specific contact matrix M is written as [25] where each element m ij denotes the mean number of contacts an individual in age group i makes with individuals in age group j per day. Note that contact matrix M is not necessarily symmetrical, which is a general feature that is also found in [26][27][28].
Since the focus areas are Seoul and Gyeonggi province, and the location-specific matrices of the whole region of Korea are only available in [25], we estimated the location-specific matrices of the focus area by using the proportion of the population of the area compared to that of Korea. We assumed the total population to be constant since the period of interest covers less than a year. We used the census data of Korea from January 2020 throughout the simulations. A summary of the data can be found in Figure S2 and Table S2 in Supplementary Section B, which describe how to calculate the contact matrix of the focus area. The calculated location-specific matrices for Seoul and Gyeonggi province are shown in Figure S4 in Supplementary Section B.
A full contact matrix M is composed of a linear combination of the location-specific contact matrices [25]: where m W is the workplace contact matrix, m S is the school contact matrix, m H is the household contact matrix, and m O is the contact matrix for all other locations, except for the workplace, school, and household; c W , c S , and c O are constants, and c H is a 16 × 16 diagonal matrix, which are each multiplied by their respective matrices. Based on the real policies of school closure and social distancing levels in Korea, we composed five different contact matrices by adjusting c W , c S , c H , and c O as M O , M C , M C w , M C m , and M C s , which denote the contact matrices of the cases of school openings with no social distancing, school closures with no social distancing, school closures with weak social distancing, school closures with medium social distancing, and school closures with strong social distancing, respectively. When the school is closed, c S = 0 since there are no contacts made in the school. On the other hand, when the school is closed, c H = diag(1.5, 1.5, 1.5, 1.5, 1.1, 1.1, . . . , 1.1) 16 , where diag( ) n denotes the diagonal matrix with n diagonal entries, such that for age groups below the age of 20, contact rates increased by 50.0% and for age groups 20 and above, contact rates increased by 10.0% [29]. For social distancing, when there is no social distancing, weak social distancing, medium social distancing, or strong social distancing, we assumed c O = 1, 0.7, 0.5, 0.3, respectively, such that c O decreases under stronger social distancing. Note that different types of c O levels were tested while decreasing the orders of c O for stronger social distancing, as shown in Figures S12 and S13 in Supplementary Section D, but we present only one case due to the lack of a significant difference in the fitting and simulation results. An example of a scenario/policy-specific contact matrix of Seoul and Gyeonggi province-school closure with no social distancing, M C -is shown in Figure 3; a comparison with the equivalent version for Korea is provided in Figure S3 in Supplementary Section B. Table 3 shows a summary of the contact matrices for different policies. The contact matrices for each scenario/policy are shown in Figure S5.

Mathematical Modeling
We developed a mathematical model to describe the transmission dynamics of COVID-19 by employing an S-E-I-H-R compartment model with 16 age groups. In this model, , , , , and denote the susceptible, exposed, infectious, hospitalized, and recovered/removed population of age group , respectively. The diagram for the model is shown in Figure 4.

Mathematical Modeling
We developed a mathematical model to describe the transmission dynamics of COVID-19 by employing an S-E-I-H-R compartment model with 16 age groups. In this model, S i , E i , I i , H i , and R i denote the susceptible, exposed, infectious, hospitalized, and recovered/removed population of age group i, respectively. The diagram for the model is shown in Figure 4.

Mathematical Modeling
We developed a mathematical model to describe the transmission dynamics of COVID-19 by employing an S-E-I-H-R compartment model with 16 age groups. In this model, , , , , and denote the susceptible, exposed, infectious, hospitalized, and recovered/removed population of age group , respectively. The diagram for the model is shown in Figure 4.    (3) are described in Table 4.  b i Infection probability of a person in age group i per contact In this model, the asymptomatic infectious population is excluded. Although recent studies around the world suggest the presence and significance of an asymptomatic infectious population [31,32], we found that it is appropriate to apply the settings from our area of interest and the time period we are observing. Hence, we refer to a recent antibody test for COVID-19 for randomly selected subjects in Korea [33], which includes 1833 subjects from Seoul and 278 subjects from Gyeonggi province, where only 1 subject was found positive (A total of 3555 subjects were tested through a screening inspection and plague reduction neutralization test; 1555 serum samples were collected from 21 April through 19 June from 192 regions in Korea, and 1500 hospitalized patients from Seoul were tested from 25 May through 28 May). Thus, the ratio of asymptomatic infected people to infected people was estimated to be very small in Korea. For this reason, together with the difficulty in determining the proportion of asymptomatic infections accurately, we did not consider a compartment for asymptomatic infections in the mathematical model.
The parameter 1/q is the median value computed from the data for each period, and its values are given in Table 5. We estimate the transmission rate β ij by utilizing the least squares method, lsqcurvefit, which is an embedded function in MATLAB. To measure the potential of the disease transmission in each period, we use the effective reproduction number R t , which is the average number of secondary cases infected by an index case in a population of both susceptible and nonsusceptible hosts. R t is computed as R t = ρ(G), where ρ is the spectral radius of the next generation matrix G [34]. The derivation of the value of R t is described in Supplementary Section C.

Ethical Considerations
This study used the data available in [18,19]. The datasets were already fully anonymized and did not include any identity information. Thus, ethical approval was not required for this analysis.

Data Sharing Policy
The COVID-19 data for Gyeonggi province are accessible in [18], and the data for Seoul city are available upon request [19].

Estimation of Transmission Rate
We estimated the transmission rate using the epidemiological data described in Section 2. Depending on each period, we estimated the transmission rates corresponding to the age group by applying the least squares method to the age-specific incidence data. We observed that the number of incidence data for each 5-year age group was not sufficient to estimate the transmission rate between age groups due to the absence of reported cases in some periods. Thus, to clarify the different properties of transmission rates between age groups, we estimated the transmission rate for 10-year age groups, such as 0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, and 70 and above, by combining two 5-year age groups into one 10-year age group. Figure 5 compares the observed and estimated COVID-19 cases for (a) incidence and (b) cumulative incidence among all ages. The results of the data-fitting for each age group are shown in Figures S6 and S7 in Supplementary Section D. in a population of both susceptible and nonsusceptible hosts. is computed as = ( ), where is the spectral radius of the next generation matrix G [34]. The derivation of the value of is described in Supplementary Section C.

Ethical Considerations
This study used the data available in [18,19]. The datasets were already fully anonymized and did not include any identity information. Thus, ethical approval was not required for this analysis.

Data Sharing Policy
The COVID-19 data for Gyeonggi province are accessible in [18], and the data for Seoul city are available upon request [19].

Estimation of Transmission Rate
We estimated the transmission rate using the epidemiological data described in Section 2. Depending on each period, we estimated the transmission rates corresponding to the age group by applying the least squares method to the age-specific incidence data. We observed that the number of incidence data for each 5-year age group was not sufficient to estimate the transmission rate between age groups due to the absence of reported cases in some periods. Thus, to clarify the different properties of transmission rates between age groups, we estimated the transmission rate for 10-year age groups, such as 0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, and 70 and above, by combining two 5-year age groups into one 10-year age group. Figure 5 compares the observed and estimated COVID-19 cases for (a) incidence and (b) cumulative incidence among all ages. The results of the data-fitting for each age group are shown in Figures S6 and S7 in Supplementary Section D.  Table 5 shows the values of the estimated infection probability and the effective reproduction number depending on the age group and period. Here, is a vector consisting of the infection probability for eight age groups instead of sixteen age groups (i.e., = for ∈ {0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+}), and each subsequent pair of equals (i.e., = = , = = , ⋯ , = = ). The value of was bigger than 2 in period 1, but after governmental control policies began in period 2, it decreased below 2. In particular, in periods 3 and 4, when medium and strong levels of social distancing were implemented, respectively, the value of became much less than 1. However, significant local infections have occurred since 24 April, when infected cases linked to club attendance among the  Table 5 shows the values of the estimated infection probabilityb and the effective reproduction number R t depending on the age group and period. Here,b is a vector consisting of the infection probability for eight age groups instead of sixteen age groups (i.e.,b = b k for k ∈ {0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+}), and each subsequent pair of b i equalsb k (i.e.,b 0−9 = b 0−4 = b 5−9 , b 10−19 = b 10−14 = b 15−19 , · · · ,b 70+ = b 70−74 = b 75+ ). The value of R t was bigger than 2 in period 1, but after governmental control policies began in period 2, it decreased below 2. In particular, in periods 3 and 4, when medium and strong levels of social distancing were implemented, respectively, the value of R t became much less than 1. However, significant local infections have occurred since 24 April, when infected cases linked to club attendance among the young age groups were reported [23]. In the period between 24 April and 6 May, the R t value was estimated to be 2.4846, and the governmental control policies against local infections were implemented in period 7, which decreased the value of R t to 0.8047. In periods 1 and 2, the time taken to be diagnosed from symptom onset, 1/q, was estimated at 8 and 5 days, respectively, but decreased to 3-4 days since 29 February when social distancing began. In the transition from period 2 to period 3, medium social distancing was implemented, and the infection probabilityb for age groups 0-9, 10-19, 20-29, 30-39, and 40-49 decreased by 37.8%, 86.1%, 21.3%, 40.6%, and 17.8%, respectively, while that of the age groups 50-59, 60-69, and 70+ increased by more than 400.0%, which resulted in a decrease of R t of 0.6776. Once the strong social distancing started in period 4,b either decreased or remained at similar level for almost all age groups, resulting in the R t decreasing by 0.1145. In period 5-2, theb for age groups 20-29 and 30-39 increased rapidly, resulting in an increase of R t of 2.4846.

Effect of the Control Strategies
We investigated the potential effect of social distancing under various scenarios. In Table 6, between 24 April and 31 August, we created seven scenarios along the baseline considering the social distancing strengths described in Section 2.2. These scenarios were designed to test the effects from the strongest case (scenario 1) to the weakest case (scenario 7). Weak Weak Weak Figure 6 shows (a) a comparison of the time-dependent cumulative incidences for the scenarios and (b) the age-specific cumulative incidences up to 31 August 2020. Figure 6a illustrates the effects of different social distancing combinations under each scenario, and in Figure 6b, we can observe how each scenario affects different age groups accordingly. The incidence plot corresponding to Figure 6a is shown in Figure S8a. The simulation results of the incidence and cumulative incidence for each age group are shown in Figures S9 and S10 in Supplementary Section D, respectively. Table 7 shows the cumulative incidence of each age group up to 31 August 2020. young age groups were reported [23]. In the period between 24 April and 6 May, the value was estimated to be 2.4846, and the governmental control policies against local infections were implemented in period 7, which decreased the value of to 0.8047. In periods 1 and 2, the time taken to be diagnosed from symptom onset, 1/q, was estimated at 8 and 5 days, respectively, but decreased to 3-4 days since 29 February when social distancing began. In the transition from period 2 to period 3, medium social distancing was implemented, and the infection probability for age groups 0-9, 10-19, 20-29, 30-39, and 40-49 decreased by 37.8%, 86.1%, 21.3%, 40.6%, and 17.8%, respectively, while that of the age groups 50-59, 60-69, and 70+ increased by more than 400.0%, which resulted in a decrease of of 0.6776. Once the strong social distancing started in period 4, either decreased or remained at similar level for almost all age groups, resulting in the decreasing by 0.1145. In period 5-2, the for age groups 20-29 and 30-39 increased rapidly, resulting in an increase of of 2.4846.

Effect of the Control Strategies
We investigated the potential effect of social distancing under various scenarios. In Table 6, between 24 April and 31 August, we created seven scenarios along the baseline considering the social distancing strengths described in Section 2.2. These scenarios were designed to test the effects from the strongest case (scenario 1) to the weakest case (scenario 7). Weak Weak Weak Figure 6 shows (a) a comparison of the time-dependent cumulative incidences for the scenarios and (b) the age-specific cumulative incidences up to 31 August 2020. Figure 6a illustrates the effects of different social distancing combinations under each scenario, and in Figure 6b, we can observe how each scenario affects different age groups accordingly. The incidence plot corresponding to Figure 6a is shown in Figure S8a. The simulation results of the incidence and cumulative incidence for each age group are shown in Figures S9 and S10 in Supplementary Section D, respectively. Table  7 shows the cumulative incidence of each age group up to 31 August 2020.
(a) (b) Figure 6. Cumulative Incidence for scenarios of social distancing: (a) the time-dependent cumulative incidence for the total age group and (b) the age-specific cumulative incidence from 1 February to 31 August 2020. Strong, medium, and weak social distancing is denoted by s, m, and w, respectively, and w+ denotes weak social distancing+ defined in Table 2.   Table 7 show that if a strong level of social distancing had been maintained for all three periods, the number of infected people would have decreased by about 44.6%. On the other hand, if a weak level of social distancing was implemented for the three periods, the number of incidences would have increased by about 29.2%. However, when the intensity of social distancing is reduced in all the scenarios, the number of incidences increases in proportion with the degree of the intensity reduction. In particular, people who are between the ages of 0 and 19 present a minimum number of infected cases in all the scenarios of social distancing. In other words, the number of infected cases among those between the ages of 0 and 19 was the least affected by the strength of social distancing. For those between 50 and above, the number of infected cases increased drastically (28.4, 42.4, and 42.0 percent increase for age groups 50-59, 60-69, and 70+, respectively) in scenario 7 compared to the baseline, showing that without sufficiently strong social distancing, the age groups of 50 and above became noticeably vulnerable compared to the younger age groups. Scenarios 6 and 7 used the same weak social distancing strength from 24 April through 29 May. The only difference is in the social distancing strength during the longest period of 29 May-31 August, where scenario 6 uses strong social distancing, and scenario 7 uses weak social distancing. Despite the strength differences for the period of approximately three months, the effects on ages 0-19 appear to be minimal compared to those for people aged 20 and above. Moreover, the number of infected cases among those age 40 and above was effectively reduced even though social distancing was weak starting from 29 May. Corresponding to Figure 6b, the comparison of age for each scenario is shown in Figure S11a in Supplementary Section D. Figure 7 shows (a) the monthly incidence of cases for the total age group under all scenarios and (b) a comparison of the monthly incidence among the two age groups of 20-49 and 50 and above for the baseline (scenarios 1 and 7). The monthly incidence of the other scenarios for these two age groups is shown in Figure S11b,c in Supplementary Section D. Table 8 shows the monthly incidence of the total age group, the age groups of 20-49, and those of 50 years and above for all scenarios. In the four months of May through August, compared to the baseline, the total incidence increased by 43.9% under scenario 7 but decreased by 66.6% for scenario 1. Table 8. Monthly incidence of the total age group, the age groups of 20-49, and those 50 years and older (50+) for all scenarios. The percentage below incidence represents the percentage increase or decrease from the baseline.

Scenario
May (a) (b) Figure 7. Cumulative incidence based on scenarios: (a) the monthly incidence for the total age group and (b) the monthly incidence of the two age groups of 20-49 and 50 and older (50+) for the baseline (scenarios 1 and 7).

Discussion
In this study, we analyzed the epidemiological data of COVID-19 cases in Seoul and Gyeonggi province between 1 February and 15 June 2020. The symptoms, transmission rates, and fatality rates of this disease differ by age, and the risks of severe symptoms and fatality rates are greater with an increase in age [13].
To take these aspects into account, we developed an age-structured model that describes the age-dependent dynamics of COVID-19. In the age-structured model developed in this study, we estimated the transmission rate by applying the contact matrix obtained from [25] to the actual incidence and population data for Seoul and Gyeonggi province. Since the control policies implemented by the governmental authorities affect the dynamics of infectious diseases [13,35], we divided the whole period between 1 February and 15 June into seven distinct periods following important changes in governmental control policies. We observed that the simulated incidence curve with the fitted transmission rate matches well with the actual incidence data of each age group over the whole period. Using the developed age-structured model, we investigated the effect of social distancing under various scenarios in the focus area.
For each of the seven distinct periods, we estimated the infection probabilityb for each age group and the effective reproduction number R t , which led to three interesting results. First, as the social distancing strength increased, R t decreased from 2.1971 to 0.0001 until 24 April. Until the serious infections linked to clubs began to emerge, social distancing was effective in preventing local transmission. In period 5-2, the behavioral changes among those aged 20-39 [23] were suspected to be the primary cause of the escalating outbreaks after 24 April, among which the R t increased to 2.4846. Secondly,b differed greatly depending on the age group. Despite an increase in social distancing strength, the age groups 50 and above experienced an increase inb during period 3, while the transmission rates for the age groups younger than 50 decreased. This suggests that social distancing affects different age groups with different magnitudes, with younger age groups being more effective while under control. Thirdly, in period 6 during the weak social distancing, age groups 50 and above showed a greater change inb than age groups 20-49. This resulted in critical situations featuring an elevated number of deaths since the fatality rate is generally greater for people age 50 and above [17].
The baseline scenario reflected the actual social distancing policies implemented by the Korean governmental authorities between 1 February and 15 June 2020. In other scenarios, it was assumed that various levels of social distancing, different from the baseline scenario, were implemented in periods 5, 6, and 7. The simulation results in Table 7 showed that if a strong level of social distancing has been implemented for all three periods, the number of infected people would have decreased by about 44.6%. On the other hand, if a weak level of social distancing was maintained for the three periods, the number of infected people would have increased by about 29.2%. For all the scenarios, the results showed that a reduction in the intensity of social distancing produced an increase in the number of infected persons. Notably, the number of incidences in the age groups 60 years and above increased significantly compared to that of other age groups, which represents a very dangerous situation, as the fatality rate of the elderly groups is much higher than that of the younger groups [17]. Therefore, it is necessary to properly maintain a high-level intensity of social distancing to lower the fatality rate and reduce medical expenses. However, the social and economic costs that may emerge from strengthening social distancing should also be considered.
To investigate the effects of social distancing, we assumed that all schools were closed during the whole period of this study. We also reviewed some previous studies on the effects of school closures during different disease outbreaks [27,36]. Indeed, during the COVID-19 pandemic, many of our sampled schools have been opened since mid-May, except when there were recorded incidences of infected people in a school or its nearby area. Schools under such conditions were closed for a certain period, and quarantine policies were implemented differently for each school. However, it was difficult to provide an accurate reflection on the effects of schools opening/closing in this study since there are no reports on group infections in all schools over the whole period. Therefore, we propose that our model is more suitable for analyzing the impact of fixed and clear-cut control policies like social distancing, rather than the impact of schools opening/closing on the transmission of COVID-19.
Despite the limitations in our study, we successfully developed an age-structured model using the epidemiological data in Seoul and Gyeonggi province by implementing an age and location-based contact matrix, which is not a well-known model for COVID-19. Through this study, we analyzed the effects of different social distancing policies and further extended those effects to simulate different scenarios. As the social distancing strength was weakened, people age 50 and above were directly affected, showing a more significant increase in transmission rate than that among people age 20-49. Strong social distancing can be very effective in reducing the number of infected cases, as shown in scenario 1, where the cumulative incidence was reduced by 44.6% compared to the baseline.

Conclusions
In this paper, we developed an age-structured mathematical model for assessing the age-dependent transmission of COVID-19 in Korea. The target area was Seoul and Gyeonggi province, the most populated area in Korea. We divided the total human population in the target area into different age groups. We estimated the transmission rate for each age group in seven distinct periods using the COVID-19 data and contact matrix for each age group and investigated the effect of social distancing on the control of the disease in the age-structured model under various scenarios. In the most optimal scenario (Scenario 1), the reduced cumulative incidence of 44.6% from the baseline established that social distancing strength can have a critical impact on the mitigation of transmission dynamics.
Our modeling approach for COVID-19 has novelty in that we estimated the transmission rates of different age groups in seven distinct periods following government control policies. The modeling approach presented in this work can be applied to other target areas worldwide if sufficient epidemiological data and contact matrices for the various age groups are available.
Supplementary Materials: The following are available online at http://www.mdpi.com/1660-4601/17/20/7474/s1, Section A: Data Analysis (Table S1: Dataset Sample; Figure S1: Cumulative incidence of Seoul/Gyeonggi Province by (a) age group, (b) source of infection, and (c) region); Section B: Contact Matrix and Control Policy ( Figure S2. Population of South Korea and Seoul/Gyeonggi Province in January 2020 by age groups; Table S2