A Comparative Study of the Robustness and Resilience of Retail Areas in Seoul, Korea before and after the COVID-19 Outbreak, Using Big Data

: This study aimed to assess the robustness and resilience of retail areas in Seoul, based on the changes in sales before and after the COVID-19 outbreak. The spatial range and temporal scope of the study were set as district- and community-level retail areas in Seoul, from January 2019 to August 2020, to consider the effect of the COVID-19 outbreak. The data used in this study comprised sales information from the retail sector, namely Shinhan Card sales data for domestic and foreigners by business type in Seoul, provided by Seoul Big Data Campus. We classiﬁed the retail areas based on the change in sales before and after the COVID-19 outbreak, using time series clustering. The results of this study showed that time series clustering based on the change in sales can be used to classify retail areas. The similarities and differences were conﬁrmed by comparing the functional and structural characteristics of the district- and community-level retail areas by cluster and by retail area type. Furthermore, we derived knowledge on the decline and recovery of retail areas before and after a national crisis such as the emergence of a COVID-19 wave, which can provide signiﬁcant information for sustainable retail area management and regional economic development.


Introduction
A retail area has a range of spatial powers of commercial functions, formed by the spatial agglomeration of commercial facilities. Spatial agglomeration of commercial facilities promotes the formation of consumption space and attractiveness to customers, thereby creating an environment in which high business results and sales can be achieved [1][2][3]. The formation and characteristics of retail areas are greatly influenced by consumers' decision-making, along with numerous factors in the urban space [4].
In essence, it creates a physical and nonphysical environment for the stability and sustainability of commercial facility operations. It also plays a role in improving urban vitality, forming a community, increasing local attractiveness and investment value, and preventing crime [5][6][7][8]. Along with these roles, as commerce is a major industrial sector that occupies the infrastructure of the national economy [9], the importance of stability and sustainability management of commercial facilities increases from the perspective of the retail area.
Meanwhile, COVID-19, which has been spreading around the world since November 2019, is a respiratory infection and has high infectivity between humans. The World Health Organization declared a global pandemic in March 2020, and many countries have implemented city blockades (lockdowns) and are experiencing rapid changes in social behavior [10]. This has had a direct impact on the economy [11,12]; in particular, the loss of retail area function and the collapse of the regional economy within the city are becoming a reality due to the rapid contraction of consumption activities. In the case of retail areas and commercial facilities, special management and attention are required for these major determinants of the dynamics and sustainability of urban spaces or systems [9,13,14].
Korea is no exception to this global crisis. According to the 2020 report of the Seoul Institute, Seoul's economic sentiment index showed a trend of rising from 88.8 in August 2019 to 95.7 in January 2020, but from January 2020, due to the COVID-19 outbreak, it plunged to 87.2 in February 2020 and 63.7 in March 2020 [15]. These changes occurred with the implementation of "social distancing" to prevent the spread of infectious disease. Since the COVID-19 outbreak, changes in lifestyle have been different for each administrative district [16][17][18], and the gap in the decrease in sales in retail areas has been determined by region and business type. In order to overcome this crisis, the government provided emergency disaster relief funds [19][20][21] and financial stabilization for small businesses [22,23] to recover the local economy and promote consumption activities, and the effect of recovering retail area was confirmed [24,25].
The fact that the changes in sales in retail areas due to the COVID-19 outbreak and emergency disaster relief funds differ by region means that the pattern may differ depending on the characteristics of the retail area. In other words, it means that the management and support policies implemented by relevant ministries should be differentiated according to the characteristics. This phenomenon is related to the structural characteristics, such as robustness and resilience of the retail area, but studies on the contraction and recovery of the retail area have not dealt with these characteristics.
To fill in these knowledge gaps, this study aims to analyze how the fluctuations in retail areas appear differently according to the structural characteristics, based on the experience of the changes following the COVID-19 outbreak. The study was conducted in Seoul, the capital of South Korea, and the time range was from January 2019 to August 2020. In Korea, the COVID-19 pandemic began in February and March 2020, and emergency disaster relief funds and social distancing were implemented from May 2020. In order to reflect these temporal characteristics, the time range is from January 2019 to August 2020.
The remainder of the study is organized as follows: in the next section, we briefly review the literature on behavior change due to an outbreak of infection, the robustness and resilience of urban space, and agglomeration externalities from the perspective of the structural characteristics of businesses. Then, we perform a time series cluster analysis for classification based on the changes in the retail area over time. Based on the classification, the structural characteristics of each retail area are measured, and the changes in sales and structural characteristics of the retail area are analyzed. The results of this study will be able to provide basic data for systematic policy application, such as setting the priorities, timing, and period of policy implementation by the characteristics of the retail area.

Changes in Behavior Due to Infectious Diseases
Infectious diseases with high infectivity, such as severe acute respiratory syndrome (SARS), novel swine-origin influenza A (H1N1), and Middle East respiratory syndrome (MERS), have a direct impact on lifestyle. This leads to various changes, such as movement patterns in urban spaces, consumption patterns, economic losses, and community collapse [26,27]. In particular, in the case of COVID-19, which approached pandemic levels in December 2019, the movement and activity in urban spaces decreased sharply due to different characteristics such as high infectivity, the appearance of various viruses, a relatively long incubation period, and reinfection [28,29].
In this context, various research has investigated the impact of the COVID-19 pandemic on the performance and sales of tourism and service industries [30][31][32], the changes in the prices of goods and services [33,34], the changes in urban behavior and consumption patterns [35,36], and the characteristics of shrinking consumption activity after lockdown [37]. Although, depending on the type of national disaster, the intensity and duration of the impacts on movement patterns may vary by social sector [38], these types of disasters generally have a direct impact on the social structure, leading to changes in consumption and regional economic collapse by changing the social behavior in urban spaces [39,40]. In particular, when a global pandemic is declared or a lockdown is implemented, consumption sharply declines [41].
These studies are useful for studying changes in urban residents' behavior in urban spaces and regional economic changes caused by infectious diseases triggering global pandemics. However, even though these changes may differ depending on the region and the degree of impact may vary, research on this topic is in the early stages.

Robustness and Resilience of Urban Space
Robustness refers to the ability of an existing system to retain its characteristics after an external shock, and resilience refers to the characteristics of the system changed by external shock being recovered through continuous development and change adjustment. Therefore, an understanding of the relationship between robustness and resilience is the basis for successful strategy formulation now and in the future [42].
In this regard, research on urban and regional robustness and resilience in the event of natural disasters [43][44][45], as well as urban spatial robustness studies on urban air quality, fine particulate matter, and carbon emissions based on price prediction and traffic prediction, were conducted [46,47]. In addition, studies on urban infrastructure and environment have been conducted, such as a study on predicting the robustness and resilience of the urban drainage system through the application of disaster scenarios [48].
As such, studies on robustness and resilience have mainly focused on disasters and the environment, but in recent years, studies on the robustness and resilience of urban spatial characteristics have been attempted. A considerable body of empirical studies has investigated the robustness and resilience of public transport system networks through the application of scenarios [49,50]. In addition, studies on the robustness of a place-based community [51], the characteristics of gatherings in neighborhood units [52], and the robustness of the population structure [53] were attempted. Such robustness and resilience differ according to the spatial, social, and economic characteristics of the region [54].
In particular, a retail area is a major factor in the robustness and resilience of the regional economy. Accordingly, studies on the continuity of commercial facilities in the retail area [2,55], and confirming the relationship between regional economic resilience and the characteristics of the retail area, have been conducted [14,56]. These studies emphasized the importance of maintaining the sustainability of retail areas and commercial facilities within urban spaces, as they play a large role in the growth and management of urban and regional economies [9,13]. However, only studies on the sustainability of commercial facilities and the relationship between the retail area and the local economy have been conducted, while discussions on the robustness and resilience of the retail area have been insufficient.

Commercial Facilities' Agglomeration and Externality
Commercial facilities account for a large proportion of national businesses and are representative businesses that directly affect the sustainability and resilience of urban spaces [13,14]. They agglomerate in a specific space to form a place called a retail area, which increases alternative consumption activities, forms a consumer pool, and creates an environment where both consumers and sellers can benefit [1,2,57].
The agglomeration of commercial facilities in the retail area creates an externality from a regional economic perspective. This means that an agglomeration economy is based on the physical proximity between the facilities and is an element that can encompass the regional boundaries and the potential of the economy [58]. The agglomeration economy is divided into static and dynamic externalities. In particular, dynamic externalities are factors that directly or indirectly affect a business' performance. These are based on three theories: the Marshall-Arrow-Romer (MAR) externality-knowledge and information spread through regional specialization; the Jacobs externality-the higher the diversity, the more clusters that can exchange knowledge and information; and the Porter external effect-local competition is positive for performance creation [59,60].
Among the dynamic externalities, diversity induces fundamental innovation, unlike regional specialization, which induces radical innovation in the region [61,62]. In order to understand the structure of regions and regional economies according to the concentration of businesses, research is being conducted that divides diversity into "related diversification" and "unrelated diversification." Related diversification is diversity within the sector. The higher this characteristic is, the more consumers, markets, and resources are shared, and the utility of individual businesses is maximized. On the other hand, unrelated diversification is the diversity between the sectors, and the higher this characteristic is, the more robust it is to economic crises or external shocks, but it also creates an environment where synergy effects are difficult to obtain due to the inconsistency of utility between consumers and suppliers [58,[63][64][65][66][67]. Both "related diversification" and "unrelated diversification" are major factors that increase the continuity of industries and businesses [68][69][70]. Diversification can also be used as an index to grasp the structural characteristics of local industries and businesses [71][72][73][74] and has a very close relationship with regional economic performance [62]. Based on these theories, a number of studies have been conducted to confirm the influence of dynamic externalities, focusing on the performance and survival of industries and businesses. However, the approach to commercial facilities and retail area is still in its infancy.

Application of Time Series Clustering Approaches
Clustering is a technique that supports the identification of a structure in unlabeled data [75]. This approach has a very strong characteristic of reducing the dimensions of the dataset and being able to search for representativeness based on input data. In this process, different approaches are required based on the characteristics of the dataset to classify the individuals into distinct clusters [76].
Information in urban spaces, in the real world, is accumulated over time and is longitudinal and large in volume [77]. These have meaning by themselves, but if they recognize their patterns and are classified, they have greater insight. For this, the application of the time series clustering approaches that can be applied to dynamic data is more suitable than the conventional clustering approaches that have been applied to static data [78].
Time series clustering is specialized in dealing with dynamic data; the values change over time, and through this, traditional clustering analysis can be performed based on the time series data. This method categorizes the input data based on observations at each time point, and the type is identified by comparing observations at each time point and checking dissimilarity [79][80][81].
Recently, a few related studies applying time series clustering have been reported in various fields. In the field of finance, a study was attempted to compare and analyze changes in financial assets by country [82] and daily euro exchange rate fluctuations [83] before and after the global financial crisis. In other fields, a study on the pattern of changes in energy consumption over time [75,84], the prediction and clustering of temporal changes in building energy consumption according to climate change using machine learning [85], the pattern of the lane changes to determine reasons for crash accidents [86], and the classification of districts and identification of centers of regions based on changes in daily population were attempted [87].
The common aim of these studies is to find out the emerging singularity as time passes and classify the individuals into distinct clusters based on the similar patterns and characteristics of singularity. In the data used in these studies, there is a limit to the application of the ordinary clustering approach, which targets only static information. In other words, because the conventional clustering approaches for static data are not suitable for dynamic data accumulated in large quantities in the real world, their practicality is decreasing. The time series clustering approach is being adopted as a solution to this limitation. By using this approach, it is possible to compare changes in a specific phenomenon caused by external shocks or events before and after, and it is possible to reduce the dimension of information through categorization based on this singularity analysis.

Summary
It was confirmed that the COVID-19 pandemic has drastically changed the living behavior of urban residents and has imposed a direct impact on the regional economy. In particular, commercial facilities, which are a major factor in the substructure of the local economy, were directly hit by the change in consumption patterns following the beginning of the COVID-19 pandemic. These changes may show different patterns for each retail area depending on the robustness and resilience, but up until now, the retail area has been considered as a main factor in urban robustness and resilience, and research on the retail area itself is still in its infancy. Furthermore, the impact of the COVID-19 pandemic may differ depending on the structural characteristics that it affects, such as by causing diversification based on the aggregation of commercial facilities. However, research on this area has not been completed yet. From a regional economic perspective, externality based on the agglomeration of businesses is also manifested in the field of commercial facilities, which has become easier with the recent emergence of big data and the development of analysis technology. In particular, it was confirmed through a literature review that the externality of agglomeration can measure the structural characteristics of regional industries and businesses, and through this, it is possible to indirectly analyze response and recovery due to external shocks. In order to do this easily, the comparison of the situation before and after the external shock based on the time series data is most suitable.
As such, studies comparing the situation before and after an external shock have mainly been conducted to identify individuals as a distinct cluster with similar patterns of change. These studies observe changes in individuals, derive singularities, and cluster individuals with similar characteristics. After this, the characteristics of the cluster are analyzed to derive characteristics that respond to external shocks and implications for future preparation. In this process, the time series clustering approach was mainly applied, and it is being applied in various fields for analyzing changes over time due to external shocks, such as the global financial crisis and COVID-19. This provides a basis for empirically analyzing changes in characteristics of specific phenomena caused by external shocks based on real data.
This study analyzes the changes in sales in Seoul's retail area due to the COVID-19 outbreak, categorizes them, and analyzes the structural characteristics by cluster. This can be distinguished by analyzing the robustness and resilience of the retail area itself from the perspective of regional economics for the retail area, which was considered the main factor of urban robustness and resilience.

Study Areas and Data
As noted, the spatial scope is Seoul, the capital of South Korea. In general, research in the commercial sector has limitations in terms of the boundaries and spatial definition of the retail area, as well as the limitations on obtaining data on commercial facilities, which causes high locational variability [88].
However, there are data that can overcome these limitations. There are two types of retail areas because of the geographically different characteristics: district-and communitylevel (neighborhood) retail areas. The district-level retail areas are formed in the center of the city or district, while the community-level retail areas are formed in a residential area with local roads, pedestrian pathways, or community roads and nearby districtlevel areas ( Figure 1). These geographical characteristics generate differences in terms of internal characteristics, agglomeration externalities, and location strategies [89]. In addition, information on individual commercial facilities has been disclosed to the public due to the revision of information-related laws and systems, and this includes information such as the location, type, and scale of commercial facilities. Along with these data, this study used the sales of each retail area to understand the changes that occurred before and after the initial COVID-19 outbreak. For the sales information, the Shinhan Card sales data for domestic and foreigners by business type in Seoul, provided by Seoul Big Data Campus, was used. This information is provided on a monthly basis and contains roughly 20 million pieces of information per month. It is provided based on a block unit smaller than the retail areas and includes daily sales information in the block (Figure 2). Shinhan Card is a subsidiary of Shinhan Financial Group and is one of seven card companies in Korea. In the third quarter of 2020, the Shinhan Card usage record (individuals and corporations) was approximately USD 34 billion, ranking first in the industry, with a market share of around 22% [90]. This is a representative card company in Korea, and it was used as it could represent the fluctuations in sales. In this research, the data were used for 20 months, which is the time range of the study. They are provided on a monthly block basis, and the total data used number roughly 360 million cases. In addition, around 360 million cases of sales data provided in block units by business type and daily were aggregated and refined by retail area and month. The block unit is larger than the parcel, but smaller than the retail area, and is the smallest unit among the available data related to sales. The total number of blocks in Seoul is 68,216, and 20,707 blocks are spatially included in the boundary of the retail area. Since multiple blocks are included in one retail area, the lower blocks were set, and the floor area of the retail area and the total sales of the blocks were calculated to the daily sales per area (m 2 ) for each retail area. After this, for the analysis, it was calculated as the average monthly sales per unit area for each retail area. The process of refining sales data, which are big data, is shown in the figure below ( Figure 3). Along with these data, this study used 253 district-level retail areas and 1010 communitylevel retail areas as a spatial range. In order to measure the agglomeration externalities, which are structural characteristics of each retail area, approximately 2.7 million cases of quarterly commercial facilities information from the fourth quarter of 2018 to the second quarter of 2020 were used.

Structural Characteristics of Retail Area Measure
In this research, the diversification indices, related and unrelated diversification, were used to measure the structural characteristics of the retail area. The diversification index was divided into related diversification and unrelated diversification by the conceptual differences. The related diversification index, intradiversification, is a diversification index within the business type that refers to the diversity within the major classification of commercial facilities. The unrelated diversification index, interdiversification, is a diversification index between the business type, which means the diversity of the major classification of commercial facilities.
In commercial facilities information data, there are 8 major classifications, 95 middle classifications, and 837 subclasses. In this study, the diversification indices within and between business types were calculated using the major and middle classifications. To measure the diversification in retail areas, the related and unrelated diversification indices (Equations (1) and (2)) were computed as follows: RV j is related diversification and UV j is unrelated diversification in j retail area; a refers to all middle classification of business types exclusively under the major classification of commerce sector type i, where i = 1,2, . . . , n. p aj is the proportion of middle classification a in retail area j. The number of major and middle classifications of commerce sector type is 8 and 95, respectively. These indices were computed separately by retail area type, district-and community-level retail area. Therefore, the number of retail areas is 253 for district-level and 1010 for community-level.

Time Series Clustering
The main analytical methodology of this research was time series clustering, which is based on a time series dataset. Because the goal of the classification is to discover a new set of clusters, the new groups are meaningful in themselves [78,79].
In this research, using the time series clustering method, retail areas in Seoul were classified according to the average monthly sales per unit area for each retail area. For this, a partitional clustering algorithm and a global alignment kernel distance method were adopted. In the clustering algorithms, the data were plainly assigned to only one cluster of k clusters. First, we randomly extracted k data, set them as random centroid points, and allocated clusters between data points with the closest distances. Then, this process was repeated until each data point was assigned to a designated number of clusters or the number of clusters no longer changed to allocate the final clusters. This method is stochastic because cluster formation is performed in a random way. In addition, since it performs repetitive tasks compared to other cases, it derives the optimal classification results, and through this, it is possible to reduce the complexity of the data, so it is mainly applied to very large datasets [80].
The global alignment kernel distance (GAK) method was applied as a method for measuring the distance between time series data points. This method considers the cost of all possible alignments by computing their exponentiated soft minimum by using a local similarity function. This has the advantage of applying a relatively consistent method compared to other methods in quantifying similarity [91,92]. Then, in order to identify the centroids of the time series clusters, a task called a time series prototype is performed. This has the advantage of being able to efficiently summarize time series datasets and understand the characteristics of each cluster. In this research, the shape extraction method was applied as a method for identifying the centroids. This has the advantage of setting the centroids according to the pattern of each cluster over time [80].
In this study, the above algorithm is applied to analyze sales fluctuations due to the outbreak and spread of COVID-19 in Seoul's retail areas, and to analyze the characteristics of each cluster by clustering individuals with similar singularities.

Analysis of Variance (ANOVA)
After categorizing the retail areas, the ANOVA test, a one-way analysis of variance, was used to test the statistical significance of classification. The ANOVA test is well known as a method for testing the significance of differences. This statistically confirms the difference in the mean of each cluster by comparing the variance between the clusters and variance within the clusters. In this research, to confirm the statistically significant differences in retail area by clusters, the F-value was used, which is estimated as follows (Equations (5) and (6)): F is the F-value, which means the ratio of the variance between the clusters and variance within the clusters and follows the F distribution. MS B is the mean variance between the retail area clusters. BSS is the sum of squares between the retail area clusters. d f B is the degree of freedom between the retail area clusters. MS W is the mean variance within the retail area clusters. WSS is the sum of squares within the retail area clusters. d f W is the degree of freedom within the retail area clusters.
In this study, one-way variance analysis was performed using group information derived through time series cluster analysis. Through this, it was checked whether individual groups' change patterns and singularity characteristics showed statistical differences.

COVID-19 Outbreak and Spread in Korea
Korea recorded the maximum number of new confirmed cases, 909 cases, on 29 February 2020, after the first confirmed case of COVID-19, patient zero, occurred in January 2020. This spread occurred through small and large gatherings across the country. Accordingly, the Korean government implemented "social distancing" to control the spread, and it drastically changed people's lifestyles. In order to promote economic activity, an emergency COVID-19 relief fund was provided. The outbreak and spread of COVID-19 in Korea were as follows ( Figure 4): The above trend is shown in Figure 3 below, which analyzes the characteristics of each major time point.

Categorization and Comparison of Retail Areas Based on Sales
In this research, to determine the change in sales in retail areas before and after COVID-19, the average monthly sales per unit area were used. To do this, we set the number of classifications in retail areas to four in order to consider both the high and low relative sales before and after the COVID-19 outbreak. The classification applies separately based on the type of retail area (i.e., a district-level or community-level retail area).
This study aims to analyze the robustness and resilience of retail areas through the changes in sales before and after the COVID-19 outbreak. As described above, robustness can be seen as the ability to withstand external shocks, and resilience can be seen as the ability to recover to the previous situation after external shocks. The change in sales due to COVID-19, an external shock, is an event that can indirectly confirm this. In other words, retail areas with a low decrease in sales after the outbreak and spread of COVID-19 can be seen as areas with relatively high robustness, while those with high sales increase after the provision of the emergency relief fund to recover from the COVID-19 outbreak can be seen as areas with relatively high resilience.
Therefore, in this study, the outbreak and spread of COVID-19 were considered as external shocks, and an emergency relief fund was considered as continuous development and changes involving an adjustment to recover from those shocks. Then, through comparing and analyzing changes in sales in retail areas over time by type and cluster, the robustness and resilience of retail areas were indirectly confirmed.

Categorization of District-Level Retail Areas and Comparison by Cluster
The change in monthly sales in the district-level retail areas from January 2019 to August 2020 is shown in Figure 5 below, and the characteristics by time point are shown in Table 1. Standardized values were used in a consideration of the relative qualities of individual retail areas by time point and the export policy of original data.  Note. (1) : COVID-19 outbreak, (2) : Emergency COVID-19 relief fund.
As shown in Table 1, the average and maximum sales in December 2019, before the COVID-19 outbreak, showed the highest values within the study range. In March 2020, when the spread of COVID-19 was severe, the lowest value was seen. Based on these characteristics, the categorizing of retail areas into clusters is shown in Figure 6 and Table 2. In Figure 6, the gray solid lines represent the entire district-level retail area, the colored solid lines represent the district-level retail area corresponding to each distinct cluster, and the dashed line represents the centroid of the cluster.
From December 2019 until March 2020, when the COVID-19 outbreak and spread was severe, the decrease in sales was large, in the order of cluster 1, cluster 3, cluster 4, and cluster 2. In the case of cluster 1, the decrease in sales was −4.855, which was roughly twice that of cluster 2 (−2.210), the lowest decrease. This can be seen as the biggest hit due to the low robustness of the retail areas in cluster 1. Until May 2020, when an emergency COVID-19 relief fund was paid, the increase in sales was large, in the order of cluster 1, cluster 2, cluster 4, and cluster 3. In the case of cluster 1, the increase in sales was 3.210, which is around twice that of cluster 3 (1.620), which had the lowest increase. This is because the resilience of the retail areas in cluster 1 was high, and it can be seen that the recovery was the fastest for this cluster.  Note. (1) : COVID-19 outbreak, (2) : Emergency COVID-19 relief fund.
Taken together, the changes in sales before and after COVID-19, in the order of cluster 2, cluster 4, cluster 1, and cluster 3, are large. When comparing December 2019, before the COVID-19 outbreak, and May 2020, after the outbreak and spread, in the case of cluster 2, the fluctuations before and after COVID-19 show only positive values with the lowest decrease in sales (−2.210) and the second highest increase in sales (2.335). This is because cluster 2 has the highest robustness and the smallest decrease in sales, and other clusters have not recovered to the level of sales before COVID-19.
To confirm the functional differences in terms of sales change by the clusters of districtlevel retail area, the characteristics of buildings in the retail area were analyzed (Table 3). District-level retail areas are mainly in commercial and business zones, station areas, etc. Therefore, in the case of cluster 2, which has a high ratio of business usage buildings, and cluster 4, which has a high ratio of commercial usage buildings, the decrease in sales is relatively low due to the continued ability to attract consumers. On the other hand, in the case of clusters 1 and 3, which have a low proportion of residential usage buildings, the formation of potential consumers based on the resident population is relatively disadvantageous compared to that of other clusters, and thus the sales decrease seems to be high. However, there was no statistical difference in the functional characteristics for each type of sales pattern in district-level retail areas. After confirming the functional characteristics, in order to understand the structural characteristics formed through the agglomeration of commercial facilities in the retail area, the related and unrelated diversification were analyzed according to the clusters of district-level retail areas, which are shown in Table 4. Cluster 2, which has the lowest decrease in sales among district-level retail areas, has the lowest related diversification and the highest unrelated diversification. The lower related diversification (intradiversification) and the higher unrelated diversification (interdiversification)that is, the district-level retail areas, where the individual commercial function (major classification) is concentrated in a small number of middle classifications, and the diversity of commercial function is high-are relatively robust, and the decrease in sales is lower. This means that individual commercial functions become concentrated in a small number of middle classifications by minimizing the alternatives for each commercial function in retail areas. These structural characteristics of retail areas can be considered to be relatively robust by providing an environment that supports consumption activities that are essential even in the event of national disasters such as COVID-19. On the other hand, clusters 1 and 3, with an above-average related diversification, have a high decrease in sales. The higher related diversification, diversity within the major classification, means that individual commercial functions are relatively distributed across multiple middle classifications. This can create an innovative environment in the retail area, but the distribution and consumption of individual commercial activities are not concentrated due to the increase in alternatives to commercial functions. This means that, in the event of a national disaster in which only essential consumption activities occur, the sales decrease relatively significantly.
After the COVID-19 spread, clusters 3 and 4, with low sales increase among the districtlevel retail areas, had low unrelated diversification. This means that the lower the unrelated diversification-that is, the lower the diversity of commercial functions (major classifications) in the retail areas-the lower the resilience and increase in sales. In the case of cluster 3, the high increase in sales was expected due to the high related diversification, but the increase in sales was low, and in the case of cluster 2, the opposite was true. On the other hand, in the case of cluster 1, where both the related and unrelated diversification were above average, the increase in sales was the highest. This means that the diversity between commercial functions (unrelated diversification) is a structural characteristic that prioritizes the resilience of the retail area over the diversity within individual commercial functions (related diversification).

Categorization of Community-Level Retail Areas and Comparison by Cluster
From January 2019 to August 2020, the changes in monthly sales of the communitylevel retail areas are shown in Figure 7 below, and the characteristics by time point are shown in Table 5. As shown in Table 5, the pattern of sales change in the community-level retail areas before and after COVID-19 is similar to that of district-level retail areas. Based on these characteristics, the results of categorizing retail areas as clusters are shown in Figure 8 and Table 6. In Figure 8, the gray solid lines represent the entire district-level retail area, the colored solid lines represent the community-level retail area corresponding to each distinct cluster, and the dashed line represents the centroid of the cluster.
From December 2019 until March 2020, when the COVID-19 outbreak and spread was severe, the decrease in sales was large, in the order of cluster 4, cluster 3, cluster 2, and cluster 1. In the case of cluster 4, the decrease in sales was −4.643, which was around four times more than that of cluster 1 (−0.898), which had the lowest decrease. The small commercial facilities are concentrated in community-level retail areas, showing a more severe gap than the district-level retail areas. Until May 2020, when the emergency COVID-19 relief fund was paid, the increase in sales was large, in the order of cluster 2, cluster 4, cluster 2, and cluster 1. In the case of cluster 2, the increase in sales was 5.029, which is around twice that of cluster 3 (2.151), which had the lowest increase. This is because the resilience of the retail areas in cluster 2 is high, and it can be seen that the recovery is the highest.  Taken together, the change in sales before and after the first COVID-19 wave was large, in the order of cluster 2, cluster 1, cluster 3, and cluster 4. When comparing December 2019, before the COVID-19 outbreak, and May 2020, after the initial outbreak and spread, only clusters 2 and 1 showed positive changes in sales. Note. (1) : COVID-19 outbreak, (2) : Emergency COVID-19 relief fund.
In order to confirm the functional differences in terms of sales changes by the clusters of community-level retail areas, the characteristics of buildings in the retail area were analyzed (Table 7). Community-level retail areas formed mainly in residential zones and around the district-level retail areas. Therefore, the ratio of residential usage buildings was higher than in district-level retail areas. Cluster 2 is the type with the highest increase in sales, and the high housing ratio can be seen as forming a hinterland with stable potential consumers. Cluster 4 had the highest decrease in sales and commercial usage buildings ratio, while cluster 1 had the lowest decrease in sales and commercial usage buildings ratio. Cluster 3 had the lowest increase in sales and the highest business usage buildings ratio. In other words, unlike district-level retail areas that induce external inflows based on high commerce and business functions, because community-level retail areas have the resident population as major consumers, the commercial and business functions do not act as a positive factor for a decrease or increase in sales. After confirming the functional characteristics, in order to understand the structural characteristics formed through the accumulation of commercial facilities in a retail area, the related and unrelated diversification were analyzed according to the clusters of!community-level retail areas, which are shown in Table 8.
Cluster 1, which had the lowest decrease in sales among the community-level retail areas, has the lowest related diversification and relatively high unrelated diversification. Cluster 1 had the lowest ratio of commercial usage buildings, and individual commercial functions were concentrated in a small number of middle classifications. Similar to the district-level retail areas, the high diversity of commercial functions provided a commercial environment that supported consumption activities that are essential even in the event of a national disaster, providing high robustness. Comparing the minimum decrease in sales of community-level retail areas (−0.898) and district-level retail areas (−2.210), in the case of community-level, the sales of the retail areas after a national disaster were maintained-that is, the robustness was relatively high. This is judged as a result of the stable fixed potential consumers (resident population) along with the functional characteristics of the community-level retail area, which has strong residential functions. After the COVID-19 spread, clusters 2 and 4 experienced high sales increases among the community-level retail areas, and there were statistically significant differences in the structural characteristics. However, the resilience was unclear according to structural characteristics. Similar to district-level retail areas, in the case of cluster 2, the related diversification was low, but the unrelated diversification was the highest, and the resilience was relatively high. However, cluster 4, despite the low unrelated diversification, showed relatively high resilience. In the case of cluster 4, the structural difference in the community-level retail areas from other clusters was unclear, but the ratio of commercial usage buildings was relatively high. On the other hand, in the case of clusters 1 and 3 with a low sales increase-that is, with low resilience-the difference in structural characteristics was unclear, but the ratio of commercial usage buildings was lower than average. Through this, it can be seen that differences in the resilience of community-level retail areas occur according to functional characteristics (commercial functions) rather than structural characteristics.

Summary
In this study, it was confirmed that, when using big data on sales, it is possible to analyze the sales pattern of retail areas, and, based on this, categorization is possible. Seoul's district-and community-level retail areas were classified into four clusters based on the change in sales, and the functional and structural characteristics of each cluster were statistically confirmed.
The functional characteristics based on the building usage in district-and communitylevel retail areas differed by cluster in terms of robustness and resilience. As for the ratio of the residential usage buildings to the resident population in the retail area, a decrease in sales was relatively high in the case of a district-level retail area mainly formed in the commercial and business zone with a low housing ratio. In the case of the community-level retail area mainly formed in a residential zone, the increase in sales was high when the ratio of residential usage buildings was high. The higher the ratio of commercial usage buildings, which is related to the external consumers, the lower the decrease in sales in district-level retail areas.
Structural characteristics in Seoul's district-and community-level retail areas differed by cluster in terms of robustness and resilience. When the commercial function (major classification) was concentrated in a small number of middle classifications, the related diversification was low, and when the unrelated diversification was high due to the high diversity of the individual commercial functions, the decrease in sales was low in the district-and community-level retail areas. On the other hand, in the case of the district-level retail areas, when both the diversity of individual commercial function (unrelated diversification) and the diversity within a commercial function (related diversification) were above average, there was the highest increase in sales.
However, in the case of community-level retail areas, the increase in sales differed according to functional characteristics (commercial functions) rather than structural characteristics.

Conclusions
The COVID-19 pandemic has changed the living patterns of urban residents, shrinking the domestic regional economy and directly and indirectly impacting the commercial sector. COVID-19, which is highly contagious, represents a national disaster that has directly changed the lifestyles of urban residents. In the commercial sector, the sustainability of urban and regional economies has suffered various blows, such as decreased sales and business closures. In the event of a national disaster, the contraction and recovery of a local economy or retail area may vary by characteristics, but the related research is insufficient, and so this study started here. This study analyzes the contraction and recovery of retail areas according to the structural characteristics, focusing on just before and after the COVID-19 outbreak. The robustness and resilience of the retail area were analyzed indirectly through the change in sales, and in this process, diversification, which is a structural characteristic, was used based on a regional economic perspective.
The main findings are as follows: first, by using big data before and after a national disaster such as COVID-19, it is possible to categorize it according to changes in sales of retail areas. In this study, the sales data of around 360 million units were used to categorize the sales patterns of retail areas before and after the COVID-19 outbreak. This approach can provide a methodology and basic data for discriminatory management of local economy development or retail areas in relation to public policies in the occurrence and recovery of a national disaster.
Second, the structural characteristics of the retail area can be measured by using big data, which is different in each cluster. The dynamic externalities through the agglomeration have been used as a variable to measure the structural characteristics from a regional economic perspective. In this study, using this concept, the agglomeration of commercial facilities and the structural characteristics were measured, and the differences by cluster were identified. Regarding public policies for retail area recovery after a national disaster, these results imply the necessity of establishing customized policies, as it confirmed the changes in sales pattern according to the differences in structural characteristics by retail area type and cluster. Third, the district-level retail area has higher resilience as the related diversification is lower. This means that individual commercial functions are concentrated in a small number of middle classifications. Considering that only essential consumption activities occur after a national disaster, the resilience is relatively high in retail areas where commercial functions are concentrated. In addition, district-and community-level retail areas have higher robustness as the unrelated diversification is high. This means that the diversity of individual commercial functions is high, and the more various commercial functions there are, the less damage is caused by external shocks such as national disasters. Furthermore, this means that the economy of the retail areas will not collapse, even if some commercial functions are paralyzed. These results are meaningful in confirming the characteristics of retail areas required in relation to public policy for retail area management. These also suggest the necessity of analyzing the characteristics for sustainable retail area management.
In terms of follow-up research, it is necessary to conduct a study that addresses the following limitations. First, this study only targeted retail areas located in Seoul, and there was insufficient consideration of other cities that may have different patterns from Seoul. Second, although there may be differences by subclasses of commercial facility, sufficient consideration was not given to this in the process of aggregating and analyzing large-scale big data. Third, there are limitations in the analysis due to the data export policy of the Seoul Big Data Campus, the source of the original data. Only a comparative analysis was performed on the functional and structural characteristics of each district-and community-level retail area cluster, but an empirical analysis of the influence of structural characteristics on the decrease and increase in sales was not performed.
Nevertheless, this study is meaningful to analyze the robustness and resilience of the retail areas, an area of research still in its infancy, through the sale fluctuations before and after the COVID-19 outbreak. In addition, the regional economic perspective of industry-and business-related research fields was applied to the commercial sector. Moreover, the robustness and resilience were compared by the clusters categorized by the similarity and singularity of sales fluctuations. This study is expected to contribute to the expansion of the field of research on the robustness and resilience of retail areas and local economies within cities and regions in the event of a national disaster.  Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Shinhan Card sales data is provided only to visitors who comply with the export policy of the Seoul Big Data campus, and information on the retail area and commercial facility is available at the public data portal (https://www.data.go.kr/, accessed on 16 March 2021) provided by National Information Society Agency in Korea.