Next Article in Journal
Challenges and Strategies for the Retention of Female Construction Professionals: An Empirical Study in Australia
Previous Article in Journal
Multidimensional Human Responses Under Dynamic Spectra of Daylighting and Electric Lighting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring the Factors Influencing the Spread of COVID-19 Within Residential Communities Using a Big Data Approach: A Case Study of Beijing

1
Beijing Tsinghua Tongheng Urban Planning & Design Institute Co., Ltd., Beijing 100085, China
2
School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
3
School of Architecture, Tsinghua University, Beijing 100084, China
*
Authors to whom correspondence should be addressed.
Buildings 2025, 15(13), 2186; https://doi.org/10.3390/buildings15132186
Submission received: 30 April 2025 / Revised: 9 June 2025 / Accepted: 14 June 2025 / Published: 23 June 2025
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

Abstract

The COVID-19 pandemic has profoundly influenced urban planning and disease management in residential areas. Focusing on Beijing as a case study (3898 communities), this research develops a big data analytics framework integrating anonymized mobile phone signals (China Mobile), location-based services (AMAP.com), and municipal health records to quantify COVID-19 transmission dynamics. Using logistic regression, we analyzed 15 indicators across four dimensions: mobility behavior, host demographics, spatial characteristics, and facility accessibility. Our analysis reveals three key determinants: (1) Population aged 65 and above (OR = 62.8, p < 0.001) and (2) housing density (OR = 9.96, p = 0.026) significantly increase transmission risk, while (3) population density exhibits a paradoxical negative effect (β = −3.98, p < 0.001) attributable to targeted interventions in high-density zones. We further construct a validated risk prediction model (AUC = 0.7; 95.97% accuracy) enabling high-resolution spatial targeting of non-pharmaceutical interventions (NPIs). The framework provides urban planners with actionable strategies—including senior activity scheduling and ventilation retrofits—while advancing scalable methodologies for infectious disease management in global urban contexts.

1. Introduction

The COVID-19 pandemic has underscored the importance of routine public health measures in urban governance, particularly in China, where community-level management has been pivotal in controlling the spread of the virus. Residential communities are not only essential for the health and well-being of urban residents, but also pose potential risks for virus exposure and infection, and therefore play a critical role in the overall mechanism of urban epidemic control. The municipality of Beijing encompasses a vast network of over 12,000 residential communities, within which resides a population exceeding 23 million. With numerous communities and limited resources for epidemic prevention, identifying factors that influence the spread of COVID-19 is essential for developing more effective non-pharmaceutical interventions (NPIs). Additionally, recognizing these factors and their interactions can assist in predicting community-level epidemic risks, enabling the timely and precise allocation of epidemic prevention resources. This is particularly important in the context of urban planning and governance activities, where understanding the dynamics of community transmission can inform the implementation of effective measures to mitigate the spread of the virus [1]. Big data, represented by mobile signal and Location-Based Services (LBS) data, can support more accurate information and a larger sample size, facilitating a more precise understanding of various potential factors influencing community transmission of COVID-19 and the extent of their impact, which may lead to more reliable research conclusions and help identify patterns and correlations that might be missed through traditional epidemiological methods [2,3]. Despite the potential of big data, there is a relative scarcity of research utilizing these methods to explore the factors influencing the transmission of COVID-19 between and within communities. This gap in the literature suggests a need for more studies that leverage the power of big data to inform urban planning and public health strategies, and to provide valuable insights into how to better manage and mitigate the spread of infectious diseases in urban settings.
In recent years, an increasing body of research has established a connection between urban spatial governance and urban health and disease prevention, with big data providing new and effective methods for exploring the relationship between urban spatial characteristics and epidemic transmission. Accordingly, big data technology can be applied in four general aspects: (1) to describe the features of epidemic transmission; (2) to explain the mechanism of epidemic transmission; (3) to evaluate the risk of transmission; and (4) to support planning and control [4]. In the realm of feature analysis, Ding et al. (2021) leveraged incidence data from 2014 to 2015 to chart the spatial distribution of 14 natural zoonotic diseases across China, revealing distinct patterns [5]. Similarly, Cao et al. (2022) deconstructed the spatiotemporal distribution of neural tube defects in Yuanping City, Shanxi Province, uncovering underlying regularities Mechanistic explanations have been advanced through studies such as Cao, who employed regression modelling to examine the incidence of Hand, Foot, and Mouth Disease (HFMD) in Shenzhen City in relation to geographical environmental factors, including road density and population density [6]. Liu conducted a retrospective cohort study among COVID-19 patients in Shanghai, pinpointing several indoor transmission risks during home isolation [7]. In terms of risk evaluation, Xie and Ye have highlighted the use of big data for rapid assessment of disease spread and risk [8,9,10]. The development of smart visualization platforms for epidemic prevention and control, as suggested by Wu, has been instrumental in this domain [11]. The application of big data analysis technology, geographic information technology, and other technologies to construct a smart visualization platform for epidemic prevention and control can quickly grasp the spatial distribution of epidemics and provide spatial early warning and precise positioning of specific populations. In terms of spatial planning and management, Zhou and Yang have discussed the role of big data in urban planning and management [12,13]. The use of big data for urban early warning systems has been advocated by Bajardi, emphasizing the importance of real-time monitoring and decision-making [14]. Bayrsaikhan studied the change of people’s choices on leisure activity places in Seoul using online surveys, providing several implications for urban leisure place planners and service providers [15]. The integration of big data and artificial intelligence techniques in urban planning can enable planners to initiate urban early warning systems from a holistic perspective and make planning decisions in accordance with the epidemic impact. Despite these advancements, the literature acknowledges a shortfall in practical applicability. He and Sharifi identified a limitation in that the factors influencing epidemic transmission examined by the studies are difficult to intervene in planning and governance actions, which reduces the applicability and effectiveness of the proposed measures [16,17]. Furthermore, Lee and Xie noted that current research methodologies may not be granular or frequent enough to meet the needs of planning and governance for fine-grained risk prediction and management at the residential level, challenging the field’s ability to respond to the rapidly evolving and intricate nature of epidemics [18,19]. Moreover, the current situation of rapid data, population, and resource flows poses new challenges for urban spatial management and urban health and disease prevention, requiring more advanced and effective big data techniques and decision-making tools [17,20]. To address these gaps, it is necessary to identify the specific spatial planning and governance elements that are susceptible to influence or can influence epidemic transmission and refine big data methodologies and applications for planning and governance practices concerning epidemics, focusing on precise and high-frequency risk prediction and guidance [19,21].
This study employs a big-data-driven methodology to investigate the determinants of COVID-19 transmission within residential communities. The objective is to enhance the understanding of the dynamics underlying the spread of COVID-19, thereby facilitating the formulation of precise, high-resolution, and high-frequency risk assessments. This research also aims to contribute to the development of targeted non-pharmaceutical interventions (NPIs) within the sphere of urban planning and governance, particularly in the context of routine public health measures regarding residential communities.
The study is completed based on the following steps: Initially, a comprehensive literature review is conducted to identify key urban planning and governance factors that could contribute to the increased spread of COVID-19 in residential areas. This step is essential for understanding the potential impacts of urban structural elements on disease transmission. Following this, a data-driven methodology is developed, leveraging big data analytics to quantify COVID-19 transmission and its associated factors. The methodology primarily relies on mobile phone signal data and location-based services (LBS) data, providing a detailed and expansive dataset for analysis. The empirical research is conducted using residential communities in Beijing as case studies, offering a practical application of the developed methodology. A logistic regression analysis is employed to determine which factors are significantly linked to the transmission of COVID-19, and to examine how these factors interact. Finally, a risk prediction model is constructed to further analyze how these factors influence the occurrence of COVID-19 cases within communities. This model aims to provide a deeper understanding of the relationships between urban planning, governance, and the spread of the disease, ultimately informing more effective public health strategies.

Literature Review—Factors Influencing the Spread of COVID-19 in Residentail Areas

According to the current scientific evidence, COVID-19 transmission can occur through four major modes: direct contact, indirect contact (fomite), droplet, and aerosol [22]. The relative contribution of each mode to the transmission of SARS-CoV-2 in different settings is still uncertain and may vary depending on various virus, host, and environmental factors [22,23]. Some of these factors include:
Virus characteristics, such as viral load, infectivity, stability, and genetic diversity of SARS-CoV-2 [22,23].
Host characteristics, such as age, sex, health status, immunity, behavior, and compliance with preventive measures of infected and susceptible individuals [7,22,23].
Environmental characteristics, such as temperature, humidity, ventilation, sunlight, air quality, and surface properties of residential spaces [22,23,24].
A framework can be drawn to theorize the COVID-19 transmission dynamics. Virus characteristics (encompassing viral load, infectivity, and stability), host characteristics (spanning age, sex, health status, immunity, behavior, and adherence to preventive protocols), and environmental characteristics (including temperature, humidity, and ventilation parameters) collectively shape the transmission process. Pharmaceutical Interventions (PIs, e.g., vaccines and antiviral agents) and Non-pharmaceutical Interventions (NPIs, such as physical distancing, mask usage, hygiene practices, isolation, quarantine, travel restrictions, and public health communication) exert regulatory effects on this process. Ultimately, the confluence of these elements dictates the transmission outcomes, which are operationalized via metrics like the number of infections and the effective reproduction number.
Based on a general literature review, potential factors that influence COVID-19 transmission in residential areas may involve the aspects of mobility and travel behavior, host characteristics of residents, spatial characteristics, and facilities and services. Management entities, e.g., local government and property management companies, are instrumental in implementing related preventive measures [25].
Mobility and travel behavior refer to the movement patterns and activities of residents within and outside their residential areas. These factors significantly modulate both individual exposure risk and population-level contact frequency with potential infection sources. Empirical evidence underscores this relationship: Liu et al.’s spatiotemporal analysis of COVID-19 clusters in Zhuhai, China, revealed that transmission dynamics were primarily driven by case importation followed by intra-household transmission, establishing population mobility as a critical determinant of epidemic propagation [26]. Similarly, Chen et al. demonstrated positive associations between COVID-19 incidence and specific mobility behaviors, including public transportation usage, visitation of crowded venues, and travel to high-risk regions [27]. Conversely, mobility-restricting interventions—such as lockdowns—effectively curtail transmission potential by constraining human movement and goods circulation [7].
Host characteristics of residents—encompassing demographic, socioeconomic, behavioral, and health-related factors—significantly influence individual susceptibility to SARS-CoV-2 infection, viral transmission potential, and COVID-19 disease severity. Advanced age, pre-existing comorbidities, lack of health insurance, low socioeconomic status, and minority status are established risk factors for increased COVID-19 morbidity and mortality [28,29]. This vulnerability is particularly pronounced among elderly populations, as identified by Wang et al.; empirical data from Wuhan demonstrated disproportionately high pandemic impact in areas with the highest proportions of elderly residents [30,31], attributable to age-related immune senescence and higher comorbidity prevalence. Furthermore, specific behavioral factors modulate risk: Yang et al. demonstrated that residential population gatherings elevate SARS-CoV-2 transmission risk by increasing both the frequency and duration of interpersonal contacts [14]. Finally, residents’ knowledge, attitudes, beliefs, and practices (KABP) regarding COVID-19 prevention directly influence adherence to non-pharmaceutical interventions and public health guidelines [32].
Spatial characteristics refer to the physical and environmental features of residential areas that can affect the transmission dynamics and outcomes of COVID-19. Key factors include population density, urbanization level, housing typology, green space availability, and air quality. Empirical evidence consistently links high population density and urbanization to elevated outbreak risk, primarily through intensified human contact and environmental contamination pathways [26]. This relationship is exemplified by Gurram et al.’s analysis of Singaporean sub-zones, where western regions exhibited significant positive correlations between population density and case incidence [29]. Conversely, built environment features that mitigate transmission include adequate ventilation and improved air quality, which reduce viral persistence in aerosols and on surfaces [30,31,32]. Model simulations further indicate that increased living space may reduce infection probability by approximately 10% [33], suggesting that quantitative metrics of the built environment—particularly gross land area and building gross floor area (GFA)—represent critical modulators of transmission patterns. Notably, while humidity correlations remain unestablished [13], the collective evidence underscores the importance of spatial determinants in pandemic control.
Facilities and services—encompassing the availability and accessibility of infrastructure supporting COVID-19 prevention and control in residential areas—significantly modulate outbreak dynamics through three primary pathways: case management efficacy, population mobility patterns, and built-environment safety. These include medical resources (testing sites, clinics, hospitals, vaccination centers), community service facilities (pharmacies, supermarkets), and living service facilities (schools, workplaces, transport hubs, recreational spaces) [34,35]. Empirical studies confirm that service density directly influences epidemiological outcomes: Clinic and general hospital availability correlates positively with case detection rates [32,34], while areas with concentrated commercial facilities (e.g., large supermarkets, convenience stores) exhibit elevated COVID-19 incidence due to amplified human mobility and contact frequency [36,37,38]. Crucially, facility design and operational protocols mediate this risk. Optimized spatial configuration, ventilation standards, and sanitation protocols demonstrably reduce transmission vectors by enforcing physical distancing, improving hygiene compliance, and limiting fomite persistence [7,39,40]. Crucially, facility design and operational protocols mediate this risk. Optimized spatial configuration, ventilation standards, and sanitation protocols demonstrably reduce transmission vectors by enforcing physical distancing, improving hygiene compliance, and limiting fomite persistence [7,39,40].
The notion of the four main aspects helps to select proper indicators for measuring the spread of the epidemic in the empirical research.

2. Methods—Exploring the Factors of COVID-19 Transmission in Residential Areas of Beijing

2.1. Study Area

Beijing, the capital of China, covers an area of 16,410 km2, with a permanent population of over 21.85 million. And the migrant population was as high as 8.240 million, accounting for 37.7% of the total permanent population [41]. At the view of national level, Beijing is one of the megacities that has the largest size of population mobility [42]. Within the city of Beijing, with the growth of the urban population and the expansion of urban space, home-work separation and unreasonable urban zoning have gradually become prominent, leading to a substantial upsurge in the travel demands of inhabitants [35]. In this study, the region within the sixth ring road, which contains a substantial portion of Beijing’s population and daily travel, was selected as the study area (see Figure 1).

2.2. Data and Methods

This research adopts a big-data driven approach to explore the transmission of COVID-19 and a series of potential factors in 3898 residential communities in Beijing. The factors are categorized into four dimensions: mobility and travelling behavior, demographic characteristics, spatial characteristics, and medical resources and service facilities. 15 indicators are selected for these four dimensions (Table 1) and quantified using mobile phone signalling data and location-based services (LBS) data. The data are acquired from three major sources, including the Beijing Municipal Health and Wellness Commission, the mobile phone communication service provider (China Mobile), and the location-based internet service provider (AMAP.com) (See Table 1). Based on these data, logistic regression and modelling are performed to analyze the quantitative relationship between various indicators and case occurrence, and to identify the key influencing factors for the spread of COVID-19 (see Figure 2).
The selection of indicators quantifying factors influencing COVID-19 transmission was rigorously guided by the conceptual framework derived from the literature review, specifically the four key aspects: mobility and travel behavior, host characteristics of residents, spatial characteristics, and facilities and services (see Table 1). This framework ensured the chosen indicators captured the core mechanisms theorized to drive transmission dynamics in residential settings. Crucially, to meet the imperative for precise, high-resolution, and high-frequency risk prediction of urban spaces, two major operational considerations governed the indicator selection process: First, the indicators must support high-frequency and high-resolution analysis to effectively track rapidly evolving transmission risks. Second, the indicators needed to be computationally feasible using readily available data sources, enabling the derivation of results that could directly inform practical, feasible, and applicable planning and governance interventions. To achieve these dual objectives, mobile phone signaling data served as the foundational dataset. This data source was strategically exploited and integrated with other essential, available data (e.g., administrative records, LBS data) to calculate the selected indicators (See Table 2 and Figure 3). This approach minimized the number of distinct data sources required, significantly reducing data acquisition complexity and enhancing the practicality and scalability of the analysis for real-time or near-real-time urban epidemic management.

2.3. Quantifying COVID-19 Transmission and Related Factors

The empirical study is designed to uncover the correlational dynamics between two principal variables within the statistical model: (a) the prevalence of COVID-19 within a residential community over a defined timeframe; (b) the metrics of diverse factors impacting the spread of COVID-19, as detailed in Table 1. To measure these variables, the paper leverages three primary data sources: mobile signal data from China Mobile, a leading telecommunications provider in China; points of interest (POIs) and geographic data from AMAP.COM, a prominent location-based services platform in China. The incidence of COVID-19 is quantified by the count of confirmed cases within residential areas, as reported by the Beijing Municipal Health and Wellness Commission from 23 August to 19 November 2022—a period encompassing a full epidemic cycle, as illustrated in Figure 2. The influencing factors’ metrics are derived from the synthesis of the disparate datasets collected. This timeframe marks the initial surge in Beijing where daily confirmed cases exceeded 100. Moreover, Beijing’s implementation of standardized, tiered epidemic prevention and control strategies since June 2022 renders this period more indicative for the study of epidemic propagation (see Figure 4).

2.4. Selection of Research Sample

This paper selected 3898 residential communities in Beijing as the analysis sample based on data availability. These communities basically cover the main built-up areas of Beijing. According to the latest data from the Beijing Municipal Bureau of Statistics, there were 10,465 residential communities in Beijing by the end of 2020. Therefore, the analysis sample accounts for approximately 37% of the total number of residential communities in Beijing.

2.5. Data Preprocessing Summary

Prior to analysis, comprehensive preprocessing was applied to all datasets: (1) Mobile signalling records (251 million entries) were temporally aligned to daily intervals, then mapped to residential communities via GIS overlay; (2) COVID-19 case reports underwent validation against municipal bulletins with date corrections; (3) key indicators were derived through spatiotemporal calculus and road-network analysis; (4) anomalies were manual ly mitigated; (5) all variables underwent Box-Cox transformation and [0, 1] scaling to ensure comparability. These steps ensured temporal synchronization, spatial consistency, and dimensional homogeneity across the 3898-community dataset.

3. Results of Empirical Analysis

3.1. Logistic Regression Analysis Results

The logistic regression analysis identifies three significant predictors of COVID-19 occurrence in Beijing’s residential communities: (a) number of people aged 65 and above, (b) housing density, and (c) population density. While both elderly population size and housing density exhibit positive associations with case probability, population density demonstrates a counterintuitive negative relationship. These empirically validated factors constitute essential inputs for risk prediction models to elucidate epidemic transmission patterns within urban communities (see Table 3 and Table 4).
(a)
Number of People Aged 65 and Above (OR = 62.8, p < 0.001)
The number of people aged 65 and above in the community has a significant positive effect on the probability of case occurrence. The logistic regression model indicates that for every 20% increase in the standardized value of this variable, the average probability of case occurrence in the community increases by 10 percentage points.
(b)
Density of Housing (OR = 9.96, p = 0.026)
Housing density also demonstrates a significant positive effect on the probability of case occurrence in residential areas. According to the logistic regression model, for every 20% increase in the standardized value of residential building density, the average probability of case occurrence increases by 5 percentage points.
(c)
Density of Population (β = −3.98, p < 0.001)
Population density exhibits a highly significant negative effect on the probability of case occurrence. The logistic regression model shows that a one-unit increase in this variable is associated with a substantial reduction in outcome probability (OR approaching zero).

3.2. Collinearity Analysis

Collinearity diagnostics were incorporated into the model using Variance Inflation Factors (VIFs). VIFs measure the inflation of standard errors in linear model coefficient estimates due to multicollinearity. As presented in Table 5, all independent variables exhibited VIF values below the threshold of 10, confirming the absence of significant multicollinearity.

3.3. Confusion Matrix Validation

Model performance was further evaluated using a confusion matrix. The results demonstrate high predictive accuracy, with an overall correct classification rate of 95.97%. Sensitivity (the proportion of actual positives correctly identified) reached 100%.

3.4. k-Fold Cross-Validation

Ten-fold cross-validation was performed to assess model generalizability. As summarized in Table 6 and Table 7, the model achieved low predictive error rates, indicating robust predictive capability.

3.5. Prediction Model for COVID-19 Transmission

The prediction of the case occurrence, or the development of a risk prediction model, can help governments or communities to take precautions based on scientific evidence. A logistic regression model with binary outcomes was used to estimate the probability of future epidemics in the community, using 15 selected indicators as independent variables and the occurrence of cases in each community as a dependent variable.
The performance of the logistic regression model was assessed by an ROC curve, which plots sensitivity against specificity. The AUC of the ROC curve measures how well the model discriminates between positive and negative outcomes. The AUC of the prediction model in this research was approximately 0.7, indicating a good predictive effect. This means that the model can correctly predict the occurrence of cases in most communities based on the indicators (See Figure 5 and Figure 6).
The results show that the average probability of case occurrence in sample communities is 4%, while approximately 3% of the communities have a probability of over 10%. The highest risk of cases is found in two communities near the North Fourth Ring Road, with probabilities of 74% and 78%, respectively (See Table 8).
The spatial distribution of the residential communities with a high probability of epidemics (greater than 10%) is mainly between the second and fifth ring roads. These communities have different characteristics depending on their location. The high-risk communities on both sides of the central axes and in the southeast of Beijing generally have a large number of people aged 65 and above and a higher housing density, while most of the communities along the extension of West Chang’an Street are at higher risk due to the larger number of people aged 65 and above. In addition, there are a few high-risk communities within the second ring road and outside the fifth ring road (Figure 7). These communities should implement interventions to enhance the epidemic prevention and control.

3.6. Developing NPI Based on Modelling Results

The quantitative methods and findings of this study can support the development of NPIs of urban planning and governance and improve epidemic prevention in communities.
First, to adopt science-based and targeted actions in high-risk areas and key groups identified by the “risk map” of the epidemic and the distribution map of key groups (See Figure 8 and Figure 9). These maps are drawn based on an accurate analysis of various factors that influence epidemic transmission, such as age, housing density, outdoor activity coverage, and proximity to crowded places. By allocating more resources and restricting mobility in high-risk areas and key groups, the probability of outbreaks can be effectively reduced.
Furthermore, the establishment of comprehensive technical networks for spatial governance is essential. These networks engage a broader spectrum of stakeholders—including public agencies, private tech firms, and planning institutions—to investigate diverse non-pharmaceutical interventions (NPIs), as exemplified by the Shenzhen-based Urban Epidemic Sites Map (integrating 1938 epidemic locations across 36 cities into the CDC command center for resource allocation) and the Beijing Community Epidemic Resilience Map (establishing a multi-dimensional quantitative assessment system for 6727 communities with public interactive visualization, Figure 10) [42]. Equipped with such real-time, high-resolution monitoring capabilities, these networks leverage dynamic data flows to pinpoint critical transmission factors, enhance epidemic risk assessments, optimize spatial planning for medical facilities and community services, and formulate agile intervention strategies responsive to evolving public health crises (See Table 9).

4. Discussion—Towards Better Epidemic Control Through Urban Planning and Governance

4.1. Major Findings and Governance Implications

This research aims to develop a big data-driven approach using mobile phone signaling data and LBS data to quantify COVID-19 transmission dynamics and related factors, thereby providing actionable insights for epidemic planning and governance. Through fine-resolution analysis of 3898 residential communities in Beijing, we identify key factors affecting COVID-19 spread in residential areas.
(a) 
Critical role of elderly populations
While people aged 65 and above face higher probabilities of severe outcomes [26]—our findings reveal their critical role in accelerating community transmission. This mechanism may operate through three interconnected pathways: Firstly, heightened biological vulnerability (e.g., weakened immune response and comorbidities) increases hospitalization risk and intra-household amplification [43]. Secondly, senior-centric aggregation activities act as diffusion accelerators, with empirical evidence from Gansu Province showing 68% of localized outbreaks originating from venues like community squares and chess parlors, where collective exercises and social habits facilitated rapid viral diffusion among attendees [43]. Thirdly, cognitive-digital prevention barriers create a risk multiplier: a survey of 203 community-dwelling elders revealed a “knowledge-behavior disconnect”—while 99% acknowledged COVID-19 risks and 93.6% recognized symptoms, only 23.2% mastered correct handwashing techniques and 10.3% maintained adequate hydration (critical for mucosal immunity). This is compounded by heavy reliance on TV/radio (84.2%) for information, low digital tool adoption (e.g., 30.1% telemedicine usage), and limited intergenerational support wherein 48.6% of seniors reported impatient guidance from younger relatives, collectively hindering behavioral adaptation [44].
(b) 
Housing density as a transmission amplifier
The findings indicate that building density has a positive impact on COVID-19 case numbers within residential areas, consistent with the majority of existing research literature. Multiple studies confirm a positive correlation between floor area ratio/building density and COVID-19 transmission. For instance, empirical studies in Chicago, USA, and Hong Kong, China, both demonstrate that higher floor area ratios and building density are often associated with increased COVID-19 transmission risk [45,46,47]. Larger floor area ratios and higher building density imply relatively denser populations and heightened exposure risk. Places with higher building density typically lead to more frequent crowd gatherings, where public spaces such as elevators and corridors become significant environments facilitating epidemic spread. Furthermore, high density impedes natural ventilation, increasing the probability of virus transmission via aerosols.
(c) 
Policy-mediated inverse density effect
Specifically, this study identified a significant negative correlation between population density and case numbers (β = −3.98, p < 0.001), contradicting prevailing epidemiological consensus that associates higher density with increased transmission risk [17,27,33,45,47]. This paradox is attributable to Beijing’s context-specific governance during the 2022 resurgence phase. Following accumulated pandemic control experience since 2020, municipal authorities systematically prioritized high-density communities for enhanced interventions [48]. As evidenced by early district-level protocols like those in Laoshang Subdistrict (2020), where three-tiered measures were implemented: (1) Targeted home confinement guidance for vulnerable groups through community volunteers; (2) Enhanced surveillance of high-risk gathering spots (e.g., plazas, retail clusters) with dedicated patrols; and (3) Strategic closure of high-exposure venues like outdoor basketball courts where respiratory exertion amplifies transmission risk [49]. These intensified measures—including mobility controls, precision health communication, and vaccination drives—collectively reversed conventional density-risk relationships in regulated urban environments.
Collectively, these findings indicate that during the routine epidemic prevention phase, residential communities with either elevated elderly populations or higher building density should be prioritized in intervention planning. Crucially, while governing bodies typically find population size data readily accessible, effective epidemic control requires systematically incorporating spatial characteristics of residential areas into governance frameworks—a critical but often overlooked dimension. This integrated approach constitutes an essential pathway toward enhancing epidemic resilience through evidence-based urban planning and governance.

4.2. Research Highlights

First, the research reveals several factors that may significantly influence the spread of COVID-19 in residential communities, which can support the development of specific NPIs in the field of urban planning and governance. By analyzing the factors in the aspects of mobility and travel behavior, host characteristics of residents, spatial characteristics, and facilities and services, two specific factors were found that may significantly intensify the spread of COVID-19 in residential communities, i.e., the number of people aged 65 and above within the residential community and the density of housing. These evidence-based findings directly inform the development of targeted non-pharmaceutical interventions (NPIs) for urban epidemic governance, particularly during the regular epidemic prevention phase.
Second, this study establishes an efficient big data-driven risk prediction framework that addresses the critical need for precise, high-resolution, and high-frequency spatial risk assessment. By leveraging integrated mobile phone signaling and LBS POI data—characterized by extensive coverage, superior spatiotemporal resolution, and real-time currency—we achieve accurate quantification of COVID-19 transmission dynamics across residential communities. Unlike conventional multi-source approaches, our streamlined methodology exploits the inherent multidimensionality of signaling data, significantly improving data acquisition efficiency while enhancing model generalizability. This computationally optimized pipeline enables cost-effective analysis at unprecedented scales (3898 Beijing communities), ensuring robust statistical reliability. The resulting predictive model delivers scientifically validated decision support for urban epidemic governance, particularly in optimizing non-pharmaceutical interventions (NPIs) through spatially targeted deployment. This methodological advance provides novel analytical infrastructure for infectious disease control in built environments.

4.3. Limitations

This study acknowledges inherent constraints in data quality and generalizability. First, mobile phone signaling data (250 × 250 m resolution) exhibits positional inaccuracies inversely proportional to cellular base station density, though residual errors remain acceptable within Beijing’s urban context. Concurrently, LBS POI datasets systematically underrepresent venues frequented by digitally excluded populations (e.g., non-smartphone users), introducing demographic sampling bias. Second, findings derived from Beijing’s unique socio-spatial and governance environment may not generalize to dissimilar regions where epidemic response mechanisms, mobility patterns, and urban morphology differentially modulate transmission factors. Future work should integrate multi-source validation (e.g., census/survey augmentation), develop exclusion-robust indicators, and advance spatial interpolation techniques to enhance methodological robustness.

5. Conclusions

This study establishes a replicable big-data framework using mobile signaling and LBS data to quantify COVID-19 transmission drivers across 3898 Beijing communities. We identify three critical determinants: (1) Elderly populations accelerate spread through a biological-behavioral nexus—aggregation venues triggered 68% of localized outbreaks, while prevention literacy gaps compounded vulnerability; (2) housing density amplifies risk via aerosol retention and congregation effects, aligning with global evidence; and (3) a policy-mediated inverse density effect emerged where targeted interventions in high-density areas reversed epidemiological norms during Beijing’s 2022 resurgence. Methodologically, our model achieved 95.97% prediction accuracy, enabling precise risk thresholds. These findings mandate spatial prioritization in governance: communities with elderly concentrations require activity scheduling interventions, while dense housing necessitates ventilation retrofits. Limitations in digital representation of marginalized groups highlight need for multi-source validation—a critical direction for scaling this evidence-based approach to enhance epidemic resilience in global urban settings.

Author Contributions

Conceptualization, Y.L. (Yang Li); Resources, L.D.; Data curation, H.Z.; Writing—original draft, Y.L. (Yang Li); Writing—review & editing, Y.L. (Yinong Li) and L.D.; Visualization, H.C.; Supervision, W.L.; Project administration, X.S. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Beijing Natural Science Foundation (No. 8232008).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

Authors Yang Li, Xiaoming Sun, Huiyan Chen, Hong Zhang and Wenqi Lin were employed by the company Beijing Tsinghua Tongheng Urban Planning & Design Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Qiu, Y.; Chen, X.; Shi, W. Impacts of social and economic factors on the transmission of coronavirus disease 2019 (COVID-19) in China. J. Popul. Econ. 2020, 33, 1127–1172. [Google Scholar] [CrossRef] [PubMed]
  2. Li, D.; Shao, Z.; Yu, W.; Zhu, X.; Zhou, S. Public Epidemic Prevention and Control Services Based on Big Data of Spatiotemporal Location Make Cities More Smart. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 475–487, 556. [Google Scholar] [CrossRef]
  3. Zhang, L.-S.; Chen, Y.; Fan, R.; Wen, Y.-H.; Yang, H.; Li, L.; Liu, Y.-H.; Zheng, H.-Z.; Jiang, J.-J.; Qian, H.; et al. Risk Factors Associated with Indoor Transmission During Home Quarantine of COVID-19 Patients. Front. Public Health 2023, 11, 1170085. [Google Scholar]
  4. Fang, Y.H.; Gu, K.K. Exploration on geospatial risk assessment in China based on multiple data: A case study of COVID-19 data from January 1 to April 11, 2020. J. Geo-Inf. Sci. 2021, 23, 284–296. [Google Scholar] [CrossRef]
  5. Ding, X.T.; Yu, Z.Y.; Song, H.H.; Xie, Y.; Lv, K. Research on the distribution of natural focus diseases based on information entropy. J. Geo-Inf. Sci. 2019, 21, 1877–1887. [Google Scholar]
  6. Cao, C.; Li, G.; Zheng, S.; Cheng, J.; Lei, G.; Tian, K. Research on the environmental impact factors of Hand-Foot-Mouth disease in Shenzhen, China using RS and GIS technologies. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; pp. 7240–7243. [Google Scholar]
  7. Liu, Y.; Chai, Y.-H.; Wu, Y.-F.; Zhang, Y.-W.; Wang, L.; Yang, L.; Shi, Y.-H.; Wang, L.-L.; Zhang, L.-S.; Chen, Y.; et al. Risk factors associated with indoor transmission during home quarantine of COVID-19 patients. Front. Public Health 2023, 11, 1170085. [Google Scholar]
  8. Xie, S.; Wang, Y.; Liu, Y.; Zhang, Y. Spatial big data analysis and visualization for epidemic prevention and control: A case study of COVID-19 in China. J. Clean. Prod. 2022, 279, 123537. [Google Scholar]
  9. Xie, Y.; Batty, M.; Zhao, K.; Chai, Y.; Heppenstall, A.; Longley, P. Big data for urban studies: Opportunities and challenges in the era of COVID-19 and beyond. Comput. Environ. Urban Syst. 2022, 88, 101689. [Google Scholar] [CrossRef]
  10. Ye, Y.; Fan, W.; Wang, H.F. Clustering of 2019 Novel Coronavirus Disease Epidemic in Henan Province. Chin. J. Public Health 2020, 36, 465–468. [Google Scholar]
  11. Fang, H.; Wang, L.; Yang, Y. Human mobility restrictions and the spread of the Novel Coronavirus (2019-nCoV) in China. J. Public Econ. 2020, 191, 104272. [Google Scholar] [CrossRef]
  12. Zhou, Q.F.; Long, R.N.; Huang, Z. Discussion on the Needs of Epidemic Transmission Models for Fine Urban Spatial Governance. Beijing Plan. Constr. 2020, 4, 28–30. [Google Scholar]
  13. Yang, J.Y.; Shi, B.X.; Shi, Y.; Li, Y. Construction of a Multi-Scale Spatial Epidemic Prevention System in High-Density Cities. City Plan. Rev. 2020, 44, 17–24. [Google Scholar]
  14. Bajardi, P.; Poletto, C.; Ramasco, J.J.; Tizzoni, M.; Colizza, V.; Vespignani, A. Human mobility networks, travel restrictions, and the global spread of 2009 H1N1 pandemic. PLoS ONE 2011, 6, e16591. [Google Scholar] [CrossRef] [PubMed]
  15. Bayrsaikhan, T.; Lee, J.; Kim, M.H.; Gim, T.-H.T. A seemingly unrelated regression model of the impact of COVID-19 risk perception on urban leisure place choices. Int. Rev. Spat. Plan. Sustain. Dev. 2021, 9, 30–40. [Google Scholar] [CrossRef] [PubMed]
  16. He, J.; Zhang, Y. Urban epidemic governance: An event system analysis of the outbreak and control of COVID-19 in Wuhan, China. Urban Stud. 2023, 60, 1707–1729. [Google Scholar] [CrossRef] [PubMed]
  17. Sharifi, A.; Khavarian-Garmsir, A.R.; Kamali, F.; Yamagata, Y. The COVID-19 pandemic: Impacts on cities and major lessons for urban planning, design, and management. Sci. Total Environ. 2020, 749, 142391. [Google Scholar] [CrossRef]
  18. Lee, V.J.; Ho, M.; Chen, W.K.; Aguilera, X.; Heymann, D.; Wilder-Smith, A. Epidemic preparedness in urban settings: New challenges and opportunities. Lancet Infect. Dis. 2020, 20, 527–529. [Google Scholar] [CrossRef]
  19. Xie, S.X.; Liu, C.H.; Zhou, X.M. The Construction of Urban Resilience and the Evolution of Planning Theory from the Perspective of Epidemic Prevention. Huazhong Archit. 2020, 10, 137–141. [Google Scholar]
  20. Song, X.; Cao, M.; Zhai, K.; Gao, X.; Wu, M.; Yang, T. The effects of spatial planning, well-being, and behavioural changes during and after the COVID-19 pandemic. Front. Sustain. Cities 2021, 3, 686706. [Google Scholar] [CrossRef]
  21. Honey-Roses, J.; Anguelovski, I.; Bohigas, J.; Chireh, V.; Daher, C.; Konijnendijk, C.; Litt, J.; Mawani, V.; McCall, M.K.; Orellana, A.; et al. The impact of COVID-19 on public space: A review of the emerging questions. Cities Health 2020, 1–16. [Google Scholar] [CrossRef]
  22. Weaver, A.K.; Head, J.R.; Gould, C.F.; Carlton, E.J.; Remais, J.V. Environmental Factors Influencing COVID-19 Incidence and Severity. Annu. Rev. Public Health 2022, 43, 271–291. [Google Scholar] [CrossRef]
  23. Boehm, A.B.; Wigginton, K.R. Environmental Engineers and Scientists Have Important Roles to Play in Stemming Outbreaks and Pandemics Caused by Enveloped Viruses. Environ. Sci. Technol. 2020, 54, 3736–3739. [Google Scholar] [CrossRef]
  24. Canada.ca. COVID-19: Guidance on Indoor Ventilation During the Pandemic. 2021. Available online: https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection/guidance-documents/guide-indoor-ventilation-covid-19-pandemic.html (accessed on 1 January 2022).
  25. China Property Management Association, China Economic Information Service. Impact of the COVID-19 Pandemic on the Property Management Industry: A Survey Report. 2020. Available online: https://kns.cnki.net/dm8/manage/export.html?filename=Pe2nFq1PBOM11SpCErZ-LwM1UHjV0uMR_icN4IXwgidyz3lGWSThkCcypFHQXJ5yec0HoBMAn_O3Wp_5JSKve3-s4n4jlL7XQKLOuexh_op3KAUBlVIsgcb0E8u_YkCK&displaymode=NEW&uniplatform=NZKPT (accessed on 1 January 2023).
  26. Liu, Z.; Ye, Y.; Zhang, H.; Guo, H.; Yang, J.; Wang, C. Spatio-Temporal Characteristics and Transmission Path of COVID-19 Cluster Cases in Zhuhai. Trop. Geogr. 2020, 40, 422–431. [Google Scholar]
  27. Klompmaker, J.O.; Hart, J.E.; Holland, I.; Sabath, M.B.; Wu, X.; Laden, F.; Dominici, F.; James, P. County-level exposures to greenness and associations with COVID-19 incidence and mortality in the United States. Environ. Res. 2021, 199, 111331. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  28. CNBC. Factors Affecting COVID-19 Spread among Lower-Income Communities. 2021. Available online: https://www.cnbc.com/2021/01/12/factors-affecting-covid-19-spread-among-lower-income-communities.html (accessed on 12 January 2021).
  29. Gurram, M.K.; Wang, M.X.; Wang, Y.C.; Pang, J. Impact of urbanisation and environmental factors on spatial distribution of COVID-19 cases during the early phase of epidemic in Singapore. Sci. Rep. 2022, 12, 9758. [Google Scholar] [CrossRef] [PubMed]
  30. CDC. Factors That Affect Your Risk of Getting Very Sick from COVID-19. 2023. Available online: https://archive.cdc.gov/#/details?q=https://www.cdc.gov/coronavirus/2019-ncov/your-health/risks-getting-very-sick.html&start=0&rows=10&url=https://www.cdc.gov/coronavirus/2019-ncov/your-health/risks-getting-very-sick.html (accessed on 11 May 2023).
  31. Shi, X. Urban morphology, urban ventilation and aerosol transmission of novel coronavirus: A written discussion on coping with the emergency of novel coronavirus pneumonia in 2020. Urban Plan. 2020, 44, 10. [Google Scholar]
  32. Li, X.; Zhou, L.; Jia, T.; Peng, R.; Fu, X.; Zou, Y. Associating COVID-19 Severity with Urban Factors: A Case Study of Wuhan. Int. J. Environ. Res. Public Health 2020, 17, 6712. [Google Scholar] [CrossRef]
  33. Kan, Z.; Kwan, M.P.; Wong, M.S.; Huang, J.; Liu, D. Identifying the space-time patterns of covid-19 risk and their associations with different built environment features in Hong Kong. Sci. Total Environ. 2021, 772, 145379. [Google Scholar] [CrossRef]
  34. Shen, J.; Hu, F.; Huang, J.X.; Wu, S. Review on Risk Factors Influencing the Spread of COVID-19. Huazhong-Archit. 2022, 40, 33–39. [Google Scholar] [CrossRef]
  35. Nguyen, Q.C.; Huang, Y.; Kumar, A.; Duan, H.; Keralis, J.M.; Dwivedi, P.; Meng, H.-W.; Brunisholz, K.D.; Jay, J.; Javanmardi, M.; et al. Using 164 Million Google Street View Images to Derive Built Environment Predictors of COVID-19 Cases. Int. J. Environ. Res. Public Health 2020, 17, 6359. [Google Scholar] [CrossRef]
  36. You, H.; Wu, X.; Guo, X. Distribution of COVID-19 Morbidity Rate in Association with Social and Economic Factors in Wuhan, China: Implications for Urban Development. Int. J. Environ. Res. Public Health 2020, 17, 3417. [Google Scholar] [CrossRef] [PubMed]
  37. Zheng, T.M.; Liu, H.L. Exploration of the Built-Environmental Elements that Influence the Spread of COVID-19 Pandemic on Community Scale: A Case Study of Wuhan, China. Mod. Urban Res. 2020, 10, 20–29. [Google Scholar]
  38. News Medical. Factors Affecting SARS-CoV-2 Transmission and Outbreak Control in Densely Populated Areas. 2021. Available online: https://www.news-medical.net/news/20210803/Factors-affecting-SARS-CoV-2-transmission-and-outbreak-control-in-densely-populated-areas.aspx (accessed on 3 August 2021).
  39. CDC. How Coronavirus Spreads. 2021. Available online: https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/how-covid-spreads.html (accessed on 13 June 2024).
  40. CDC. Science Brief: SARS-CoV-2 and Surface (Fomite) Transmission for Indoor Community Environments. 2021. Available online: https://archive.cdc.gov/#/details?url=https://www.cdc.gov/coronavirus/2019-ncov/more/science-and-research/surface-transmission.html (accessed on 31 May 2024).
  41. Beijing Municipal Bureau of Statistics & National Bureau of Statistics Survey Office in Beijing. Statistical Communiqué on the National Economy and Social Development of Beijing in 2023; Beijing Municipal Bureau of Statistics & National Bureau of Statistics Survey Office in Beijing: Beijing, China, 2024.
  42. Li, Y.; Chu, Q.; Zhang, M.L.; Xie, X.Z. Urban Public Service Delivery. In Proceedings of the 2020 China Urban Planning Annual Conference (11 Urban-Rural Governance and Policy Research): Spatial Governance for High-Quality Development, Chengdu, China, 25 September 2021; Urban Planning Society of China: Beijing, China, 2021; pp. 180–193. [Google Scholar]
  43. Sporadic Cases and Interconnectedness of Community Transmission During the Pandemic: An Interview with Du Zhaohui, Expert of the Gansu Working Group Under the Joint Prevention and Control Mechanism of the State Council. Updated 15 October 2021. Available online: https://www.gov.cn/xinwen/2021-10/30/content_5647744.htm (accessed on 2 May 2025).
  44. Investigation on the Status Quo of Cognition, Prevention and Control of Novel Coronavirus Infection Pneumonia Among the Elderly at Home in Community and Analysis of Strategies. Updated 8 August 2020. Available online: https://www.qikanchina.com/thesis/view/4731218 (accessed on 30 May 2025).
  45. V, A.A.R.; R, V.; Haghighat, F. The contribution of dry indoor built environment on the spread of Coronavirus: Data from various Indian states. Sustain. Cities Soc. 2020, 62, 102371. [Google Scholar] [CrossRef] [PubMed]
  46. Bryan, M.S.; Sun, J.; Jagai, J.; Horton, D.E.; Montgomery, A.; Sargis, R.; Argos, M. Coronavirus disease 2019 (COVID-19) mortality and neighborhood characteristics in Chicago. Ann. Epidemiol. 2021, 56, 47–54. [Google Scholar] [CrossRef]
  47. Li, W.; Zhao, S.C.; Ji, X.C.; Ma, J.W. The impact of traffic exposure and land use patterns on the risk of COVID-19 transmission in communities. China J. Highw. Transp. 2020, 33, 43–54. [Google Scholar]
  48. Notice from the Beijing Municipal Commission of Housing and Urban Rural Development on Solidly Doing a Good Job in Epidemic Prevention and Control in Residential Communities. Updated 15 October 2021. Available online: https://www.beijing.gov.cn/ywdt/gzdt/202108/t20210804_2456608.html (accessed on 31 May 2025).
  49. Strictly Control Personnel Density, Shijingshan Laoshan Street Launches a Combination of Epidemic Prevention and Education Punches. Updated 18 February 2020. Available online: https://baijiahao.baidu.com/s?id=1658863895420690838&wfr=spider&for=pc (accessed on 18 February 2020).
Figure 1. Research area with its location in Beijing (source: author).
Figure 1. Research area with its location in Beijing (source: author).
Buildings 15 02186 g001
Figure 2. Diagram of the mechanism of COVID-19 transmission (Source: author).
Figure 2. Diagram of the mechanism of COVID-19 transmission (Source: author).
Buildings 15 02186 g002
Figure 3. Diagram for the data-driven approach applied in this research (source: author).
Figure 3. Diagram for the data-driven approach applied in this research (source: author).
Buildings 15 02186 g003
Figure 4. The condition of COVID-19 spread in Beijing from 23 August to 19 November 2022 (source: author).
Figure 4. The condition of COVID-19 spread in Beijing from 23 August to 19 November 2022 (source: author).
Buildings 15 02186 g004
Figure 5. Evaluation of the probability of case occurrence in the community affected by key factors (source: author).
Figure 5. Evaluation of the probability of case occurrence in the community affected by key factors (source: author).
Buildings 15 02186 g005
Figure 6. ROC curve of logistic regression model (source: author).
Figure 6. ROC curve of logistic regression model (source: author).
Buildings 15 02186 g006
Figure 7. Probability of case occurrence in sample communities (source: author).
Figure 7. Probability of case occurrence in sample communities (source: author).
Buildings 15 02186 g007
Figure 8. “Risk map”—distribution of people aged 65 and above in residential communities (source: author).
Figure 8. “Risk map”—distribution of people aged 65 and above in residential communities (source: author).
Buildings 15 02186 g008
Figure 9. “Risk map”—building density of residential communities in Beijing (source: author).
Figure 9. “Risk map”—building density of residential communities in Beijing (source: author).
Buildings 15 02186 g009
Figure 10. Online map for COVID-19 prevention of Beijing communities (source: author).
Figure 10. Online map for COVID-19 prevention of Beijing communities (source: author).
Buildings 15 02186 g010
Table 1. Explanation of 15 indicators for the factors influencing the spread of COVID-19 (Source: author).
Table 1. Explanation of 15 indicators for the factors influencing the spread of COVID-19 (Source: author).
CategoryIndicatorExplanation
Mobility and travelling behavior1Number of floating populationthe number of people who live in the residential community less than 6 month per year, such as migrant workers
2Coverage of people’s travelling activities the total number of communities reached by the travels of persons from a certain residential community
3Distance to densely populated placesthe spatial distance between a residential community and the nearest densely populated places such as airport, railway station or supermarket
4Number of visitorsthe total number of visitors to a residential community
Host characteristics of residents5Number of residentsthe total number of population within a residential community
6Density of populationthe average number of population per hectare within a residential community
7Number of people aged 65 and abovethe total number of persons older than 65 within a residential community
8Proportion of people aged 65 and above the ratio of persons older than 65 and the total number of population within a residential community
Spatial characteristics9Gross land areathe total land area of a residential community
10Gross floor area of buildingsthe total gross floor area (GFA) of buildings within a residential community
11Density of housingthe average gross floor area of buildings per hectare within a residential community
Facilities and services12Accessibility to medical resourcesthe distance between a residential community and the nearest medical facility such as a hospital or a clinic
13Accessibility to isolation facilitiesthe distance between a residential community and the nearest isolation facility for COVID-19 e.g., hotels
14Accessibility to community service facilitiesthe distance between a residential community and the nearest community service center
15Availability of living service facilitiesThe number and variability of living service facilities e.g., markets and open spaces
Table 2. Information of data used (source: author).
Table 2. Information of data used (source: author).
DatasetsDetailsContentsSize of RecordsPeriodData source
Municipal dataMedical resourcesHospitalName21862022Beijing Municipal Health and Wellness Commission
Level
Address
Confirmed cases dataCount of confirmed casesResidential communities832022Beijing Municipal Health and Wellness Commission
Count of confirmed cases
Community service facilities/Name2032022data.beijing.gov.cn
Address
Location
GeodataSpace unitsDistrictsName162022Location-based internet service provider (AMAP.com)
Area
Sub districtsName333
Residential communityName6652
Density of housing
Point of interests
(POI)
POIsCategoryca. 0.3
million
2022Location-based internet service provider (AMAP.com)
Type
Address
Location
Road networkName
Level
ca. 0.51
million
Mobile signaling dataCommuting data Population in residential areasca. 251
million
September to November 2022Mobile phone communication service provider (China Mobile)
Population in workplace
ID of grids
Populationgender
age
Number of permanent inhabitants
Number of inhabitants
ID of grids
Grids dataID of grids
Table 3. Coefficients of various indicators and case occurrence (source: author).
Table 3. Coefficients of various indicators and case occurrence (source: author).
CoefficientsEstimateOdds RatioStd.ErrorZ ValuePR(>|ZL)
Intercept−2.930.0534.66 × 10−1−6.2953.08 × 10−10 ***
Number of Floating Population−4.340.0134.71−0.9220.356757
Coverage of People’s Travelling Activities8.39 × 10−12.3149.32 × 10−10.90.368045
Distance to Densely Populated Places−1.570.2091.09−1.4410.149551
Density of Population−3.980.0001.05 × 104−3.7940.000148 ***
Number of People aged 65 and above4.1462.8011.043.996.62 × 10−5 ***
Proportion of People aged 65 and above−2.70 × 10−10.7631.40−0.1930.846719
Gross Land Area of Community2.9919.7952.621.1410.25376
Density of Housing2.309.9561.032.2290.025816 *
Sig. codes: ***: 0 < p < 0.001. * 0.01 < p < 0.05.
Table 4. Evaluation of the probability of case occurrence in the community affected by key factors (source: author).
Table 4. Evaluation of the probability of case occurrence in the community affected by key factors (source: author).
Std. ValueProbability of Case Occurrence Affected by
Number of People Aged 65 and AboveDensity of Housing
0.20.0490.037
0.40.1060.058
0.60.2130.089
0.80.3830.133
10.5870.196
Table 5. Variance inflation factors (VIF) of independent variables.
Table 5. Variance inflation factors (VIF) of independent variables.
Independent VariableVariance Inflation Factor (VIF)
Number of Floating Population3.8776
Coverage of Residential Activities2.5780
Distance to Densely Populated Places1.0543
Population Density1.3262
Number of population aged 65 and above3.3629
Proportion of population aged 65 and above1.6025
Gross Land Area1.9257
Housing Density1.0955
Table 6. Classification performance metrics with class imbalance.
Table 6. Classification performance metrics with class imbalance.
Confusion Matrix and Statistics
Reference
01
Prediction03739157
102
Accuracy0.9597
95% Confidence Interval(0.9531, 0.9657)
No Information Rate (NIR)0.9592
p-Value [Acc > NIR]0.4565 (p > 0.05)
Cohen’s Kappa0.0239
McNemar’s Test p-Value<2.0 × 10−16
Specificity0.01258
Pos Pred Value (PPV)0.95970
Neg Pred Value (NPV)1.00000
Prevalence0.95921
Detection Rate0.95921
Detection Prevalence0.99949
Balanced Accuracy0.50629
‘Positive’ Class0
Table 7. 10-Fold cross-validation error estimates.
Table 7. 10-Fold cross-validation error estimates.
Cross-Validation Error MetricValue
Raw CV Error Estimate (Mean across folds)0.03821481
Bias-Adjusted CV Error Estimate0.03820028
Table 8. Probability distribution of case occurrence in sample communities (source: author).
Table 8. Probability distribution of case occurrence in sample communities (source: author).
Probability of Case OccurrenceFrequencyFrequency Proportion
0–0.012847.12%
0.01–0.023859.65%
0.02–0.0371117.82%
0.03–0.0493223.36%
0.04–0.0573918.53%
0.05–0.063478.70%
0.06–0.071984.96%
0.07–0.08852.13%
0.08–0.09581.45%
0.09–0.1370.93%
0.1–11223.06%
Table 9. NPIs corresponding to risk factors (source: author).
Table 9. NPIs corresponding to risk factors (source: author).
Risk FactorFocus of Non-Pharmaceutical Interventions (NPIs)Corresponding
Non-Pharmaceutical Interventions (NPIs)
High proportion of elderly residentsprotect the elderly from exposure and infection, as well as to support their physical and mental well-being during the pandemic.
  • Enhancing testing, tracing, isolation, and quarantine measures for suspected or confirmed cases within or outside the community.
  • Providing priority access to vaccination for older adults and their caregivers.
  • Restricting visitors or non-residents from entering the community unless necessary.
  • Providing home-based or online services such as health care, food delivery, social support, entertainment, etc.
  • Encouraging physical activity and outdoor exposure within safe distance and time limits.
High housing densityreduce the transmission risk within and between households, as well as to improve the environmental quality and ventilation.
  • Implementing zoning or staggered strategies to limit the number of people accessing common areas or facilities such as elevators, corridors, stairs, laundry rooms, etc.
  • Providing alternative accommodation or isolation facilities for confirmed or suspected cases who cannot self-isolate at home.
  • Enhancing outdoor airflow and filtration in shared ventilation systems or installing portable air purifiers in individual units.
  • Promoting mask wearing, hand hygiene, ventilation, and disinfection among residents and staff.
  • Encouraging social cohesion and mutual support among residents and staff.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Sun, X.; Chen, H.; Zhang, H.; Li, Y.; Lin, W.; Ding, L. Exploring the Factors Influencing the Spread of COVID-19 Within Residential Communities Using a Big Data Approach: A Case Study of Beijing. Buildings 2025, 15, 2186. https://doi.org/10.3390/buildings15132186

AMA Style

Li Y, Sun X, Chen H, Zhang H, Li Y, Lin W, Ding L. Exploring the Factors Influencing the Spread of COVID-19 Within Residential Communities Using a Big Data Approach: A Case Study of Beijing. Buildings. 2025; 15(13):2186. https://doi.org/10.3390/buildings15132186

Chicago/Turabian Style

Li, Yang, Xiaoming Sun, Huiyan Chen, Hong Zhang, Yinong Li, Wenqi Lin, and Linan Ding. 2025. "Exploring the Factors Influencing the Spread of COVID-19 Within Residential Communities Using a Big Data Approach: A Case Study of Beijing" Buildings 15, no. 13: 2186. https://doi.org/10.3390/buildings15132186

APA Style

Li, Y., Sun, X., Chen, H., Zhang, H., Li, Y., Lin, W., & Ding, L. (2025). Exploring the Factors Influencing the Spread of COVID-19 Within Residential Communities Using a Big Data Approach: A Case Study of Beijing. Buildings, 15(13), 2186. https://doi.org/10.3390/buildings15132186

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop