Next Article in Journal
Construction and Optimization of the Ecological Security Pattern of Pinglu Canal Economic Zone Based on the InVEST-Circuit Theory Model
Previous Article in Journal
The Economic Performance of Urban Sponge Parks Uncovered by an Integrated Evaluation Approach
Previous Article in Special Issue
Research on the Nonlinear Relationship Between Carbon Emissions from Residential Land and the Built Environment: A Case Study of Susong County, Anhui Province Using the XGBoost-SHAP Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analyzing the Impact of Land-Use Characteristics and Demographic Factors on Spatial Variations in Public Bus Usage: A Comparison of Pre- and During COVID-19 Periods

Department of Geography Education, Dongguk University, 30, Pildong-ro 1-gil, Jung-gu, Seoul 04620, Republic of Korea
*
Author to whom correspondence should be addressed.
Land 2025, 14(5), 1102; https://doi.org/10.3390/land14051102
Submission received: 26 March 2025 / Revised: 11 May 2025 / Accepted: 14 May 2025 / Published: 19 May 2025

Abstract

:
The spread of the coronavirus pandemic led to significant changes in bus-usage patterns in urban areas worldwide. Researchers have frequently employed linear and nonlinear models in bus-usage studies. However, existing linear models assume that each variable affects a uniform range, limiting their ability to capture localized pattern changes. This study applies a multiscale geographically weighted regression model reflecting the characteristics of the variables to address these limitations. Linear models are constrained by their inability to account adequately for the complex dynamics of real-world bus usage. This research introduces nonlinear methods to overcome these constraints. The geographical random forest method, an advanced variant of the random forest model, integrates spatial concepts to explain local patterns more effectively than traditional machine learning techniques. The linear models revealed significant changes in four variables (i.e., population size, over-65 population ratio, number of students, and land-use complexity). In contrast, nonlinear models demonstrated diverse movement patterns influenced by several factors, indicating a shift toward new public transportation patterns.

1. Introduction

Transit-oriented development is an urban planning strategy addressing the challenges of densely populated cities by enhancing accessibility to public transportation. This strategy aims to foster a sustainable transportation system by reducing reliance on private vehicles and encouraging walking, cycling, and public transit [1]. However, the pervasive spread of coronavirus disease 2019 (COVID-19) posed significant risks to these systems. In response, governments worldwide implemented movement restrictions and social distancing measures, which, in turn, led to a significant decrease in mobility [2,3,4]. Notably, during the COVID-19 pandemic, the decline in public transportation use was more pronounced than personal vehicle use [5,6,7].
Numerous studies have investigated the changes in the relationship between public transportation usage patterns and the built environment due to COVID-19 [6,8,9]. However, these studies often lack a spatial analysis or the ability to visualize the spatial discrepancies in usage. Moreover, while they focused on overall trends, they did not adequately capture how the relationships between key factors, such as demographic variables and land use, shifted regionally in the post-pandemic context. This research differs by employing advanced spatial models, specifically multiscale geographically weighted regression (MGWR) and geographical random forest (GRF), to not only capture the changing relationships between public transport usage and influencing factors but also to account for spatial heterogeneity in these relationships. Unlike traditional approaches, this study integrates spatial bandwidth information from GWR into GRF, offering deeper insights into local dependencies and allowing for the identification of region-specific transit demand drivers. This method provides a more nuanced understanding of transit patterns, especially for vulnerable groups, making it a more effective tool for targeted transit planning in the post-pandemic era. By doing so, this study contributes to the existing literature by offering a more comprehensive, spatially aware framework for addressing public transit challenges in a post-COVID-19 world.
To achieve this, the study sets out three specific objectives: (1) to spatially identify changes in bus ridership patterns between pre- and post-pandemic periods, (2) to compare and evaluate the explanatory power of GWR and MGWR in capturing regional variations, and (3) to determine the most influential factors using RF and GRF models, incorporating spatial dependency through the GRF approach. These steps aim to support evidence-based and regionally adaptive public transportation strategies in the wake of COVID-19.

2. Literature Review

2.1. Factors Related to Public Transportation Usage

Transit-oriented development is recognized for its effective integration with public transportation systems, enhancing public transit usage by concentrating its influence on surrounding areas [10,11]. Prior research has identified population density, the built environment, and land use as critical indicators of transit-oriented development [12]. Specifically, demographic factors, such as the population density, number of employees, and number of students, are positively correlated with public transportation use [1,13,14,15,16]. Land-use characteristics, including the ratios of residential and commercial areas, land prices, and land-use complexity, have also been significant in explaining public transportation patterns [1,15,16,17,18,19,20]. Additionally, variables related to the connectivity between buses and subways enhance public transportation usage [1,11,14,15,16,18,19,20,21,22]. Various factors reflecting the surrounding construction environment, such as weather conditions, road density, subway station configurations, bicycle stands, parking lots, and parking spaces, have also been considered [6,11,14,15,22]. Studies have employed these variables in examining how construction environment changes affected public transportation demand during the COVID-19 pandemic [6,7,8,9,23,24]. However, a local analysis approach has rarely been applied in such studies.
Recent studies have also explored how demographic and built environment factors influenced travel behavior during and after the COVID-19 pandemic. For instance, metro commuting in Wuhan was analyzed, highlighting the shifting role of the built environment in shaping ridership patterns [25]. Spatial resilience in London was examined, showing how public transport demand recovery varied across urban areas [26], while the severity of traffic changes under pandemic uncertainty was investigated, underscoring the need to consider socio-spatial disparities in transportation planning [27].

2.2. Methodological Approaches in Public Transportation Research

Various methodological approaches have been employed to analyze public transportation usage, ranging from linear spatial analysis to more complex nonlinear and machine learning techniques. In linear spatial analysis methods for public transportation usage, the ordinary least square model is a representative regression technique often employed in transportation research. Nevertheless, this method does not account for spatial relationships between adjacent stations and faces criticism for its inadequacy in reflecting spatial dependencies [15,28,29].
Conversely, GWR addresses spatial heterogeneity by allowing local relationship variations between dependent and explanatory variables [30,31]. Moreover, GWR has been widely applied in transportation research due to its ability to account for spatial nonstationarity [11,15,29,32]. However, GWR faces limitations, such as a constrained number of data points and potentially biased results due to uniform bandwidths [31,33,34]. The MGWR method was developed to apply variable-specific bandwidths, providing a more accurate representation of the relationships between variables to address these problems [31,35]. In addition, MGWR offers advantages in terms of computational efficiency and data handling. Although it has been applied in diverse fields, its use in examining public transportation demand remains limited [31,36,37].
The GWR and MGWR methods are based on linear assumptions, which may not fully capture the complexity of real-world interactions [17,21,38,39]. Nonlinear methods, such as support vector machines, neural networks, and RF, improve accuracy by modeling empirical trends without assuming linear distributions [40,41].
Random Forest (RF) was proposed to assess the importance of explanatory variables relative to the dependent variable, offering insights beyond the typical positive or negative relationships of linear models [42]. The RF method has demonstrated superior performance in transportation studies [21,37,41,43]. Despite its advantages, RF does not account for spatial effects, addressed by incorporating spatial concepts into RF using the GRF method.
Geographically Weighted Random Forest (GRF), which integrates the bandwidth concept of Geographically Weighted Regression (GWR) to highlight local patterns and spatial heterogeneity, was later developed [44]. Thus, it is a promising tool for analyzing public transportation demand, especially in the context of COVID-19 [45]. Nonetheless, GRF remains underused across fields.
In summary, the existing research on public transportation demand has primarily ignored localized analyses in the post-COVID-19 context. Traditional GWR models fail to account for variable-specific distribution characteristics, whereas nonparametric machine learning techniques often ignore spatial effects. Consequently, precise analytical approaches are needed to identify the most influential factors of bus usage and their regional variations. So, this study builds on previous research linking demographic structure, urban form, and public transport demand. However, unlike earlier studies, it applies a fine-grained grid-based spatial analysis and incorporates both linear and nonlinear spatial models to capture local heterogeneity and complex interactions.

3. Data and Methodology

3.1. Study Area and Research Flow

The research area on the map in Figure 1a is Seoul, the capital of the Republic of Korea, with 9.39 million residents as of 2023.
At the center of the Korean Peninsula in East Asia, Seoul is the largest city in the Republic of Korea. Due to its complex urban structure and substantial population, Seoul exhibits considerable intra-city variation, with distinct demographic, land-use, and mobility patterns across different districts. In 2021, Seoul had the highest percentage of confirmed COVID-19 cases in the Republic of Korea, accounting for approximately 53.5% of the total population. This led to the implementation of strict social distancing measures, which resulted in a significant reduction in movement, particularly in public transportation. The map in Figure 1b depicts a notable decrease in bus usage across most areas of Seoul. The left panel in Figure 1a presents the distribution of bus stops in the city, and the following diagram outlines the overall research framework. Given its urban complexity, high dependency on public transit, and diverse spatial characteristics, Seoul provides a suitable and representative case for studying the spatial impacts of COVID-19 on transit usage.
First, in the initial stage (Figure 2a), geographical information system (GIS) shapefiles based on a grid (500 m × 500 m) were employed to examine the spatial changes in public transportation volume in Seoul. A map was created to display the bus stops, with the daily average number of boarding and alighting passengers at each bus stop serving as the dependent variable. Relevant studies were reviewed to identify factors affecting changes in public transportation volume using a linear analysis.
In the second stage (Figure 2b), variables directly affecting bus usage were identified. An analysis of the connectivity between public transportation modes, such as buses and subways, was conducted. The collected data on these factors were standardized to mitigate scale disparities caused by heterogeneous data types, ensuring that specific factors did not disproportionately influence the results. Subsequently, a geodatabase for the research analysis was established.
In the third stage (Figure 2c), an influence analysis and importance assessment were performed to observe the spatial changes in public transportation volume. First, an analysis was conducted to understand how bus usage spatially depends on and varies with the selected factors in this study. Next, a GWR model was applied to analyze which variables had a more significant spatial influence. As mentioned, this study compared GWR and MGWR methods to determine the most suitable approach for the research objectives. Finally, an RF machine learning technique and GRF model incorporating bandwidth concepts were used for importance assessment to analyze and identify significant factors by region (Figure 2c).

3.2. Data Collection

The dependent variable, the daily average number of bus rides, was derived from smart card transportation data provided by Seoul’s public bus operators.
The data are from 2019 (before the COVID-19 pandemic) and 2021 (when social distancing policies were most strictly enforced throughout the year). Data from 2022 and 2023 were omitted as these years represented a transitional phase characterized by the gradual relaxation of restrictions and shifting travel behaviors, which could introduce confounding effects into the assessment of the pandemic’s impact.
Explanatory variables affecting bus usage were chosen based on previous studies exploring the relationship between public transportation and diverse factors [6,13,15,17,21,22]. These variables are categorized into demographics, land-use characteristics, and connectivity.
Socioeconomic data were sourced from grid-cell-based datasets published by Korea’s National Geographic Information Institute. Grid-based data are used because continuous spatial phenomena can be grasped in detail compared to existing traffic analysis zones and the bias of spatial analysis models is improved according to the size of spatial units in urban and non-urban cities. While alternatives such as Transport Analysis Zones offer population-balanced spatial units, the grid-based approach was preferred for its compatibility with high-resolution spatial datasets.
Additionally, the smart card data, which are originally recorded at the level of individual bus stops, were spatially aggregated to match the grid cells using the geographic coordinates of each stop. This process ensured compatibility with other grid-based datasets used in the study. However, one notable limitation of the smart card dataset is the incomplete recording of alighting information, particularly for bus users, which may affect the spatial precision of the dependent variable. In such cases, approximations or partial records may have been used, potentially introducing spatial uncertainty.
Despite offering fine spatial granularity, the grid-based approach may also result in inhomogeneous population or land-use distributions within cells, especially in dense metropolitan areas like Seoul. For instance, grids in central business districts may contain significantly higher population density and functional diversity compared to suburban grids. During the COVID-19 pandemic, strict social distancing measures led to reduced usage of commercial facilities and a shift in population activity patterns, with more movements concentrated around residential areas. These behavioral changes may have indirectly influenced several variables, including population density, land-use complexity, and public transit demand. These limitations and contextual factors should be taken into account when interpreting the findings.
Among the various variables, land-use complexity represents the number of different building use types within each grid cell. This metric was calculated by counting the types of buildings present based on 28 building use categories defined in the Building Act and its Enforcement Decree. These include residential, commercial, religious, cultural, medical, educational, industrial, and recreational uses, among others. A higher number of distinct building uses in a grid reflects a higher level of land-use complexity. Unlike more nuanced indices such as the Shannon Diversity Index, this approach provides a straightforward measure of use diversity, though it does not reflect the balance or evenness of land-use distribution. Spatial adjustments were applied to align bus usage data and subway station proximity data with these grid cells. The distance to subway stations represents the mean Euclidean distance from each grid centroid to the closest subway station, capturing multimodal accessibility. Classical graph-theoretic measures were considered but not applied due to data limitations and the focus on direct physical proximity. Bus usage was aggregated to match the grid boundaries.
Table 1 lists the definitions of 12 independent variables across these categories, and Table 2 provides descriptive statistics for the determinants in this study. Factors without values for 2019 used the most recent available data because data for both 2019 and 2021 were not available simultaneously. All explanatory variables were standardized using Z-score normalization to ensure comparability and prevent scale bias prior to performing the analysis.
In Table 2, the changes experienced by the variables between 2019 and 2021 are indicative of the influence of the COVID-19 pandemic and related social distancing policies. For example, bus usage decreased from an average of 944 rides in 2019 to 717 in 2021, reflecting reduced public transportation demand due to pandemic restrictions. Demographic variables such as the aging population ratio also saw changes, with an increase in the proportion of elderly individuals in 2021, possibly due to restrictions affecting younger populations more significantly. Moreover, land-use characteristics show a clear impact of pandemic policies: commercial area usage declined as businesses faced restrictions, while residential areas saw an increase, possibly reflecting people spending more time at home during lockdowns. These shifts in land-use complexity and other variables reflect broader shifts in behavior and land-use patterns due to the pandemic.

3.3. Methods

The GWR and MGWR methods were employed to analyze how the local relationship between bus usage and influencing factors changes. Brunsdon et al. introduced GWR, a local regression model that estimates relationships based on the values of the dependent and explanatory variables at each location [30]. Unlike traditional global regression methods that assume a constant relationship between variables across space, GWR allows spatially varying the parameters [31]. The formulation of GWR is as follows:
Y i = i = 0 m β j μ i , v i x i j + ε i .
In Equation (1), Y i represents the dependent variable for the i th grid, β j μ i ,   v i denotes the coefficient for the j th grid, x i j denotes the j th explanatory variable for the i th grid, and ε i indicates the error term [14].
The GWR estimation process involves setting a bandwidth around a location i and calculating the weights for each observation based on the distance from the centroid. The model coefficients are estimated using the weighted least squares method [30]. In this study, the GWR model employs the bi-square distance weighting method, where weights decrease with distance and become zero beyond a certain threshold. Despite using grid units with a constant spatial distribution, this study applies an adaptive kernel method to exclude grids without bus stops or population segments.
To standardize the data before applying the GWR and MGWR models, the explanatory variables were normalized using Z-scores. This ensures that all variables are on the same scale and comparable, mitigating any potential bias introduced by differences in units or magnitude.
Bandwidths were selected using the golden division method, applying the golden ratio to determine optimal values. The optimal bandwidth was chosen based on the smallest Akaike’s information criterion with a small-sample correction (AICc) value [34,36]. AICc was chosen because it balances model fit and complexity, penalizing models with too many parameters, which helps prevent overfitting in the context of the spatial data.
The limitations of GWR noted in previous research are addressed by employing MGWR, allowing for varying spatial scales in the relationship between the dependent and explanatory variables. Fotheringham et al. proposed MGWR, which estimates local parameters by adjusting bandwidths according to the spatial characteristics of explanatory variables [31,46]. The formulation of MGWR is as follows:
Y i = i = 0 m β b w j μ i , v i x i j + ε i .
In Equation (2), β b w j represents the bandwidth for calibrating the j th conditional relationship [14]. This study employs MGWR to capture the local influences of factors more effectively on bus usage than GWR. Geographically weighted regression (GWR) and multiscale geographically weighted regression (MGWR) analyses were performed using the MGWR 2.0 software package, a Python-based tool developed by researchers at the University of Arizona. Next, RF and GRF were applied to assess the local importance of each factor affecting bus usage. Breiman introduced RF, an ensemble method comprising multiple decision trees that aggregate results via majority voting based on random inputs [42,47]. The RF method determines factor importance by modeling nonlinear relationships between dependent and explanatory variables. This method constructs numerous decision trees in parallel using bootstrap aggregating (bagging), reducing the variance and enhancing model stability, mitigating overfitting and noise sensitivity in individual trees [21]. Each decision tree is trained on two-thirds of the data, whereas the remaining third, known as the out-of-bag set, is used to estimate model performance. The out-of-bag method eliminates the need for separate cross-validation or testing sets. Final predictions were averaged across all decision trees to produce the overall RF output [45,48]. To select the optimal parameters for the RF model, we employed a random grid search method using the H2O package in R 4.2.0, which automates the search for the best hyperparameters by testing different combinations.
Noted for its simplicity, RF is versatile in addressing various predictive problems and can handle large datasets [49]. Moreover, RF mitigates the risk of overfitting due to its method of training multiple decision trees on varied datasets [20]. This characteristic makes RF a robust classifier that is effective when working with large datasets and appropriately tuned hyperparameters [50]. This study employs a random grid search method for parameter optimization in RF using the H2O package in R, a prominent statistical programming language [20,51].
GRF extends the RF model by incorporating spatial heterogeneity, which is not explicitly accounted for in RF. GRF also integrates the bandwidth from GWR, allowing for the generation of localized submodels rather than relying on a single global model. The spatial bandwidth in GRF was optimized using functions in the SpecialML package in R. GRF uses the bandwidth information obtained from GWR to adjust the spatial scale of the local models, reflecting the varying spatial dependencies across different locations. This integration allows GRF to capture the local variations more effectively than traditional RF models. Georganos et al. proposed that GRF extends the RF model by incorporating spatial heterogeneity, which is not explicitly accounted for in RF [19]. In addition, GRF evaluates the local significance of each factor by integrating the bandwidth from GWR, and it generates local submodels rather than relying on a single global model [20]. The formulation of GRF is as follows:
Y i = a u i , v i x i + e .
In Equation (3), a u i ,   v i x i represents the predicted value of the RF at position i , and u i ,   v i indicates the center coordinate of position i . Other submodels are created at each location i and reflect only its adjacent values [20]. Similarly to GWR, GRF can determine the type of kernel, and the adaptive method GWR is used in this work. The same parameters are used as in the RF model, and the optimal bandwidth was selected using functions embedded in the SpecialML package of the R programming language, ensuring a tailored model that reflects local spatial dependencies.

4. Results

4.1. Spatially Evaluating Determinants Influencing Public Bus Usage

In this section, GWR was employed to analyze the spatial heterogeneity by locally estimating the effects of explanatory variables on bus usage. The MGWR method was employed to perform a spatial heterogeneity analysis based on the distinctive characteristics of each factor to address the limitations of GWR. Table 3 presents the results of the GWR model for 2019 and 2021.
The comparative analysis of the results from 2019 and 2021 reveals shifts in the coefficients of several variables. The coefficient for the population number increased from −39.030 to 0.898, and the coefficient for the over-65 population ratio rose from −11.185 to 58.193, transitioning from negative to positive effects on bus usage. Conversely, the coefficient for the number of students shifted from 3.013 to −26.251, changing from a positive to a negative effect. These variations imply alterations in bus-usage patterns associated with COVID-19. However, the GWR results may be subject to over- or underestimation due to the uniform bandwidth across all variables [31]. The MGWR method was employed to address this limitation, allowing for variable-specific bandwidths based on the distribution characteristics. Table 4 presents the MGWR model results for 2019 and 2021.
The comparison of the results from 2019 and 2021 reveals that the over-65 population ratio consistently positively affected bus usage in both periods; however, its coefficient value declined from 31.30 in 2019 to 0.07 in 2021. Similarly, the number of students transitioned from having a positive effect (32.57) in 2019 to a negative effect (−19.20) in 2021, aligning with the findings of the GWR analysis. Furthermore, the positive influence of land-use complexity significantly increased, rising from 116.6 in 2019 to 503.9 in 2021. These findings diverge from those in previous GWR analyses. Figure 3 and Figure 4 illustrate the local changes in these variables observed in the MGWR model. Four variables demonstrated notable changes: total population, over-65 population ratio, number of students, and land-use complexity.
As shown in Figure 3, the total population variable exerted a negative influence on bus usage across most areas of Seoul in 2019. This may reflect a pre-pandemic urban environment where higher population areas, particularly in dense commercial or transit-rich zones, had more modal choices or greater reliance on subways or private transport. However, by 2021, this negative effect had diminished, particularly in central districts, and a positive relationship emerged in the northern parts of the city. This spatial shift suggests that population-driven demand for bus services increased in peripheral or residential areas during the pandemic recovery phase, potentially reflecting shifts in essential travel patterns, reduced subway usage due to infection concerns, or an increased reliance on local bus services in areas with limited alternative transit modes.
Conversely, the proportion of the over-65 population, which had a consistently positive impact on bus usage throughout Seoul in 2019, transitioned to a negative influence in 2021, especially in Gangseo-gu and the northern districts. This reversal may be associated with mobility restrictions among elderly populations due to health concerns, increased vulnerability to COVID-19, or changes in travel behavior such as reduced discretionary and non-essential trips. Additionally, service disruptions or perceived risks of crowded public spaces may have discouraged older residents from using buses.
The number of students exhibited the most pronounced change among the variables, positively affecting bus usage in 2019 across most regions, excluding parts of Guro-gu and Yangcheon-gu. However, by 2021, the number of students became a negative determinant in all regions. The change was attributed to remote education in middle and high schools due to COVID-19. Land-use complexity also underwent notable change: in 2019, it exhibited positive and negative effects depending on the region, but by 2021, it had a uniformly evenly weak positive effect across all areas. This is due to the simplification of people’s movement factors as a result of social distancing. Figure 5 visualizes the residuals and R-squared values from the MGWR model.
The two maps in Figure 5 illustrate the spatial distribution of residuals, representing the difference between the actual and predicted values in the MGWR model. The map analysis indicates that the residuals from the MGWR model do not display any discernible spatial patterns. Moran’s I values for the residuals in the 2019 and 2021 models were −0.016 and −0.018, respectively, suggesting a lack of significant spatial dependence in both models. At the bottom of Figure 5, the two maps depict the spatial distribution of the local coefficients of determination for the MGWR model. The coefficient of determination exceeded 0.5 in Jung-gu and Jongno-gu, whereas it fell below 0.3 in western regions. These findings indicate that the effect of the determinants on bus usage varies across regions.

4.2. Ranking the Importance of Determinants

In previous analyses, spatial methods were employed based on linear relationships. This section applies nonparametric machine learning techniques, specifically RF and GRF, for further understanding. The RF analysis provided insight into the relative importance of each variable concerning bus usage. The GRF analysis offered a localized variable importance assessment, accounting for spatial correlations with neighboring regions. Table 5 presents the results of the RF model for 2019 and 2021.
When comparing the results from the two periods, the importance ranking of certain variables shifted; however, no variable experienced a change in relative importance greater than 10. This finding indicates that the RF analysis, which does not account for spatial correlation between adjacent regions, insufficiently detected changes in bus-usage patterns attributable to COVID-19. A GRF analysis was conducted to address these limitations, incorporating the bandwidth concept from GWR. Table 6 presents the results of the GRF models for 2019 and 2021.
Several notable changes in variable importance were observed when comparing the results from the two periods. Similarly to the RF model, the importance rankings of variables, such as the official land price and number of employees, shifted. In 2021, the relative importance of variables, such as the commercial area ratio, distance from subway stations, and productive population ratio, significantly increased, with changes exceeding 10. The over-65 population ratio (ranked 11th in 2019) advanced to sixth place in 2021, indicating the most substantial change. Figure 6 presents a variable importance map from the GRF model, highlighting the highest and lowest importance values across the grids. In addition, Figure 7 depicts the area ratios of variables with the maximum and minimum importance values.
In the 2019 GRF model, the official land price emerged as the most significant variable influencing bus usage over the largest area. This variable was influential in the regions surrounding the Hangang River and Geumcheon-gu in eastern Seoul, encompassing approximately 38% of the city. Conversely, the number of employees was the dominant factor in northern and southern Seoul, covering about 30% of the area. The distance from the subway station was most significant in approximately 15% of Seoul, including Eunpyeong-gu, Guro-gu, and Yeongdeungpo-gu.
In the 2021 GRF model, the number of employees became the most influential variable, affecting northern Seoul in particular. This reflects the post-pandemic recovery of work-related trips, especially in areas with high employment density. Compared to the 2019 model, the area where the official land price was a crucial determinant experienced the most reduction. The proportion of Seoul where the official land price was a critical factor decreased from 38% in 2019 to 15% in 2021. This decline likely reflects a weakening correlation between property value and travel behavior during the pandemic, as commercial areas became less active under social distancing policies.
The significance of the commercial area ratio, productive population ratio, and total population number increased markedly in 2021, indicating a shift in the relative importance of factors influencing bus usage regionally. These variables gained prominence as public transport demand became more aligned with population-driven and locally essential activity patterns during the pandemic.
In contrast, land-use complexity was not a major determinant in the 2019 GRF model, with less significant influence than the youth and over-65 population ratios. Specifically, land-use complexity was the least critical variable, affecting 58% of Seoul. The youth population ratio was significant in only 11% of the area, predominantly around Gangseo-gu and Jongno-gu, whereas the over-65 population ratio was notable in just 9% of the area, primarily in central Seoul, from Gangbuk-gu to Seocho-gu.
For the 2021 GRF model, land-use complexity remained the least significant variable, followed by the number of students, residential area ratio, and over-65 population ratio. In 2021, the proportion of regions where land-use complexity was a minor factor decreased by half compared with 2019. Furthermore, a substantial increase occurred in areas where the number of students, total population, and distance from the subway station were insignificant. This trend reflects how the spatial determinants of bus ridership became more fragmented and context-dependent during the pandemic, underlining the differentiated impact of COVID-19 policies across neighborhoods. Overall, the changing influence of each variable between 2019 and 2021 demonstrates how the pandemic reshaped spatial travel demand patterns. Employment-related factors gained relevance, while traditional urban form indicators like land value and land-use complexity lost explanatory power, emphasizing the need for spatially adaptive transit planning.
This diversification in the importance of factors by region underscores the limitations of linear assumptions and the necessity of incorporating spatial considerations in public transportation demand studies.
Figure 7 illustrates the proportion of each variable’s importance using pie charts. This figure presents the average significance of factors across the periods, providing insight into which factors were more influential during each one. Finally, an analysis of the residuals was conducted to evaluate the suitability of the GRF model. In addition, Figure 8 depicts the spatial distribution of the GRF model residuals.
Figure 8 presents the residual map for both 2019 and 2021. The map indicates that the residuals are largely randomly distributed across the study area, with no clear spatial patterns emerging in either period. Moran’s I values for the GRF model residuals were 0.069 for 2019 and −0.005 for 2021. These values suggest a lack of significant spatial dependence, implying that the model adequately captured the major spatial patterns in bus ridership and that there were no residual spatial effects that could be explained by other factors.

5. Discussion

The COVID-19 pandemic presented new challenges for public transportation systems, particularly for vulnerable populations who depend heavily on public transit. In Seoul, bus networks serve as a crucial link for reaching destinations that are not directly accessible by subway, offering flexible mobility options for various demographic groups. This study provides spatial insights into how public bus usage changed during the pandemic and reveals the localized effects of key influencing factors.
Our findings are consistent with prior studies that observed pandemic-induced changes in transit demand patterns [6,9,21], but they extend existing knowledge by incorporating spatial heterogeneity using multiscale geographically weighted regression (MGWR). Previous research has established that demographic and built-environment variables significantly impact public transportation use [1,15,21,22]. However, our results highlight that these impacts vary considerably across neighborhoods, especially in the context of post-COVID-19 recovery. Unlike conventional models that rely on spatially uniform assumptions, MGWR reveals the nuanced and region-specific nature of these relationships.
From a methodological standpoint, this study compares the performance of GWR and MGWR and demonstrates that MGWR more effectively captures spatial variation by applying variable-specific bandwidths. The analysis shows that, for instance, the influence of population size increased in certain northern districts of Seoul after the pandemic, while the influence of the over-65 population declined, likely due to behavioral and demographic shifts. Similarly, the reduced demand from students was reflected in areas where remote learning became widespread. These results suggest that pandemic-related behavioral changes have altered the spatial dynamics of transit use. Our findings indicate that COVID-19 policies, such as curfews and telecommuting recommendations, led to short-term changes in structural variables like land-use complexity and public transportation accessibility. These policies influenced behavioral patterns, such as reduced commuting and increased remote work, which in turn affected the demand for public transportation. The interplay between these behavioral changes and structural variables highlights the complex dynamics of urban mobility during the pandemic. Understanding these interactions is crucial for designing adaptive transportation strategies in future urban planning.
This Figure 9 illustrates the spatial patterns observed in the study, highlighting the areas where the influence of demographic and land-use factors on public bus demand shifted over time.
Complementing the MGWR analysis, the machine learning-based geographical random forest (GRF) model provided additional depth by capturing both nonlinear relationships and spatial dependencies. In contrast to the standard random forest (RF) model, GRF incorporated spatial bandwidths, uncovering regionally distinct determinants of bus usage in both pre- and post-pandemic periods. For instance, while land price, employment density, and subway accessibility were highly influential in 2019, their importance declined in 2021, with other variables gaining prominence. This shift illustrates the evolving drivers of transit demand and underscores the value of combining statistical and machine learning approaches in spatial transport research.
From a policy and planning perspective, the findings offer several practical implications. First, the spatially disaggregated results support the design of locally adaptive transit strategies. Transit planners can tailor bus routes and service frequencies based on regional differences in demographic and land-use profiles. For example, areas where demand from older adults has increased could receive enhanced service coverage, while service in areas with declining student ridership might be reduced to optimize resource allocation. Such dynamic planning promotes both efficiency and equity—critical goals in resilient public transport systems.
Furthermore, integrating MGWR and GRF into transport analysis enables policymakers to move beyond one-size-fits-all approaches and implement context-sensitive interventions. These methods provide planners with tools to identify high-priority regions for intervention and tailor responses according to the spatial variability of influencing factors. In future pandemics or disruptions, such tools could be instrumental in designing scalable and responsive transit strategies that maintain service quality while minimizing public health risks.
While this study focuses on Seoul, its methodological framework is transferable to other metropolitan areas with similar transit infrastructures. The combined use of MGWR and GRF allows for localized insight into urban mobility systems and could be adopted in comparative studies across different cities. However, contextual differences in governance, urban form, and travel behavior must be considered when applying this approach elsewhere.
Regarding the methodological contributions, this study compared the performance of multiple models, including GWR, MGWR, RF, and GRF, to determine which model is most suitable under varying conditions. Each of these models offers distinct advantages depending on the context of the study. For example, GWR is effective when spatial variation is moderate and can capture local dynamics, but it may not handle complex spatial dependencies as well as MGWR. The MGWR approach, which applies variable-specific bandwidths, performs better when there are substantial regional differences in influencing factors, as observed in the post-pandemic period. Meanwhile, RF excels at uncovering nonlinear relationships between variables but does not explicitly account for spatial dependencies, which is a limitation for studies focused on densely populated urban areas. On the other hand, GRF, by incorporating both spatial and nonlinear dependencies, provides the most comprehensive insights for understanding urban transit patterns in complex metropolitan settings.
This comparative analysis of the models contributes to the broader field of transport research by illustrating the strengths and weaknesses of each approach in handling spatial-temporal data. By understanding which model is most appropriate for different circumstances, transit planners and researchers can make more informed decisions when selecting the methodology for future studies or planning efforts.
This study has several limitations. It primarily analyzes bus usage as a proxy for public transport demand and does not account for multimodal interactions or temporal fluctuations (e.g., hourly or daily patterns). Future research could incorporate subway data or real-time mobility datasets to enhance understanding of multimodal dynamics. Moreover, applying this methodology across multiple cities could validate its generalizability and inform broader urban transport policy. It is also unclear how COVID-19 policies impacted structural variables like land-use complexity and public transportation accessibility in the short term. While changes observed were assumed to be due to pandemic policies, it is uncertain whether these changes were directly linked to COVID-19 or other factors such as population shifts or trends in transportation modes. Moreover, the full recovery of COVID-19 impacts by 2021 remains uncertain, especially in regions affected by secondary waves. Data from 2022 could have provided more clarity, particularly regarding shifts in transportation modes, which might have been influenced by both pandemic policies and pre-existing trends. Finally, the limited data available for this study may have constrained the conclusions. Expanding data sources in future studies could lead to more robust and generalizable findings.
In summary, this study contributes to the literature by revealing localized shifts in transit demand patterns through advanced spatial modeling. By identifying the most influential factors and their regional variations, the findings support more targeted and equitable transportation planning in the post-pandemic era.

6. Conclusions

This study identifies spatial inconsistencies in public transportation usage and explores the factors with the most significant influence by region and the spatial changes before and after the pandemic. By systematically identifying the spatial heterogeneity of Seoul city bus demand before and after the pandemic through multiscale geographically weighted regression (MGWR) and geographic random forest (GRF), we have significantly contributed to the theoretical advancement of transportation demand theory, particularly in understanding spatial dynamics within urban transit systems. The combination of MGWR and GRF enables a more nuanced view of the complex, region-specific patterns of transit demand, revealing how the influence of key factors shifts over time and across locations.
From a methodological perspective, this study advances the use of spatial modeling by demonstrating the advantages of integrating both traditional regression-based approaches (GWR) and machine learning methods (GRF). The GRF model, by highlighting changes in the importance of specific variables, provides deeper insights into the evolving factors influencing bus capacity, while GWR clarifies the regional variations in these relationships. This hybrid approach enhances both theoretical and methodological frameworks for future spatial transport research, especially in the context of post-pandemic mobility analysis.
In practical terms, our findings provide actionable guidance for urban planners and policymakers. The identification of region-specific shifts in transit demand due to the pandemic can aid in tailoring public transport policies to accommodate evolving needs. For example, understanding the dynamics of bus usage in specific districts can lead to more responsive and adaptive public transport strategies, ensuring efficient resource allocation and service provision. Moreover, by focusing on the spatial variability of demand, this research offers a blueprint for designing future transport systems that can better withstand disruptions like pandemics.
For future research, several directions are suggested. First, as part of the metropolitan area, Seoul has extensive public transportation exchanges with surrounding regions (Gyeonggi and Incheon). Future studies should include these areas to provide a more comprehensive view of regional interactions and dynamics. The lack of bus data from Gyeonggi and Incheon limited the analysis. Expanding the study to include the entire metropolitan area could offer more robust insights for public transportation policies during pandemics. Additionally, analyzing bus usage by time of day is essential, as travel purposes vary significantly by time (e.g., morning/evening commutes vs. nighttime travel). Understanding temporal changes in bus usage patterns could further reveal the impact of COVID-19 on daily mobility behavior and help predict changes in transport demand under non-pandemic situations. This research approach could also be instrumental in facilitating more rapid and effective responses to future public transport challenges, both pandemic-related and otherwise.

Author Contributions

Conceptualization, S.H. and B.Y.; Methodology, S.H. and B.Y.; Software, S.H.; Validation, S.H. and B.Y.; Formal analysis, S.H.; Investigation, B.Y.; Writing—original draft, S.H.; Writing—review & editing, B.Y.; Visualization, S.H.; Supervision, B.Y.; Project administration, B.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constrictive comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cervero, R. Alternative approaches to modeling the travel-demand impacts of smart growth. J. Am. Plan. Assoc. 2006, 72, 285–295. [Google Scholar] [CrossRef]
  2. De Vos, J. The effect of COVID-19 and subsequent social distancing on travel behavior. Transp. Res. Interdiscip. Perspect. 2020, 5, 100121. [Google Scholar] [CrossRef] [PubMed]
  3. Kwon, S.; Kim, E. Sustainable health financing for COVID-19 preparedness and response in Asia and the Pacific. Asian Econ. Policy Rev. 2022, 17, 140–156. [Google Scholar] [CrossRef]
  4. Nikiforiadis, A.; Mitropoulos, L.; Kopelias, P.; Basbas, S.; Stamatiadis, N.; Kroustali, S. Exploring mobility pattern changes between before, during and after COVID-19 lockdown periods for young adults. Cities 2022, 125, 103662. [Google Scholar] [CrossRef]
  5. Barbieri, D.M.; Lou, B.; Passavanti, M.; Hui, C.; Hoff, I.; Lessa, D.A.; Sikka, G.; Chang, K.; Gupta, A.; Fang, K.; et al. Impact of COVID-19 pandemic on mobility in ten countries and associated perceived risk for all transport modes. PLoS ONE 2021, 16, e0245886. [Google Scholar] [CrossRef]
  6. Kim, S.; Lee, S.; Ko, E.; Jang, K.; Yeo, J. Changes in car and bus usage amid the COVID-19 pandemic: Relationship with land use and land price. J. Transp. Geogr. 2021, 96, 103168. [Google Scholar] [CrossRef] [PubMed]
  7. Shortall, R.; Mouter, N.; Van Wee, B. COVID-19 passenger transport measures and their impacts. Transp. Rev. 2022, 42, 441–466. [Google Scholar] [CrossRef]
  8. Jenelius, E.; Cebecauer, M. Impacts of COVID-19 on public transport ridership in Sweden: Analysis of ticket validations, sales and passenger counts. Transp. Res. Interdiscip. Perspect. 2020, 8, 100242. [Google Scholar]
  9. Orro, A.; Novales, M.; Monteagudo, Á.; Pérez-López, J.B.; Bugarín, M.R. Impact on city bus transit services of the COVID–19 lockdown and return to the new normal: The case of A Coruña (Spain). Sustainability 2020, 12, 7206. [Google Scholar] [CrossRef]
  10. Guerran, E.; Cervero, R. Cost of a ride: The effects of densities on fixed-guideway transit ridership and costs. J. Am. Plan. Assoc. 2011, 77, 267–290. [Google Scholar] [CrossRef]
  11. Cardozo, O.D.; García-Palomares, J.C.; Gutiérrez, J. Application of geographically weighted regression to the direct forecasting of transit ridership at station-level. Appl. Geogr. 2012, 34, 548–558. [Google Scholar] [CrossRef]
  12. Shin, H.; Nicolau, J.L.; Kang, J.; Sharma, A.; Lee, H. Travel decision determinants during and after COVID-19: The role of tourist trust, travel constraints, and attitudinal factors. Tour. Manag. 2022, 88, 104428. [Google Scholar] [CrossRef] [PubMed]
  13. Chakraborty, A.; Mishra, S. Land use and transit ridership connections: Implications for state-level planning agencies. Land Use Policy 2013, 30, 458–469. [Google Scholar] [CrossRef]
  14. Chen, E.; Ye, Z.; Wang, C.; Zhang, W. Discovering the spatio-temporal impacts of built environment on metro ridership using smart card data. Cities 2019, 95, 102359. [Google Scholar] [CrossRef]
  15. Jun, M.J.; Choi, K.; Jeong, J.E.; Kwon, K.H.; Kim, H.J. Land use characteristics of subway catchment areas and their influence on subway ridership in Seoul. J. Transp. Geogr. 2015, 48, 30–40. [Google Scholar] [CrossRef]
  16. Sohn, K.; Shim, H. Factors generating boardings at Metro stations in the Seoul metropolitan area. Cities 2010, 27, 358–368. [Google Scholar] [CrossRef]
  17. Ding, C.; Cao, X.; Liu, C. How does the station-area built environment influence Metrorail ridership? Using gradient boosting decision trees to identify non-linear thresholds. J. Transp. Geogr. 2019, 77, 70–78. [Google Scholar] [CrossRef]
  18. Lin, P.; Weng, J.; Brands, D.K.; Qian, H.; Yin, B. Analysing the relationship between weather, built environment, and public transport ridership. IET Intell. Transp. Syst. 2020, 14, 1946–1954. [Google Scholar] [CrossRef]
  19. Pan, H.; Li, J.; Shen, Q.; Shi, C. What determines rail transit passenger volume? Implications for transit oriented development planning. Transp. Res. Part D Transp. Environ. 2017, 57, 52–63. [Google Scholar] [CrossRef]
  20. Vergel-Tovar, C.E.; Rodriguez, D.A. The ridership performance of the built environment for BRT systems: Evidence from Latin America. J. Transp. Geogr. 2018, 73, 172–184. [Google Scholar] [CrossRef]
  21. Chen, E.; Ye, Z.; Wu, H. Nonlinear effects of built environment on intermodal transit trips considering spatial heterogeneity. Transp. Res. Part D Transp. Environ. 2021, 90, 102677. [Google Scholar] [CrossRef]
  22. Gutiérrez, J.; Cardozo, O.D.; García-Palomares, J.C. Transit ridership forecasting at station level: An approach based on distance-decay weighted regression. J. Transp. Geogr. 2011, 19, 1081–1092. [Google Scholar] [CrossRef]
  23. Brough, R.; Freedman, M.; Phillips, D.C. Understanding socioeconomic disparities in travel behavior during the COVID-19 pandemic. J. Reg. Sci. 2021, 61, 753–774. [Google Scholar] [CrossRef] [PubMed]
  24. Hu, S.; Chen, P. Who left riding transit? Examining socioeconomic disparities in the impact of COVID-19 on ridership. Transp. Res. Part D Transp. Environ. 2021, 90, 102654. [Google Scholar] [CrossRef]
  25. Yang, H.; Lu, Y.; Wang, J.; Zheng, Y.; Ruan, Z.; Peng, J. Understanding post-pandemic metro commuting ridership by considering the built environment: A quasi-natural experiment in Wuhan, China. Sustain. Cities Soc. 2023, 96, 104626. [Google Scholar] [CrossRef]
  26. Sharma, D.; Zhong, C.; Wong, H. Lockdown lifted: Measuring spatial resilience from London’s public transport demand recovery. Geo-Spat. Inf. Sci. 2023, 26, 685–702. [Google Scholar] [CrossRef]
  27. Peng, Q.; Bakkar, Y.; Wu, L.; Liu, W.; Kou, R.; Liu, K. Transportation resilience under Covid-19 Uncertainty: A traffic severity analysis. Transp. Res. Part A Policy Pract. 2024, 179, 103947. [Google Scholar] [CrossRef]
  28. Ma, X.; Zhang, J.; Ding, C.; Wang, Y. A geographically and temporally weighted regression model to explore the spatiotemporal influence of built environment on transit ridership. Comput. Environ. Urban Syst. 2018, 70, 113–124. [Google Scholar] [CrossRef]
  29. Qian, X.; Ukkusuri, S.V. Spatial variation of the urban taxi ridership using GPS data. Appl. Geogr. 2015, 59, 31–42. [Google Scholar] [CrossRef]
  30. Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically weighted regression: A method for exploring spatial nonstationarity. Geogr. Anal. 1996, 28, 281–298. [Google Scholar] [CrossRef]
  31. Fotheringham, A.S.; Yang, W.; Kang, W. Multiscale geographically weighted regression (MGWR). Ann. Am. Assoc. Geogr. 2017, 107, 1247–1265. [Google Scholar] [CrossRef]
  32. Andersson, J. Using Geographically Weighted Regression (GWR) to Explore Spatial Variations in the Relationship Between Public Transport Accessibility and Car Use: A Case Study in Lund and Malmö, Sweden. Master’s Thesis, Lund University, Lund, Sweden, 2017; p. 45. [Google Scholar]
  33. Harris, R.; Singleton, A.; Grose, D.; Brunsdon, C.; Longley, P. Grid-enabling geographically weighted regression: A case study of participation in higher education in England. Trans. GIS 2010, 14, 43–61. [Google Scholar] [CrossRef]
  34. Li, Z.; Fotheringham, A.S.; Li, W.; Oshan, T. Fast Geographically Weighted Regression (FastGWR): A scalable algorithm to investigate spatial process heterogeneity in millions of observations. Int. J. Geogr. Inf. Sci. 2019, 33, 155–175. [Google Scholar] [CrossRef]
  35. Iyanda, A.E.; Osayomi, T. Is there a relationship between economic indicators and road fatalities in Texas? A multiscale geographically weighted regression analysis. GeoJournal 2021, 86, 2787–2807. [Google Scholar] [CrossRef]
  36. Oshan, T.M.; Li, Z.; Kang, W.; Wolf, L.J.; Fotheringham, A.S. mgwr: A Python implementation of multiscale geographically weighted regression for investigating process spatial heterogeneity and scale. ISPRS Int. J. Geo-Inf. 2019, 8, 269. [Google Scholar] [CrossRef]
  37. Li, Z.; Ftoheringham, A.S. Computational improvements to multi-scale geographically weighted regression. Int. J. Geogr. Inf. Sci. 2020, 34, 1378–1397. [Google Scholar] [CrossRef]
  38. Tao, T.; Wang, J.; Cao, X. Exploring the non-linear associations between spatial attributes and walking distance to transit. J. Transp. Geogr. 2020, 82, 102560. [Google Scholar] [CrossRef]
  39. Zhang, W.; Zhao, Y.; Cao, X.J.; Lu, D.; Chai, Y. Nonlinear effect of accessibility on car ownership in Beijing: Pedestrian-scale neighborhood planning. Transp. Res. Part D Transp. Environ. 2020, 86, 102445. [Google Scholar] [CrossRef]
  40. Galasso, J.; Cao, D.M.; Hochberg, R. A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data. Chaos Solitons Fractals 2022, 156, 111779. [Google Scholar] [CrossRef]
  41. Zhao, X.; Yan, X.; Yu, A.; Van Hentenryck, P. Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. Travel Behav. Soc. 2020, 20, 22–35. [Google Scholar] [CrossRef]
  42. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  43. Hagenauer, J.; Helbich, M. A comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst. Appl. 2017, 78, 273–282. [Google Scholar] [CrossRef]
  44. Georganos, S.; Grippa, T.; Niang Gadiaga, A.; Linard, C.; Lennert, M.; Vanhuysse, S.; Mboga, N.; Wolff, E.; Kalogirou, S. Geographical random forests: A spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto Int. 2021, 36, 121–136. [Google Scholar] [CrossRef]
  45. Grekousis, G.; Feng, Z.; Marakakis, I.; Lu, Y.; Wang, R. Ranking the importance of demographic, socioeconomic, and underlying health factors on US COVID-19 deaths: A geographical random forest approach. Health Place 2022, 74, 102744. [Google Scholar] [CrossRef]
  46. Oshan, T.M.; Smith, J.P.; Fotheringham, A.S. Targeting the spatial context of obesity determinants via multiscale geographically weighted regression. Int. J. Health Geogr. 2020, 19, 11. [Google Scholar] [CrossRef]
  47. Liu, Y.; Wang, Y.; Zhang, J. New machine learning algorithm: Random forest. In Information Computing and Applications: Third International Conference, ICICA 2012, Chengde, China, September 14–16, 2012; Proceedings 3; Springer: Berlin/Heidelberg, Germany, 2012; pp. 246–252. [Google Scholar]
  48. Yesilkanat, C.M. Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos Solitons Fractals 2020, 140, 110210. [Google Scholar] [CrossRef] [PubMed]
  49. Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  50. Janitza, S.; Hornung, R. On the overestimation of random forest’s out-of-bag error. PLoS ONE 2018, 13, e0201904. [Google Scholar] [CrossRef]
  51. Zhao, Y.; Chen, F.; Zhai, R.; Lin, X.; Wang, Z.; Su, L.; Christiani, D.C. Correction for population stratification in random forest analysis. Int. J. Epidemiol. 2012, 41, 1798–1806. [Google Scholar] [CrossRef]
Figure 1. (a) Bus stop locations and (b) percentage decline in daily bus usage between 2019 and 2021: the red color indicates Seoul Republic of Korea (a).
Figure 1. (a) Bus stop locations and (b) percentage decline in daily bus usage between 2019 and 2021: the red color indicates Seoul Republic of Korea (a).
Land 14 01102 g001
Figure 2. Schematic diagram: (a) study preparation, (b) factor determination, (c) method application, and (d) results.
Figure 2. Schematic diagram: (a) study preparation, (b) factor determination, (c) method application, and (d) results.
Land 14 01102 g002
Figure 3. Multiscale geographically weighted regression coefficient maps: population number and over-65 population ratio.
Figure 3. Multiscale geographically weighted regression coefficient maps: population number and over-65 population ratio.
Land 14 01102 g003
Figure 4. Multiscale geographically weighted regression coefficient maps: number of students and land-use complexity (the box in the upper-left map marks the Guro-gu–Yangcheon-gu boundary).
Figure 4. Multiscale geographically weighted regression coefficient maps: number of students and land-use complexity (the box in the upper-left map marks the Guro-gu–Yangcheon-gu boundary).
Land 14 01102 g004
Figure 5. Residual and local R2 map of the multiscale geographically weighted regression (MGWR) model.
Figure 5. Residual and local R2 map of the multiscale geographically weighted regression (MGWR) model.
Land 14 01102 g005
Figure 6. Geographical random forest (GRF) importance maps.
Figure 6. Geographical random forest (GRF) importance maps.
Land 14 01102 g006
Figure 7. Importance ratio of the research areas in 2019 and 2021: Note: Variables representing less than 3% were omitted from the chart.
Figure 7. Importance ratio of the research areas in 2019 and 2021: Note: Variables representing less than 3% were omitted from the chart.
Land 14 01102 g007
Figure 8. Residual map of the geographical random forest (GRF) model.
Figure 8. Residual map of the geographical random forest (GRF) model.
Land 14 01102 g008
Figure 9. Results of the grid-based regional analysis.
Figure 9. Results of the grid-based regional analysis.
Land 14 01102 g009
Table 1. Definition of the variables.
Table 1. Definition of the variables.
VariableDescription
Demographic variables
Population numberPopulation number in each grid
Youth population ratioPercentage of the population under 14 years old in each grid
Production population ratioPercentage of population over 15 and under 64 years old in each grid
Over-65 population ratioPercentage of population over 65 years old in each grid
Sex ratioNumber of men per 100 women in each grid
Number of studentsNumber of middle and high school students in each grid
Number of employeesNumber of employees in each grid
Characteristics of land-use variables
Residential area ratioPercentage of the residential area of each grid
Commercial area ratioPercentage of the commercial area of each grid
Official land priceAverage of the official land price in each grid
Land-use complexityBuilding use in each grid
Connectivity variable
Distance from the subway stationAverage distance from the bus stop to the nearest subway station in each grid
Table 2. Descriptive statistics of the variables.
Table 2. Descriptive statistics of the variables.
Year20192021
VariableMeanSdMaxMeanSdMax
Bus usage94494811,9377176917578
Demographic variables
Number of populations5676345118,9475566332318,560
Youth population ratio9.044.0031.668.413.9932.54
Production population ratio73.315.9110072.406.01100
Over-65 population ratio17.305.455018.976.0066.67
Sex ratio97.5821.3250096.4819.12457
Number of students279224.91609264218.21718
Number of employees---2998453841,980
Characteristics of land-use variables
Residential area ratio---39.0922.8894.57
Commercial area ratio---13.3214.9691.36
Official land price3886362345,4794584415048,256
Land-use complexity---8.7625.2318
Connectivity variable
Distance from the subway station---6725375239
Table 3. Geographically weighted regression results.
Table 3. Geographically weighted regression results.
Variable2019 Coefficient2021 Coefficient
Intercept993.572730.570
Demographics
Population number−39.0300.898
Youth population ratio−146.099−51.810
Production population ratio40.81298.302
Over-65 population ratio−11.18558.193
Sex ratio−76.325−43.922
Number of students3.013−26.251
Number of employees194.803123.723
Characteristics of land use
Residential area ratio−18.234−3.205
Commercial area ratio49.34848.820
Official land price388.348229.477
Land-use complexity454.075372.717
Connectivity
Distance from the subway station−84.220−49.024
Bandwidth281381
Root mean squared error743562
R-squared0.3850.340
Akaike information criterion26,33925,437
Table 4. Multiscale geographically weighted regression results.
Table 4. Multiscale geographically weighted regression results.
Variable2019 CoefficientBandwidth2021 CoefficientBandwidth
Intercept993.647730.648
Demographics
Number of populations−75.681618−32.62731
Youth population ratio−89.571618−59.741624
Production population ratio78.81161844.651624
Over-65 population ratio31.3016180.071624
Sex ratio−40.441557−16.501624
Number of students32.571561−19.201624
Number of employees209.8231100.328
Characteristics of land use
Residential area ratio−44.431618−29.271620
Commercial area ratio93.4532655.61304
Official land price446.2280319.91624
Land-use complexity116.581319503.9452
Connectivity
Distance from the subway station−139.71614−89.331624
Bandwidth4748
Root mean squared error728533
R-squared0.4090.404
Akaike information criterion26,20925,292
Table 5. Results of the random forest model.
Table 5. Results of the random forest model.
20192021
VariableImportanceVariableImportance
Official land price100Number of employees100
Number of employees95.1Official land price91.4
Commercial area ratio75.5Commercial area ratio76.5
Distance from the subway station71.2Distance from the subway station66.6
Youth population ratio53.1Youth population ratio54.7
Production population ratio50.1Production population ratio51.1
Sex ratio47.0Number of students44.7
Number of students46.7Sex ratio44.2
Population number44.1Residential area ratio44.0
Residential area ratio43.7Population number43.0
Over-65 population ratio40.2Over-65 population ratio38.9
Land-use complexity30.6Land-use complexity29.8
Number of trees926 860
Root mean squared error375 270
Akaike information criterion19,219 18,217
R-squared0.843 0.848
Table 6. Results of the geographical random forest (GRF) model.
Table 6. Results of the geographical random forest (GRF) model.
20192021
VariableImportanceVariableImportance
Official land price100Number of employees100
Number of employees88.8Official land price90.4
Commercial area ratio69.6Commercial area ratio81.1
Distance from the subway station61.5Distance from the subway station71.5
Production population ratio56.8Production population ratio68.8
Youth population ratio46.8Over-65 population ratio60.3
Population number45.8Youth population ratio54.3
Number of students45.5Sex ratio54.0
Sex ratio44.7Residential area ratio51.3
Residential area ratio43.5Number of students49.8
Over-65 population ratio41.6Population number49.1
Land-use complexity32.6Land-use complexity41.5
Number of trees926 860
Bandwidth210 60
Root mean squared error91 65
Akaike information criterion14,622 13,575
R-squared0.991 0.991
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hong, S.; Yang, B. Analyzing the Impact of Land-Use Characteristics and Demographic Factors on Spatial Variations in Public Bus Usage: A Comparison of Pre- and During COVID-19 Periods. Land 2025, 14, 1102. https://doi.org/10.3390/land14051102

AMA Style

Hong S, Yang B. Analyzing the Impact of Land-Use Characteristics and Demographic Factors on Spatial Variations in Public Bus Usage: A Comparison of Pre- and During COVID-19 Periods. Land. 2025; 14(5):1102. https://doi.org/10.3390/land14051102

Chicago/Turabian Style

Hong, Sukchan, and Byungyun Yang. 2025. "Analyzing the Impact of Land-Use Characteristics and Demographic Factors on Spatial Variations in Public Bus Usage: A Comparison of Pre- and During COVID-19 Periods" Land 14, no. 5: 1102. https://doi.org/10.3390/land14051102

APA Style

Hong, S., & Yang, B. (2025). Analyzing the Impact of Land-Use Characteristics and Demographic Factors on Spatial Variations in Public Bus Usage: A Comparison of Pre- and During COVID-19 Periods. Land, 14(5), 1102. https://doi.org/10.3390/land14051102

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop