Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity

Qian, Minlei; Cheng, Lin; Sun, Jianan

doi:10.3390/systems13110985

Open AccessArticle

Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity

by

Minlei Qian

^1,2,*

,

Lin Cheng

^1,2

and

Jianan Sun

³

¹

School of Transportation, Southeast University, Nanjing 211189, China

²

Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast University, Nanjing 211189, China

³

Faculty of Maritime and Transportation, Ningbo University, Fenghua Road 818#, Ningbo 315211, China

^*

Author to whom correspondence should be addressed.

Systems 2025, 13(11), 985; https://doi.org/10.3390/systems13110985

Submission received: 15 September 2025 / Revised: 16 October 2025 / Accepted: 30 October 2025 / Published: 4 November 2025

(This article belongs to the Special Issue AI-Driven Transportation Systems: Innovations, Challenges, and Future Mobility)

Download

Browse Figures

Versions Notes

Abstract

Urban Rail Transit (URT), as an important part of the modern urban transportation system, undertakes a large number of daily commuter passenger flow transportation needs. In this context, the in-depth analysis of influential factors of URT passenger flow has become an important issue in transportation management and optimization. This paper selects 13 POI (Point of Interest) types and the surrounding demographic data as the independent variables, and constructs a multi-scale spatio-temporal geographically weighted regression (MGTWR) model with the daily morning peak inbound traffic of the URT station as the dependent variable. The results of the study show that the positive effect of the business and residential variables on the URT morning peak inbound passenger flow is the most significant, reflecting the fact that the increase in these variables promotes the morning peak inbound passenger flow; relatively speaking, the scenic spot variables have a negative effect on the URT morning peak inbound passenger flow, indicating that the increase in these variables inhibits the morning peak inbound passenger flow. In addition, the corporate variables have a negative effect on the morning peak inbound passenger flow, and the company variables have a negative effect on the daily peak inbound passenger flow of URT. URT morning peak inbound passenger flow is non-stationary, i.e., the degree of its influence fluctuates greatly in different spatial and temporal scales. In order to further understand these influence mechanisms, this paper conducts an in-depth analysis of the spatio-temporal characteristics of the above three types of variables, revealing the influence of their spatio-temporal heterogeneity on URT passenger flow.

Keywords:

urban rail transit; POI; MGTWR; analysis of spatio-temporal heterogeneity

1. Introduction

With the rapid growth of the urban population leading to an increase in urban transportation demand, the urban rail transit (URT) system has developed into a key solution to improve urban mobility and accessibility due to its large capacity, high reliability and efficiency.

With the widespread adoption of the Automatic Fare Collection System (AFC) [1], easy access to historical travel data has become possible. Meanwhile, the rapid development of big data and deep learning technologies has provided theoretical frameworks that effectively support the analysis of URT influencing factors.

In recent years, the URT industry has increasingly focused on analyzing the spatiotemporal heterogeneity of passenger flow in built environments, considering factors such as population distribution and land use characteristics. However, existing studies lack detailed categorization of influencing factors and employ regression models (e.g., Geographically and Temporally Weighted Regression, GTWR) inadequately to examine their spatiotemporal impacts on URT passenger flow. This highlights the urgent need to refine factor classifications and select appropriate models to quantify these elements’ effects on spatiotemporal heterogeneity in URT passenger flow.

In view of the existing problems, this paper uses the POI data and population distribution data of rail transit, and analyzes the influencing factors of three subway lines in Hangzhou with GTWR and MGTWR as research models, respectively. The fitting accuracy and other indicators of the two models are compared, and the interpretation analysis is carried out in time and space dimensions according to the regression coefficients.

The main contributions of this paper are as follows:

This study proposes a Multi-Scale Geographically and Temporally Weighted Regression (MGTWR) model to analyze the spatio-temporal heterogeneity of influencing factors on Urban Rail Transit (URT) morning peak passenger flow. By integrating 13 types of Point of Interest (POI) data and population distribution around stations, the model effectively captures the non-stationary and multi-scale characteristics of passenger flow influences, providing a more refined and accurate analytical framework compared to traditional GTWR models.
Through rigorous multicollinearity and spatial autocorrelation tests, this study identifies and addresses the strong collinearity among four key variables (catering services, life services, commercial housing, and government agencies) using ridge regression. This preprocessing step enhances the interpretability and stability of the MGTWR model, ensuring more reliable estimation of variable impacts across different spatio-temporal scales.
The study reveals the complex dual effects and spatio-temporal heterogeneity of various built environment factors on URT passenger flow. Specifically, business and residential variables show the most significant positive impact, while scenic spots exhibit a strong inhibitory effect. Corporate variables demonstrate non-stationary behavior, positively influencing passenger flow on weekdays but negatively influencing passenger flow on weekends. These findings provide valuable insights for urban planning and URT operation optimization, especially during peak hours.

The subsequent parts of the paper are organized as follows: Section 2 provides a comprehensive literature review on factors influencing URT passenger flow and the evolution of regression models from GWR to GTWR and MGTWR. Section 3 outlines the data sources and variable selection process, including POI categorization and spatial correlation analysis. Section 4 details the methodology, including the GTWR and MGTWR models, variable testing, and evaluation metrics. Section 5 presents a case study using Hangzhou URT data, including variable tests, model fitting, and spatio-temporal analysis of influencing factors. Finally, Section 6 summarizes the research findings, discusses implications for URT management, and suggests directions for future research.

2. Literature Review

China’s urban rail transit system is currently in a critical phase of rapid development, making research on passenger flow characteristics and influencing factors particularly crucial. In recent years, advancements in computer technology and information systems have driven extensive studies by scholars worldwide on factors affecting passenger flow in URT (Urban Rail Transit) systems. These influencing factors can generally be categorized into internal and external elements. Internal factors relate to train operation scheduling, including departure intervals and fare structures. External factors primarily involve the characteristics of URT station areas. Wang et al. [2] developed an evaluation method that analyzes how fixed-fare policies transition to distance-based pricing through smart card data analysis. This approach enables examination of fare adjustments’ impacts on demand, travel distance, and price elasticity across different time periods and distance intervals. Chen et al. [3] conducted comprehensive analyses using a control method for three Chinese cities—Hangzhou, Ningbo, and Xiamen—that implemented free public transportation policies, evaluating the effects of three distinct free transit strategies on daily subway passenger flows.

Regarding external influencing factors, most studies focus on the impact of the built environment on urban rail passenger flow. Li Guoqiang et al. [4] analyzed passenger flow influencing factors at subway stations by mining AFC and urban POI data. Based on analyzing station passenger flow characteristics, they comprehensively considered the quantity and proportion of various types of POIs within service areas, categorizing all stations into four typical types and establishing a station influencing factor set. Pan et al. [5] utilized IC card data from subway passenger volume and cellular signaling data on human activity spatial distribution in Shanghai. Their explanatory variables included station area employment and population distribution, commuting distance, subway network accessibility, transfer station status, and integration with commercial activity centers. Through regression analysis using station passenger flow as the dependent variable, they identified variations in passenger flow between rail transit stations. Zhu et al. [6] developed a Bayesian negative binomial regression model to identify key influencing factors related to daily peak and off-peak passenger flows. Based on this model, they established a geographically weighted model to analyze the spatial dependencies of these influences across different periods. Using Beijing station passenger flow data, they explored the spatiotemporal relationship between station passenger flow and the built environment.

With the introduction of the Geographical Weighted Regression (GWR) model, Calvo et al. [7] adopted this framework to analyze spatial heterogeneity relationships between daily subway trips and socioeconomic variables, including population, land use, accessibility, and transportation systems. Cardozo et al. [8] conducted spatial analyses of Madrid’s metro passenger flow using both GWR and ordinary least squares regression models, demonstrating that GWR models exhibit superior analytical performance. Chen et al. [9] utilized Shenzhen’s Metro AFC data, incorporating the Minkowski Distance (MD) metric to measure geographical distance and refine weighted matrix calibration. They employed exponential distance decay weighting functions to quantify spatial correlations among independent variables, thereby constructing a GWR model to address spatial autocorrelation and non-stationary issues in station-level metro passenger flow. Gao et al. [10] proposed a Network Distance-Based Geographical Weighted Regression (ND-GWR) model, calculating spatial impacts of demographic, land use, network, transfer, and station characteristics on metro passenger flow. The study compared these effects across weekdays, weekends, and holidays.

With continuous advancements in spatiotemporal data analysis methods, research models have gradually transitioned from traditional geographically weighted regression (GWR) to spatiotemporal geographically weighted regression (GTWR). Unlike GWR models that only consider spatial heterogeneity, GTWR introduces a temporal dimension. By simultaneously accounting for both temporal and spatial weighting effects, GTWR models can more accurately capture the dynamic characteristics of traffic flow and other socioeconomic variables over time. Wang et al. [11] developed spatiotemporal geographically weighted regression (GTWR) to explore how changes in land use, transfer connection facilities, and station attributes affect the spatiotemporal heterogeneity of passenger flow at subway stations. Results indicate that GTWR outperforms ordinary least squares (OLS) and geographically weighted regression (GWR) models. Ma Junze et al. [12] focused on the urban built environment around Shenzhen Metro station entrances/exits, employing the Tyson polygon analysis method to construct a transit travel recognition model. They analyzed station entry/exit passenger flow variations during weekday morning and evening rush hours and combined spatiotemporal geographically weighted regression models to investigate the spatial impact of built environment characteristics on transit demand. Shao et al. [13] developed a gradient boosting decision tree model to calculate the relative importance of land use variables and their thresholds and moderating effects on passenger flow. Xu Xinyue et al. [14] investigated the impacts of four built environment characteristics—demographic–economic, station, external transportation, and land use—on passenger flow at transportation hubs. By integrating Geographical Weighted Regression (GTWR) with Random Forest (RF), they developed a combined model called Geographical Weighted Random Forest (GTWR RF). This approach aims to reveal the spatiotemporal heterogeneity and nonlinear effects of built environment features on passenger flow patterns around transportation hubs.

In studies analyzing influencing factors around URT sites, GTWR effectively reveals the temporal and spatial variability of environmental factors across different stations by incorporating geospatial and temporal weights. The GTWR model has become a crucial tool for researchers to analyze surrounding factors due to its capability in handling spatial heterogeneity and temporal dynamics. However, despite its notable achievements in various applications, the model still demonstrates limitations when addressing complex spatiotemporal interactions, particularly under multi-scale and multi-level analysis requirements. In contrast, the Multi-Scale Spatiotemporal Geographical Weighted Regression (MGTWR) model introduces greater flexibility by simultaneously processing spatial heterogeneity and temporal dynamic changes at multiple scales, significantly enhancing its capacity to capture intricate spatiotemporal relationships. By optimizing regression coefficients across multiple scales, MGTWR provides more detailed spatiotemporal patterns and regional variations. This makes it particularly effective in accurately reflecting the unique characteristics of different stations when analyzing multi-layered and multi-dimensional influencing factors. Furthermore, the infrastructure design of railway lines and stations themselves constitutes another crucial dimension. The configuration of tracks (single, double, or multiple tracks) and station types (e.g., junctions, terminals) fundamentally determine capacity and operational performance, forming the physical backbone upon which passenger flow characteristics emerge [15]. Ref. [15] For URT passenger flow prediction, recent studies have adopted deep learning approaches. Sun et al. [16] proposed a Bi-graph Graph Convolutional Spatio-Temporal Feature Fusion Network (BGCSTFFN) that integrates POI data and station topology through graph convolutional networks and a Transformer, demonstrating superior performance in multi-step prediction tasks.

This study focuses on analyzing factors influencing urban rail transit passenger flow. By selecting 13 types of Points of Interest (POIs) and population distribution around stations as independent variables, we constructed a Multi-Scale Temporal-Spatial Geometric Weighted Regression Model (MGTWR) with morning rush hour station entry passenger flow as the dependent variable. The model aims to investigate how station characteristics and surrounding environments affect the spatiotemporal heterogeneity of passenger flow across different spatial scales and temporal dimensions. To evaluate the model’s superiority, this paper systematically compares MGTWR with traditional Geometric Weighted Regression (GTWR) models.

3. Problem Statement

3.1. Data Source of URT Station Card Swipe

The data used in this study originates from the Tianchi Big Data Competition, specifically including card-swiping records from three subway lines and 81 URT stations in Hangzhou, spanning from 1 January to 25 January 2019 [17]. Each record contains passenger swipe details, documenting boarding and alighting events that provide detailed insights into rail transit system mobility. These card-swiping datasets encompass critical information such as boarding/alighting times, station locations, and swipe methods. Selected AFC data can be referenced in Table 1. Before data analysis, we cleaned and processed the raw AFC data to ensure data quality:

Data Cleaning: Identified and removed invalid values and inconsistent records, including handling duplicate entries, outliers, and missing values.
Data Transformation: This study primarily focuses on predicting passenger flow during the URT morning peak period (6:30–9:30). Therefore, the passenger flow sequence for the URT morning peak period was extracted from 25 CSV files containing historical AFC passenger flow data recorded from 1 January to 25 January 2019, using 5 min and 15 min intervals. Two datasets were obtained: 5 min and 15 min intervals. URT passenger flow $Y_{i}^{t}$ represents the passenger volume entering the station i during the t-th time period. $Y^{t} \in R^{N \times 1}$ denotes the total passenger volume across all URT stations during the t-th time period. N indicates the total number of stations on the URT line. Furthermore, prior to inputting the data into the prediction model, the URT passenger flow data undergoes normalization processing.

3.2. Variable Selection

Based on the daily passenger flow correlation rankings of station POIs in Table 2 and the proximity rankings of selected stations to Station 7’s POI counts in Table 3, we selected 13 POI categories with non-zero values and similar daily passenger flow correlation rankings. Drawing from prior research, we added the surrounding population size as the 14th variable. These 14 variables are: Food & Beverage Services, Shopping Services, Daily Living Services, Sports & Leisure Services, Healthcare Services, Lodging Services, Scenic Spots, Commercial Residential Areas, Government Agencies & Social Organizations, Science, Education & Cultural Services, Transportation Facilities Services, Financial & Insurance Services, Corporate Enterprises, and Station Surrounding Population. These variables were selected by synthesizing existing research findings to ensure comprehensive coverage of station-related factors.

POI Relativity: The overall similarity in the distribution of all types of POIs surrounding different transit stations. Daily Passenger Flow Correlation: The degree of consistency in daily passenger flow fluctuation trends between different stations. Table 2 presents the analysis results for a selected group of sites (3, 5, 6, 15, 43, 68) exhibiting high POI correlation with Site 7. Building upon this, Table 3 further selects the top three ranked sites (3, 5, 6).

3.3. Variable Testing

3.3.1. Multiple Linear Test of Variables

Multiple collinearity testing serves as the foundation for ensuring the validity and accuracy of model estimation results in spatiotemporal regression analysis. Commonly referred to as multicollinearity testing, this method aims to identify whether explanatory variables exhibit high correlation. If strong collinearity exists among these variables, the economic interpretation of regression coefficients may become distorted, meaning their actual impact cannot be accurately reflected [18]. The estimated values of regression coefficients might deviate, thereby compromising the model’s reliability and stability, which in turn weakens its interpretative power and predictive precision. Therefore, by employing diagnostic tools such as variance inflation factors (VIF), we can effectively detect collinearity issues among independent variables, ensuring the model’s reliability and stability [19].

The formula for calculating the variance expansion factor (VIF) is shown as follows:

V I F_{i} = \frac{1}{1 - R_{i}^{2}}

(1)

where

X_{i}

is used as the dependent variable for regression, and all other explanatory variables are used as independent variables for regression analysis. In this regression analysis, the obtained

R_{i}^{2}

refers to the explainable portion of

X_{i}

under the influence of other independent variables (i.e., the squared correlation value between the regression model and all other independent variables).

3.3.2. Autocorrelation Test of Variables

Global Moran’s Index (GMI), a key metric for assessing the spatial clustering or dispersion of variables across a study area, serves as a quantitative tool to evaluate spatial autocorrelation. By calculating this index for specific variables, researchers can determine whether significant spatial correlations exist within the region [20]. The GMI calculation process is as follows:

I = \frac{n \sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j} (x_{i} - \bar{x}) (x_{j} - \bar{x})}{\sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j} \sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}

(2)

where

n

is the number of spatial units;

w_{i j}

is the weight between location

i

and location

j

;

x_{i}

and

x_{j}

respectively denote the attribute values selected at location

i

and location

j

;

\bar{x}

is the average value of all observed values.

In the spatial autocorrelation analysis, the spatial weight matrix

w_{i j}

is used to measure the spatial relationship between the attribute

x_{i}

at location

i

and the attribute

x_{j}

at location

j

When the distance between the two locations is relatively short, it is assigned a value of 1, indicating that there is a significant spatial correlation between these two locations; while when the distance between the locations is relatively long, it is assigned a value of 0, meaning that the spatial relationship between these two locations is weak or does not exist.

In addition to the global Moran’s Index (Moran’s I), spatial autocorrelation can also be tested through the standardized statistic Z. The calculation process of the standardized statistic Z is shown in Equation (3):

Z (I) = \frac{I - E (I)}{\sqrt{V a r (I})}

(3)

where

E (I)

and

V a r (I)

represents the expected value and variance.

Z is a method to further process Moran’s index. Standardizing the expected value and standard deviation of Moran’s index provides more accurate statistical tests to evaluate the degree of autocorrelation of spatial data.

4. Methodology

The multiscale geospatial time-weighted regression (MGTWR) model employed in this paper traces its theoretical evolution back to geographic weighted regression (GWR). Serving as the starting point for this series of studies, GWR’s core objective is to address the issue of “spatial non-stationarity” in variable relationships. Subsequently, by incorporating a temporal dimension, the research model evolved from GWR to the spatiotemporal geographically weighted regression (GTWR), enabling the simultaneous capture of spatial and temporal heterogeneity in variables.

4.1. GTWR Model

Before introducing the MGTWR model, it is essential to first explain its theoretical foundation—the GTWR model—which enhances our understanding of spatiotemporal modeling processes. Building upon the GWR model’s consideration of local spatial heterogeneity, the GTWR model incorporates a temporal dimension [21]. By utilizing adjacent sample points in both time and space domains to calibrate regression coefficients for local observation points, it achieves analysis of spatiotemporal non-stationarity. The specific calculation formula is shown in Equation (4):

Y_{i} = β_{0} (u_{i}, v_{i}, t_{i}) + \sum_{k = 1}^{m} β_{k} (u_{i}, v_{i}, t_{i}) x_{i k} + ε_{i}

(4)

where

(u_{i}, v_{i}, t_{i})

is the sample point

i

with spatial coordinates and time coordinates;

m

is the sample size;

ε_{i}

is the random error term;

β_{k} (u_{i}, v_{i}, t_{i})

is the estimated local regression coefficient.

In order to simplify the analysis process, some of the geographical factors directly related to time and space are usually omitted to make the expression more concise.

Y_{i} = β_{0} + \sum_{k = 1}^{m} β_{k} x_{i k} + ε_{i}

(5)

Expand it as shown in Equation (6):

\{\begin{matrix} y_{1} = β_{10} + β_{11} x_{11} + β_{12} x_{12} + \dots + β_{1 m} x_{1 m} + ε_{1} \\ y_{2} = β_{20} + β_{21} x_{21} + β_{22} x_{22} + \dots + β_{2 m} x_{2 m} + ε_{2} \\ ⋮ \\ y_{n} = β_{n 0} + β_{n 1} x_{n 1} + β_{n 2} x_{n 2} + \dots + β_{n m} x_{n m} + ε_{n} \end{matrix}

(6)

The matrix expression is

W_{i}

Y = (X \otimes β) I + ε

Y = [\begin{matrix} \begin{array}{l} y_{1} \\ y_{2} \end{array} \\ ⋮ \\ y_{n} \end{matrix}], X = [\begin{matrix} 1 & x_{11} & x_{12} & \dots & x_{1 m} \\ 1 & x_{21} & x_{22} & \dots & x_{2 m} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 1 & x_{n 1} & x_{n 2} & \dots & x_{n m} \end{matrix}], ε = [\begin{matrix} ε_{1} \\ ε_{2} \\ ⋮ \\ ε_{n} \end{matrix}]

(7)

The

i

estimated coefficient of the first variable is obtained by weighted least squares, and its calculation matrix expression is shown in Equation (8):

{\hat{β}}_{i} = {(X^{T} W_{i} X)}^{- 1} X^{T} W_{i} Y

(8)

where

X

represents the

n \times m

design matrix, where the values in the first column are all 1, representing the intercept term;

W_{i}

is an

n \times n

diagonal matrix determined by spatial variables, that is,

W_{i} = d i a g (W_{i 1}, W_{i 2}, \dots, W_{i j}, \dots, W_{i n})

all elements of

W_{i}

determined by the dependent variable.

4.2. MGTWR Model

The Multi-Scale Spatio-temporal Geographical Weighted Regression (MGTWR) model, as an extension of the Geospatial Weighted Regression (GTWR) model, addresses some limitations of the latter. The traditional GTWR model assumes all explanatory variables share identical spatiotemporal bandwidths, meaning that in regression analysis, each influencing factor’s spatiotemporal effects on the dependent variable are confined to the same scale. However, this assumption overlooks the possibility that variables may exhibit different influence intensities and ranges across different spatial and temporal scales, potentially leading to overestimation or underestimation of variable effects. Therefore, the MGTWR model introduces variable-specific spatiotemporal bandwidths, allowing each explanatory variable to possess independent spatiotemporal stability and enabling more precise capture of factors’ impacts on the dependent variable across different scales [22]. The specific calculation formula is shown in Equation (9):

\begin{array}{l} Y_{i} = β_{b e s t (0)} (u_{i}, v_{i}, t_{i}) + \sum_{j = 1}^{k} β_{b e s t (j)} (u_{i}, v_{i}, t_{i}) x_{i j} + ε_{i} \\ = \sum_{j = 0}^{k} f_{j} + ε_{i} \end{array}

(9)

where

k

is the number of explanatory variables;

b e s t (j)

is the specific spatio-temporal bandwidth of explanatory variable

j

;

β_{b e s t (j)}

is the regression coefficient of explanatory variable

j

under the specific spatio-temporal bandwidth;

f_{j} = β_{b e s (j)} (u_{i}, v_{i}, t_{i}) x_{i j}

is the additive term of explanatory variables.

The MGTWR model can be regarded as a generalized additive model. During the modeling process, to determine the optimal spatiotemporal bandwidth, the GTWR model is used to initialize the additive

\hat{ε} = y - \sum_{j = 0}^{k} f_{j}

term vector, yielding an initial value. The error term is defined as follows: unlike the GTWR model, the MGTWR model employs a backfitting algorithm to iteratively update and gradually adjust the bandwidth of each variable until it meets predefined convergence criteria, ultimately terminating parameter updates.

The backfitting steps are as follows [23,24]:

Step 1: View the MGTWR model as a generalized additive model as a whole:

Y = \sum_{j = 0}^{k} f_{j} + ε, \begin{matrix} f_{j} \end{matrix} = β_{b e s t (j)} x_{j}

(10)

Step 2: Initialization is obtained

β_{b e s t (0)}, β_{b e s t (1)} \otimes x_{1}, β_{b e s t (2)} \otimes x_{2}, \dots, β_{b e s t (j)} \otimes x_{j}

, that is

f_{0}^{(0)}, f_{1}^{(0)}, f_{2}^{(0)}, \dots, f_{j}^{(0)}

, the initial residual term

ε^{(0)} = Y - f_{0}^{(0)} - f_{1}^{(0)} - f_{2}^{(0)} - \dots - f_{k}^{(0)}

is calculated.

Step 3: Use the GTWR model to perform a regression operation on the intercept term

x_{0}

, calibrate to generate the optimal spatial bandwidth and time bandwidth, and update to obtain

f_{0}^{(1)} = β_{0}^{(1)} x_{0}

,

ε_{0}^{(1)} = f_{0}^{(1)} - f_{1}^{(0)} - \dots - f_{k}^{(0)}

;

Step 4: Use the GTWR model to perform regression operation on the explanatory variables

x_{1}

, calibrate to generate the optimal spatial bandwidth and time bandwidth, and update to obtain

f_{1}^{(1)} = β_{1}^{(1)} x_{1}

,

ε_{1}^{(1)} = f_{0}^{(1)} - f_{1}^{(1)} - \dots - f_{k}^{(0)}

;

Step 5: Perform regression operations on each of the

k

explanatory variables one by one to obtain the first parameter estimation values

β_{0}^{(1)}, β_{1}^{(1)}, \dots, β_{k}^{(1)}

and repeat steps 3–5 until the convergence criterion is met.

The convergence criterion adopts the classical SOC criterion [25], whose termination condition is less than or equal to, as shown below:

S O C_{f}^{p + 1} = \sqrt{\frac{\sum_{j = 1}^{k} \frac{1}{n} \sum_{i = 1}^{n} {({\hat{f}}_{i j}^{(p + 1)} - {\hat{f}}_{i j}^{(p)})}^{2}}{\sum_{i = 1}^{n} {(\sum_{j = 1}^{k} {\hat{f}}_{i j}^{(p + 1)})}^{2}}} < δ

(11)

In this case

δ

, we usually take the negative of 10 to the power of, the sum of the squared residuals

R S S = \sum_{i = 1}^{n} {(\sum_{j = 1}^{k} {\hat{f}}_{i j}^{(p + 1)})}^{2}

.

4.3. Evaluating Indicator

(1): Residual sum of squares

R S S = \sum_{i = 1}^{n} {(\sum_{j = 1}^{k} {\hat{f}}_{i j}^{(p + 1)})}^{2}

(12)

(2): Information criteria (AICc)

The Akaike Information Criterion (AICc) is a statistical measure used to evaluate model fit and select parameters. These criteria also play a crucial role in bandwidth selection, as they emphasize that optimizing models requires not only considering fitting accuracy but also imposing penalties for model complexity factors such as the number of parameters.

A I C c = 2 n \ln (\hat{δ}) + 2 n \ln (2 π) + n [\frac{n + t r (s)}{n - 2 - t r (s)}]

(13)

where

\hat{δ}

is the estimated standard deviation of the error term,

\hat{δ} = R S S / n - t r (s)

, where

t r (S)

is the trace of

S

, and

S

is the hat matrix of GTWR. The rows of the

i

-th sample can be expressed as follows:

S_{i} = X_{i} {(X^{T} W_{i} X)}^{- 1} X^{T} W_{i}

(14)

Finally, the bandwidth combination with the smallest AICc value is selected as the optimal bandwidth combination.

(3): Goodness of fit (R²):

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}}

(15)

Among them,

y_{i}

,

{\hat{y}}_{i}

and

{\bar{y}}_{i}

are the real value, predicted value and mean of passenger flow of URT stations

i

, and

N

is the total number of URT stations.

5. Case Study

In the experimental design, 14 independent variables were initially selected, with the dependent variable being daily morning rush hour passenger flow from 1 January to 25. First, multiple collinearity tests and spatial correlation analyses were conducted on the independent variables to evaluate their linear relationships and spatial data interdependencies. Based on these findings, a Multi-Scale Geographical Weighted Regression (MGTWR) model was further developed to reveal the spatiotemporal heterogeneity effects of each variable on the dependent variable.

5.1. Variable Collinearity Test Results

In this regression analysis, we employed Variance Inflation Factors (VIF) to assess multicollinearity among independent variables. As shown in Table 4, the VIF values of four variables—catering services, life services, commercial housing, and government agencies/social organizations—all exceeded 10, indicating strong multicollinearity. However, most variables maintained VIF values well below 10. To mitigate multicollinearity in these four variables, ridge regression was first applied to the independent variables before their inclusion in the MGTWR model, thereby enhancing the interpretability of the model’s estimated parameters.

Ridge Regression works by adding a regularization term (or penalty term) to the objective function that minimizes the sum of squared residuals. This term multiplies the sum of squared regression coefficients by a regularization parameter, thereby reducing the absolute values of the coefficients and preventing overfitting or instability caused by multicollinearity.

The loss function expression of ridge regression is shown in Equation (16):

L (β) = {\sum_{i - 1}^{n} (y_{i} - x_{i}^{T} β_{j})}^{2} + λ \sum_{j - 1}^{p} β_{j}^{2}

(16)

where

n

is the number of samples;

y_{i}

is the target value of the

i

-th sample;

x_{i}

is the feature vector of the

i

-th sample;

β_{j}

is the regression coefficient of the

j

-th feature;

λ

is the regularization parameter, which controls the intensity of the penalty term.

5.2. Variable Correlation Test Results

Spatial autocorrelation analysis serves as the final step in examining independent variables before regression analysis, aiming to identify spatial clustering patterns among variables. To ensure spatial autocorrelation among variables, this study employed a Python-programmed algorithm (Programming language: Python 3.9. Editor: Pycharm. Model is built using the deep learning framework of Pytorch.) to calculate Moran’s I index for spatial autocorrelation testing. The correlation analysis results are presented in Table 5.

The analysis of the table reveals that both Moran’s I and z-values are positive, with all Moran’s I values remaining below 1. This indicates significant spatial autocorrelation among all independent variables. Furthermore, the significance levels (p-values) for each variable are all below 0.05, further confirming spatial correlations among all 14 independent variables. Notably, the clustered characteristic value z maintains positive values, demonstrating significant spatial clustering among these variables. Consequently, these 14 independent variables were selected as foundational variables for the study. Overall, the independent variable data meet the essential requirements for constructing the MGTWR model.

5.3. Results of the MGTWR Model at URT Site

Based on the selected 14 independent variables and 1 dependent variable, we constructed the MGTWR combined model (MGTWR). The exponential spatiotemporal weight function was selected, with the parameter

δ

in the convergence criterion SOC set to

10^{- 4}

. The influence coefficients of each variable in the model are shown below. The statistics in Table 6 include: minimum value, lower quartile, median, upper quartile, and maximum value.

The regression results of the model reveal that all eight independent variables—catering services, living services, accommodation services, healthcare services, business residences, transportation facilities, and population distribution—display positive parameter estimates. Notably, business residences exhibit the highest values across all indicators, indicating their significant positive correlation with morning rush hour passenger flow. This demonstrates that expanded availability of these service facilities enhances regional appeal and functionality, thereby driving increased passenger traffic during peak hours.

Contrary to expectations, the parameter estimates for five independent variables—shopping services, sports and leisure facilities, scenic attractions, government agencies and social organizations, as well as science, education, culture, and financial insurance services—all showed negative correlations. Notably, all indicators related to scenic attractions ranked lowest, indicating that increased availability of these services had a suppressive effect on morning rush hour passenger flow, with scenic attractions demonstrating the most significant impact. This inverse relationship suggests that expanding these service types may not effectively boost passenger numbers during peak hours under specific spatiotemporal conditions. Furthermore, variations in regional distribution patterns or differences in passenger demand patterns could potentially lead to reduced traffic volumes.

Furthermore, the parameter estimates of the corporate enterprise variable exhibit both positive and negative values under different scenarios, indicating a complex dual effect on morning rush hour passenger flow. This bidirectional influence may stem from significant temporal and spatial variations in the corporate enterprise’s impact, reflecting how its role in passenger flow dynamics shifts across contexts. This demonstrates the spatiotemporal heterogeneity inherent in the model.

In conclusion, these regression results not only reveal the complex relationship between independent variables and dependent variables, but also emphasize that the influence of variables may be nonstationary or heterogeneous when considering the time-space dimension.

To investigate the spatiotemporal heterogeneity of morning rush-hour passenger flow at URT stations in built environments, this study analyzes three representative influencing factors: First, commercial-residential properties demonstrate significant positive effects on station entry passenger flow; second, scenic attractions exhibit strong inhibitory effects on passenger flow; third, corporate entities show complex spatiotemporal variations due to their non-stationary characteristics across different time periods. Through temporal-spatial analysis of these variables’ coefficients, we aim to reveal the spatiotemporal differences and underlying mechanisms of built environment elements affecting URT passenger flow during peak hours.

(1): Time characteristic analysis of the influence coefficient of specific variables

In order to explore the variation trend of the influence coefficient of the independent variable company enterprise over different times, the average coefficient values of each station from 7 January (Monday) to 13 January (Sunday) were selected, and the results are shown in Figure 1:

As shown in Figure 1a, the corporate presence variable exhibits positive coefficients from Monday to Thursday, with its impact peaking on Tuesday. The weekday coefficient significantly outperforms weekend levels, demonstrating that corporate establishments around stations substantially boost morning rush hour passenger flow during these hours. However, the coefficient turns negative and shows a continuous decline from Friday to Sunday, indicating that corporate presence acts as a counterforce during weekends. Most companies are closed during weekends, reducing commuting demand and causing a sharp drop in URT morning peak passenger numbers. Weekend travel patterns predominantly consist of non-work-related activities like leisure, socializing, and shopping. These dispersed demands lack the concentration seen during weekdays, thus failing to effectively drive morning peak traffic.

As shown in Figure 1b, commercial residential variables demonstrate significant positive impacts on both weekday and weekend morning rush hour station passenger flows. Specifically, the influential coefficients of commercial residential variables show a gradual upward trend throughout the week, reaching their peak on Sundays. This phenomenon reflects the continuous and time-sensitive effects of commercial residential areas as built environment elements on public transportation demand across different periods. During weekdays, residents in commercial residential zones exhibit regular commuting patterns, leading to a gradual increase in morning rush hour passenger flows over time, demonstrating strong connectivity with surrounding transportation systems. On weekends, despite most workplaces being closed, residents within these zones maintain robust travel demands for leisure, entertainment, and social activities, driving increased passenger flow. Particularly on Sundays, when influencing factors reach their maximum value, this may be closely related to intensified weekend activity concentration and higher frequency of outings, highlighting the unique role of commercial residential areas in URT (Urban Rail Transit) passenger flow during non-working days. Overall, commercial residential variables not only significantly impact morning rush hour commuter flows during weekdays but also amplify URT passenger demand through diversified resident activities during weekend hours. This pattern reveals the complex influence of commercial residential variables on URT morning rush hour station passenger flows.

As shown in Figure 1c, scenic area variables exhibit a negative impact on both weekday and weekend passenger flows. Specifically, the negative coefficient of scenic area variables on passenger flow reaches its minimum value on Tuesday, indicating that during this period, the variable exerts the most significant suppression effect on URT morning peak passenger flow. Moreover, the influential coefficient on weekdays is significantly lower than that on weekends, suggesting that scenic area variables have a more pronounced suppressive effect on weekday traffic demand. Furthermore, the degree of negative impact on the URT morning peak passenger flow shows notable temporal variations across different periods.

This phenomenon can be analyzed from multiple perspectives. Firstly, scenic area variables significantly influence passenger leisure and tourism activities. During weekdays, when most people are engaged in regular work routines, travel demand remains relatively stable and limited, resulting in a smaller negative impact of scenic area variables on URT morning peak passenger flow. In contrast, weekends, characterized by widespread rest periods, see concentrated travel demands from residents, particularly increased tourism demand for scenic areas. This heightened demand volatility amplifies weekend URT morning peak passenger flow, thereby weakening the negative impact of scenic area variables on passenger traffic.

It is worth noting that Tuesday, being the mid-point of the week, typically sees relatively stable passenger flow during most weekdays with sparse tourism demand and lower visitor numbers at scenic spots. This could explain why the negative impact coefficient of scenic spot variables shows minimal influence on passenger flow during this period. The varying negative coefficients across different time periods reflect shifts in residents’ travel habits, social activity patterns, and transportation demands.

(2): Spatial characteristic analysis of influence coefficients of specific variables

ArcGIS 10.2 was employed to conduct spatial visualization analysis of morning rush hour passenger flow patterns at stations during weekday and non-working day periods. As shown in Figure 2a,b, corporate variables significantly boost URT (Urban Rail Transit) ridership at most stations during weekdays. However, during non-working days, these corporate variables suppress morning rush hour passenger flow across all stations, with the suppression intensity gradually decreasing from central to peripheral areas, reaching its minimum at northeastern regional stations. Spatially, the impact of corporate variables shows distinct gradient characteristics across different regions. Stations located in urban centers and commercial hubs demonstrate more pronounced effects from corporate variables, particularly during working hours. During non-working days, while passenger flow decreases substantially at these stations, the inhibitory effect diminishes as the distance from the city center increases. This indicates that travel patterns in peripheral areas may be more influenced by alternative factors such as leisure activities.

In terms of spatial dimensions, there are significant differences in passenger travel patterns between working and non-working days. Specifically, as shown in Figure 3a,b, the promoting effect of commercial residential variables on URT (urban rail transit) trips during morning rush hours exhibits distinct spatiotemporal variations across different time periods and regions.

During weekday hours, the business residential variables show weaker boosting effects on morning rush hour passenger flow in urban core areas, while demonstrating more significant positive impacts at stations farther from central districts—particularly at terminal line endpoints. Specifically, stations located at the northernmost positions exhibit the most pronounced effects. This phenomenon likely stems from commuting demands during weekdays, especially for office workers residing in peripheral areas whose travel needs peak during morning rush hours. Consequently, business residential variables around these stations exert stronger demand stimulation effects on URTs (Urban Transport Rides).

During non-working days, the spatial distribution of business residential variables’ impact on URT morning rush hour passenger flow shows a reverse pattern compared to working days. Compared with weekdays, the positive influence coefficient of business residential variables on morning rush hour passenger flow at eastern regional stations is relatively smaller, while the positive influence coefficient at central regional stations reaches its peak. This indicates that during non-working days, the promotional effect of business residential variables on morning rush hour passenger flow at central regional stations is stronger than in other areas. This is because passenger mobility during non-working days is more driven by leisure activities, shopping, and other social engagements, which exhibit greater dispersion and lack of concentration.

From a spatial perspective, the impact of scenic spots on URT’s morning rush hour station passenger flow demonstrates significant spatiotemporal variations, with distinct differences observed between weekdays and non-working days. As shown in Figure 4a,b, scenic areas exhibit a suppressive effect on station passenger flow across all stations during both working and non-working periods. However, the spatial distribution patterns of this inhibitory effect show notable temporal variations.

During weekday hours, scenic area variables exhibit strong spatial heterogeneity in morning rush hour passenger flow at stations. As commuting demands weekday travel, the inhibitory effect of scenic area variables on morning peak passenger flow is most pronounced in urban center areas, particularly around tourist attractions. During weekdays, morning peak passenger flow tends to be more dispersed, primarily consisting of commuters rather than tourists. In contrast, stations farther from city centers—especially those located at route endpoints—show weaker inhibitory effects. Particularly at the northernmost sections of routes, the impact of scenic area variables on passenger flow remains relatively minimal. This phenomenon reflects differences in commuting patterns during workdays.

During non-working days, the impact of scenic area variables shows a completely different trend. Particularly in central areas of tourist attractions like West Lake, the negative influence coefficient of scenic area variables on morning rush hour passenger flow at regional stations peaks and gradually increases toward peripheral areas, reaching its maximum in the northeastern route regions. This indicates that during non-working days, scenic area variables suppress morning rush hour passenger flow across all stations, with weakening intensity from the center outward, and minimal impact on northeastern regional stations. This phenomenon closely aligns with the characteristics of non-working day travel demand: passenger flow primarily depends on the appeal of scenic centers, while peripheral areas exhibit relatively weaker attractiveness, resulting in limited passenger growth at terminal stations.

(3): Model comparison

To verify the performance advantages of the constructed MGTWR model over other models, this study selected the GTWR model and conducted a systematic comparison using three metrics: R², AICc, and RSS. As shown in Table 7, the MGTWR model demonstrated a 2.94% improvement in R² compared to the GTWR model, indicating its significant advantage in data fitting and ability to accurately capture underlying patterns. Additionally, the MGTWR model showed a 311-point reduction in AICc compared to the GTWR model, highlighting its higher efficiency in data interpretation and better balance between model complexity and fit. Furthermore, the MGTWR model reduced RSS by 14,170,4440 points compared to the GTWR model, significantly lowering the sum of squared residuals and further validating its improved prediction accuracy. In conclusion, considering these metrics comprehensively, the MGTWR model not only exhibits remarkable improvements in fitting, model interpretability, and prediction accuracy compared to the GTWR model but also demonstrates stronger adaptability and higher effectiveness when handling spatiotemporal data.

6. Conclusions

To balance passenger flow throughout the week, differentiated planning for station surroundings can be implemented based on the spatiotemporal heterogeneity of variables affecting commercial residences, scenic attractions, and corporate enterprises. This study first conducted multicollinearity and spatial autocorrelation tests on selected variables, identifying multicollinearity issues among four variables. Subsequently, ridge regression was applied to the data before inputting it into the Multi-Geospatial Temporal Geographical Weighted Regression (MGTWR) model. Through this approach, we conducted an in-depth analysis of how the characteristics of built environments around stations influence URT morning rush hour passenger flow. The findings indicate the following:

(1): There are 8 independent variables that have a positive impact on the morning rush hour station passenger flow, among which the business residential variable has the biggest impact; while there are 5 independent variables that have a negative impact on the passenger flow, among which the scenic spot has the biggest impact the corporate enterprise shows a complex double effect, reflecting non-stationary spatiotemporal heterogeneity.
(2): Business residential properties demonstrate the most significant positive impact on morning rush hour passenger flow, particularly during weekends. Scenic attractions exhibit negative effects during weekdays, though this adverse effect diminishes due to concentrated tourism demand on weekends. Corporate entities show marked non-stationary patterns in passenger flow dynamics: they boost traffic during workdays while exerting a restraining effect on weekends.
(3): Corporate variables boost URT travel in urban centers and commercial hubs during weekdays, while showing inhibitory effects on non-working days, with diminished impact on peripheral areas. Business residential variables and scenic attraction variables exhibit significant spatiotemporal differences in their influence on URT travel: the former stimulates commuting demand in peripheral regions, whereas the latter intensifies leisure demand in central scenic area clusters during off-workdays.

Author Contributions

Conceptualization, M.Q.; data curation, M.Q.; formal analysis, J.S.; funding acquisition, L.C.; investigation, J.S. and M.Q.; writing—original draft, J.S. and M.Q.; writing—review and editing, L.C. and M.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Zhejiang Province, China (No. MS25E080023), the Natural Science Foundation of Ningbo City, China (No.2024J130), the Fundamental Research Funds for the Provincial Universities of Zhejiang (No. SJLY2023009), the National “111” Center on Safety and Intelligent Operation of Sea Bridge (D21013), and National Natural Science Foundation of China (Nos. 71971059, 52262047, 52302388, 52272334, and 61963011).

Data Availability Statement

2019 Hangzhou Metro AFC data for eighty-one stations on three lines are sourced from https://tianchi.aliyun.com/competition/entrance/231708/information(accessed on 7 November 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Qiu, W.; Tan, X.; Tan, F. Urban Rail Transit Station Equipment; China Railway Publishing House: Beijing, China, 2012. [Google Scholar]
Wang, Z.-J.; Li, X.-H.; Chen, F. Impact evaluation of a mass transit fare change on demand and revenue utilizing smart card data. Transp. Res. Part A Policy Pract. 2015, 77, 213–224. [Google Scholar] [CrossRef]
Dai, J.; Liu, Z.; Li, R. Improving the subway attraction for the post-COVID-19 era: The role of fare-free public transport policy. Transp. Policy 2021, 103, 21–30. [Google Scholar] [CrossRef] [PubMed]
Li, G.; Yang, M.; Wang, S. Mining the influencing factors of rail transit station passenger flow based on AFC and POI data. Urban Transp. 2019, 17, 120. [Google Scholar]
Pan, H.; Li, J.; Shen, Q.; Shi, C. What determines rail transit passenger volume? Implications for transit oriented development planning. Transp. Res. Part D-Transp. Environ. 2017, 57, 52–63. [Google Scholar] [CrossRef]
Zhu, Y.; Chen, F.; Wang, Z.; Deng, J. Spatio-temporal analysis of rail station ridership determinants in the built environment. Transportation 2018, 46, 2269–2289. [Google Scholar] [CrossRef]
Calvo, F.; Eboli, L.; Forciniti, C.; Mazzulla, G. Factors influencing trip generation on metro system in Madrid (Spain). Transp. Res. Part D Transp. Environ. 2019, 67, 156–172. [Google Scholar] [CrossRef]
Daniel Cardozo, O.; Carlos Garcia-Palomares, J.; Gutierrez, J. Application of geographically weighted regression to the direct forecasting of transit ridership at station-level. Appl. Geogr. 2012, 34, 548–558. [Google Scholar] [CrossRef]
Chen, E.; Ye, Z.; Wang, C.; Zhang, W. Discovering the spatio-temporal impacts of built environment on metro ridership using smart card data. Cities 2019, 95, 102359. [Google Scholar] [CrossRef]
Gao, F.; Yang, L.; Han, C.; Tang, J.; Li, Z. A network-distance-based geographically weighted regression model to examine spatiotemporal effects of station-level built environments on metro ridership. J. Transp. Geogr. 2022, 105, 103472. [Google Scholar] [CrossRef]
Wang, J.; Zhang, N.; Peng, H.; Huang, Y.; Zhang, Y. Spatiotemporal Heterogeneity Analysis of Influence Factor on Urban Rail Transit Station Ridership. J. Transp. Eng. Part A Syst. 2022, 148, 04021115. [Google Scholar] [CrossRef]
Ma, J.; Zheng, C.; Wu, F.; Wang, Y.; Lu, Y. Analysis of shared bike docking travel characteristics and influencing factors for subway stations. J. Jilin Univ. (Eng. Ed.) 2024, 55, 2639–2650. [Google Scholar]
Shao, Q.; Zhang, W.; Cao, X.; Yang, J.; Yin, J. Threshold and moderating effects of land use on metro ridership in Shenzhen: Implications for TOD planning. J. Transp. Geogr. 2020, 89, 102878. [Google Scholar] [CrossRef]
Xu, X.; Kong, Q.; Li, J.; Liu, J.; Sun, Q. Analysis of the Impact of Built Environment on Spatiotemporal Heterogeneity of Rail Transit Passenger Flow. Transp. Syst. Eng. Inf. 2023, 23, 281. [Google Scholar]
Guerrieri, M. Railway Lines and Stations. In Fundamentals of Railway Design; Springer Nature: Cham, Switzerland, 2023; pp. 137–158. [Google Scholar] [CrossRef]
Sun, J.; Ye, X.; Yan, X.; Wang, T.; Chen, J. Multi-Step Peak Passenger Flow Prediction of Urban Rail Transit Based on Multi-Station Spatio-Temporal Feature Fusion Model. Systems 2025, 13, 96. [Google Scholar] [CrossRef]
Global City Computing AI Challenge; Algorithm Competition, Tianchi Competition-Ali Cloud Tianchi Competition System (aliyun.com); 31 March 2019. Available online: https://tianchi.aliyun.com/competition/entrance/231708/information (accessed on 7 November 2023).
Wang, M. Research on the Influencing Factors of Housing Price in Beijing Based on Spatiotemporal Geographical Weighted Regression Model; Shandong Agricultural University: Shandong, China, 2018. [Google Scholar]
Yang, H.; Lu, X.; Cherry, C.; Liu, X.; Li, Y. Spatial variations in active mode trip volume at intersections: A local analysis utilizing geographically weighted regression. J. Transp. Geogr. 2017, 64, 184–194. [Google Scholar] [CrossRef]
Ye, X.; Sui, X.; Wang, T.; Yan, X.; Chen, J. Research on parking choice behavior of shared autonomous vehicle services by measuring users’ intention of usage. Transp. Res. Part F Traffic Psychol. Behav. 2022, 88, 81–98. [Google Scholar] [CrossRef]
Huang, B.; Wu, B.; Barry, M. Geographically and Temporally Weighted Regression for Modeling Spatio-temporal Variation in House Prices. Int. J. Geogr. Inf. Sci. 2010, 24, 383–401. [Google Scholar] [CrossRef]
Li, Z.; Xu, Z.; Chen, Y.; Gu, S.; Li, C. Impacts of landscape patterns on habitat quality in coal resource-exhausted cities: Spatial-temporal dynamics and non-stationary scale effects. Environ. Monit. Assess. 2025, 197, 297. [Google Scholar] [CrossRef]
Xu, J.; Jisi, S. Research on Spatial Distribution Changes and Influencing Factors of Urban Housing Prices Based on MGTW; Chongqing Jiaotong University: Chongqing, China, 2023. [Google Scholar]
Guo, P.; Hong, Z. Analysis of spatial-temporal heterogeneity of PM 2.5 concentration influencing factors in three northeastern provinces based on MGTWR model. J. Inn. Mong. Univ. Technol. (Nat. Sci. Ed.) 2024, 44, 280–288. [Google Scholar] [CrossRef]
Fotheringham, A.S.; Yang, W.B.; Kang, W. Multiscale geographically weighted regression (MGWR). Ann. Am. Assoc. Geogr. 2017, 107, 1247–1265. [Google Scholar] [CrossRef]

Figure 1. Time distribution of the coefficients of the effects of the independent variables of the MGTWR model.

Figure 2. Spatial distribution of the coefficients of the influence of the independent variables of the Enterprises of the MGTWR model.

Figure 3. Spatial distribution of the coefficients of the independent variables influencing the MGTWR model Commercial House.

Figure 4. Spatial distribution of the coefficients of the independent variables affecting the Tourist Attraction of the MGTWR model.

Table 1. Examples of some AFC swipe data.

Time	Line Id	Station Id	Device Id	Status	User Id	Pay Type
5 January 2019 8:15	A	73	3332	0	C35ff……25246	2
5 January 2019 8:15	B	24	1235	0	D2821……41526	3
5 January 2019 8:15	B	4	158	0	A91aec……a9e93	0
5 January 2019 8:15	C	41	1919	1	Bbb3a……d6357	1
5 January 2019 8:15	B	24	1221	0	B1426……61853	1
5 January 2019 8:15	B	15	782	1	B1576……db5e0	1

Table 2. Results of POI correlation analysis between Station 7 and selected stations.

Site	3	5	6	7	15	43	68
POI Relativity	0.9394	0.912	0.9320	1	0.9059	0.8747	0.8876
Related Rankings	1	3	2	-	4	6	5
Daily Passenger Flow Correlation	0.9267	0.9153	0.9247	1	0.8072	0.9151	0.8884
Related Rankings	1	3	2	-	6	4	5

Table 3. Ranking of the number of POIs in each category at selected sites and their proximity to the number of POIs at Station 7.

Site	3	Ranking	5	Ranking	6	Ranking	7
Automobile Service	35	4	37	5	28	1	23
Car Sales	14	3	0	2	0	2	2
Vehicle Maintenance and Repair	8	1	8	1	9	2	8
Motorcycle Service	0	2	0	2	0	2	2
Food Service	366	2	295	3	211	4	345
Shopping Services	378	1	343	2	296	3	439
Service For Life	405	2	436	1	284	4	470
Sports and Leisure Services	83	5	80	4	44	2	56
Medical and Health Services	105	1	75	2	64	3	135
Accommodation Services	62	2	27	5	38	4	252
Famous Scenery	3	2	4	1	2	3	13
Commercial Housing	31	4	47	1	45	2	94
Government Agencies and Social Organizations	63	3	151	2	118	1	124
Science, Education and Cultural Services	112	3	132	6	57	1	77
Transportation Facilities Services	152	1	210	3	170	2	153
Financial And Insurance Services	20	4	84	2	50	1	60
Incorporated Business	115	4	153	2	153	2	199
Road Ancillary Facilities	0	2	0	2	1	1	1
Location Address Information	55	1	47	2	39	3	60
Communal Facilities	22	2	57	3	24	1	36
Event Activities	0	1	0	1	0	1	0
Transportation Facilities	0	1	0	1	0	1	0
Site	15	Ranking	43	Ranking	68	Ranking
Automobile Service	49	6	13	3	17	2
Car Sales	4	2	1	1	1	1
Vehicle Maintenance and Repair	2	4	2	4	3	3
Motorcycle Service	0	2	0	2	1	1
Food Service	355	1	158	5	145	6
Shopping Services	243	4	125	6	215	5
Service For Life	329	3	146	6	150	5
Sports and Leisure Services	51	1	43	3	26	6
Medical and Health Services	51	4	32	6	37	5
Accommodation Services	232	1	40	3	26	6
Famous Scenery	3	2	2	3	3	2
Commercial Housing	35	3	30	5	26	6
Government Agencies and Social Organizations	46	4	28	6	43	5
Science, Education and Cultural Services	44	2	30	4	29	5
Transportation Facilities Services	260	6	61	4	59	5
Financial and Insurance Services	22	3	10	6	19	5
Incorporated Business	140	3	163	1	72	5
Road Ancillary Facilities	0	2	0	2	0	2
Location Address Information	33	5	34	4	11	6
Communal Facilities	81	6	1	5	4	4
Event Activities	0	1	0	1	0	1
Transportation Facilities	0	1	0	1	0	1

Table 4. VIF values for each variable.

Argument	VIF
Food Service	10.34668186
Shopping Services	7.032731489
Service For Life	23.82167605
Sports and Leisure Services	6.554564789
Medical and Health Services	4.078520572
Accommodation Services	4.399362948
Famous Scenery	4.462307008
Commercial Housing	11.88041137
Government Agencies and Social Organizations	13.31520142
Science, Education and Cultural Services	4.877289223
Transportation Facilities Services	3.670111891
Financial and Insurance Services	3.733694487
Incorporated Business	1.828769161
Population Distribution	2.975139216

Table 5. Results of the spatial correlation test for each variable.

Argument	Moran’s I	E(I)	Var(I)	z-Value
Food service	0.303251889	−0.0125	6.37 × 10⁻⁶	125.134
Shopping services	0.497486103	−0.0125	1.04 × 10⁻⁵	157.7971
Service for life	0.468983164	−0.0125	9.85 × 10⁻⁶	153.4383
Sports and leisure services	0.246757545	−0.0125	5.18 × 10⁻⁶	113.901
Medical and health services	0.506598145	−0.0125	1.06 × 10⁻⁵	159.1655
Accommodation services	0.429375709	−0.0125	9.02 × 10⁻⁶	147.1677
Famous scenery	0.576883604	−0.0125	1.21 × 10⁻⁵	169.35
Commercial housing	0.955530334	−0.0125	2.01 × 10⁻⁵	216.1215
Government agencies and social organizations	0.93230677	−0.0125	1.96 × 10⁻⁵	213.5476
Science, education and cultural services	0.544384151	−0.0125	1.14 × 10⁻⁵	164.7188
Transportation facilities services	0.467177063	−0.0125	9.81 × 10⁻⁶	153.1579
Financial and insurance services	0.614376574	−0.0125	1.29 × 10⁻⁵	174.5404
Incorporated business	0.206291487	−0.0125	4.33 × 10⁻⁶	105.1286
Population distribution	0.993669173	−0.0125	2.09 × 10⁻⁵	220.2831

Table 6. Fitting parameters for the MGTWR model of morning peak inbound passenger flow.

Argument	Least Value	The Last Four Digits	Median	Lower Quartile	Crest Value
The Intercept Term	553.7797	821.6143	1214.1688	1261.4603	1347.7401
Food Service	6.7665	10.1378	13.5146	14.3634	17.8539
Shopping Services	−1.6585	−1.1661	−0.7152	−0.6078	−0.3245
Service For Life	3.2319	4.3188	5.4752	5.8761	6.1308
Sports and Leisure Services	−33.1797	−30.2744	−29.3012	−28.3353	−26.4028
Medical and Health Services	12.0856	14.0070	17.7617	19.1660	21.5747
Accommodation Services	3.2178	3.9127	4.3618	5.2461	8.9824
Famous Scenery	−54.5313	−48.9200	−45.8257	−23.6494	−11.6348
Commercial Housing	13.9447	17.2068	18.5939	19.6171	20.7198
Government Agencies and Social Organizations	−30.2892	−29.6214	−28.7856	−27.1498	−24.7318
Science, Education and Cultural Services	−4.9256	−4.0762	−3.7593	−3.3391	−2.5871
Transportation Facilities Services	1.4142	3.7231	4.1307	4.7995	6.7697
Financial and Insurance Services	−20.1579	−18.3932	−16.8708	−10.9467	−8.1135
Incorporated Business	−2.4580	−1.2893	−0.2972	0.2815	1.0551
Population Distribution	0.0585	0.0682	0.0891	0.0953	0.1027

Table 7. Comparison of results from different models.

Model	R²	AICc	RSS
GTWR	62.16%	35,926	4,134,135,268
MGTWR	65.10%	35,615	3,992,430,828

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qian, M.; Cheng, L.; Sun, J. Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity. Systems 2025, 13, 985. https://doi.org/10.3390/systems13110985

AMA Style

Qian M, Cheng L, Sun J. Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity. Systems. 2025; 13(11):985. https://doi.org/10.3390/systems13110985

Chicago/Turabian Style

Qian, Minlei, Lin Cheng, and Jianan Sun. 2025. "Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity" Systems 13, no. 11: 985. https://doi.org/10.3390/systems13110985

APA Style

Qian, M., Cheng, L., & Sun, J. (2025). Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity. Systems, 13(11), 985. https://doi.org/10.3390/systems13110985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on the Analysis of Influential Factors of Short-Period Passenger Flow of Urban Rail Transit Based on Spatio-Temporal Heterogeneity

Abstract

1. Introduction

2. Literature Review

3. Problem Statement

3.1. Data Source of URT Station Card Swipe

3.2. Variable Selection

3.3. Variable Testing

3.3.1. Multiple Linear Test of Variables

3.3.2. Autocorrelation Test of Variables

4. Methodology

4.1. GTWR Model

4.2. MGTWR Model

4.3. Evaluating Indicator

5. Case Study

5.1. Variable Collinearity Test Results

5.2. Variable Correlation Test Results

5.3. Results of the MGTWR Model at URT Site

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI