Modeling the Factors Influencing the Activity Spaces of Bikeshare around Metro Stations: A Spatial Regression Model

Metro-bikeshare integration is considered a green and efficient travel model. To better understand bikeshare as a feeder mode to the metro, this study explored the factors that influence the activity spaces of bikeshare around metro stations. First, metro-bikeshare transfer trips were recognized by matching bikeshare smartcard data and metro smartcard data. Then, standard deviation ellipse (SDE) was used for the calculation of the metro-bikeshare activity spaces. Moreover, an ordinary least squares (OLS) regression and a spatial error model (SEM) were established to reveal the effects of social-demographic, travel-related, and built environment factors on the activity spaces of bikeshare around metro stations, and the SEM outperformed OLS significantly in terms of model fit. Results show that the average metro-bikeshare activity space on weekdays is larger than that on weekends. The proportion of local residents promotes the increase in activity space on weekends, while a high density of road and metro impedes the activity space on weekdays. Additionally, with increased job density, the activity space becomes smaller significantly throughout the week. Also, both on weekdays and weekends, the closer to the central business district (CBD), the smaller the activity space. This study can offer meaningful guidance to policymakers and city planners aiming to make the bikeshare distribution more reasonable.


Introduction
The bikeshare program offers an affordable and equitable public transportation mode [1], since it allows travelers to use bicycles on an "as-needed" basis, free from the cost concerns and responsibilities aroused by private bike ownership [1,2]. As a non-motorized transportation tool, bikeshare produces no air pollution and helps to alleviate traffic congestion [3][4][5]. Flexible and convenient as it is, shared bikes are not only sufficient for a complete short trip, but also provide a good opportunity to compensate for the metro, solving the last mile problem [2,5], as one mile is often too short to efficiently travel by bus due to the waiting time and long walking distance. Over the past decades, bikeshare has expanded rapidly all over the world. For the docked bikeshare, as of 19 February 2018, bike-sharing systems are operating in 1560 cities worldwide, and another 402 such systems are in planning or under construction, with a growing interest in an increasing number of cities [6].
Bike-metro integration has grown substantially in the past decade in many countries. Fishman [4] suggested that "improved public transport integration" is one of the most important future directions in the bikeshare domain. In Chinese cities, a marriage between bikeshare and the metro can greatly benefit sustainable transportation development [7]. First, China has been going through a phase of unprecedented expansion in both bikeshare programs and urban rail transit systems. Second, extending the metro systems to suburban and exurban areas often requires a higher cost, while an effective multimodal transfer system would help expand the service to cover more places and save people's travel time [8,9]. Cycling is an answer. It is convenient to get to a place beyond the reach of the metro and that requires a long walking time [10,11]. Furthermore, metro-bikeshare integration is especially important in Chinese cities, where public transit systems are notoriously crowded and bicycles are not allowed to be taken on buses and trains.
Nowadays, the development of automatic fare collection systems has made it possible for users to transfer between bikeshare and the metro freely with the same smartcard. This brings researchers new perspectives on the integration of bicycles and the metro, as the seamless smartcard records the transactions of all public transport modes with the same card ID. As Shaheen [12] suggested, seamless smartcards for metro users could help increase the number of metro-bikeshare users and thus make the public transportation systems more profitable. In this paper, the data come from bikeshare and metro smartcard systems in Nanjing in March 2017.
Despite the rising interest in metro-bikeshare and the number of countries that are developing metro-bikeshare integration to improve the efficiency of public transportation, knowledge on the following topics is scarce: How to measure the activity spaces of metro-bikeshare transfers by isolating valid transfer records that belong to one person using smartcard data, and how different factors influence the activity spaces across social-demographic, travel-related, and environmental dimensions. Specifically, activity spaces are defined as the physical spaces within which people travel in the course of their daily activities [13]. In this paper, the activity spaces of bikeshare around metro stations refer to the spaces defined by the maximum distance a metro user would use bikeshare to reach metro services; this is a reference to Nair [14], who defined the pedestrian catchment area for the metro in the same way.
This paper focuses on these questions, and the remainder is organized as follows. A literature review of metro-bikeshare travel characteristics and influential factors is provided in the next section. Subsequently, the paper introduces the methodology and data. Afterward, the model results are presented. The conclusions and suggestions for future research are summarized in the last part of the paper.

Literature Review
There is extensive literature on the combination of bicycles and the metro, and the number is growing rapidly. Multiple articles have summarized a range of bicycle-metro integration topics, including the travel characteristics of integrated bicycle-metro trips [8,10,[15][16][17], the accessibility of the bicycle-metro system [18][19][20][21], bicycle parking issues at railway stations [22,23], the bicycle-metro demand forecast [24][25][26], and the determinants of general bicycle-metro integration [8,16,[27][28][29][30]. However, the emphasis of this literature review is on the integration of bikeshare with the metro. Such a review is challenging, as empirical evidence on the means to improve the integration between bikeshare and the metro has been fragmentary. The limited literature on this topic has focused on metro-bikeshare travel characteristics and the influencing factors.

Travel Characteristics of Metro-Bikeshare Integration
Previous studies have explored metro-bikeshare travel characteristics in socio-demographic groups and spatial-temporal dimensions. Bachand-Marleau, et al. [31] found that over one-third of survey respondents reported having used bikeshare. Bikeshare users, especially those with a yearly membership, were most likely to integrate bikeshare and the metro. Chen, et al. [32] conducted a survey and suggested that more than half of the metro users had a preference for bicycle transfer services, and those travelers were making trips for non-time-sensitive purposes, such as shopping and visiting friends and, to a lesser extent, going to work and school. The reason for this result may be that the sample included private bicycle users. Shaheen [5] found that bikeshare systems in smaller, less transit-intensive cities can provide greater public transit connectivity than those in bigger cities. Similarly, bikeshare was more closely connected with the metro in suburban or exurban areas where the population was less dense [33]. The results showed that bikeshare appeared to be improving urban mobility and lowering travelers' dependency on automobile travel. Bikeshare stations colocated with public transit produced the greatest ridership, and the appropriate distance between a metro station and a bike-sharing kiosk that encouraged multimodal cross-flow between bikeshare and the metro was an average of 120 m [14,34]. Moreover, Yi et al. [35] found that over 89% of passengers have fewer than six transfers in 3 weeks. Ma et al. [36] analyzed the distribution of bikeshare travel distance and travel time of metro-bikeshare users and found that the main function of bikeshare integrated with the metro is commuting, rather than entertainment or exercising. In addition, in Zhao's [37] study, some interesting findings of metro-bikeshare transfer were derived; for instance, the first-mile trips during the morning peak had the same spatial pattern as the last-mile trips during the evening peak. Also, the distance of the first-mile or last-mile bikeshare trips was shorter than that of the other bikeshare trips. Yang, et al. [38] found that the metro-bikeshare is a more comfortable, simple, and efficient travel mode for many suburban commuters. Many commuters who drive medium to long distances daily strongly favor the metro-bikeshare as an alternative commute mode. In addition, male motorists and commuters who have had unpleasant experiences are more likely to be attracted to the corresponding features of metro-bikeshare integration.

Influential Factors of Bicycle-Metro Integration
Several studies analyzed the effects of different factors on bicycle-metro integration. Chen et al. [39] and Zhao and Li [27] found that travel distance is one of the most important influential factors which determines the rate at which the bicycle is a feeder mode to the metro. The attributes of the bicycle service at metro stations can heavily influence a cyclist's decision to choose the bicycle as a feeder mode, as suggested by Pan et al. [8]. Zhan [40] used a zero-inflated negative binomial regression model to examine the associations between public transit (bus and metro) usage and monthly bikeshare ridership, and they found that public transit usage was significantly positively associated with bike-sharing usage. Zhao [41] adopted Z-score analysis and visual analytic techniques and found that residential areas are the primary fountainhead where bikeshare demands are generated; meanwhile, metro stations were the most attractive hubs for the termination of bikeshare trips. In addition to these observable variables that can affect the use of bicycles to access public transit, unobservable variables have also been identified as having an influence. By using a hybrid model, the relationship between integrated bicycle-metro travel and three latent variables (the perception of connectivity, the attitude towards the station environment, and the perceived quality of bicycle facilities) were analyzed, and it was found that each variable had a positive correlation [42]. Personal demographics also contribute to the likelihood of using bikeshare to access the metro. Ji et al. [43] showed that female, older, and low-income transit commuters are less likely to use bikeshare to access the metro. Additionally, researchers have explored the effects of transfer bicycles and bicycle-related infrastructure on the metro, and they found that the presence of bicycle lanes and bikeshare ridership within a certain area (300-800 m) around the rail stations were both positively associated with transit ridership [14,44].
None of the aforementioned studies are specifically focused on the factors influencing the bikeshare activity spaces around metro stations. Also, most of the studies related to the metro-bikeshare system are based on survey data or data from only one kind of smartcard (bikeshare or metro). Therefore, in their studies, there were several problems, such as an inadequate sample size, restricting study generalizability. To the best of our knowledge, this article is the first effort to use a large amount of valid metro-bikeshare transfers isolated from smartcard data to measure the bikeshare activity spaces around metro stations and then reveal influential factors across social-demographic, travel-related, and built environment dimensions.

Methodology
In this section, multicollinearity and spatial autocorrelation are briefly introduced, and then two methods (ordinary least squares (OLS), spatial error model (SEM)) developed for comparisons are briefly illustrated.

Multicollinearity
Multicollinearity means that several particular independent variables have a strong linear correlation with each other, which can cause bias when interpreting the significance and influence of other independent variables. To eliminate this phenomenon, we adopted the VIF (variance inflation factor), which is an indicator that represents the severity of multicollinearity. Specifically, the VIF of an independent variable is directly related to the goodness-of-fit (r 2 ) of the model, and it takes this variable as the dependent variable while still treating the other independent variables as independent variables. The VIF is calculated as follows: Commonly, variables with VIF values greater than 10 are assumed to be multicollinear variables and should thus be eliminated from the OLS model [45].

Spatial Autocorrelation
The most commonly used spatial variability test is called Moran's I test, which shows the spatial autocorrelation of each independent variable and can be expressed as follows [46]: where n is the number of spatial units; w ij is the weight between location i and j, and the function has been shown above; y i , y j represents the selected attribute value at locations i and j, respectively; and y is the average of all observations. The range of Moran's I statistic is between −1 and +1. Higher positive values mean that close observations tend to have similar attribute values while distant observations have different attribute values, which indicates spatial aggregation. However, a negative value indicates spatial dispersion, and a value near zero indicates a spatially random distribution. The null hypothesis of Moran's I test is that the independent variables are spatially independent, which means that Moran's I statistic is close enough to zero. A Z-score is usually used as the indicator of the significance of Moran's I statistic to verify the null hypothesis, and it can be calculated as follows: where E(I) and Var(I) are the expectation and the standard deviation of the Moran's I statistic, respectively. The significance level in this study was p < 0.05.

Regression Models
Ordinary least squares (OLS) regression is first conducted; in OLS, the dependent variable is modeled as a linear function of multiple predictors using the least square approach [47]. However, the applicability of the OLS approach has been criticized for neglecting the spatial variations in metro-bikeshare usage [48].
The spatial error model (SEM) can be viewed as an extension of OLS models, as it assumes that variables that follow a spatial pattern are omitted from the model, which leads to spatial autocorrelation within the error term [49]. SEM is useful when there is spatial autocorrelation among residuals. It contains a spatial error term and examines the impact of the omitted variables on the observations of a dependent variable [50]. SEM is expressed as follows [51]: where: A = (n × 1) vector of activity spaces of bikeshare around metro stations, X = (n × k) matrix of exogenous influencing variables on activity spaces of bikeshare around metro stations, W = the spatial weight matrix, β = (k × 1) vector of regression coefficients, ε = assumed disturbance term, u = random error term, typically assumed to be independent and identically distributed (i.i.d) error terms, λ = spatial autoregressive coefficient. If λ is statistically significant, it indicates that the existence of hidden independent variables with spatial autocorrelation leads to a phenomenon that has the appearance of spatial autocorrelation in the residuals [52].
We used Moran's I of residuals of the spatial regression model, Akaike information criterion (AIC), and log-likelihood statistics to compare the results of the OLS and SEM models, as Yang [50] used. Specifically, AIC is a commonly used metric to indicate the goodness-of-fit of a spatial regression model, where the final model with the lowest AIC is selected [53,54].

Data
This section introduces the dependent variable and three categories of explanatory variables, including social-demographic, travel-related, and built environment dimensions.

Dependent Variable: Activity Space of Bikeshare around the Metro Station
The dependent variable used in the models is the activity spaces of bikeshare around metro stations. In order to describe the dependent variable, we must first illustrate the process of recognizing the metro-bikeshare transfer trips. Transfer trips were identified using metro smartcard data and bikeshare smartcard data from 9 March 2016 to 29 March 2016, which were obtained from the Nanjing Public Bicycle Company and the Nanjing Metro Company. Bikeshare smartcard data and metro smartcard data share the same structure, including two profiles: Trips and Stations. The Trips profile includes the following anonymous information: card ID, trip starting date and time, trip ending date and time, trip starting station ID, trip ending station ID. The Stations profile includes station ID, station name, and the longitude/latitude of the docking station. We applied two matching rules to recognize metro-bikeshare transfer trips: a maximum transfer time of 10 min and a maximum transfer distance of 300 m [37,43]. After applying the two rules of maximum travel time and distance, 12,331 metro-bikeshare transfer trips made by 3836 passengers at 39 transfer pairs were generated.
Afterward, we calculated the activity spaces of bikeshare around all metro stations by using the standard deviation ellipse (SDE), which is an effective method to measure the activity spaces of transportation behavior. With the record of each recognized trip, we obtained the bikeshare activity space of each metro station according to the origin and destination [55]. Specifically, in this study, the SDE was created at two standard deviations, capturing approximately 95% of the docking stations, as Wong et al. used [56][57][58]. The average size of the bikeshare activity spaces around metro stations is 4.78 km 2 on weekdays and 4.20 km 2 on weekends, as shown in Figure 1a,b. The activity space area in urban areas is smaller than that in suburban areas, which is reasonable because the density of the metro and bikeshare stations is much larger than that in the suburban areas, so the travel distance in the suburban areas is greater. Additionally, the bikeshare activity spaces around most metro stations have no obvious difference between weekdays and weekends. This finding is because the activity spaces are calculated using the latitude and longitude coordinates of bikeshare stations, although the transfer volume on weekdays is much greater than that on weekends, and the bikeshare stations used by metro-bikeshare users basically do not change much due to the diversity of travel purposes of metro-bikeshare users [43].
Sustainability 2018, 10, x FOR PEER REVIEW 6 of 12 by metro-bikeshare users basically do not change much due to the diversity of travel purposes of metro-bikeshare users [43].
(a) Activity spaces on weekdays (b) Activity spaces on weekends

Explanatory Variable: Social-Demographic, Travel-Related, and Built Environment Factors
This paper aims to explore how the social-demographic, travel-related, and built environment factors affect the activity spaces. Social-demographic and travel-related variables were extracted from the recognized metro-bikeshare transfer data, and the built environment variables were measured by

Explanatory Variable: Social-Demographic, Travel-Related, and Built Environment Factors
This paper aims to explore how the social-demographic, travel-related, and built environment factors affect the activity spaces. Social-demographic and travel-related variables were extracted from the recognized metro-bikeshare transfer data, and the built environment variables were measured by GIS from the 2016 statistical yearbook of Nanjing [37]. The social-demographic and travel-related variables include the proportion of males, the proportion of local residents, the proportions of different age groups, the proportion of travel trips in peak hours, and the average travel distance using the metro. The built environment variables include road density, metro station density, bikeshare station density, bus stop density, resident density, job density, and distance to the central business district (CBD). The summary statistics for each variable are shown in Table 1.
The spatial patterns and autocorrelation of the activity spaces were tested before applying the spatial error model. The Global Moran's I tool in ArcGIS was used to evaluate whether there was spatial autocorrelation in the activity spaces throughout the study area. The value of Moran's I index is 0.2386 (z-score = 3.8388, p-value < 0.0001) on weekdays and 0.3131 (z-score = 4.7272, p-value < 0.0000) on weekends, indicating that significant spatial autocorrelation exists in the activity spaces of bikeshare around the metro stations.
Due to the spatial autocorrelation, the spatial regression model was applied to examine the factors influencing activity spaces. At first, the OLS model was used to test whether the residuals are spatially random. Based on a value of Moran's I residuals of 0.8568 (p-value < 0.0000) on weekdays and 0.6685 (p-value < 0.0000) on weekends, the residuals are proved to be spatially correlated, and the estimated results for the OLS model are invalid. Thus, the spatial error model, which takes the spatial autocorrelation into consideration, was applied. The estimation results of the models are presented in Table 2. Compared to OLS, the SEM has a better R-squared, a smaller value of the AIC, and a higher value of the log-likelihood, which means that SEM has a better explanatory power.
In terms of social-demographic aspects, the POLR (proportion of local residents) significantly promotes increased activity space on weekends. Compared with non-native groups that travel mainly on weekdays, native groups are more likely to travel by bikeshare on weekends [59]. Therefore, the activity space will track an increase in POLR (proportion of local residents). Considering age groups, five age groups were identified in this analysis. The reason for this classification is because, in China, the statutory age of retirement is 55 for females and 60 for males, in most cases. PAGE1 (proportion of users under 18 years old) exerts negative impacts throughout the week, probably for physical strength reasons. Thus, their activity space is smaller than other groups, while age group 4, mainly consisting of commuters, promotes increased activity space on weekdays.
As for the travel-related dimension, ADM (average travel distance in metro) has significant negative impacts on the bikeshare activity spaces around metro stations on weekdays. This may be because commuters with tired bodies are not willing to cycle too far after spending too much time on the metro.
In terms of built environment variables, METROD (metro station density) and ROADD (road density) impede the activity spaces on weekdays. The higher value of these two variables is in urban areas, which indicates heavier on-road traffic and more bus lines, so commuters prefer to walk or take the bus instead of riding. With increased JOBD (job density), the activity space decreases significantly throughout the week. A denser job distribution indicates a larger number of work-related points of interest (POIs) nearby, and thus, more bikeshare stations are located to meet the commuting demand. The average distance between each bikeshare station is relatively shorter than other places, which results in a shorter travel distance and smaller activity spaces. A positive coefficient of the distance to the CBD suggests that the farther a district is from the CBD, the bigger is its activity space throughout the week. This is because the suburban bus network and metro lines cover fewer areas and the density of bikeshare stations is low, so people have to cycle longer to reach their destinations.

Discussion and Conclusions
A marriage between bikeshare and metro brings new opportunities for sustainable transportation, benefiting the society as a whole. For exploring the factors that influence the metro-bikeshare activity space, transfer trips were first identified through metro smartcard data and bikeshare smartcard data of 9-29 March 2016; the data were obtained from the Nanjing Public Bicycle Company and the Nanjing Metro Company. Two recognition rules-a maximum transfer time of 10 min and a maximum transfer distance of 300 m-were used to identify the metro-bikeshare transfer trips. Then, we calculated the bikeshare activity spaces around metro stations with the SDE method. The result of this shows that the average activity space on weekdays is larger than that on weekends. By establishing the OLS and SEM models, this study examines the relationship between the activity spaces of bikeshare around the metro stations and social-demographic, travel-related, and built environment variables. SEM outperforms the traditional OLS model significantly in terms of model fit. The findings provide guidance for urban planning and help to better allocate bikeshare stations.
Results show that the POLR (proportion of local residents) significantly promotes increased activity space on weekends. This may be because, compared with non-natives, native groups are more likely to travel by bikeshare on weekends. Hence, the activity space tracks the increase in POLR. The PAGE1 (below 18 years old) impedes the activity space throughout the week, while PAGE4 (between 46 years old and retirement age) promotes it on weekdays. A high density of road and metro stations impedes the bikeshare activity spaces because of a relatively well-developed public transport network. With increasing JOBD (jobs density), the activity space decreases significantly throughout the week. Although the distribution of bikeshare near the office areas is dense, a demand-supply imbalance of bikeshare is also severe since the proportion of commuters is high. Therefore, strategies should be proposed to establish an intelligent monitoring system to rebalance bikeshare over time. Moreover, a long travel distance on the metro impedes the activity spaces on weekdays, probably because commuters with tired bodies are not willing to cycle too far after spending too much time on the metro. The further a district is from the CBD, the bigger is its activity space throughout the week. This is because the suburban bus network and metro lines cover fewer areas and the density of bikeshare stations is low, so people have to ride longer to reach their destinations. It is suggested that the urban planner establishes more bikeshare stations in suburban areas to improve the density of bikeshare stations.
However, there are still some limitations to this study. One limitation of this study is that more data from other sources should be obtained. For instance, it is necessary to take other factors, like points of interest (POIs), into consideration. Also, since the smartcard data used for this work only cover a 21-day period, it is possible that passengers' transfer behavior changes under various seasonal conditions. Future studies shall consider analyzing the transfer behavior changes using data collected over a longer period. Additionally, this work could be extended by obtaining metro and bikeshare smartcard data, land use data, and social-demographic data from other cities to examine whether the transfer patterns in other cities are consistent with the findings in Nanjing. Finally, this study has only looked into the programs integrating the traditional docked bikeshare with the metro. It is necessary to compare it with dockless bikeshare in terms of the metro-bikeshare activity spaces.

Conflicts of Interest:
The authors declare no conflict of interest.