Investigating Impacts of Environmental Factors on the Cycling Behavior of Bicycle-Sharing Users

As it is widely accepted, cycling tends to produce health benefits and reduce air pollution. Policymakers encourage people to use bikes by improving cycling facilities as well as developing bicycle-sharing systems (BSS). It is increasingly interesting to investigate how environmental factors influence the cycling behavior of users of bicycle-sharing systems, as users of bicycle-sharing systems tend to be different from regular cyclists. Although earlier studies have examined effects of safety and convenience on the cycling behavior of regular riders, they rarely explored effects of safety and convenience on the cycling behavior of BSS riders. Therefore, in this study, we aimed to investigate how road safety, convenience, and public safety affect the cycling behavior of BSS riders by controlling for other environmental factors. Specifically, in this study, we investigated the impacts of environmental characteristics, including population density, employment density, land use mix, accessibility to point-of-interests (schools, shops, parks and gyms), road infrastructure, public transit accessibility, road safety, convenience, and public safety on the usage of BSS. Additionally, for a more accurate measure of public transit accessibility, road safety, convenience, and public safety, we used spatiotemporally varying measurements instead of spatially varying measurements, which have been widely used in earlier studies. We conducted an empirical investigation in Chicago with cycling data from a BSS called Divvy. In this study, we particularly attempted to answer the following questions: (1) how traffic accidents and congestion influence the usage of BSS; (2) how violent crime influences the usage of BSS; and (3) how public transit accessibility influences the usage of BSS. Moreover, we tried to offer implications for policies aiming to increase the usage of BSS or for the site selection of new docking stations. Empirical results demonstrate that density of bicycle lanes, public transit accessibility, and public safety influence the usage of BSS, which provides answers for our research questions. Empirical results also suggest policy implications that improving bicycle facilities and reducing the rate of violent crime rates tend to increase the usage of BSS. Moreover, some environmental factors could be considered in selecting a site for a new docking station.

Earlier studies reveal that the cycling behavior of regular riders (private bicycle riders) tends to be influenced by environmental factors, including population density, land use mix, green space, cycling facilities, and safety [25][26][27][28][29][30].It is increasingly interesting to investigate how environmental factors influence the cycling behavior of BSS cyclists, as BSS cyclists tend to be different from regular cyclists.For instance, in the UK, the USA, and Australia, BSS riders are on average disproportionately of higher education and income, and more likely to be male and white [31].Therefore, in very recent years, some studies started to investigate the impact of environmental factors on the cycling behavior of BSS users [32][33][34][35][36][37][38][39][40].For instance, some studies uncover that proximity to workplace or home tends to increase the usage of BSS [33][34][35], and high population and high levels of public transit accessibility also tend to increase the usage of BSS [36][37][38].More importantly, improvement on bicycle facilities (bicycle lanes or bicycle paths) seems to increase bicycle sharing ridership [36,37,39,40].Potentially, a better understanding of environmental effects on the cycling behavior of BSS riders could offer suggestions for policymaking and urban planning.For instance, knowing how environmental factors influence the cycling behavior of BSS riders will help in finding optimum places for docking stations across a city [32].
Although earlier studies have examined the effects of safety and convenience on the cycling behavior of regular riders [26][27][28][29], they have rarely explored the effects of safety and convenience on the cycling behavior of BSS riders.Therefore, in this study, we aimed to investigate how road safety, convenience, and public safety affect the cycling behavior of BSS riders by controlling for other environmental factors.Specifically, we used traffic accidents, congestions, and violent crimes to measure road safety, convenience, and public safety, respectively.Additionally, to more accurately measure public transit accessibility, road safety, convenience, and public safety, we used spatiotemporally varying measurements instead of spatially varying measurements that have been widely used in earlier studies [32,37,41].We also examined how space-and time-dependent public transit accessibility influences the usage of BSS, since earlier studies employed space-dependent public transit accessibility without any information on transit frequency [36][37][38].
Specifically, in this study, we investigated the impacts of environmental factors on the usage of BSS.The environmental factors investigated in this study include socioeconomic factors, land use and point-of-interest (POI) factors, road infrastructure factors, public transit accessibility factors, and safety and convenience factors.Particularly, we attempted to answer the questions of: (1) how traffic accidents and congestion influence the usage of BSS; (2) how violent crime influences the usage of BSS; and (3) how public transit accessibility influences the usage of BSS.Moreover, we tried to offer implications for policies aiming to increase the usage of BSS or for the site selection of new docking stations.Particularly, we paid much attention to the necessity of reducing violent crime.We conducted an empirical investigation in Chicago with cycling data from a BSS called Divvy.Divvy publicizes cycling trips of BSS users (including both annual members and casual users) with start and end docking stations as well as trip duration.The BSS data was contributed by individual riders, and thus is considered a type of crowdsourced geographic information (CGI) and has high potential for studies of active travel and sustainable transport.
The remainder of this paper is organized as follows.Section 2 introduces the BSS data and environmental data, as well as the measures of environmental characteristics.Section 3 then describes the empirical results, and finally Section 4 presents the conclusion and makes recommendations for future work.

Materials and Methods
In this section, the BSS data and environmental data employed are introduced.Subsequently, the environmental characteristics of docking stations are introduced.Finally, with a linear mixed-effects model, we examine the impacts of environmental factors on annual members' usage of BSS.

BSS Data
In this study, the cycling dataset was offered by Divvy (Motivate, Chicago, IL, USA), which is a BSS in Chicago.It began operating in June 2013.Yearly historical Divvy data are publicly available [42].The 2015 data includes two files: Trips and Stations.There are 3,183,439 cycling trips and 474 docking stations in these two files, respectively.In the Trips file, each trip was anonymized and includes: bike ID, trip start day and time, trip end day and time, trip start station ID, trip end station ID, trip duration, user type.Divvy has two user types: Subscriber and Customer, representing annual members and casual users, respectively.If a trip was made by a Subscriber, gender and year-of-birth of the Subscriber is available.In the Stations file, each station includes: station ID, station name, and station location (longitude/latitude). Figure 1 maps the docking stations of Divvy in Chicago.In 2015, there were 3,183,439 cycling trips, including 2,253,556 trips made by annual members (70%) and 929,883 made by casual riders (30%).The average duration of all cycling trips was 17.1 min.Specifically, the average duration of annual members' trips was 12.1 min; while the average duration of casual users' trips was 29.2 min.Casual users tend to cycle for a longer duration than annual members do.On the other hand, according to an estimate based on field observations [43], the average duration of regular riders' trips in Chicago was roughly 45 min.Compared with the average duration of regular riders' trips, the average duration of Divvy riders' trips is much shorter.This is mainly because only the first 30 min of the Divvy system is free and users are encouraged to use Divvy for short trips [41].Table 1 shows the demographics of cycling trips of Divvy's annual members in 2015.The majority of cycling trips were made by male riders, while female riders tend to cycle for a longer duration.Users aged 26-34 years made almost half of the total cycling trips, while users aged over 64 years tended to cycle for a longer average duration.We further removed some noisy trips by the following steps: (1) We first removed trips with a duration of more than 90 min (twice the average duration of regular riders' trips in Chicago [43]), as trips longer than 90 min are not typical bicycle-sharing rides and could also be a result of misplacing the bicycle when returning it to the station [41]; (2) We further removed trips originating or arriving at a docking station with a unique ID of '394', as this docking station is missing in the Stations file and thus its geo-location is unknown.Specifically, data of residents and jobs were downloaded from US Census Bureau's Longitudinal Employer Household Dynamics (LEHD) synthetic data [44].Data of road network, data of bicycle lanes, data of public transit services, and crime data were downloaded from Chicago's open data portal [45]; while land use data were downloaded from Chicago Metropolitan Agency for Planning [46].Data of POIs were downloaded from MapQuest via APIs [47].Data of traffic accidents and congestion were downloaded from Bing Maps REST Services via APIs [48].One-year records of geo-referenced and time-stamped traffic accidents and congestions from June 2015 to May 2016 were collected since June 2015.As Bing Maps REST Services do not support downloading historical records, we do not have records before June 2015.Instead, we used records from June 2015 to May 2016 to represent traffic accidents and congestions in this study.A three-year dataset of geo-referenced and time-stamped violent crimes from 2013 to 2015 was used to represent violent crimes.The geographical levels of population and employment were obtained from census tracts.The population and employment data in 2014 were used, as they were the most updated.Also, we used the latest land use data generated based on aerial images from 2013.The public transit data were obtained from a General Transit Feed Specification (GTFS) dataset that was downloaded from the Chicago Transit Authority (CTA) system, including stops, route patterns, and a full service schedule.The geography level and temporal level of traffic and crime data were street level and second level.

Cycling Behaviour and Investigation Model
In this study, we focused on the cycling behavior of annual members rather than casual riders because: (1) annual members tend to use BSS much more frequently than casual riders; and (2) 70% of the cycling trips were made by annual members.Accordingly, we first partitioned the cycling trips into trips made by annual members and trips made by casual riders, and then counted the cycling trips of annual members departing from or arriving at each docking station during a one-hour time slot separately.To characterize the cycling behavior of BSS riders (annual members), we measured the usage of docking stations as an origin or destination by two indices: hourly number of departures and hourly number of arrivals.A docking station's hourly number of departures and hourly number of arrivals represent the total number of cycling trips from and to this station during a one-hour time slot (e.g., 7:00 a.m.-7:59 a.m.) on all workdays in 2015.Specifically, dependent variables are hourly number of departures and hourly number of arrivals.In this study, we investigated the impact of environmental factors on one of these two dependent variables each time.Consequently, there are 11,376 records (24 h × 474 stations) of each dependent variable.
To quantitatively examine the effects of environmental factors with different data types (numeric and categorical), a linear mixed-effects model (also called a linear mixed model) was employed in this study.Moreover, a multilevel linear mixed model was employed to explicitly recognize the dependencies associated with bicycle flows originating or arriving at the same station, as a traditional linear regression model is not appropriate for studying data with multiple repeated observations [37].Additionally, an earlier study [37] compared a linear regression model and a linear mixed model, and its experimental results demonstrated the suitability of the mixed modeling approach employed in an analysis for examining the determinants of BSS usage.
The general form of a linear mixed model is: where y is an N × 1 response vector of the outflows or inflows of docking stations; N is the number of observations (24 h × 474 stations); X is an N × p matrix of the p independent variables for the fixed-effects; β is an N × 1 fixed-effects vector; Z is an N × q matrix for the q random-effects; γ is a q × 1 random-effects vector; ε is an N × 1 vector of the residuals.
For simplicity, we only considered random intercepts in this study.Accordingly, we assume that: apart from the capacity and visibly environmental factors of the docking stations, some other invisibly environmental factors of the docking stations, e.g., building density, steep inclines, or the presence of tourism sites nearby, might influence cyclists' behavior in a way that is not seen in the present data.In this study, the number of groups is equal to the number of docking stations.

Environmental Factors
In this study, we took account of population density, employment density, land use mix, accessibility to POIs (schools, shops, parks and gyms), road infrastructure, public transit accessibility, road safety and convenience, and public safety.We characterized the environmental factors at the level of the station's surrounding area.Here, the surrounding area of a docking station is defined as a circular buffer surrounding the docking station.An earlier study [41] suggests that a 300-m buffer around each station was found to be an appropriate walking distance, considering the distances between Divvy stations in the city of Chicago [11].Therefore, we set a radius of 300 m to define the surrounding area for each docking station.Based on the 300-m surrounding area, the environmental variables of docking station are defined and calculated as follows: Residential density and employment density is the density of residents and jobs in the 300-m buffer.As a docking station's buffer might overlap more than one census tract, we combined all overlapping parts of census tracts and the 300-m buffer.Supposing that i is a docking station, we calculated residential density and employment density of its buffer as: where res_den sub (i, j) and emp_den sub (i, j) represent the residential density and employment density of the overlapping part of census tract j and the buffer, equaling residential density and employment density of census tract j.Area sub (i, j) represents the area of the overlapping part of census tract j and the buffer; S(i) is the set of overlapping parts of census tracts and the buffer.
Length of roads equals the total length of roads within the 300-m buffer.It is used to measure the density of roads.As the buffer of each docking station is same size, the length of roads is not necessarily divided by the area of the buffer to represent the level of road density.
Length of bicycle lanes equals the total length of bicycle lanes within the 300-m buffer.It is used to measure the level of cycling facilities.
Land use mix is the mix level of land use in the 300-m buffer.We used an entropy index to describe the level of land use mix [28,49].The higher the entropy index, the more homogeneous the distribution of land types; in other words, the higher the level of land use mix.Supposing that there are N land use types, the entropy-based land use mix is represented as: where: LUA (t) represents the area of land use type t in the 300-m buffer; LUA represents the total area of all of the land use types.In this study, N equals 7. The seven land use types are: commercial, residential, industrial, institutional, other built-in, open space, and others.The entropy-based land use mix is within the range of 0 to 1, with 0 meaning a single land use type (e.g., all residential) and 1 denoting the even distribution of all seven land use types in the 300-m buffer.
Presence of colleges and universities, presence of schools, presence of grocery stores, presence of retail shops, presence of gyms, and presence of parks represent whether there are colleges and universities, schools, grocery stores, retail shops, gyms, or parks within the 300-m buffer.As a large portion of docking stations' 300-m buffers presented zero POI, we used a binary categorical data type instead of the original numeric data type to measure the availability of POIs.Specifically, 'Y' means there are colleges and universities, schools, grocery stores, retail shops, gyms, gyms, or parks within the 300-m buffer, while 'N' means there are none.
Metro frequency is the total number of metro routes passing all metro stations within the 300-m buffer.As the hourly frequency of each metro route is almost identical and the service times of each metro station is close to 24 h, we only calculated the number of routes to measure metro accessibility at each metro station.
Hourly bus frequency is the hourly number of bus trips passing all bus stops within the 300-m buffer on workdays (Monday to Friday).This was used to measure bus accessibility.Supposing that i is a docking station, its hourly bus frequency is calculated as: where ave_num_bus_ trip Stop Hour (j, t) is the average number of bus trips passing through the bus stop j during a one-hour time slot t on workdays; S(i) is the set of bus stops that are situated within the 300-m buffer of i.
Number of traffic accidents and number of traffic congestions are the hourly number of traffic accidents and congestions within the 300-m buffer on workdays (Monday to Friday), respectively.
Number of on-street violent crimes and number of off-street violent crimes are the hourly number of on-street violent crimes and off-street violent crimes within the 300-m buffer on workdays, respectively.

Results and Discussion
In this study, 11,376 records (24 h × 474 stations) of independent variables (see Table 2) were input into a linear mixed model for annual members' usage of BSS.Table 3 lists the estimation results for the linear mixed-effects model of annual members' usage of BSS.The number of observations is 11,376, which is equal to the number of records, and the number of groups is 474, which is equal to the number of dock stations.In Table 3, the coefficient is the coefficient estimated for each independent variable in the fixed effects; the SE is the standard error for each independent variable; and the p-value indicates the statistical significance for each independent variable.In this study, a p-value below 0.05 means the corresponding independent variable is statistically significantly associated with the dependent variable at a 0.05 level.Moreover, a positive coefficient means the corresponding independent variable is positively associated with the dependent variable, while a negative coefficient means the corresponding independent variable is negatively associated with the dependent variable.We demonstrated and discussed the environmental effects on annual members' usage of BSS.Note: * means it is not statistically significant at a 0.05 level.

Environmental Effect on Annual Members' Usage of BSS
We first examined the effects of independent variables on annual members' usage of BSS.Unsurprisingly, station capacity is positively and significantly associated with both number of departures and number of arrivals.Regarding time of the day, annual members are more likely to use BSS during PM Peak Hours (3:00 p.m.-5:59 p.m.) and Early Evening Hours (6:00 p.m.-7:59 p.m.), while annual members are less likely to use BSS during the other times of the day.This coincides with a finding in an earlier study: the BSS in Montreal is more predominantly used during the PM period (3:00 p.m.-7 p.m.) relative to other times of the day [37].Of the socioeconomic factors, both residential density and employment density are positively and significantly associated with both number of departures and number of arrivals.Of the infrastructure factors, length of roads is not significantly associated with number of departures or number of arrivals, while length of bicycle lanes is positively and significantly associated with both number of departures and number of arrivals.None of the land use and POI factors (land use mix, presence of colleges and universities, presence of schools, presence of grocery stores, presence of retail shops, presence of gyms, and presence of parks) are significantly associated with number of departures or number of arrivals.Interestingly, hourly bus frequency is positively and significantly associated with both number of departures and number of arrivals, while metro frequency is negatively and significantly associated with both number of departures and number of arrivals.More interestingly, two road safety and convenience factors (number of traffic accidents and number of traffic congestions) are not significantly associated with number of departures or number of arrivals, while two public safety factors (number of on-street violent crimes and number of off-street violent crimes) are negatively and significantly associated with number of arrivals but are not significantly associated with number of departures.

Discussion and Implications for Policies
We further discussed the environmental effects on the usage of BSS, and then tried to offer implications for policies.First of all, the socioeconomic factors, both residential density and employment density tend to increase the usage of BSS as they tend to be positively associated with the demand for BSS.Second, of the infrastructure factors, length of roads does not influence the usage of BSS, while length of bicycle lanes tends to increase the usage of BSS.This suggests that an increase in bicycle lanes separated from motor traffic is likely to increase the usage of BSS.Third, the usage of BSS is influenced by none of the land use or POI factors, i.e., land use mix and presence of colleges and universities, schools, grocery stores, retail shops, gyms, and parks.Fourth, of the public transit service factors, hourly bus frequency influences the usage of BSS differently than metro frequency.As mentioned above, the average duration of BSS riders' trips is much shorter than that of regular riders' trips.Most of the BSS riders' trips tend to be only part of the BSS riders' complete origin-destination trips, considering the short average duration of BSS riders' trips and the lack of docking stations near some workplaces or homes.Therefore, BSS riders might need to use public transport or car to complete their origin-destination trips.It is inferable that public transit accessibility tends to be positively associated with the usage of BSS, as some earlier studies uncovered.Surprisingly, this study revealed that metro accessibility influences the usage of BSS in a different way.Specifically, hourly bus frequency is positively associated with the usage of BSS, while metro frequency is negatively associated with the usage of BSS.This indicates that BSS riders are likely to transfer between BSS and bus but are unlikely to transfer between BSS and metro.Fifth, neither number of traffic accidents nor number of traffic congestions influences the usage of BSS.This contradicts our expectation that a large number of traffic accidents or traffic congestions would reduce the usage of BSS.Sixth, both number of on-street violent crimes and number of off-street violent crimes tend to decrease the usage of BSS.This coincides with our expectation that a large number of on-street violent crimes tend to reduce the usage of BSS, and a large number of off-street violent crimes might increase the usage of BSS as well.
Finally, we linked the environmental effects with policies to offer some implications for increasing the usage of BSS and for the site selection of new docking stations.Specifically, this study suggests that: (1) investment in cycling facilities tends to increase the usage of BSS; (2) reducing violent crimes is likely to encourage people to use BSS; and (3) residential density, employment density, public transit service, levels of cycling facilities, and the rate of violent crimes should be considered in the site selection of new docking stations.

Conclusions
This study has investigated the impacts of environmental characteristics on the usage of BSS, and tried to offer suggestions for policies aiming to increase the usage of BSS or to choose the sites of new docking stations.Empirical results demonstrated that the density of bicycle lanes, public transit accessibility, and public safety influence the usage of BSS.We obtained answers to the following research questions: (1) neither traffic accidents nor traffic congestion influences the usage of BSS; (2) both on-street violent crimes and off-street violent crimes tend to decrease the usage of BSS; (3) bus accessibility is positively associated with the usage of BSS, while metro accessibility is negatively associated with the usage of BSS.Empirical results also suggested the policy implications that improving bicycle facilities and reducing the rate of violent crimes tend to increase the usage of BSS.Moreover, some environmental factors could be considered in selecting a site for a new docking station.

Limitations
There are some limitations to this study.First, as how often each member uses BSS is unknown, we were not able to access the annual or monthly average frequency of BSS usage at the individual level.Ideally, individual-level frequency of BSS usage would be known, and we thus would be able to investigate the environmental effects on the cycling behavior of BSS riders at an individual level.
Second, the cycling purpose of BSS usage is unknown.Ideally, the cycling purpose of members' BSS usage would be known, and we thus could examine how BSSs increase commuting and recreational cycling, respectively.

Future Works
In future research, some aspects should be considered for further study.First, route choice modelling could be used to infer the explicit cycling paths based on road networks and cyclists' preferences [22].Second, to understand how BSS usage may increase with the number of docking stations and operating years, we could explore the monthly variations of BSS usage from 2013 to 2017, as Divvy offers data for several different years.Third, comparing the cycling behavior of BSS riders and private bicycle riders is also interesting.Traffic count data [50] or crowdsourced data from online social networking websites for sports, such as Strava [51,52], could be used as cycling data for private bicycle riders.

Table 2 .
Independent variables considered in this study.

Table 3 .
Estimation results for the linear mixed-effects model of annual members' usage of BSS.