In this section, the BSS data and environmental data employed are introduced. Subsequently, the environmental characteristics of docking stations are introduced. Finally, with a linear mixed-effects model, we examine the impacts of environmental factors on annual members’ usage of BSS.
2.2. Cycling Behaviour and Investigation Model
In this study, we focused on the cycling behavior of annual members rather than casual riders because: (1) annual members tend to use BSS much more frequently than casual riders; and (2) 70% of the cycling trips were made by annual members. Accordingly, we first partitioned the cycling trips into trips made by annual members and trips made by casual riders, and then counted the cycling trips of annual members departing from or arriving at each docking station during a one-hour time slot separately. To characterize the cycling behavior of BSS riders (annual members), we measured the usage of docking stations as an origin or destination by two indices: hourly number of departures and hourly number of arrivals. A docking station’s hourly number of departures and hourly number of arrivals represent the total number of cycling trips from and to this station during a one-hour time slot (e.g., 7:00 a.m.–7:59 a.m.) on all workdays in 2015. Specifically, dependent variables are hourly number of departures and hourly number of arrivals. In this study, we investigated the impact of environmental factors on one of these two dependent variables each time. Consequently, there are 11,376 records (24 h × 474 stations) of each dependent variable.
To quantitatively examine the effects of environmental factors with different data types (numeric and categorical), a linear mixed-effects model (also called a linear mixed model) was employed in this study. Moreover, a multilevel linear mixed model was employed to explicitly recognize the dependencies associated with bicycle flows originating or arriving at the same station, as a traditional linear regression model is not appropriate for studying data with multiple repeated observations [
37]. Additionally, an earlier study [
37] compared a linear regression model and a linear mixed model, and its experimental results demonstrated the suitability of the mixed modeling approach employed in an analysis for examining the determinants of BSS usage.
The general form of a linear mixed model is:
where
y is an
N × 1 response vector of the outflows or inflows of docking stations;
N is the number of observations (24 h × 474 stations);
X is an
N × p matrix of the
p independent variables for the fixed-effects;
β is an
N × 1 fixed-effects vector;
Z is an
N ×
q matrix for the
q random-effects;
γ is a
q × 1 random-effects vector;
ε is an
N ×
1 vector of the residuals.
For simplicity, we only considered random intercepts in this study. Accordingly, we assume that: apart from the capacity and visibly environmental factors of the docking stations, some other invisibly environmental factors of the docking stations, e.g., building density, steep inclines, or the presence of tourism sites nearby, might influence cyclists’ behavior in a way that is not seen in the present data. In this study, the number of groups is equal to the number of docking stations.
2.3. Environmental Factors
In this study, we took account of population density, employment density, land use mix, accessibility to POIs (schools, shops, parks and gyms), road infrastructure, public transit accessibility, road safety and convenience, and public safety.
Table 2 lists the independent variables, including
station capacity,
time of the day, and environmental variables.
Station capacity is the capacity of each docking station.
Time of the day is classified into six categories: Very Early AM Hours (12:00 p.m.–3:59 a.m.), Early AM Hours (4:00 a.m.–5:59 a.m.), AM Peak Hours (6:00 a.m.–8:59 a.m.), Mid-Day Hours (9:00 a.m.–2:59 p.m.), PM Peak Hours (3:00 p.m.–5:59 p.m.), Early Evening Hours (6:00 p.m.–7:59 p.m.), and Late Evening Hours (8:00 p.m.–11:59 p.m.).
We characterized the environmental factors at the level of the station’s surrounding area. Here, the surrounding area of a docking station is defined as a circular buffer surrounding the docking station. An earlier study [
41] suggests that a 300-m buffer around each station was found to be an appropriate walking distance, considering the distances between Divvy stations in the city of Chicago [
11]. Therefore, we set a radius of 300 m to define the surrounding area for each docking station.
Based on the 300-m surrounding area, the environmental variables of docking station are defined and calculated as follows:
Residential density and
employment density is the density of residents and jobs in the 300-m buffer. As a docking station’s buffer might overlap more than one census tract, we combined all overlapping parts of census tracts and the 300-m buffer. Supposing that
i is a docking station, we calculated
residential density and
employment density of its buffer as:
where
and
represent the residential density and employment density of the overlapping part of census tract
j and the buffer, equaling residential density and employment density of census tract
j.
represents the area of the overlapping part of census tract
j and the buffer;
is the set of overlapping parts of census tracts and the buffer.
Length of roads equals the total length of roads within the 300-m buffer. It is used to measure the density of roads. As the buffer of each docking station is same size, the length of roads is not necessarily divided by the area of the buffer to represent the level of road density.
Length of bicycle lanes equals the total length of bicycle lanes within the 300-m buffer. It is used to measure the level of cycling facilities.
Land use mix is the mix level of land use in the 300-m buffer. We used an entropy index to describe the level of land use mix [
28,
49]. The higher the entropy index, the more homogeneous the distribution of land types; in other words, the higher the level of land use mix. Supposing that there are
N land use types, the entropy-based land use mix is represented as:
where:
LUA (
t) represents the area of land use type
t in the 300-m buffer;
LUA represents the total area of all of the land use types. In this study,
N equals 7. The seven land use types are: commercial, residential, industrial, institutional, other built-in, open space, and others. The entropy-based land use mix is within the range of 0 to 1, with 0 meaning a single land use type (e.g., all residential) and 1 denoting the even distribution of all seven land use types in the 300-m buffer.
Presence of colleges and universities, presence of schools, presence of grocery stores, presence of retail shops, presence of gyms, and presence of parks represent whether there are colleges and universities, schools, grocery stores, retail shops, gyms, or parks within the 300-m buffer. As a large portion of docking stations’ 300-m buffers presented zero POI, we used a binary categorical data type instead of the original numeric data type to measure the availability of POIs. Specifically, ‘Y’ means there are colleges and universities, schools, grocery stores, retail shops, gyms, gyms, or parks within the 300-m buffer, while ‘N’ means there are none.
Metro frequency is the total number of metro routes passing all metro stations within the 300-m buffer. As the hourly frequency of each metro route is almost identical and the service times of each metro station is close to 24 h, we only calculated the number of routes to measure metro accessibility at each metro station.
Hourly bus frequency is the hourly number of bus trips passing all bus stops within the 300-m buffer on workdays (Monday to Friday). This was used to measure bus accessibility. Supposing that
i is a docking station, its
hourly bus frequency is calculated as:
where
is the average number of bus trips passing through the bus stop
j during a one-hour time slot
t on workdays;
is the set of bus stops that are situated within the 300-m buffer of
i.
Number of traffic accidents and number of traffic congestions are the hourly number of traffic accidents and congestions within the 300-m buffer on workdays (Monday to Friday), respectively.
Number of on-street violent crimes and number of off-street violent crimes are the hourly number of on-street violent crimes and off-street violent crimes within the 300-m buffer on workdays, respectively.