# Data-Driven Analysis of Bicycle Sharing Systems as Public Transport Systems Based on a Trip Index Classification

## Abstract

## 1. Introduction

- The creation a quantitative framework to classify BSS trips as transport or leisure.
- The definition of a distance-based index that builds the basis for this classification of trips.
- The mathematical characterization of the shortest path distance in a BSS, considering the set of shortcuts that bikers can use in their routes.
- The application of this framework to classify trips in a real BSS.
- Statistical and operational analysis to confirm the validity of the obtained results.
- The extraction the underlying BSS public transportation network.

## 2. Data-driven Classification of Trips

#### 2.1. Starting Premise

#### 2.2. Trip Index

#### 2.3. Spaces, Trajectories and Shortcuts in a BSS

- The linear trajectory, with length ${d}_{L}$, i.e., the direct Euclidean distance between origin and destination, which only depends on the real physical space.
- The retrievable shortest path between origin and destination given the underlying graph, which we will refer to as the orthodox trajectory, with length ${d}_{O}$.
- The shortest path between origin and destination using the set of available shortcuts, namely the heterodox trajectory, with length ${d}_{H}$.
- The actual path of the trip traveled by the user, with length ${d}_{p}$.

#### 2.4. Characterizing the Shortest Path in a BSS Trip

## 3. Application to a Real BSS

#### 3.1. Dataset

- Time stamp: pick up time with 1 hour definition, for privacy and anonymity issues.
- User’s identifier: unique encrypted identifier of user, refreshed daily.
- Type of user: annual, eventual, staff.
- User’s range of age: 6 intervals [0,16], [17,18], [19–26], [27–40], [41–65], [66,∞), and unknown.
- Identifier of the origin docking station.
- Identifier of the destination docking station.
- Travel time: time from pick up to drop off.
- Track: collection of geographical coordinates ordered in time recorded on a 1-minute basis during the trip.

#### 3.2. Applying the Mathematical Framework to the Dataset

#### 3.2.1. Preprocessing

#### 3.2.2. Calculation of the Trip Index

#### 3.3. Results of the Classification of BSS Trips

#### 3.4. Validation of the Results

#### 3.4.1. Statistical Analysis

#### 3.4.2. Operational Analysis

## 4. Underlying BSS Public Transport Network

## 5. Conclusions and Future Research

**Figure 2.**In blue: trip indexes sorted in ascending order; in red: the tangent line obtained by the elbow method.

**Figure 3.**Statistical characterization of trips: leisure (${\alpha}_{p}<{\alpha}^{*}$) on the left, and transport (${\alpha}_{p}\ge {\alpha}^{*}$) on the right.

Min. | Max. | Mean | Std. Dev. | |
---|---|---|---|---|

Leisure (${\alpha}_{p}<{\alpha}^{*}$) | $0.3$ | $47.9$ | $4.5$ | $3.0$ |

Transport$({\alpha}_{p}\ge {\alpha}^{*})$ | $0.1$ | $10.0$ | $2.4$ | $1.2$ |

Min. | Max. | Mean | Std. Dev. | |
---|---|---|---|---|

Leisure$({\alpha}_{p}<{\alpha}^{*})$ | 00:02:22 | 05:57:57 | 00:37:20 | 00:38:22 |

Transport$({\alpha}_{p}\ge {\alpha}^{*})$ | 00:00:58 | 05:59:26 | 00:11:56 | 00:09:52 |

Min. | Max. | Mean | Std. Dev. | |
---|---|---|---|---|

Leisure$({\alpha}_{p}<{\alpha}^{*})$ | $0.3$ | $28.7$ | $9.4$ | $3.9$ |

Transport$({\alpha}_{p}\ge {\alpha}^{*})$ | $0.2$ | $29.4$ | $12.9$ | $3.3$ |

Total | Leisure | Transport | |
---|---|---|---|

trips | $139\phantom{\rule{0.166667em}{0ex}}956$ | $13\phantom{\rule{0.166667em}{0ex}}156$ | $126\phantom{\rule{0.166667em}{0ex}}800$ |

order | 169 | 169 | 169 |

size | $22\phantom{\rule{0.166667em}{0ex}}750$ | $7\phantom{\rule{0.166667em}{0ex}}617$ | $21\phantom{\rule{0.166667em}{0ex}}947$ |

DENSITY | $0.80$ | $0.27$ | $0.77$ |

