# Bicycle Speed Modelling Considering Cyclist Characteristics, Vehicle Type and Track Attributes

## Abstract

## 1. Introduction

## 2. Methodology

#### 2.1. Study Logic and Technology Pathway

#### 2.1.1. Study Logic

#### 2.1.2. Technology Pathway

#### 2.2. Clustering Algorithm and Validation

#### 2.2.1. K-Means Clustering

#### 2.2.2. Determining the Optimal Number of Clusters

#### 2.2.3. Clustering Validation

#### 2.3. Distribution Fitting

#### 2.3.1. Probability Distribution Models for Fitting

#### 2.3.2. Model Test and Selection

#### Kolmogorov–Smirnov Test

#### AIC and BIC

## 3. Data Preparation and Description

## 4. Clustering Results

#### 4.1. Optimal Number of Clusters

#### 4.2. Validaty of Clustering

## 5. Distribution Fitting Results for Speed Clusters

#### 5.1. Speed Distribution of Clusters

- The first 7 distributions were suitable to fit the data of the three clusters according to the K-S test results while Uniform, Rayleigh, Gp, and exponential distribution were not. Nakagami, Rician, normal, logistic remained uncertain due to the failures of passing at least one of K-S tests to the three clusters.
- After considering the sum and variance of the three rankings together, we recommended GEV, Gamma, and Lognormal distributions as the top three tools to fit the three clusters of speed data set.
- Moreover, Tlocationscale, Gamma, and GEV distributions performed best in fitting the data from Clusterss 1, 2, and 3, respectively.

#### 5.2. Discussion on Best-Fit Distribution

- Type Ⅰ—Distributions whose tails decrease exponentially when the shape parameter (k) is equal to zero, see the light blue line.
- Type Ⅱ—Distributions whose tails decrease as a polynomial shown by the yellow line, when k is more than zero.
- Type Ⅲ—Distributions whose tails are finite as illustrated by the red line, when k is less than zero.

## 6. Conclusions

- 48 initial bicycle speed sub-clusters generated by the combinations of bicycle type, bicycle lateral position, gender, age, and lane width were grouped in three clusters finally.
- Among the common distributions, GEV, Gamma, and lognormal were the top three models to fit the three clusters of speed dataset.
- Integrating stability and overall performance, GEV was the best-fit distribution of bicycle speed. The speeds of the three clusters followed GEV (−0.04, 0.78, 3.66), GEV (−0.17, 1.03, 4.81), and GEV (−0.18, 1.42, 6.00), respectively.

## Appendix A

Distribution | Probability Density Function | Parameters |
---|---|---|

Birnbaumsaunders | $\left(\frac{\sqrt{\frac{x-\mu}{\beta}}+\sqrt{\frac{\beta}{x-\mu}}}{2\gamma \left(x-\mu \right)}\right)\varphi \left(\frac{\sqrt{\frac{x-\mu}{\beta}}+\sqrt{\frac{\beta}{x-\mu}}}{\gamma}\right);x>\mu $ | β: scale parameter, β > 0; γ: shape parameter, γ > 0. |

Ev | $\frac{1}{\beta}{e}^{-\frac{x-\mu}{\beta}}{e}^{-{e}^{\frac{x-\mu}{\beta}}}$ | μ: location parameter; σ: scale parameter, σ≥0 |

Exponential | $\frac{1}{\theta}{e}^{-\frac{x}{\theta}};x\ge 0$ | θ: inverse scale, θ > 0 |

Gamma | $\frac{1}{\mathsf{\Gamma}\left(\alpha \right){\beta}^{\alpha}}{x}^{\alpha -1}{e}^{-\frac{x}{\beta}};x>0$ | α: shape parameter, α: > 0; β: scale paramete, β > 0 |

Gev | $\frac{1}{\sigma}t{\left(x\right)}^{k+1}{e}^{-t\left(x\right)};t\left(x\right)=\{\begin{array}{l}\left(1+k\left(\frac{x-\theta}{\sigma}\right)\right),k\ne 0\\ {e}^{-\frac{x-\theta}{\sigma}},k=0\end{array}$ | k: shape parameter σ: scale parameter, σ > 0; θ: location parameter |

Gp | $\{\begin{array}{l}\frac{1}{\sigma}{\left(1+k\frac{x-\theta}{\sigma}\right)}^{-1-\frac{1}{k}},k\ne 0\\ \frac{1}{\sigma}{e}^{-\frac{x-\theta}{\sigma}},k=0\end{array}$ | k: shape parameter σ: scale parameter, σ≥0; θ: location parameter |

Inversegaussian | $\sqrt{\frac{\lambda}{2{\pi}^{3}}}{e}^{\frac{-\lambda {\left(x-\mu \right)}^{2}}{2{\mu}^{2}x}};x>0$ | μ: scale parameter, μ>0; λ: shape parameter, λ>0 |

Logistic | $\frac{1}{\beta}\frac{{e}^{-\frac{x-\mu}{\beta}}}{\left[1+{e}^{-\frac{x-\mu}{\beta}}\right]};$ | μ: mean; β: scale parameter, β > 0 |

Loglogistic | $\frac{1}{\sigma}\frac{1}{x}\frac{{e}^{z}}{{\left(1+{e}^{z}\right)}^{2}};z=\frac{\mathrm{log}x-\mu}{\sigma},x>0$ | μ: mean of logarithmic values, μ>0; σ: scale parameter of logarithmic values, σ > 0; |

Lognormal | $\frac{1}{\sqrt{2\pi}\sigma}\frac{1}{x}{e}^{-\frac{{\left(\mathrm{log}x-\mu \right)}^{2}}{2{\sigma}^{2}}};x>0$ | μ: mean of logarithmic values; σ: standard deviation of logarithmic values, σ > 0; |

Nakagami | $2{\left(\frac{\mu}{\omega}\right)}^{\mu}\frac{1}{\mathsf{\Gamma}\left(\mu \right)}{x}^{\left(2\mu -1\right)}{e}^{\frac{-\mu}{\omega}{x}^{2}};x>0$ | μ: shape parameter, μ > 0; ω: scale parameter, ω > 0 |

Normal | $\frac{1}{\sqrt{2\pi}\sigma}{e}^{-\frac{{\left(x-\mu \right)}^{2}}{2{\sigma}^{2}}};$ | μ: mean; σ: standard deviation, σ ≥ 0; |

Rayleigh | $\frac{x}{{b}^{2}}{e}^{-\frac{{x}^{2}}{{b}^{2}}};x>0$ | b > 0 |

Rician | ${l}_{0}\frac{xs}{{\sigma}^{2}}\frac{x}{{\sigma}^{2}}{e}^{-\frac{{x}^{2}+{s}^{2}}{2{\sigma}^{2}}};x>0$ | s: noncentrality parameter, s ≥ 0; σ: scale parameter, σ > 0; |

Tlocationscale | $\frac{\mathsf{\Gamma}\left(\frac{\nu +1}{2}\right)}{\sigma \sqrt{\nu \pi}\mathsf{\Gamma}\left(\frac{\nu}{2}\right)}{\left(\frac{\nu +{\left(\frac{x-\mu}{\sigma}\right)}^{2}}{\nu}\right)}^{-\frac{\nu +1}{2}}$ | μ: location parameter; σ: scale parameter, σ > 0; ν: shape parameter, ν > 0. |

Uniform | $\frac{1}{b-a};a\le x\le b$ | a: lower parameter; b: upper parameter |

Weibull | $\frac{b}{a}{x}^{a-1}{e}^{-\frac{xb}{a}};x>0$ | a: scale parameter, a > 0; b scale parameter, b > 0; |

Number | Gender | Age | Bicycle Type | Lane Width | Lateral Position | Cluster |
---|---|---|---|---|---|---|

1 | Female | >40 years | CB | ≤3.5 m | right | 1 |

2 | Female | >40 years | CB | >3.5 m | right | 1 |

3 | Female | ≤40 years | CB | ≤3.5 m | right | 1 |

4 | Female | ≤40 years | CB | >3.5 m | right | 1 |

5 | Male | >40 years | CB | ≤3.5 m | right | 1 |

6 | Male | >40 years | CB | >3.5 m | right | 1 |

7 | Male | ≤40 years | CB | ≤3.5 m | right | 1 |

8 | Male | ≤40 years | CB | >3.5 m | right | 1 |

9 | Female | >40 years | CB | ≤3.5 m | center | 1 |

10 | Female | >40 years | CB | >3.5 m | center | 1 |

11 | Female | ≤40 years | CB | ≤3.5 m | center | 1 |

12 | Female | ≤40 years | CB | >3.5 m | center | 1 |

13 | Male | >40 years | CB | ≤3.5 m | center | 1 |

14 | Male | >40 years | CB | >3.5 m | center | 1 |

15 | Male | ≤40 years | CB | ≤3.5 m | center | 1 |

16 | Male | ≤40 years | CB | >3.5 m | center | 2 |

17 | Female | >40 years | CB | ≤3.5 m | left | 2 |

18 | Female | >40 years | CB | >3.5 m | left | 2 |

19 | Female | ≤40 years | CB | ≤3.5 m | left | 2 |

20 | Female | ≤40 years | CB | >3.5 m | left | 2 |

21 | Male | >40 years | CB | ≤3.5 m | left | 2 |

22 | Male | >40 years | CB | >3.5 m | left | 2 |

23 | Male | ≤40 years | CB | ≤3.5 m | left | 2 |

24 | Male | ≤40 years | CB | >3.5 m | left | 2 |

25 | Female | >40 years | EB | ≤3.5 m | right | 2 |

26 | Female | >40 years | EB | >3.5 m | right | 2 |

27 | Female | ≤40 years | EB | ≤3.5 m | right | 2 |

28 | Female | ≤40 years | EB | >3.5 m | right | 2 |

29 | Male | >40 years | EB | ≤3.5 m | right | 2 |

30 | Male | >40 years | EB | >3.5 m | right | 3 |

31 | Male | ≤40 years | EB | ≤3.5 m | right | 3 |

32 | Male | ≤40 years | EB | >3.5 m | right | 3 |

33 | Female | >40 years | EB | ≤3.5 m | center | 3 |

34 | Female | >40 years | EB | >3.5 m | center | 3 |

35 | Female | ≤40 years | EB | ≤3.5 m | center | 3 |

36 | Female | ≤40 years | EB | >3.5 m | center | 3 |

37 | Male | >40 years | EB | ≤3.5 m | center | 3 |

38 | Male | >40 years | EB | >3.5 m | center | 3 |

39 | Male | ≤40 years | EB | ≤3.5 m | center | 3 |

40 | Male | ≤40 years | EB | >3.5 m | center | 3 |

41 | Female | >40 years | EB | ≤3.5 m | left | 3 |

42 | Female | >40 years | EB | >3.5 m | left | 3 |

43 | Female | ≤40 years | EB | ≤3.5 m | left | 3 |

44 | Female | ≤40 years | EB | >3.5 m | left | 3 |

45 | Male | >40 years | EB | ≤3.5 m | left | 3 |

46 | Male | >40 years | EB | >3.5 m | left | 3 |

47 | Male | ≤40 years | EB | ≤3.5 m | left | 3 |

48 | Male | ≤40 years | EB | >3.5 m | left | 3 |

## Appendix B

Order | Name | Parameters | LL | KS | AIC | AICc | BIC |
---|---|---|---|---|---|---|---|

1 | loglogistic | μ: 1.38, σ: 0.12 | −411.6 | Y | 827.3 | 827.3 | 834.9 |

2 | tlocationscale | μ: 3.99, σ: 0.67, ν: 4.17 | −416.8 | Y | 839.7 | 839.8 | 851.1 |

3 | lognormal | μ: 1.38, σ: 0.22 | −419.2 | Y | 842.5 | 842.5 | 850.0 |

4 | inversegaussian | μ: 4.06, λ: 81.49 | −420.2 | Y | 844.4 | 844.5 | 852.0 |

5 | birnbaumsaunders | β: 3.97, γ: 0.22 | −420.3 | Y | 844.6 | 844.6 | 852.2 |

6 | gev | k: −0.04, σ: 0.78, θ: 3.66 | −420.4 | Y | 846.8 | 846.8 | 858.1 |

7 | logistic | μ: 4.00, β: 0.48 | −421.8 | Y | 847.6 | 847.6 | 855.1 |

8 | gamma | α: 20.44, β: 0.20 | −423.8 | Y | 851.5 | 851.6 | 859.1 |

9 | nakagami | μ: 5.05, ω: 17.42 | −433.7 | N | 871.3 | 871.3 | 878.9 |

10 | rician | s: 3.95, σ: 0.96 | −445.3 | N | 894.7 | 894.7 | 902.2 |

11 | normal | μ: 4.06, σ: 0.95 | −446.1 | N | 896.3 | 896.3 | 903.9 |

12 | rayleigh | b: 2.95 | −584.2 | N | 1170.4 | 1170.5 | 1174.2 |

13 | uniform | a: 1.93, b: 9.77 | −673.4 | N | 1350.8 | 1350.8 | 1358.4 |

14 | gp | −0.56, θ: 5.53 | −703.1 | N | 1410.1 | 1410.2 | 1417.7 |

15 | exponential | θ: 4.06 | −785.5 | N | 1573.1 | 1573.1 | 1576.9 |

Order | Name | Parameter Values | LL | KS | AIC | AICc | BIC |
---|---|---|---|---|---|---|---|

1 | gamma | α: 22.74, β: 0.23 | −268.2 | Y | 540.5 | 540.6 | 546.9 |

2 | nakagami | μ: 5.92, ω: 28.66 | −268.3 | Y | 540.7 | 540.7 | 547.0 |

3 | gev | k: -0.17, σ: 1.03, θ: 4.81 | −269.5 | Y | 542.9 | 543.0 | 549.3 |

4 | lognormal | μ: 1.63, σ: 0.21 | −268.5 | Y | 542.9 | 543.1 | 552.5 |

5 | birnbaumsaunders | β: 5.12, γ: 0.21 | −269.5 | Y | 543.0 | 543.0 | 549.3 |

6 | tlocationscale | μ: 5.23, σ: 1.04, ν: 20.78 | −269.5 | Y | 543.1 | 543.1 | 549.4 |

7 | inversegaussian | μ: 5.24, λ: 113.26 | −269.7 | Y | 543.4 | 543.5 | 549.8 |

8 | rician | s: 5.12, σ: 1.11 | −269.8 | Y | 543.6 | 543.7 | 550.0 |

9 | normal | μ: 5.24, σ: 1.09 | −270.2 | Y | 544.4 | 544.4 | 550.7 |

10 | logistic | μ: 5.22, β: 0.62 | −270.3 | Y | 544.6 | 544.7 | 551.0 |

11 | loglogistic | μ: 1.64, σ: 0.12 | −269.5 | Y | 545.0 | 545.2 | 554.6 |

12 | uniform | a: 2.83, b: 8.84 | −321.1 | N | 646.1 | 646.2 | 652.5 |

13 | rayleigh | b: 3.79 | −363.0 | N | 728.1 | 728.1 | 731.2 |

14 | gp | σ: -1.01, θ: 8.93 | −389.9 | N | 783.7 | 783.8 | 790.1 |

15 | exponential | θ: 5.24 | −475.5 | N | 953.0 | 953.1 | 956.2 |

Order | Name | Parameter Values | LL | KS | AIC | AICc | BIC |
---|---|---|---|---|---|---|---|

1 | gev | k: −0.18, σ:1.42, θ: 6.00 | −1566.6 | Y | 3139.3 | 3139.3 | 3153.5 |

2 | gamma | α: 18.40, β: 0.36 | −1567.9 | Y | 3139.8 | 3139.8 | 3149.3 |

3 | nakagami | μ: 4.83,46.13 | −1569.2 | Y | 3142.5 | 3142.5 | 3152.0 |

4 | birnbaumsaunders | β: 6.43, γ: 0.24 | −1572.8 | Y | 3149.6 | 3149.6 | 3159.1 |

5 | inversegaussian | μ: 6.62, λ:114.96 | −1573.0 | Y | 3150.1 | 3150.1 | 3159.6 |

6 | lognormal | μ: 1.86, σ: 0.24 | −1573.1 | Y | 3150.1 | 3150.2 | 3159.7 |

7 | rician | s: 6.42, σ: 1.56 | −1578.1 | Y | 3160.1 | 3160.2 | 3169.6 |

8 | normal | μ: 6.62, σ: 1.53 | −1578.9 | Y | 3161.8 | 3161.9 | 3171.3 |

9 | tlocationscale | μ: 6.62, σ: 1.53, ν: 2594780.26 | −1578.9 | Y | 3163.8 | 3163.9 | 3178.1 |

10 | loglogistic | μ: 1.87, σ: 0.14 | −1584.9 | Y | 3173.8 | 3173.8 | 3183.3 |

11 | logistic | μ: 6.56,β: 0.88 | −1590.1 | Y | 3184.2 | 3184.2 | 3193.7 |

12 | uniform | a: 2.89, b: 11.24 | −1814.4 | N | 3632.8 | 3632.8 | 3642.3 |

13 | rayleigh | b: 4.8 | −1946.1 | N | 3894.2 | 3894.2 | 3898.9 |

14 | gp | −0.98, θ: 11.01 | −2068.1 | N | 4140.2 | 4140.2 | 4149.7 |

15 | exponential | θ: 6.62 | −2470.5 | N | 4943.1 | 4943.1 | 4947.8 |

**Figure 1.**Histograms for Speed Data and the bimodality coefficient (value exceeding 0.555 are taken to indicate bimodality and conversely not).

**Figure 5.**(

**a**) evaluation graph, (

**b**) possible fitting lines, (

**c**) RMSE, and (

**d**) Best-fit lines for city-block distance, Euclidean distance, correlation distance, and Cosine similarity.

**Figure 6.**(

**a**) Silhouette coefficient values, and (

**b**) Visualization of clustered data when number of clustering = 3.

**Figure 7.**(

**a**) Fitting results for models ranked in the 1–4, (

**b**) and 5–8 for the Cluster 1; (

**c**) Fitting results for models ranked in the 1–4, (

**d**) and 5–8 for the Cluster 2; (

**e**) Fitting results for models ranked in the 1–4, (

**f**) and 5–8 for the Cluster 3.

Factor | Original | Merged | ||||
---|---|---|---|---|---|---|

Category or Level | Counts | Ratio | Category or Level | Counts | Ratio | |

gender | Male | 850 | 62.0% | Male | 850 | 62.0% |

Female | 520 | 38.0% | Female | 520 | 38.0% | |

age | (~, 20) years | 42 | 3.0% | (~, 40) years | 830 | 67.9% |

(20, 30) years | 308 | 22.5% | ||||

(30, 40) years | 580 | 42.4% | ||||

(40, 50) years | 296 | 21.6% | (40, 60) years | 440 | 32.1% | |

(50, 60) years | 144 | 10.5% | ||||

bicycle type | EB | 1028 | 75.1% | EB | 1028 | 75.1% |

CB | 342 | 24.9% | CB | 342 | 24.9% | |

lane width | 2 m | 197 | 14.4% | ≤3.5 m | 367 | 26.8% |

3.4 m | 170 | 12.4% | ||||

3.85 m | 302 | 22.0% | >3.5 m | 1003 | 73.2% | |

4 m | 346 | 25.3% | ||||

5 m | 355 | 25.9% | ||||

lateral position | left | 211 | 15.4% | left | 211 | 15.4% |

center | 745 | 54.4% | center | 745 | 54.4% | |

right | 421 | 30.8% | right | 421 | 30.8% | |

Total | ~ | 1370 | ~ | ~ | 1370 | ~ |

Cluster | Counts | Speed Statistics (m/s) | Distribution Features | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Median | Mean | Std * | 85th Value | Min * | Max * | Kurtosis | Skewness | BC * | ||

1 | 327 | 4.00 | 4.06 | 0.95 | 4.81 | 1.93 | 9.77 | 10.37 | 1.63 | 0.35 |

2 | 179 | 5.25 | 5.24 | 1.10 | 6.32 | 2.83 | 8.84 | 3.29 | 0.34 | 0.33 |

3 | 864 | 6.54 | 6.65 | 1.58 | 8.22 | 2.16 | 11.87 | 3.03 | 0.40 | 0.38 |

Overall | 1370 | 5.66 | 5.85 | 1.78 | 7.73 | 1.93 | 11.87 | 2.85 | 0.48 | 0.43 |

Number | Name | Cluster 1 | Cluster 2 | Cluster 3 | Sum | Var. | Suggestions |
---|---|---|---|---|---|---|---|

1 | Gev | 6 | 3 | 1 | 10 | 4.22 | Recommended |

2 | Gamma | 8 | 1 | 2 | 11 | 9.56 | Recommended |

3 | Lognormal | 3 | 4 | 6 | 13 | 1.56 | Recommended |

4 | Birnbaumsaunders | 5 | 5 | 4 | 14 | 0.22 | Suitable |

5 | Inversegaussian | 4 | 7 | 5 | 16 | 1.56 | Suitable |

6 | Loglogistic | 2 | 6 | 9 | 17 | 8.22 | Suitable |

7 | Tlocationscale | 1 | 11 | 10 | 22 | 20.22 | Suitable |

8 | Nakagami | 9 | 2 | 3 | 14 | 9.56 | Uncertain |

9 | Rician | 10 | 8 | 7 | 25 | 1.56 | Uncertain |

10 | Normal | 11 | 9 | 8 | 28 | 1.56 | Uncertain |

11 | Logistic | 7 | 10 | 11 | 28 | 2.89 | Uncertain |

12 | Uniform | 13 | 12 | 12 | 37 | 0.22 | Unsuitable |

13 | Rayleigh | 12 | 13 | 13 | 38 | 0.22 | Unsuitable |

14 | GP | 14 | 14 | 14 | 42 | 0.00 | Unsuitable |

15 | Exponential | 15 | 15 | 15 | 45 | 0.00 | Unsuitable |

