# Forecasting of Bicycle and Pedestrian Traffic Using Flexible and Efficient Hybrid Deep Learning Approach

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- In this study, we first introduce a proficient hybrid approach for traffic flow forecasting. The primary elements of the proposed guided-attention hybrid deep learning architecture (termed GAHD-VAE) are the VAE model, the self-attention mechanism, and LSTM. As we know, this is the first study using a hybrid deep learning model to improve the forecasting of pedestrian and bicycle traffic flows. This approach improves the traditional VAE model to capture potential temporal dependencies by using the self-attention unit at a multi-level of the VAE model and including an LSTM model in the VAE encoder. The self-attention mechanism that mimics the human brain is adapted in the GAHD-VAE to uncover the most relevant traffic flow data features. Indeed, self-attention allows attention-driven long-range dependency modeling for time-series. On the other hand, the hybrid LSTM-VAE is employed to automatically learn time dependence in traffic data without feature engineering. Employing all these advanced statistical tools is beneficial in the sense that it has the potential to enhance short-term forecasting of pedestrian and bicycle traffic flows. The forecasting performance of the GAHD-VAE method has been compared to that of the traditional VAE and some powerful deep recurrent neural networks, namely LSTM, gated recurrent units (GRUs), BiLSTM, bidirectional GRU (BiGRU), convolutional neural network (CNN), and convolutional LSTM (ConvLSTM), and four shallow methods, linear regression (LR), lasso regression, ridge regression (RR), and support vector regression (SVR).
- The second contribution consists of investigating the impact of using different configurations of the self-attention module on the forecasting quality of the GAHD-VAE. Crucially, we examined the influence of the adopted activation functions in the attention mechanism, such as Rectified Linear Unit (ReLU), Hyperbolic Tangent (tanh), and Logistic Sigmoid, on the proposed approach’s forecasting quality. Moreover, the influence of the attention type, including multiplicative and additive, on the forecasting accuracy has been investigated.
- Finally, this study investigated both single- and multi-step-ahead forecasting. Data sets from six pedestrians and bicycle traffic flows are utilized to evaluate the forecasting quality of the considered methods. Results reveal that the proposed GAHD-VAE method offers satisfying performance to forecast different types of traffic flows and consistently performed better than the other methods.

## 2. Related Works

## 3. Methods

#### 3.1. Attention Mechanism

#### 3.2. Self-Attention Mechanism

#### 3.3. Variational Autoencoder

#### 3.4. The Proposed Approach

Algorithm 1: The training procedure of the GA-HD-VAE algorithm |

## 4. Model Testing and Results Analysis

#### 4.1. Measurements of Effectiveness

#### 4.2. Data Description

#### 4.3. Results Analysis and Comparison

## 5. Conclusions and Future Directions

#### 5.1. Conclusions

#### 5.2. Future Directions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Xie, Y.; Zhao, K.; Sun, Y.; Chen, D. Gaussian processes for short-term traffic volume forecasting. Transp. Res. Rec.
**2010**, 2165, 69–78. [Google Scholar] [CrossRef] [Green Version] - Zeroual, A.; Harrou, F.; Sun, Y.; Messai, N. Integrating Model-Based Observer and Kullback–Leibler Metric for Estimating and Detecting Road Traffic Congestion. IEEE Sens. J.
**2018**, 18, 8605–8616. [Google Scholar] [CrossRef] [Green Version] - Harrou, F.; Zeroual, A.; Sun, Y. Traffic congestion monitoring using an improved kNN strategy. Measurement
**2020**, 156, 107534. [Google Scholar] [CrossRef] - Bongiorno, C.; Santucci, D.; Kon, F.; Santi, P.; Ratti, C. Comparing bicycling and pedestrian mobility: Patterns of non-motorized human mobility in Greater Boston. J. Transp. Geogr.
**2019**, 80, 102501. [Google Scholar] [CrossRef] - Lee, K.; Sener, I.N. Emerging data for pedestrian and bicycle monitoring: Sources and applications. Transp. Res. Interdiscip. Perspect.
**2020**, 4, 100095. [Google Scholar] [CrossRef] - Xu, C.; Li, Z.; Wang, W. Short-term traffic flow prediction using a methodology based on autoregressive integrated moving average and genetic programming. Transport
**2016**, 31, 343–358. [Google Scholar] [CrossRef] [Green Version] - Alghamdi, T.; Elgazzar, K.; Bayoumi, M.; Sharaf, T.; Shah, S. Forecasting Traffic Congestion Using ARIMA Modeling. In Proceedings of the 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019; pp. 1227–1232. [Google Scholar] [CrossRef]
- Harrou, F.; Zeroual, A.; Hittawe, M.M.; Sun, Y. Road Traffic Modeling and Management: Using Statistical Monitoring and Deep Learning; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar]
- Voort, M.V.D.; Dougherty, M.; Watson, S. Combining kohonen maps with arima time series models to forecast traffic flow. Transp. Res. Part C Emerg. Technol.
**1996**, 4, 307–318. [Google Scholar] [CrossRef] [Green Version] - Kumar, S.V.; Vanajakshi, L. Short-term traffic flow prediction using seasonal ARIMA model with limited input data. Eur. Transp. Res. Rev.
**2015**, 7, 21. [Google Scholar] [CrossRef] [Green Version] - Zhao, Z.; Chen, W.; Wu, X.; Chen, P.C.Y.; Liu, J. LSTM network: A deep learning approach for short-term traffic forecast. IET Intell. Transp. Syst.
**2017**, 11, 68–75. [Google Scholar] [CrossRef] [Green Version] - Gu, X.; Li, T.; Wang, Y.; Zhang, L.; Wang, Y.; Yao, J. Traffic fatalities prediction using support vector machine with hybrid particle swarm optimization. J. Algorithms Comput. Technol.
**2018**, 12, 20–29. [Google Scholar] [CrossRef] - Karlaftis, M.; Vlahogianni, E. Statistical methods versus neural networks in transportation research: Differences, similarities and some insights. Transp. Res. Part C Emerg. Technol.
**2011**, 19, 387–399. [Google Scholar] [CrossRef] - Ai, C.; Jia, L.; Hong, M.; Zhang, C. Short-term road speed forecasting based on hybrid RBF neural network with the aid of fuzzy system-based techniques in urban traffic flow. IEEE Access
**2020**, 8, 69461–69470. [Google Scholar] [CrossRef] - Cai, L.; Chen, Q.; Cai, W.; Xu, X.; Zhou, T.; Qin, J. SVRGSA: A hybrid learning based model for short-term traffic flow forecasting. IET Intell. Transp. Syst.
**2019**, 13, 1348–1355. [Google Scholar] [CrossRef] - Chen, X.; Cai, X.; Liang, J.; Liu, Q. Ensemble Learning Multiple LSSVR with Improved Harmony Search Algorithm for Short-Term Traffic Flow Forecasting. IEEE Access
**2018**, 6, 9347–9357. [Google Scholar] [CrossRef] - Lu, W.; Rui, Y.; Yi, Z.; Ran, B.; Gu, Y. A Hybrid Model for Lane-Level Traffic Flow Forecasting Based on Complete Ensemble Empirical Mode Decomposition and Extreme Gradient Boosting. IEEE Access
**2020**, 8, 42042–42054. [Google Scholar] [CrossRef] - Tang, L.; Zhao, Y.; Cabrera, J.; Ma, J.; Tsui, K.L. Forecasting short-term passenger flow: An empirical study on shenzhen metro. IEEE Trans. Intell. Transp. Syst.
**2018**, 20, 3613–3622. [Google Scholar] [CrossRef] - Lin, F.; Jiang, J.; Fan, J.; Wang, S. A stacking model for variation prediction of public bicycle traffic flow. Intell. Data Anal.
**2018**, 22, 911–933. [Google Scholar] [CrossRef] - Xu, H.; Ying, J.; Wu, H.; Lin, F. Public bicycle traffic flow prediction based on a hybrid model. Appl. Math. Inf. Sci.
**2013**, 7, 667–674. [Google Scholar] [CrossRef] - Yang, Y.; Heppenstall, A.; Turner, A.; Comber, A. Using graph structural information about flows to enhance short-term demand prediction in bike-sharing systems. Comput. Environ. Urban Syst.
**2020**, 83, 101521. [Google Scholar] [CrossRef] - Brookshire, K.; Blank, K.; Redmon, T.; Blackburn, L. Decision making support for bikeway selection. ITE J.
**2020**, 90, e1. [Google Scholar] - Zhao, W.; Gao, Y.; Ji, T.; Wan, X.; Ye, F.; Bai, G. Deep Temporal Convolutional Networks for Short-Term Traffic Flow Forecasting. IEEE Access
**2019**, 7, 114496–114507. [Google Scholar] [CrossRef] - Cheng, T.; Harrou, F.; Kadri, F.; Sun, Y.; Leiknes, T. Forecasting of Wastewater Treatment Plant Key Features using Deep Learning-Based Models: A Case Study. IEEE Access
**2020**, 8, 184475–184485. [Google Scholar] [CrossRef] - Harrou, F.; Cheng, T.; Sun, Y.; Leiknes, T.O.; Ghaffour, N. A Data-Driven Soft Sensor to Forecast Energy Consumption in Wastewater Treatment Plants: A Case Study. IEEE Sens. J.
**2020**, 21, 4908–4917. [Google Scholar] [CrossRef] - Yang, H.; Dillon, T.S.; Chang, E.; Phoebe Chen, Y. Optimized Configuration of Exponential Smoothing and Extreme Learning Machine for Traffic Flow Forecasting. IEEE Trans. Ind. Inform.
**2019**, 15, 23–34. [Google Scholar] [CrossRef] - Lv, Y.; Duan, Y.; Kang, W.; Li, Z.; Wang, F. Traffic Flow Prediction with Big Data: A Deep Learning Approach. IEEE Trans. Intell. Transp. Syst.
**2015**, 16, 865–873. [Google Scholar] [CrossRef] - Dai, G.; Ma, C.; Xu, X. Short-Term Traffic Flow Prediction Method for Urban Road Sections Based on Space–Time Analysis and GRU. IEEE Access
**2019**, 7, 143025–143035. [Google Scholar] [CrossRef] - Han, L.; Huang, Y. Short-term traffic flow prediction of road network based on deep learning. IET Intell. Transp. Syst.
**2020**, 14, 495–503. [Google Scholar] [CrossRef] - Ba, J.; Mnih, V.; Kavukcuoglu, K. Multiple object recognition with visual attention. arXiv
**2014**, arXiv:1412.7755. [Google Scholar] - Zhou, J.; Dai, H.; Wang, H.; Wang, T. Wide-Attention and Deep-Composite Model for Traffic Flow Prediction in Transportation Cyber-Physical Systems. IEEE Trans. Ind. Inform.
**2020**, 17, 3431–3440. [Google Scholar] [CrossRef] - Zheng, H.; Lin, F.; Feng, X.; Chen, Y. A Hybrid Deep Learning Model with Attention-Based Conv-LSTM Networks for Short-Term Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst.
**2020**, 22, 6910–6920. [Google Scholar] [CrossRef] - Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv
**2014**, arXiv:1409.0473. [Google Scholar] - Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing
**2021**, 452, 48–62. [Google Scholar] [CrossRef] - Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Zemel, R.; Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International conference on machine learning, Lille, France, 6–11 July 2015; pp. 2048–2057. [Google Scholar]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv
**2015**, arXiv:1508.04025. [Google Scholar] - Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; 2017. pp. 5998–6008. Available online: https://www.bibsonomy.org/bibtex/c9bf08cbcb15680c807e12a01dd8c929 (accessed on 20 April 2022).
- Parikh, A.P.; Täckström, O.; Das, D.; Uszkoreit, J. A decomposable attention model for natural language inference. arXiv
**2016**, arXiv:1606.01933. [Google Scholar] - Lin, Z.; Feng, M.; Santos, C.N.D.; Yu, M.; Xiang, B.; Zhou, B.; Bengio, Y. A structured self-attentive sentence embedding. arXiv
**2017**, arXiv:1703.03130. [Google Scholar] - Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. Stat
**2014**, 1050, 1. [Google Scholar] - Dairi, A.; Harrou, F.; Sun, Y.; Khadraoui, S. Short-term forecasting of photovoltaic solar power production using variational auto-encoder driven deep learning approach. Appl. Sci.
**2020**, 10, 8400. [Google Scholar] [CrossRef] - Zerrouki, Y.; Harrou, F.; Zerrouki, N.; Dairi, A.; Sun, Y. Desertification Detection using an Improved Variational AutoEncoder-Based Approach through ETM-Landsat Satellite Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens.
**2020**, 14, 202–213. [Google Scholar] [CrossRef] - Harrou, F.; Dairi, A.; Kadri, F.; Sun, Y. Forecasting emergency department overcrowding: A deep learning framework. Chaos Solitons Fractals
**2020**, 139, 110247. [Google Scholar] [CrossRef] - Dairi, A.; Harrou, F.; Khadraoui, S.; Sun, Y. Integrated multiple directed attention-based deep learning for improved air pollution forecasting. IEEE Trans. Instrum. Meas.
**2021**, 70, 1–15. [Google Scholar] [CrossRef] - Zeroual, A.; Harrou, F.; Dairi, A.; Sun, Y. Deep learning methods for forecasting COVID-19 time-Series data: A Comparative study. Chaos Solitons Fractals
**2020**, 140, 110121. [Google Scholar] [CrossRef] [PubMed] - Dairi, A.; Harrou, F.; Sun, Y. Deep Generative Learning-based 1-SVM Detectors for Unsupervised COVID-19 Infection Detection Using Blood Tests. IEEE Trans. Instrum. Meas.
**2021**, 71, 2500211. [Google Scholar] [CrossRef] - Doersch, C. Tutorial on variational autoencoders. arXiv
**2016**, arXiv:1606.05908. [Google Scholar] - Chen, T.; Liu, X.; Xia, B.; Wang, W.; Lai, Y. Unsupervised anomaly detection of industrial robots using sliding-window convolutional variational autoencoder. IEEE Access
**2020**, 8, 47072–47081. [Google Scholar] [CrossRef] - Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational inference: A review for statisticians. J. Am. Stat. Assoc.
**2017**, 112, 859–877. [Google Scholar] [CrossRef] [Green Version] - Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv
**2013**, arXiv:1312.6114. [Google Scholar] - Cortes, C.; Mohri, M.; Rostamizadeh, A. L2 regularization for learning kernels. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–21 June 2009; AUAI Press: Arlington, VA, USA, 2009; pp. 109–116. [Google Scholar]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2013; Volume 112. [Google Scholar]
- Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013; Volume 26. [Google Scholar]

**Figure 5.**Measured and forecasted, (

**a**) Data Set 1, (

**b**) Data Set 2, (

**c**) Data Set 3, (

**d**) Data Set 4, (

**e**) Data Set 5 and (

**f**) Data Set 6.

**Figure 6.**Boxplot of forecasting errors obtained by the seven considered methods based on the six datasets: (

**a**) Data Set 1, (

**b**) Data Set 2, (

**c**) Data Set 3, (

**d**) Data Set 4, (

**e**) Data Set 5 and (

**f**) Data Set 6.

Dataset | Location | Contents | Records |
---|---|---|---|

Data 1 | Burke Gilman Trail north of NE 70th St | Bicycle | 57,697 |

Data 2 | Burke Gilman Trail north of NE 70th St | Pedestrian | 57,697 |

Data 3 | MTS Trail west of I-90 Bridge | Pedestrian | 57,289 |

Data 4 | MTS Trail west of I-90 Bridge | Bicycle | 57,289 |

Data 5 | Seattle Spokane St Bridge | Bicycle | 51,121 |

Data 6 | 26th Ave SW Greenway at SW Oregon St | Bicycle | 55,568 |

Min | Max | Std | Q-0.25 | Q-0.5 | Q-0.75 | Skewness | Kurtosis | |
---|---|---|---|---|---|---|---|---|

Data 1 | 0 | 8191 | 77.963 | 2 | 22 | 63 | 41.250 | 4078.161 |

Data 2 | 0 | 5118 | 148.542 | 0 | 12 | 28.938 | 14.342 | 240.455 |

Data 3 | 0 | 1940 | 41.034 | 0 | 2 | 8 | 26.154 | 840.791 |

Data 4 | 0 | 431 | 34.739 | 1 | 9 | 34 | 2.239 | 9.356 |

Data 5 | 0 | 431 | 43.209 | 4 | 18 | 45 | 2.071 | 7.842 |

Data 6 | 0 | 274 | 14.352 | 0 | 2 | 7 | 5.141 | 46.965 |

Model | Dataset | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|---|

VAE | 1 | 8.053 | 6.063 | 0.851 | 0.904 |

2 | 3.633 | 2.094 | 0.947 | 0.949 | |

3 | 1.619 | 1.139 | 0.965 | 0.967 | |

4 | 6.691 | 4.922 | 0.847 | 0.902 | |

5 | 3.707 | 2.691 | 0.931 | 0.948 | |

6 | 3.257 | 2.464 | 0.971 | 0.972 | |

GAHD-VAE | 1 | 3.336 | 2.81 | 0.975 | 0.979 |

2 | 2.393 | 1.616 | 0.977 | 0.98 | |

3 | 1.586 | 0.97 | 0.968 | 0.968 | |

4 | 4.053 | 3.077 | 0.945 | 0.954 | |

5 | 3.641 | 2.853 | 0.933 | 0.952 | |

6 | 2.824 | 1.867 | 0.978 | 0.978 | |

SVR | 1 | 17.334 | 16.595 | 0.314 | 0.927 |

2 | 10.384 | 9.11 | 0.575 | 0.815 | |

3 | 2.892 | 2.199 | 0.893 | 0.914 | |

4 | 11.484 | 10.739 | 0.559 | 0.877 | |

5 | 8.605 | 8.002 | 0.627 | 0.816 | |

6 | 3.227 | 2.753 | 0.416 | 0.618 | |

LR | 1 | 17.568 | 15.698 | 0.295 | 0.571 |

2 | 10.053 | 7.764 | 0.602 | 0.626 | |

3 | 10.332 | 8.17 | −0.364 | 0.019 | |

4 | 10.941 | 9.603 | 0.6 | 0.696 | |

5 | 9.363 | 7.97 | 0.762 | 0.78 | |

6 | 12.853 | 11.596 | 0.655 | 0.753 | |

RR | 1 | 6.653 | 6.085 | 0.899 | 0.963 |

2 | 4.093 | 3.216 | 0.934 | 0.941 | |

3 | 4.294 | 3.59 | 0.764 | 0.845 | |

4 | 5.875 | 5.249 | 0.885 | 0.932 | |

5 | 9.933 | 8.736 | 0.794 | 0.851 | |

6 | 5.203 | 4.219 | 0.927 | 0.936 | |

Lasso regression | 1 | 6.64 | 6.033 | 0.899 | 0.951 |

2 | 3.756 | 2.975 | 0.944 | 0.951 | |

3 | 3.33 | 2.882 | 0.858 | 0.913 | |

4 | 5.965 | 5.265 | 0.881 | 0.925 | |

5 | 10.036 | 8.741 | 0.789 | 0.843 | |

6 | 5.05 | 4.103 | 0.931 | 0.94 |

**Table 4.**One-step ahead forecasting of pedestrian and bicycle traffic flows using the seven deep learning models.

Dataset | MODEL | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|---|

CNN | 5.905 | 3.333 | 0.92 | 0.921 | |

ConvLSTM | 8.064 | 5.445 | 0.852 | 0.887 | |

BiGRU | 6.726 | 3.894 | 0.897 | 0.897 | |

1 | BiLSTM | 5.846 | 3.515 | 0.922 | 0.923 |

GRU | 6.627 | 4.087 | 0.9 | 0.902 | |

LSTM | 6.29 | 3.878 | 0.91 | 0.912 | |

GAHD-VAE | 3.336 | 2.81 | 0.975 | 0.979 | |

CNN | 6.096 | 2.526 | 0.853 | 0.853 | |

ConvLSTM | 5.058 | 2.655 | 0.899 | 0.9 | |

GRU | 5.45 | 2.784 | 0.883 | 0.883 | |

2 | LSTM | 5.444 | 2.758 | 0.883 | 0.883 |

BiGRU | 6.239 | 2.639 | 0.847 | 0.847 | |

BiLSTM | 5.874 | 2.531 | 0.864 | 0.864 | |

GAHD-VAE | 2.693 | 1.681 | 0.971 | 0.972 | |

CNN | 3.837 | 1.357 | 0.805 | 0.806 | |

ConvLSTM | 4.482 | 3.332 | 0.735 | 0.782 | |

BiGRU | 2.254 | 1.344 | 0.935 | 0.94 | |

3 | BiLSTM | 2.762 | 1.241 | 0.903 | 0.903 |

GRU | 3.263 | 2.039 | 0.864 | 0.864 | |

LSTM | 4.414 | 2.964 | 0.751 | 0.752 | |

GAHD-VAE | 1.586 | 0.97 | 0.968 | 0.968 | |

CNN | 5.609 | 3.8 | 0.893 | 0.895 | |

ConvLSTM | 7.307 | 4.681 | 0.819 | 0.842 | |

BiGRU | 6.186 | 3.47 | 0.872 | 0.875 | |

4 | BiLSTM | 5.867 | 3.175 | 0.885 | 0.885 |

GRU | 6.582 | 3.736 | 0.855 | 0.855 | |

LSTM | 6.566 | 3.819 | 0.856 | 0.857 | |

GAHD-VAE | 4.053 | 3.077 | 0.945 | 0.954 | |

CNN | 7.553 | 4.416 | 0.845 | 0.845 | |

ConvLSTM | 8.872 | 6.59 | 0.756 | 0.801 | |

BiGRU | 9.998 | 5.721 | 0.791 | 0.797 | |

5 | BiLSTM | 9.692 | 5.419 | 0.804 | 0.809 |

GRU | 10.763 | 6.197 | 0.758 | 0.76 | |

LSTM | 10.746 | 6.355 | 0.759 | 0.768 | |

GAHD-VAE | 5.641 | 4.853 | 0.933 | 0.952 | |

CNN | 5.31 | 2.794 | 0.924 | 0.927 | |

ConvLSTM | 5.887 | 3.958 | 0.905 | 0.91 | |

BiGRU | 5.253 | 3.355 | 0.925 | 0.937 | |

6 | BiLSTM | 5.249 | 2.964 | 0.925 | 0.931 |

GRU | 5.959 | 3.505 | 0.904 | 0.911 | |

LSTM | 6.061 | 3.883 | 0.9 | 0.916 | |

GAHD-VAE | 2.824 | 1.867 | 0.978 | 0.978 |

MODEL | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|

CNN | 5.72 | 3.04 | 0.87 | 0.87 |

ConvLSTM | 6.61 | 4.44 | 0.83 | 0.85 |

BiGRU | 6.11 | 3.40 | 0.88 | 0.88 |

BiLSTM | 5.88 | 3.14 | 0.88 | 0.89 |

LSTM | 6.59 | 3.94 | 0.84 | 0.85 |

GRU | 6.44 | 3.72 | 0.86 | 0.86 |

GAHD-VAE | 3.36 | 2.54 | 0.96 | 0.97 |

**Table 6.**Evaluation of the forecasting performance of the GAHD-VAE under different configurations of the attention mechanism. ⨁ is the additive attention mode, while ⨂ is the multiplicative attention mode.

Dataset | Type | Attention | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|---|---|

Tanh | 2.663 | 1.639 | 0.972 | 0.973 | ||

1 | ⨂ | Sigmoid | 1.812 | 1.224 | 0.987 | 0.988 |

Relu | 2.086 | 1.457 | 0.983 | 0.985 | ||

None | 5.079 | 3.92 | 0.941 | 0.951 | ||

Tanh | 3.336 | 2.81 | 0.975 | 0.979 | ||

1 | ⨁ | Sigmoid | 2.225 | 1.686 | 0.98 | 0.981 |

Relu | 1.841 | 1.397 | 0.987 | 0.987 | ||

Tanh | 2.072 | 1.601 | 0.983 | 0.984 | ||

2 | ⨂ | Sigmoid | 2.281 | 1.931 | 0.979 | 0.984 |

Relu | 2.373 | 2.094 | 0.978 | 0.99 | ||

None | 2.403 | 1.692 | 0.977 | 0.979 | ||

Tanh | 2.393 | 1.616 | 0.977 | 0.98 | ||

2 | ⨁ | Sigmoid | 1.81 | 1.273 | 0.987 | 0.988 |

Relu | 2.399 | 1.736 | 0.977 | 0.978 | ||

Tanh | 1.844 | 1.587 | 0.957 | 0.973 | ||

3 | ⨂ | Sigmoid | 1.86 | 1.262 | 0.956 | 0.959 |

Relu | 1.878 | 1.397 | 0.955 | 0.962 | ||

None | 1.753 | 1.105 | 0.961 | 0.962 | ||

Tanh | 1.586 | 0.97 | 0.968 | 0.968 | ||

3 | ⨁ | Sigmoid | 2.299 | 1.955 | 0.932 | 0.958 |

Relu | 1.705 | 0.797 | 0.963 | 0.964 | ||

Tanh | 1.18 | 0.761 | 0.986 | 0.986 | ||

4 | ⨂ | Sigmoid | 2.137 | 1.637 | 0.953 | 0.957 |

Relu | 2.259 | 1.769 | 0.947 | 0.956 | ||

None | 2.754 | 2.205 | 0.922 | 0.953 | ||

Tanh | 4.053 | 3.077 | 0.945 | 0.954 | ||

4 | ⨁ | Sigmoid | 2.066 | 1.932 | 0.956 | 0.987 |

Relu | 2.591 | 2.021 | 0.931 | 0.958 | ||

Tanh | 3.469 | 3.091 | 0.939 | 0.965 | ||

5 | ⨂ | Sigmoid | 2.743 | 1.969 | 0.962 | 0.968 |

Relu | 3.979 | 2.63 | 0.92 | 0.93 | ||

None | 2.784 | 2.417 | 0.961 | 0.978 | ||

Tanh | 5.641 | 4.853 | 0.933 | 0.952 | ||

5 | ⨁ | Sigmoid | 3.672 | 2.448 | 0.932 | 0.936 |

Relu | 3.508 | 2.796 | 0.938 | 0.952 | ||

Tanh | 3.118 | 2.095 | 0.974 | 0.975 | ||

6 | ⨂ | Sigmoid | 3.077 | 2.534 | 0.974 | 0.98 |

Relu | 2.921 | 1.661 | 0.977 | 0.977 | ||

None | 4.481 | 3.192 | 0.946 | 0.958 | ||

Tanh | 2.824 | 1.867 | 0.978 | 0.978 | ||

6 | ⨁ | Sigmoid | 2.852 | 1.973 | 0.978 | 0.978 |

Relu | 4.087 | 2.744 | 0.955 | 0.962 |

**Table 7.**Averaged validation metrics per configurations of the attention mechanism. ⨁ is the additive attention mode, while ⨂ is the multiplicative attention mode.

Attention | RMSE | MAE | ${\mathit{R}}^{2}$ | EV | |
---|---|---|---|---|---|

Additive | ⨁ | 2.827 | 2.108 | 0.961 | 0.969 |

Multiplicative | ⨂ | 2.612 | 1.950 | 0.958 | 0.967 |

**Table 8.**Performance evaluation of the seven models for daily forecasting of pedestrian and bicycle traffic flows.

Dataset | MODEL | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|---|

ConvLSTM | 183.619 | 139.092 | 0.818 | 0.818 | |

CNN | 210.837 | 162.627 | 0.818 | 0.877 | |

LSTM | 98.575 | 79.5 | 0.796 | 0.902 | |

1 | GRU | 91.176 | 73.017 | 0.826 | 0.908 |

BiGRU | 86.176 | 68.847 | 0.844 | 0.915 | |

BiLSTM | 74.93 | 59.556 | 0.882 | 0.938 | |

GAHD-VAE | 60.945 | 47.493 | 0.922 | 0.925 | |

ConvLSTM | 22.485 | 15.764 | 0.895 | 0.908 | |

CNN | 25.639 | 12.826 | 0.899 | 0.914 | |

LSTM | 34.633 | 18.455 | 0.747 | 0.747 | |

2 | GRU | 21.844 | 19.131 | 0.899 | 0.962 |

BiLSTM | 23.489 | 16.936 | 0.884 | 0.923 | |

BiGRU | 22.144 | 18.488 | 0.896 | 0.95 | |

GAHD-VAE | 11.755 | 7.326 | 0.971 | 0.977 | |

ConvLSTM | 69.637 | 49.907 | 0.795 | 0.806 | |

CNN | 90.285 | 56.004 | 0.602 | 0.602 | |

LSTM | 65.821 | 46.566 | 0.854 | 0.864 | |

3 | GRU | 64.254 | 47.184 | 0.861 | 0.872 |

BiGRU | 67.314 | 49.322 | 0.848 | 0.86 | |

BiLSTM | 66.488 | 48.488 | 0.851 | 0.861 | |

GAHD-VAE | 60.603 | 41.953 | 0.877 | 0.878 | |

ConvLSTM | 21.83 | 17.913 | 0.807 | 0.815 | |

4 | CNN | 19.137 | 15.519 | 0.84 | 0.86 |

LSTM | 22.744 | 18.019 | 0.776 | 0.778 | |

GRU | 22.246 | 16.924 | 0.785 | 0.785 | |

BiGRU | 20.068 | 16.32 | 0.825 | 0.826 | |

BiLSTM | 19.115 | 15.441 | 0.842 | 0.843 | |

GAHD-VAE | 16.98 | 13.613 | 0.875 | 0.878 | |

ConvLSTM | 23.738 | 24.604 | 0.779 | 0.78 | |

CNN | 15.112 | 12.716 | 0.856 | 0.902 | |

LSTM | 11.265 | 7.663 | 0.936 | 0.936 | |

5 | GRU | 15.07 | 12.106 | 0.886 | 0.901 |

BiGRU | 18.482 | 12.609 | 0.828 | 0.833 | |

BiLSTM | 10.577 | 7.097 | 0.944 | 0.944 | |

GAHD-VAE | 10.202 | 7.015 | 0.948 | 0.951 | |

ConvLSTM | 89.583 | 72.832 | 0.733 | 0.755 | |

CNN | 68.138 | 56.918 | 0.87 | 0.901 | |

LSTM | 82.286 | 64.834 | 0.792 | 0.807 | |

6 | GRU | 84.914 | 67.21 | 0.779 | 0.787 |

BiGRU | 75.26 | 64.157 | 0.826 | 0.854 | |

BiLSTM | 76.796 | 65.442 | 0.819 | 0.844 | |

GAHD-VAE | 63.531 | 52.38 | 0.876 | 0.91 |

**Table 9.**Averaged measurements of effectiveness per model for daily forecasting of pedestrian and bicycle traffic flows.

MODEL | RMSE | MAE | ${\mathit{R}}^{2}$ | EV |
---|---|---|---|---|

ConvLSTM | 68.48 | 53.35 | 0.80 | 0.81 |

CNN | 71.52 | 52.77 | 0.81 | 0.84 |

BiGRU | 48.24 | 38.29 | 0.84 | 0.87 |

BiLSTM | 45.23 | 35.49 | 0.87 | 0.89 |

LSTM | 52.55 | 39.17 | 0.82 | 0.84 |

GRU | 49.92 | 39.26 | 0.84 | 0.87 |

GAHD-VAE | 37.34 | 28.30 | 0.91 | 0.92 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Harrou, F.; Dairi, A.; Zeroual, A.; Sun, Y.
Forecasting of Bicycle and Pedestrian Traffic Using Flexible and Efficient Hybrid Deep Learning Approach. *Appl. Sci.* **2022**, *12*, 4482.
https://doi.org/10.3390/app12094482

**AMA Style**

Harrou F, Dairi A, Zeroual A, Sun Y.
Forecasting of Bicycle and Pedestrian Traffic Using Flexible and Efficient Hybrid Deep Learning Approach. *Applied Sciences*. 2022; 12(9):4482.
https://doi.org/10.3390/app12094482

**Chicago/Turabian Style**

Harrou, Fouzi, Abdelkader Dairi, Abdelhafid Zeroual, and Ying Sun.
2022. "Forecasting of Bicycle and Pedestrian Traffic Using Flexible and Efficient Hybrid Deep Learning Approach" *Applied Sciences* 12, no. 9: 4482.
https://doi.org/10.3390/app12094482