Abstract
The decade of big data has emerged in recent years, which has led to entering the era of intelligent transportation. One of the main challenges to deploying intelligent transportation is dealing with winter roads in cold climate countries. Different operations can be used to protect the road from ice and snow, such as spreading chemicals (here salt) on the road surface. Using salt for de-icing and anti-icing increases road safety. However, the excess use of salt must be avoided since it is not cost-efficient and has negative impacts on the environment. Therefore, the accurate and timely prediction of salt quantity for winter road maintenance helps decision support systems to achieve effective and efficient winter road maintenance. Thus, this paper performs exploratory data analysis to determine the relationships among variables to find the best prediction model for this problem. Due to the stochastic nature of variables regarding weather and roads, a deep neural network/deep learning is selected to predict the amount of salt on the wheel track, using historical data measured by sensors and road weather stations. The results show that the proposed model performs perfectly to learn and predict the amount of salt on the wheel track, based on different metrics, including the loss function, scatter plot, mean absolute error, and explained variance.
1. Introduction
One of the most crucial operations of Winter Road Maintenance (WRM) in cold regions is the spreading of chemicals (such as salt) on the road surface. Spreading chemicals on the road for both anti-icing and de-icing prevents slippery conditions and provides road users with safe transportation. However, it demands equipped trucks, truck drivers, and materials, which can be extremely expensive. In addition, chemicals damage the environment, vehicles, and road infrastructure [1]. In fact, the quality of WRM depends on three major goals: (i) road safety, (ii) the more efficient use of resources, and (iii) environmental protection [2]. In order to achieve these three goals, it is necessary to use an optimal amount of chemicals. One solution to providing decision-makers with future insights for optimal chemical usage is to utilize a data-driven approach and prediction model.
Many studies have suggested different methods to achieve increased road safety, lower WRM costs, and a reduced environmental impact. For instance, Juga et al. [3] developed a statistical approach to model road surface friction/grip by using observations from an optical sensor during two winter seasons. Riehm [4] suggested different methods to estimate road weather using infrared thermometry and image analysis. Dan et al. [5] conducted laboratory experiments to design a time-dependent model to predict salt solution temperature. Xu et al. [6] established a pavement temperature prediction model in winter by improving backpropagation neural networks. Terry et al. [7] published a review paper to evaluate the pros and cons of using organic deicers. Lorentzen [8] applied multivariate regression, and Monto Carlo simulation to analyze WRM costs. Hallmark and Dong [9] used visual analytics to explore the relationship between WRM operations and traffic safety. Linton and Fu [10] evaluated three different machine learning classification algorithms to monitor road surface conditions. Pu et al. [11] proposed a time-series prediction model using a recurrent neural network to predict road surface friction. Ahabchane et al. [12] utilized regression machine learning algorithms to present a methodology to predict salt quantity on street segments every hour. In fact, road salt application has negative impacts on the environment, vehicles, water, and road infrastructures (such as groundwater, bumpers, bridges, brake linings, and frames. Kelting and Laxson [13] carried out research an analysis of the infrastructure and environmental costs of applying salt to the road surface. In their research, they stated how winter conditions and road salt application negatively impact steel corrosion and concrete structures. Zehetner et al. [14] used soil samples along the highway in Vienne, Austria. They analyzed the soils to investigate road salt residues and other contaminations. The results indicated that the pollution of roadside soli was high. Biggs and Mahony [15] showed the importance of soil for road maintenance and construction. Zhang et al. [16] stated that road maintenance and construction needs a considerable budget, and cold climatic conditions have negative effects on the durability of road infrastructure due to anti-icing and de-icing [17]. Pieper et al. [18] examined the potential extension of chloride in groundwater and its impact on private wells. The results showed that road salt application damages private and public drinking water infrastructure. Although different techniques in various disciplines have been developed to cover various gaps in WRM to obtain three major objectives (safety, cost, and environment), there is no study to use a data-driven approach to analyze the influencing factors and predict the amount of chemicals on the wheel track. As a further explanation, winter roads need to be protected from snow and ice, especially in cold climate countries, where the winter weather condition is adverse and the winter period is long. Plowing, anti-icing, and de-icing are the main techniques for winter road maintenance. In addition, vehicles create tracks on the road surface when driving on a winter road. These tracks become more visible as more cars go over the roads. Then, all cars drive on these tracks. Hence, it is essential to use chemicals on the wheel track to prevent ice formation and increase friction. Moreover, there is a special sensor mounted onto the wheel track to monitor road conditions. The photograph in Figure 1 was taken on the E6 road in the north of Norway. The black parts show the wheel tracks.
Figure 1.
Wheel tracks on the E6 road in the north of Norway.
Dynamic atmospheric circumstances lead to fluctuations in road surface conditions, which can be sometimes dangerous in the wintertime due to the reduction in friction between tires and the road surface. In fact, the presence of different factors and variations in these factors (such as precipitation and friction coefficient) makes it complicated to design a model that can predict the amount of chemicals on the road surface. Therefore, due to the nonlinear and stochastic nature of the factors, non-parametric methods can effectively demonstrate the characteristics of these influencing factors. Hence, this study employs an exploratory data analysis and then utilizes a data-driven method named Deep Neural Network (DNN)/deep learning, which uses historical data collected by a road weather station and road-mounted sensor every 10 min to predict the amount of chemical (here salt) on the wheel track. In fact, predicting chemicals (salt), which need to be applied to the road surface, helps decision-makers to make WRM plans in advance that will be beneficial, not only with regard to safety (effective WRM) but also to the environment and resources (efficient WRM).
2. Deep Neural Network/Deep Learning
DNN is a type of artificial neural network containing two or more hidden layers. The continuous improvement in the field of artificial intelligence has led to the broad use of deep learning algorithms in many academic and industrial fields, such as finance, engineering, medicine, and medical science [19]. In deep learning, the multiple hidden layers function like neurons in the human brain [20]. Each neuron includes a weight and bias, which are adjusted by the gradient descent algorithm in the backpropagation process to minimize cost function (loss function). An activation function (such as sigmoid and rectified linear units) helps the network to learn complicated patterns in the dataset. Deep learning has a high predictive capability and is more flexible than machine learning [21].
3. Materials and Methods
Figure 2 presents the structure of this research study and its different stages, which are described in this section. These stages are performed in Python 3 using different libraries including Pandas [22], Numpy [23], Seaborn [24], Matplotlib [25], Scikit-learn [26], Tensorflow [27] and Keras [28].
Figure 2.
The structure of this research study.
3.1. Data Collection
Undoubtedly, there is a direct relationship between weather conditions and road surface conditions. There are different factors that influence the salting (g/m2) operation to maintain the road during winter. These factors are surface temperature (°C), air temperature (°C), dew point temperature (°C), level of grip, ice layer (mm), precipitation (mm), snow height (mm), freezing temperature (°C), maximum wind speed (m/s), conductivity (mS/cm) and the concentration (g/L) of salt on the road surface. Historical data regarding these input variables were collected by road weather stations and sensors at test site E18 measured every 10 min in February 2019. This test site is in Northern Europe, Sweden, and is exactly located between Västerås and Enköping [29]. In fact, the air temperature, dew point temperature, precipitation, and maximum wind speed were measured by the road weather station. Furthermore, the surface temperature, snow height, freezing temperature, concentration, and conductivity were measured using a sensor mounted on the wheel track. In addition, the level of grip and ice layer was measured using an optical sensor.
3.2. Exploratory Data Analysis Using Statistical and Visualization Techniques
Analyzing the dataset helps us to explore interesting patterns among input and output variables and summarize the major characteristics of the historical data. This process is helpful to validate the results and evaluate the applicability of the model. Hence, in this section, we describe the exploratory data analysis that was performed in this research study.
3.2.1. To Check Missing Values
Table 1 shows the number of missing values for each variable. The maximum number of missing values belongs to the variable maximum wind speed while air temperature, dew point temperature, and precipitation include no missing values. Due to the low number of missing values, they were removed from the dataset.
Table 1.
Number of missing values for each variable.
3.2.2. To Describe Statical Characteristics of Historical Data
Statistical information on both the input variables and the output variable is presented in Table 2. The number of historical data (first column) is 3827 for each variable after removing missing values. The mean value and standard deviation (Std) of historical data have been shown in the second and third columns, respectively. The fourth and fifth columns illustrate the maximum and minimum values of variables, respectively.
Table 2.
Statistical data description.
3.2.3. To Plot the Distribution of the Output (Amount of Chemical/Salt)
Figure 3 shows the distribution of the amount of chemicals on the road surface. It appeared that the largest amount of chemical used on the road was somewhere between 0 and approximately 2.5 g. In fact, there are not many cases that will be more than 2.5 g. Since there are not many cases, it may not be useful to train our DNN model on these extreme outliers.
Figure 3.
Distribution of the output variable.
3.2.4. To Check Correlations between Input Variables and the Output Variable
The calculation of correlations between the input and output variables allowed us to discover which input variables were highly correlated with the output either positively or negatively. As we can see in Table 3, the freezing temperature showed a strong correlation with the amount of chemicals present.
Table 3.
Correlation between input variables and the output variable.
3.2.5. To Explore Highly Correlated Feature with the Output through Scatter Plot
Figure 4 shows a strong relationship between the freezing temperature and the amount of chemical. In this figure, the different colors of the data points display the different levels of grip on the road surface after spreading the chemical (salt). As clearly shown, by dropping the freezing temperature, the amount of chemical (salt) used on the wheel track increased, in order to enhance the driving quality on the road surface. Additionally, the level of grip was high in most cases, which indicates the effectiveness of WRM to provide drivers with safe transportation. When the freezing temperature was approximately −20 °C, the appropriate grip could be achieved by using almost 8.5 g of chemicals (salt). However, some points illustrate considerable chemical usage (between almost 9 and 17 g) when the freezing temperature was around −20 °C. These data points indicate the inefficiency of WRM, which would not only lead to spending the WRM budget on excess material (chemical/salt), but also to damaging the environment, vehicles, and road infrastructure. Therefore, using an excess amount of chemicals is not necessary and must be avoided.
Figure 4.
Scatter plot to understand the relationship between the freezing temperature, amount of chemical, and level of grip.
3.2.6. Distribution of Input Variables Based on the Amount of Chemical through a Box Plot
Excluding freezing temperature, which was highly correlated with the amount of chemical, we plotted the charts to explore the distribution of the rest of the input variables based on the amount of chemical. The box plots in Figure 5 and charts in Figure 6 demonstrate a large amount of variation in the distributions, which prove the complexity of the problem. (In Figure 5, axis X is limited to 12.2 due to fluency in the reading of charts.)


Figure 5.
Box plots for input variables.

Figure 6.
Distributions of each input variable based on amount of chemical.
3.3. Feature Engineering
Feature engineering is the process of extracting characteristics from the raw data. In order to explore more information from the dataset, we extract the days from Timestamp (converting this string to a DateTime object) to plot the distribution of the amount of chemical per day. As clearly shown in Figure 7, there were some behavioral differences in using chemicals on various days. However, it is not easy to understand significant information from this chart. Therefore, we calculated the mean value of each variable per day and plotted them. We can clearly see the differences in the average value of each variable per day in Figure 8, such as the following: (i) There was not a huge difference between the average amount of chemicals and the level of grip on various days. (ii) The minimum average values of surface temperature, air temperature, and dew point temperature were observed on day 22; however, the average level of grip and amount of chemical were 0.82 and 0.02 g, respectively, which indicates the effectiveness and efficiency of WRM. (iii) The maximum mean value of chemical was used on the 2nd of February since the maximum mean value of the ice layer and the minimum average value of freezing temperature were on the same day.
Figure 7.
Distribution of the output variable on the different days in February 2019.
Figure 8.
Mean value of each variable per day.
3.4. Data Preprocessing
Data preprocessing is transforming the raw data into a template that is understandable for machine learning algorithms. At this stage, the dataset is divided into training and testing sets (70% for the training set and 30% for the test set). Then, scaling is performed and the data range is changed to between 0 and 1 for positive values and −1 and 0 for negative values. We selected the MinMax Scaler as a transformer and only fit it to the training set to prevent data leakage from the test set. After that, both the training set and test set were transformed.
3.5. Creating a DNN Model
We created a sequential model with four hidden layers. We typically base the number of neurons in the layers on the size of the actual feature data. The activation function was the rectified linear unit (relu), the final layer had one neuron since we had one output that was to be predicted, and the optimizer ‘Adam’ was chosen. Adam is a stochastic optimization method and a much more efficient way of searching for minimum values. Adam outperforms compared to the other adaptive gradient descent algorithms. In addition, since we had a regression problem and continuous output, Mean Squared Error (MSE) was selected as a loss metric. The number of epochs was 100, and the batch size was 16, as the smaller the batch size, the longer the training, but the lower the likelihood of overfitting because we were not passing in the entire training set at once.
4. Results
Model Evaluation and Prediction
We can compare the plot behavior of training loss versus validation loss. Figure 9 shows that there was a decrease in both the training loss and validation loss and there was no increase at this stage in the validation loss. In addition, since the validation loss decreased, we could continue training without overfitting our training data. Therefore, no overfitting occurred. Moreover, Table 4 presents the MSE, the Mean Absolute Error (MAE), and the explained variance (R2). Both the training loss and validation loss converged to 0.03 (MSE value); however, it is difficult to interpret the MSE value. Hence, we could use MAE, which was easy to interpret, because it was the average absolute error across all the predictions and could be compared with the mean value of the output in the dataset. The MAE was 0.05, which is less than the mean value of the chemical (0.24). Additionally, R2 provided us with a deeper understanding to evaluate the model. R2 showed that 97% variance could be explained by the model. Furthermore, the scatter plot achieved from the DNN model is shown in Figure 10. The red line represents the best/perfect prediction line. In fact, we were disadvantaged by the outliers (the high amount of chemicals in the dataset); however, the DNN model perfectly predicted the amount of chemicals, between 0 to almost 7.5 g. Thus, the graphical and numerical evaluation results illustrated the good performance of the model.
Figure 9.
Training loss versus validation loss.
Table 4.
Values of evaluation metrics.
Figure 10.
Scatter plot achieved from the DNN model.
5. Conclusions
In this research article, we utilized exploratory data analysis to discover the relationship between the variables to find the best model to fit our dataset. Due to the large amount of variation in the variables, which makes the model complicated, a deep neural network/deep learning algorithm was employed to build a data-driven model to accurately predict the amount of chemical (here salt) on the wheel track.
The deep learning model was performed in Python 3 software using real observations, measured using an optical sensor, road mounted sensor, and the road weather station in February 2019 at test site E18 in Northern Europe, Sweden. Both numerical and graphical evaluation metrics showed that the deep learning model works perfectly, though there were a few outliers (high amount of chemicals) that could pose a problem in predictions. However, the data analysis section showed that there was no need to use a high amount of chemical to achieve effective WRM. A lower amount of salt can be used to reach both effective and efficient WRM.
Therefore, the findings of this paper can be used as a source of quantitative future insights for winter road maintenance to improve decision support systems, which ultimately leads to maximizing traffic safety while minimizing cost and environmental impacts.
Although the proposed model performs well, the prediction model can be improved by using hybrid models (combining data-driven modeling and physics-based modeling) to better capture potential uncertainties and stochastic parameters.
Author Contributions
Conceptualization, M.H.; methodology, M.H.; software, M.H.; validation, M.H.; formal analysis, M.H.; investigation, M.H.; resources, M.H.; data curation, J.C.; writing—original draft preparation, M.H.; writing—review and editing, M.H.; visualization, M.H.; supervision, G.C.P.P. and J.C.; project administration, G.C.P.P. and J.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Ministry of Education and Research, Norway, grant number 470079.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
We have got the data from the Swedish transport administration’s RWIS station at Test site E18. https://www.trafikverket.se/resa-och-trafik/forskning-och-innovation/aktuell-forskning/transport-pa-vag/testsite-e18--en-vagforskningsstation/ (accessed on 1 January 2020).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Vaitkus, A.; Gražulytė, J.; Skrodenis, E.; Kravcovas, I. Design of frost resistant pavement structure based on road weather stations (RWSs) data. Sustainability 2016, 8, 1328. [Google Scholar] [CrossRef] [Green Version]
- Odelius, J.; Famurewa, S.M.; Forslöf, L.; Casselgren, J.; Konttaniemi, H. Industrial internet applications for efficient road winter maintenance. J. Qual. Maint. Eng. 2017, 23, 355–367. [Google Scholar] [CrossRef]
- Juga, I.; Nurmi, P.; Hippi, M. Statistical modelling of wintertime road surface friction. Meteorol. Appl. 2013, 20, 318–329. [Google Scholar] [CrossRef]
- Riehm, M. Measurements for Winter Road Maintenance. Ph.D. Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 7 December 2012. [Google Scholar]
- Dan, H.C.; Tan, J.W.; Du, Y.F.; Cai, J.M. Simulation and optimization of road deicing salt usage based on Water-Ice-Salt Model. Cold. Reg. Sci. Technol. 2020, 169, 102917. [Google Scholar] [CrossRef]
- Xu, B.; Dan, H.C.; Li, L. Temperature prediction model of asphalt pavement in cold regions based on an improved BP neural network. Appl. Therm. Eng. 2017, 120, 568–580. [Google Scholar] [CrossRef]
- Terry, L.G.; Conaway, K.; Rebar, J.; Graettinger, A.J. Alternative deicers for winter road Maintenance—A Review. Water Air Soil Pollut. 2020, 231, 394. [Google Scholar] [CrossRef]
- Lorentzen, T. Climate change and winter road maintenance. Clim. Chang. 2020, 161, 225–242. [Google Scholar] [CrossRef] [Green Version]
- Hallmark, B.; Dong, J. Examining the effects of winter road maintenance operations on traffic safety through visual analytics. In Proceedings of the IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Linton, M.A.; Fu, L. Connected vehicle solution for winter road surface condition monitoring. Transp. Res. Rec. 2016, 2551, 62–72. [Google Scholar] [CrossRef] [Green Version]
- Pu, Z.; Liu, C.; Shi, X.; Cui, Z.; Wang, Y. Road surface friction prediction using long short-term memory neural network based on historical data. J. Intell. Transp. Syst. 2022, 6, 34–45. [Google Scholar] [CrossRef]
- Ahabchane, C.; Trépanier, M.; Langevin, A. Street-segment-based salt and abrasive prediction for winter maintenance using machine learning and GIS. Willey Trans. GIS 2018, 23, 48–69. [Google Scholar] [CrossRef] [Green Version]
- Kelting, D.L.; Laxon, C.L. Review of Effects and Costs of Road De-Icing with Recommendations for Winter Road Management in the Adirondack Park; Adirondack Watershed Institute: Paul Smiths, NY, USA, 2010; pp. 1–84. [Google Scholar]
- Zehetner, F.; Rosenfellner, U.; Mentler, A.; Gerzabek, M.H. Distribution of Road Salt Residues, Heavy Metals and Polycyclic Aromatic Hydrocarbons across a Highway-Forest Interface. Water Air Soil Poll. 2009, 198, 125–132. [Google Scholar] [CrossRef]
- Biggs, A.J.W.; Mahony, K.M. Is soil science relevant to road infrastructure? In Proceedings of the 13th International Soil Conservation Organisation Conference (ISCO), Conserving Soil and Water for Society: Sharing Solutions, Brisbane, Australia, 4–8 July 2004; pp. 1–7. [Google Scholar]
- Zhang, H.; Lepech, M.D.; Keoleian, G.A.; Qian, S.; Li, V.C. Dynamic life-cycle modeling of pavement overlay systems: Capturing the impacts of users, construction, and roadway deterioration. J. Infrastruct. Syst. 2010, 16, 299–309. [Google Scholar] [CrossRef] [Green Version]
- Vignisdottir, H.R.; Ebrahimi, B.; Booto, G.K.; O’Born, R.; Brattebø, H.; Wallbaum, H.; Bohne, R.A. A review of environmental impacts of winter road maintenance. Cold Reg. Sci. Technol. 2019, 158, 143–153. [Google Scholar] [CrossRef]
- Pieper, K.J.; Tang, M.; Jones, C.N.; Weiss, S.; Greene, A.; Mohsin, H.; Parks, J.; Edwards, M.A. Impact of Road Salt on Drinking Water Quality and Infrastructure Corrosion in Private Wells. Environ. Sci. Technol. 2018, 52, 14078–14087. [Google Scholar] [CrossRef] [PubMed]
- Hu, Z.; Zhao, Y.; Khushi, M. A survey of Forex and stock price prediction using deep learning. Appl. Syst. Innov. 2021, 4, 9. [Google Scholar] [CrossRef]
- Zhu, W.; Xie, L.; Han, J.; Guo, X. The application of deep learning in cancer prognosis prediction. Cancers 2020, 12, 603. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef] [PubMed]
- McKinney, W. Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; pp. 51–56. [Google Scholar] [CrossRef] [Green Version]
- Harris, C.R.; Millman, K.J.; Van Der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with numpy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
- Waskom, M.; Botvinnik, O.; O’Kane, D.; Hobson, P.; Lukauskas, S.; Gemperline, D.C.; Augspurger, T.; Halchenko, Y.; Cole, J.B.; Warmenhoven, J.; et al. mwaskom/seaborn: v0.8.1. Available online: https://doi.org/10.5281/zenodo.883859 (accessed on 3 September 2017). [CrossRef]
- Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
- Keras. GitHub. 2015. Available online: https://github.com/fchollet/keras (accessed on 1 January 2020).
- Trafikverket. Available online: https://www.trafikverket.se/trafikinformation/vag/?TrafficType=personalTraffic&map=3%2F3611591.67%2F6763671.79%2F&Layers=RoadCondition%2BRoadWeather%2B (accessed on 12 December 2019).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).