Next Article in Journal
Watershed-Based Evaluation of Automatic Sensor Data: Water Quality and Hydroclimatic Relationships
Next Article in Special Issue
Relationships between Body Mass Index and Self-Reported Motorcycle Crashes in Vietnam
Previous Article in Journal
Climate: An R Package to Access Free In-Situ Meteorological and Hydrological Datasets For Environmental Assessment
Previous Article in Special Issue
Factors Contributing to the Relationship between Driving Mileage and Crash Frequency of Older Drivers
Open AccessArticle

Forecasting Road Traffic Deaths in Thailand: Applications of Time-Series, Curve Estimation, Multiple Linear Regression, and Path Analysis Models

1
School of Transportation Engineering, Institute of Engineering, Suranaree University of Technology, 111 University Avenue, Suranaree Sub-District, Muang District, Nakhon Ratchasima 30000, Thailand
2
Department of Logistics Engineering and Transportation Technology, Faculty of Engineering and Industrial Technology, Kalasin University 62/1 Kaset Sombun Road, Kalasin Subdistrict, Mueang District, Kalasin 46000, Thailand
*
Author to whom correspondence should be addressed.
Sustainability 2020, 12(1), 395; https://doi.org/10.3390/su12010395
Received: 27 November 2019 / Revised: 23 December 2019 / Accepted: 2 January 2020 / Published: 3 January 2020
(This article belongs to the Special Issue Traffic Safety within a Sustainable Transportation System)

Abstract

In 2018, 19,931 people were killed in road accidents in Thailand. Thus, reduction in the number of accidents is urgently required. To provide a master plan for reducing the number of accidents, future forecast data are required. Thus, we aimed to identify the appropriate forecasting method. We considered four methods in this study: Time-series analysis, curve estimation, regression analysis, and path analysis. The data used in the analysis included death rate per 100,000 population, gross domestic product (GDP), the number of registered vehicles (motorcycles, cars, and trucks), and energy consumption of the transportation sector. The results show that the best three models, based on the mean absolute percentage error (MAPE), are the multiple linear regression model 3, time-series with exponential smoothing, and path analysis, with MAPE values of 6.4%, 8.1%, and 8.4%, respectively.
Keywords: accident forecasting; multiple linear regression model; time-series; path analysis accident forecasting; multiple linear regression model; time-series; path analysis

1. Introduction

1.1. Road Safety Situation in Thailand

The Decade of Action for Road Safety 2011–2020 initiative was introduced to increase awareness across various countries of the measures that lead to increased road safety. The United Nations (UN) General Assembly delegated the World Health Organization (WHO) to monitor the progress of the initiative through a series of documents, compiled into the Global Status Reports on Road Safety in 2015. This report described an estimated 1.25 million fatalities caused by accidents in 2013 and indicated that roads in low- and middle-income countries were less safe than those in high-income countries; in particular, the death rates were twice those of high-income countries. Thailand has been classified as a middle-income country and was second in the global ranking, with a death rate of 36.2 per 100,000 population (Libya ranked first with a rate of 73.4 per 100,000 population). Thailand has been reported to be the most dangerous country in the world for motorcycle users, with a death rate of 27.3 deaths per 100,000 population [1]. In November 2017, James Burton [2] reported that as Libyan officials had declared that the number of road fatalities in Libya was caused by fighting on the roads, not driving, Thailand was then ranked as having the global highest road-based death rate (Figure 1). Reported that Thailand was first in the global rankings, followed by Malawi and Liberia, with death rates of 35.0 and 33.7 per 100,000 population, respectively. Similarly, the report World Health Statistics 2017: Monitoring Health for the Sustainable Development Goals (SDGs) identified Thailand as having the highest number of road deaths in the WHO Southeast Asia Region (SEAR); whereas, the Bolivarian Republic of Venezuela was identified as having the highest road death rate in the WHO Region of the Americas [3].
In Thailand, road accident statistics are collected from many departments and have been regularly reported and organized in the form of annual reports. Each department has a different reason for recording individual data, which does not accurately reflect the situation in Thailand. According to the 2018 statistics, 19,931 people killed in road accidents in Thailand, as reported by the Integrated Information System for road deaths (RTDDI: Road Traffic Death Data Integration) [4], the Bureau of Non-Communicable Disease, the Department of Communicable Disease Control, and the Ministry of Public Health [5]. The system mentioned above had already begun integrating road accident fatality data from three databases in 2013, including the system for all deaths acquired from death notifications, including death certificates and death registration [5], data from the Royal Thai Police system (POLIS) consisting of case records [6], and data from the E-Claim system to compose the most comprehensive death data with the support of the Thai Health Promotion Foundation (ThaiHealth) and the WHO in Thailand [7].

1.2. Importance of Accident Prediction

Given these unsafe road situations, the reduction of accidents to reduce the number of deaths has become an urgent issue that must be addressed by the government. Forecasting the number of accidents on highways is necessary for developing road safety plans in terms of staff, budgets, and policies. In addition, instruments or techniques for effective forecasting must be found [8].
Examples of road safety policies in Thailand include law enforcement (e.g., for speed violations or alcohol consumption), road safety programs in educational institutions, creation of advertising media, an increase in the training hours required to obtain new driver’s licenses or for renewals, engineering solution techniques for road safety audits, and research funding. To set these policies, the forecast data of the number of accidents have been used for determining the operating budgets.

1.3. Previous Study in Road Accident Prediction

Various additional factors may potentially affect accident forecasting. According to international research studies (Table 1), time-series methods have been applied for accident prediction; for example, Quddus [9], Ramstedt [10], Dadashova et al. [11], García-Ferrer et al. [12], Zheng and Liu [13], Sanusi et al. [14], Parvareh et al. [15] and Dadashova et al. [16]. The Autoregressive integrated moving average (ARIMA) model has mostly been applied to time-series analysis.
Similar to regression analysis, Michalaki et al. [17] and Lu and Tolliver [18] used 6 and 18 years of historical data, respectively, for accident forecasting. Oh et al. [19], Garcia-Ferrer et al. [12], Zheng and Liu [13], and Ameen and Naji [20] used 4, 28, 13 and 17 years of historical data to predict accidents, respectively. Other techniques have also been applied, such as the Poisson regression model, the negative binomial model, the Conway–Maxwell–Poisson model, the Bernoulli model, the Hurdle Poisson model, the zero-inflated Poisson model (ZIP), and Neural network models.
Accident data include accident frequencies, the number of deaths, severe injuries, and minor injuries (including the number of pedestrian injuries caused by road accidents). Environmental data, such as location, time, weather, road conditions, crossing locations, infrastructure equipment, and historical accident information may be included in the analysis. For example, Lu and Tolliver [18] used the number of kilometers travelled by vehicles; Quddus [9] used vehicle characteristics and other behavioral factors; Michalaki et al. [17] used road pavement data; and the number of lanes were used by Oh et al. [19].
Economical factors have also been used in the analysis by Dadashova et al. [11]. García-Ferrer et al. [12] included energy consumption in their analysis. Alcohol consumption was considered by Ramstedt [10]. Some studies have included law enforcement in the analysis, e.g., Quddus [9]. Vehicle registration was used by García-Ferrer et al. [12] and Ameen and Naji [20]. Only Ameen and Naji [20] included population data in their analysis, along with other data such as, average daily traffic (ADT; vehicles per day) and other behavioral factors.
In general, most studies only applied one method, such as time-series analysis with an ARIMA model. If more than one method was used, it has typically been a regression model. Therefore, the study of other methods is necessary. Therefore, the aim of this study was to apply time-series analysis with 20 years of historical data from 1997 to 2016 using exponential smoothing and multiple linear regression techniques. Other techniques were applied: Curve estimation and path analysis—two methods that have not yet been used in any other study.
In our study, we included economic factors, energy consumption, vehicle registration, and population to develop our model. The effectiveness of the model was tested using the research question “What is an effective model for forecasting road traffic deaths?” Thus, we aimed to identify an appropriate method for forecasting road traffic deaths in Thailand and comparing the effectiveness of the models. Four methods were considered in this study: Time-series analysis, curve estimation, regression analysis, and path analysis. The remainder of this paper is structured as follows: The material and methods are presented in Section 2, followed by the results, research conclusions, and discussion, then the limitations and future work.

2. Materials and Methods

2.1. Data Collection

In this research, we use Thailand’s statistics for forecasting the death rate from road accidents. Thailand collects data from a wide range of departments and information from the integration of three databases; however, the number of years was not sufficient to develop the model. Other studies used historical data of the past 6 to 55 years for forecasting. Thailand has completely collected historical data for the past 20 years. Therefore, we used statistics of the past 20 years (1997–2016) of road accident fatalities according to the traffic lawsuit data from the Royal Thai Police [6].
The data used in our study included three data sets: Population data, economic data, and transportation data, including population statistics—obtained from the Department Provincial Administration [21]; gross domestic product (GDP)—obtained from the Bank of Thailand [22]; the number of registered vehicles—obtained from the Department of Land Transport [23]; the energy consumption of the Transport Sector—obtained from the Ministry of Energy [24].

2.2. Methodology

The analysis was divided into four parts: Studying past data trends; conducting the analysis using the developed model to forecast the death rates from road accidents using time-series analysis, curve estimation, multiple regression analysis, and path analysis; comparing forecast accuracy by seeking the optimal model with minimum errors; and using the model to make a prediction by forecasting trends for the next 10 years.
The statistical data trends for all components of our analysis are shown in Figure 2. Figure 2a depicts the death rate per 100,000 population; we discovered that the road death rate was 22.75 in 1997, decreasing to 20.89 in 2002. Then, it decreased steadily until 2010. However, the rate of road deaths increased over 10.34 in 2010 within a period of 3 years—from 11.33 in 2013 to 12.69 in 2016. The economic growth (as indicated by GDP) continuously grew from 4.71 billion baht in 1997 to 13.02 billion baht in 2016. The number of registered vehicles grew, starting from 17.67 million vehicles in 1997 and increasing to 37.34 million vehicles in 2016 (despite a decrease in 2004); the energy consumption for the transport sector tended to increase, from 20.25 Ktoe in 1997 to 30.19 Ktoe in 2016. This trend reflected Thailand’s economic growth and improvements in the population well-being, as shown in Figure 2b–d, respectively.

2.3. Data Analysis

The data analysis included four techniques: Time-series analysis, curve estimation, multiple regression analysis, and path analysis. Individual techniques were used differently during each component of our analysis.

2.3.1. Time-Series

Model Specification

Numerous techniques have been proposed for time-series analysis, such as moving average (MA), exponential smoothing (ES), double exponential smoothing model (DES), trend analysis linear regression method (LR), and Winter’s method.
Time-series analysis was conducted in our study using the exponential smoothing technique, a technique that values more recent information more highly. Data importance decreases according to the temporal distance of data from the present [25]. The equation is as follows:
Ft = aAt−1 + (1 − a)Ft−1,
where Ft−1 is the value of the prediction during the period prior to forecast phase 1, At−1 is the real value during the period prior to the forecast phase 1, and the smoothing coefficient a takes a value between 0 and 1. The closer the value to 0, the lower its weight [26].

Model Fitting and Validation

We compared the effectiveness of the forecasting model by evaluating the forecast error from a training data set. The lowest mean absolute percent error (MAPE) was used as the criteria for selecting the most effective forecasting model.

2.3.2. Curve Estimation

SPSS 18.0 software (SPSS Inc., Chicago, IL, USA) was used to conduct curve estimation. The independent variable was time (years), with road death rates per 100,000 population calculated using 10 models consisting of linear, logarithmic, inverse, quadratic, cubic, compound, power, S, growth, and exponential models. The equations of the models are as follows, respectively:
E(y)t = b0 + (b1 × T),
E(y)t = b0 + (b1 × (ln(T)),
E(y)t = b0 + (b1/T),
E(y)t = b0 + (b1 × T) + (b2 × T2),
E(y)t = b0 + (b1 × T) + (b2 × T2) + (b3 × T3),
E(y)t = b0 × (b1)T,
E(y)t = b0 × Tb1,
E(y)t = exp[b0 + b1/T],
E(y)t = exp(b0 + (b1*T)),
E(y)t = b0 × (expb1*T)
where E(y)t is the predicted value of death rate from road accidents, T is time (T = 1, 2, 3, …, 20 years), b0 is a constant (or y-intercept), and b1–b3 are coefficients.

2.3.3. Multiple Regression Analysis

Multiple regression analysis was used to determine the relationships between the dependent variables Y to discover which relationships were linear. The following equation shows the relationships between Y and X1, X2, …, Xk [27]:
Υ = β 0 + β 1 X 1 + β 2 X 2 + …. + β k X k + e ,
where Y is the dependent variable, X1–Xk. are the independent variables, B0 is a constant (or y-intercept), and B1–Bk are the coefficients of the respective variables (loading or partial slopes), which are used to describe the change in the Y value when the X value changes. [27].
Four model specification techniques were used to select the variables in a regression model: (1) All possible regression, where all the independent variables were used in the equation; (2) forward selection, where only one independent variable was used at a time in the model, and the process was repeated until no independent variables were remaining that could explains the variation of the dependent variable; (3) backward elimination, where independent variables are removed to determine which variables only slightly explain the dependent variable, by removing only one variable by the time; and (4) stepwise regression, where only one independent variable is selected by considering the time, similar to forward selection, but in this technique, the independent variable may be removed later, depending on the specified significance level. The equation used was as follows:
Y = β 0 + β 1 V E H _ M O T O R C Y C L E + β 2 V E H _ C A R + β 3 V E H _ T R U C K + β 4 G D P + β 5 E N _ T R A N S P O R T ,
where Y is the death rate per 100,000 population, VEH_MOTORCYCLE is the number of registered motorcycles, VEH_CAR is the number of registered cars, VEH_TRUCK is the number of registered trucks, GDP is the gross domestic product (billion Baht), and EN_TRANSPORT is the energy consumption (Ktoe).

2.3.4. Path Analysis

This technique delineates all the relationships involved in a structural equation model (SEM). Path analysis identifies the details of bivariate relationships between two variables and the weighting of values connected to these points. The researcher determines the number and type of variables and relative paths [28]. The statistics, based on the regression analysis, are used to examine the independent variables that directly and indirectly influence the dependent variable. This describes the relationships between the variables [29]. We used Mplus 7.2 software [30] to conduct the analysis.
Model specification is the procedure used to develop a structural equation model, which is necessary to review concepts, theory, and related studies to form an assumption. A path diagram indicates the relationships between variables, and a structural equation model is developed by following the assumptions in terms of the path diagram and a particular model using a variance–covariance matrix. The appropriated particular model is the model which is supposed to explain the relationships between variables reasonably and consistently with empirical data.
In model estimation, Mplus 7.2 software [30] provided estimated parameters of the model according to the specified values of the model. Several techniques can be used to estimate parameters, for instance, maximum likelihood (ML), generalized least squares (GLS), and generally weighted least squares (WLS). In this study, we considered ML, which has been widely applied and suitable for interval and ordinal scale data. ML is parameter estimation conducted under the assumption that the observed variables are multivariate, normally distributed, and the samples must be independent; that is, they follow a perfect normal distribution (skewness (Sk) ≤ 3; Kurtosis index (Ku) ≤ 10) [31].
Goodness of fit is used to consider the consistency of an SEM developed from empirical data by examining the indices of the model to test the model fit statistical criteria, where the chi-square value/degrees of freedom (df) < 3 [31], root mean square error (RMSEA) < 0.07 [32], the comparative fit index (CFI) ≥ 0.90 [33], the Tucker–Lewis index (TLI) ≥ 0.80 [34,35], and the SRMR < 0.08 [33].

2.3.5. Evaluating Model Efficiency

The MAPE [29] was used to evaluate model efficiency with the following formula:
MAPE = 1 n i = 1 n F i O i O i × 100 ,
where Fi is the predictive value acquired from each year, Oi is the actual value that occurred each year, and n = 20 years.

3. Results

3.1. Statistical Results

3.1.1. Time-Series Data Analysis

Exponential smoothing was used to analyze the data, and Holt’s linear trend technique using two constants was used to establish the priority of data, with data importance gradually decreasing depending on the distance of the data from the present. This technique was used to determine a linear trend data without seasonal influences (Figure 2a). This resulted in a MAPE value of 8.1%.

3.1.2. Curve Estimation Technique

The results of the curve estimation analysis in Table 2 show that the three most accurate models were the cubic model E(y)t = 18.262 + 1.487T − 0.189T2 + 0.005T3, the quadratic model E(y)t = 20.772+0.207T-0.040T2, and the linear model E(y)t = 23.879-0.641T with adjusted R2 values of 0.813, 0.794, and 0.724, respectively.

3.1.3. Multiple Regression Analysis

We multiple regression analysis was used to analyze the results of the models, with the death rate from road accidents as the dependent variable, and independent variables, including the number of registered vehicles, the GDP, and energy consumption. The results, which are shown in Table 3 are as follows:
Model 1: GDP and the energy consumption of the transport sector affected the death rate from road accidents (p < 0.05); however, the coefficient values were very low, at −2.393 × 10−6 (p < 0.001), and 0.001 (p < 0.05), respectively. The model accuracy value was 0.842 (84.2%) and F-test value was 51.814.
Model 2: Factors were adjusted for population. We found that the number of registered cars over the population, and the energy consumption of the transport sector over the population, affected the death rate from road accidents (p < 0.05). However, the coefficient values remained very low, at −0.127 (p < 0.001) and 0.424 (p < 0.001), respectively. The model accuracy value was 0.853 (85.3%), and the F-test value was 55.915.
Model 3: Factors were adjusted for the number of registered vehicles. We found that the number of registered cars over the number of registered vehicles, number of registered trucks over the number of registered vehicles, and the energy consumption of the transport sector over the number of registered vehicles affected the death rate from road accidents (p < 0.05). However, the coefficient values were still very low, with the coefficient values equal to 0.081 (p < 0.001), −0.7329 (p < 0.05), and 0.260 (p < 0.001), respectively. The model accuracy was = 0.888 (88.8%) and F-test value was 51.144.

3.1.4. Path Analysis

The path analysis results showed that the model results had the following goodness-of-fit values: Chi-square = 17.706, df = 3, p = 0.0005, RMSEA = 0.495, CFI = 0.884, TLI = 0.536, and standardized root mean square residual (SRMR) = 0.078. This indicated that the model passed the goodness-of-fit statistical criteria, where the chi-square/df < 3 [31], RMSEA < 0.07 [32], CFI ≥ 0.90 [33], TLI ≥ 0.80 [34,35], and SRMR < 0.08 [33]. We found that only the SRMR was according to the condition.
According to Table 4, the factors that directly affected death rate from road accidents were GDP, the energy consumption of the transport sector, and number of registered cars, with unstandardized factor loadings and standardized factor loadings of −0.106 (−0.193), 0.834 (0.992), and −0.120 (−1.485) at p-values of <0.05 (<0.05), <0.001 (<0.001), and <0.05 (<0.05), respectively. The confidence intervals between the independent variables and dependent variables were 86.6% (R2 = 0.866); however, the transport sector energy consumption factor had an influence on the GDP, thus, indirectly affecting the number of road deaths, as seen in the path analysis model results in Figure 3.

3.2. Comparison of Model Performance

We constructed eight models to predict the number of fatalities from road accidents using different predictive techniques to identify the most effective model using MAPE (Equation (15)). Table 5 lists the results of our comparison; we found that multiple regression linear model 3 had the lowest MAPE value (6.4%), followed by the time-series model analysis with exponential smoothing, path analysis, multiple regression linear model 2, curve estimation (quadratic), curve estimation (cubic), curve estimation (linear), and multiple regression linear model 1, with MAPE values of 8.1%, 8.4%, 9.5%, 10.2% 11.2%, 12.6%, and 12.8%, respectively.

3.3. Forecasting the Death Rate from Road Accidents per 100,000 Population

As displayed in Table 6, the predictions based upon the compared models used additional data (i.e., GDP, the number of registered vehicles (motorcycles, cars, and trucks), and the transport sector energy consumption). The second-best forecasting technique included these related factors when conducting time-series analysis with the exponential smoothing technique. As shown in Table 7, a 10-year forecast was generated. The results of our prediction showed that the time-series model (exponential smoothing), curve estimation (quadratic), curve estimation (linear), multiple regression 2–3, and path analysis indicated a declining direction, reflecting the fatality statistics from the National Police Bureau (Figure 2). However, some models predicted that the reduction was close to zero (no decline in the death rate from road accidents) and the multiple regression linear 1 and curve estimation (cubic) models predicted increasing tendencies. Figure 4 presents a comparison of the predictions using the various techniques.

4. Discussion

Before Thailand was ranked first in road accident deaths per 100,000 population in 2017, the WHO had already ranked Thailand’s road accident situation as the world’s second-worst in 2015. This ranking created the unpleasant image that travel in Thailand is unsafe. The organizations and departments involved have recognized the need to prioritize measures to decrease the worsening road safety situation, including law enforcement, education campaigns for schools, advertising media, and increasing the training hours required to obtain a new driver’s license or renew a license, as well as using technical engineering solutions. Funding for research has been provided to investigate and identify solutions to this problem in an atmosphere of economic growth focusing on travel, accident risks, and into solutions for reducing the number of accidents and deaths. Among the measures mentioned above, the National Police Bureau statistics indicated that the death rate has been decreasing; however, in 2016, the rate actually increased (Figure 2a).
The GDP, the number of registered vehicles, and transport sector energy consumption are likely to increase in the future (Figure 2b–d). Therefore, in this study, we analyzed statistical data to forecast the death rate from road accidents by a time-series model using exponential smoothing, curve estimation, multiple linear regression, and path analysis, using official Thai statistical data collected over the past 20 years. The research results are as follows:
(1)
The time-series model using exponential smoothing is suitable for predicting the death rate from road accidents. Time-series techniques have also been used to forecast accidents by ARIMA [9,10,11,12,16]. However, the Thai data set was not suitable for the ARIMA technique. The exponential smoothing technique yielded MAE and MAPE values of 1.627 and 8.1%, respectively.
(2)
Curve estimation with cubic, quadratic, and linear patterns were the three models with the highest R2 values.
(3)
Multiple regression linear model 1 found that GDP was a good economic indicator, in agreement with a previous report by Dadashova et al. [16], and that the transport sector energy consumption level affected the death rate from road accidents, in agreement with reported results García-Ferrer et al. [12]; the number of registered vehicles (motorcycle, cars, and trucks) had no effect.
(4)
Using multiple regression linear model 2 (where the proportion of various factors was adjusted by population), we found that the number of registered vehicles and transport sector energy consumption affected the death rate from road accidents, whereas, the other factors had no effect.
(5)
Using multiple regression linear model 3 (where the proportion of factors was adjusted for the number of registered vehicles), we found that the number of registered vehicles, number of registered trucks, and the amount of energy consumed by the transport sector affected the death rate from road accidents, whereas, the other factors had no effect.
(6)
The path analysis model showed that GDP, energy consumption, and the number of registered vehicles were factors that directly influenced the road death rate. The amount of energy consumed by the transport sector was a factor influenced by the GDP, which indirectly affected the number of road deaths.
The effectiveness of the first three models with the lowest MAPE were multiple regression linear model 3, the time-series (exponential smoothing) model, and the path analysis model, with MAPE values of 6.4%, 8.1%, and 8.4%, respectively.
When the models were used to predict the death rate from road accidents, we found that the time-series (exponential smoothing), curve estimation (quadratic), curve estimation (linear), multiple regression 2 and 3, and path analysis models forecasted decreasing fatal accident trends, which supports the data on the direction of the death rates provided by the Royal Thai Police [6]. We found that the multiple regression linear 1 and curve estimation (cubic) models generated forecasts that contrasted the trends observed in the data provided by the National Police Bureau, whereas, the curve estimation (quadratic), multiple regression linear 1, and path analysis models predicted a zero value, and thus, are not suitable for long-term forecasting. Only the time-series (exponential smoothing), curve estimation (linear), and multiple regression linear 3 models generated predictions that were consistent with the trends present in the original statistical data.
ARIMA models have been applied to accident forecasting [9]. However, Thailand’s data were not appropriate for applying an ARIMA model, but were suitable for exponential smoothing, and hence, could be used in forecasting. The economic growth data that were used in forecasting by applying multiple regression linear and path analysis were GDP, the energy consumption of the transport sector, and the number of registered vehicles (motorcycles, cars, and trucks).
According to our data, none of the models were found to be suitable for predicting the death rate from road accidents 10 years in advance. The road death rates will feasibly decline over the long time period of over 10 years and the various measures implemented. The economic and transportation factors considered, which reflect the economic growth of the country, had both direct and indirect effects on the road death rates. If considered in -depth, some data may be useful for informing government policy-making and for designing preventive measures to reduce the causes of accidents, especially the number of registered cars on the roads, which is directly related to the number of accidents. In addition to personal and environmental factors, the appropriate control of the rate of vehicle occupancy should be considered. The legal driving age should be reviewed, along with knowledge of traffic regulations and proven experience in safe driving, when applying for a driver’s license.
Appropriate policies are required to reduce fatal accidents in the public sector. Due to the mixed traffic road conditions in Thailand, trucks and other large vehicles share roadways with other small- or medium-sized vehicles, which may cause dangerous situations and accidents. The public sector should implement policies to rigorously control the driving speed, covering all types of vehicles and providing exclusive lanes for freight vehicles. Principally, these policies may help drive Thailand’s economic growth, consistent with the results of multiple linear regression model 1, in which positive growth of the economic factor GDP, as an overall gross product of the nation, indicated increased economic activity (e.g., import and export, and generation of jobs and income).
The public sector must create policies related to the control of the possession of vehicles, including stricter measures, such as requiring declaring a driver’s license, and consideration of traffic violation history and accident history, to possess a vehicle. This policy would be consistent with the results of multiple linear regression models 2 and 3 and the path analysis model. Energy consumption in the transportation sector was found to be connected to the recent number of registered vehicles in Thailand, which has been increasing.
Other factors affecting these policies could be investigated in terms of budgets for solving the accident problems considered in this study for mitigating and preventing accidents or deaths; for example, by providing knowledge and understanding of accidents through public relations by community leaders or organizations, or providing a driver’s license.

5. Limitations and Future Work

Model analysis involves forecasting limitations, which potentially result in the misleading prediction of trends. When attempting to forecast road accident death rates, other factors must be considered, including law enforcement measures, such as those on speed limits, drunk driving, helmet wearing, seat belt use, and phone use while driving and other distracted driving behaviors, in addition to public transport use, transport infrastructure developments, and other economic and social issues, which have yet to be analyzed systematically (lane markings, lighting, road markers, signage, intersections, warnings).
The data that were used in our analysis, due to Thailand incompletely collecting data, according to the plan that was set 20 years ago, led to the lack of many types of data in the analysis. Thailand’s accident data has several databases, which affected the consistency of the data.

Author Contributions

Conceptualization, S.J.; Data curation, S.U.; Formal analysis, S.J. and S.U.; Funding acquisition, S.J.; Methodology, S.J.; Supervision, V.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Suranaree University of Technology Research and Development Fund grant number IRD7-704-61-12-11 and the APC was funded by Suranaree University of Technology.

Acknowledgments

The authors would like to thanks the Suranaree University of Technology Research and Development Fund.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. WHO. Global Status Report on Road Safety 2015; World Health Organization: Geneva, Switzerland, 2015; p. 340. [Google Scholar]
  2. James Burton, W. Top 25 Countries in Car Accidents. Available online: https://www.worldatlas.com/articles/the-countries-with-the-most-car-accidents.html (accessed on 27 April 2018).
  3. WHO. World Health Statistics 2017: Monitoring Health for the SDGs, Sustainable Development Goals; World Health Organization: Geneva, Switzerland, 2017. [Google Scholar]
  4. Department of Disease Control. Road Traffic Death Data Integration: RTDDI, Retroactive Annual Fatality. Available online: http://rti.ddc.moph.go.th/RTDDI/Modules/About.aspx (accessed on 9 February 2018).
  5. Ministry of Public Health. Public Health Statistics A.D.; Strategy and Planning Division, Ministry of Public Health: Mueang Nonthaburi, Thailand, 2016.
  6. Royal Thai Police. Traffic Accident on National Highways in 2016. Available online: https://www.m-society.go.th/ewt_news.php?nid=19593 (accessed on 30 October 2018).
  7. ThaiRSC. Statistics of Traffic Accident Provincial Level. Available online: http://www.thairsc.com/ (accessed on 9 February 2018).
  8. Brian Bass. Advantages and Disadvantages of Forecasting Methods of Production and Operations Management. Available online: http://smallbusiness.chron.com/advantages-disadvantages-forecasting-methods-production-operations-management-19309.html (accessed on 9 February 2018).
  9. Quddus, M.A. Time series count data models: An empirical application to traffic accidents. Accid. Anal. Prev. 2008, 40, 1732–1741. [Google Scholar] [CrossRef] [PubMed]
  10. Ramstedt, M. Alcohol and fatal accidents in the United States—A time series analysis for 1950–2002. Accid. Anal. Prev. 2008, 40, 1273–1281. [Google Scholar] [CrossRef] [PubMed]
  11. Dadashova, B.; Arenas-Ramírez, B.; Mira-McWilliams, J.; Aparicio-Izquierdo, F. Methodological development for selection of significant predictors explaining fatal road accidents. Accid. Anal. Prev. 2016, 90, 82–94. [Google Scholar] [CrossRef] [PubMed]
  12. García-Ferrer, A.; de Juan, A.; Poncela, P. Forecasting traffic accidents using disaggregated data. Int. J. Forecast. 2006, 22, 203–222. [Google Scholar] [CrossRef]
  13. Zheng, X.; Liu, M. An overview of accident forecasting methodologies. J. Loss Prev. Process Ind. 2009, 22, 484–491. [Google Scholar] [CrossRef]
  14. Sanusi, R.; Adebola, F.B.; Adegoke, N. Cases of Road Traffic accidents: A Time Series Approach. Mediterr. J. Soc. Sci. 2016. [Google Scholar] [CrossRef]
  15. Parvareh, M.; Karimi, A.; Rezaei, S.; Woldemichael, A.; Nili, S.; Nouri, B.; Nasab, N.E. Assessment and prediction of road accident injuries trend using time-series models in Kurdistan. Burn. Trauma 2018, 6, 9. [Google Scholar] [CrossRef] [PubMed]
  16. Dadashova, B.; Ramírez Arenas, B.; McWilliams Mira, J.; Izquierdo Aparicio, F. Explanatory and prediction power of two macro models. An application to van-involved accidents in Spain. Transp. Policy 2014, 32, 203–217. [Google Scholar] [CrossRef]
  17. Michalaki, P.; Quddus, M.A.; Pitfield, D.; Huetson, A. Exploring the factors affecting motorway accident severity in England using the generalised ordered logistic regression model. J. Saf. Res. 2015, 55, 89–97. [Google Scholar] [CrossRef] [PubMed]
  18. Lu, P.; Tolliver, D. Accident prediction model for public highway-rail grade crossings. Accid. Anal. Prev. 2016, 90, 73–81. [Google Scholar] [CrossRef] [PubMed]
  19. Oh, J.; Washington, S.P.; Nam, D. Accident prediction model for railway-highway interfaces. Accid. Anal. Prev. 2006, 38, 346–356. [Google Scholar] [CrossRef] [PubMed]
  20. Ameen, J.R.M.; Naji, J.A. Causal models for road accident fatalities in Yemen. Accid. Anal. Prev. 2001, 33, 547–561. [Google Scholar] [CrossRef]
  21. Department Provincial Administration. Official Statstics Registration Sytstems. Available online: https://www.dopa.go.th/main/web_index (accessed on 23 August 2017).
  22. Bank of Thailand. Thailand’s Macro Economic Indicators 1; Bank of Thailand: Bangkok, Thailand, 2016. [Google Scholar]
  23. Department of Land Transport. Transport Statistics. Available online: http://apps.dlt.go.th/statistics_web/vehicle.html (accessed on 9 February 2018).
  24. Ministry of Energy. Department of Alternative Energy Development and Efficiency, Energy Statistics & Information, Energy Consumption for Transport Sector. Available online: http://www.dede.go.th/ewt_news.php?nid=42079 (accessed on 9 February 2018).
  25. Dielman, T.E. Applied Regression Analysis, 4th ed.; Curt Hinrichs: Cincinnati, OH, USA, 2005. [Google Scholar]
  26. Roberts, S.W. Control Chart Tests Based on Geometric Moving Averages. Technometrics 1959, 1, 239–250. [Google Scholar] [CrossRef]
  27. Ott, R.L.; Longnecker, M.T. An Introduction to Statistical Methods and Data Analysis, 6th ed.; Brooks/Cole: Pacific Grove, CA, USA, 2010; p. 1296. [Google Scholar]
  28. Hair, J.F.; Black, W.C.; Babin, B.J.; Anderson, R.E. Multivariate Data Analysis: A Global Perspective; Pearson Education: London, UK, 2010; p. 800. [Google Scholar]
  29. Ratanavaraha, V.; Jomnonkwao, S. Trends in Thailand CO2 emissions in the transportation sector and Policy Mitigation. Transp. Policy 2015, 41, 136–146. [Google Scholar] [CrossRef]
  30. Muthén, L.K.; Muthén, B.O. Mplus User’s Guide, 7th ed.; Muthén & Muthén: Los Angeles, CA, USA, 1998–2012; p. 850. [Google Scholar]
  31. Kline, R.B. Principles and Practice of Structural Equation Modeling; Guilford Press: New York, NY, USA, 2011. [Google Scholar]
  32. Steiger, J.H. Understanding the limitations of global fit assessment in structural equation modeling. Personal. Individ. Differ. 2007, 42, 893–898. [Google Scholar] [CrossRef]
  33. Hu, L.T.; Bentler, P.M. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct. Equ. Modeling Multidiscip. J. 1999, 6, 1–55. [Google Scholar] [CrossRef]
  34. Hooper, D.; Coughlan, J.; Mullen, M.R. Structural Equation Modelling: Guidelines for Determining Model Fit. Electron. J. Bus. Res. Methods 2008, 6, 53–61. [Google Scholar]
  35. Jomnonkwao, S.; Sangphong, O.; Khampirat, B.; Siridhara, S.; Ratanavaraha, V. Public transport promotion policy on campus: Evidence from Suranaree University in Thailand. Public Transp. 2016, 8, 185–203. [Google Scholar] [CrossRef]
Figure 1. Statistics of worldwide death rates from road accidents per 100,000 population in 2013 James Burton [2].
Figure 1. Statistics of worldwide death rates from road accidents per 100,000 population in 2013 James Burton [2].
Sustainability 12 00395 g001
Figure 2. The statistical data trends for all components considered in this study: (a) Death rate from road accidents per 100,000 population [6], (b) Gross domestic product (GDP) statistics from the Bank of Thailand [22], (c) Number of registered vehicles statistics (million vehicles) per the Department of Land Transport [23], and (d) Energy consumption for transport sector statistics [24].
Figure 2. The statistical data trends for all components considered in this study: (a) Death rate from road accidents per 100,000 population [6], (b) Gross domestic product (GDP) statistics from the Bank of Thailand [22], (c) Number of registered vehicles statistics (million vehicles) per the Department of Land Transport [23], and (d) Energy consumption for transport sector statistics [24].
Sustainability 12 00395 g002aSustainability 12 00395 g002b
Figure 3. Results of path analysis.
Figure 3. Results of path analysis.
Sustainability 12 00395 g003
Figure 4. Comparison of prediction results using different techniques.
Figure 4. Comparison of prediction results using different techniques.
Sustainability 12 00395 g004
Table 1. Previous research on accident prediction and analytical methodologies.
Table 1. Previous research on accident prediction and analytical methodologies.
AuthorPeriodMethodologyData
Time-SeriesRegressionOtherAccident DataEnvironment ConditionsEconomic FactorsEnergy ConsumptionAlcohol ConsumptionLawVehicle RegistrationPopulation
Quddus [9]1950–2005
(55 years)
-------
Michalaki, et al. [17]2005–2011
(6 years)
--------
Lu and Tolliver [18] 1996–2014
(18 years)
-------
Ramstedt [10]1950–2002
(52 years)
-------
Oh, et al. [19]1998–2002
(4 years)
-------
Dadashova, et al. [11]2000–2011
(11 years)
------
Dadashova, et al. [16]2000–2009
(9 years)
-----
García-Ferrer, et al. [12] 1975–2003
(28 years)
-----
Zheng and Liu [13]1989–2002
(13 years)
-------
Ameen and Naji [20]1978–1995
(17 years)
-----
Sanusi, et al. [14]1969–2013
(54 years)
---------
Parvareh, et al. [6,15]2009–2015
(72 months)
---------
This research1997–2016
(20 years)
Curve estimate, Path analysis---
Table 2. Curve estimate model. R2, determination coefficient; SE, standard error.
Table 2. Curve estimate model. R2, determination coefficient; SE, standard error.
ModelR2Adjusted R2SE of the EstimateFEquation
Linear0.7380.7242.31850.815E(y)t = 23.879 − 0.641T
Logarithmic0.5110.4833.17018.785E(y)t = 25.361 − 3.878ln(T)
Inverse0.2490.2083.9265.981E(y)t = 15.379 + (9.856/T)
Quadratic0.8160.7942.00137.638E(y)t = 20.772 + 0.207T − 0.040T2
Cubic0.8420.8131.90828.511E(y)t = 18.262 + 1.487T − 0.189T2 + 0.005T3
Compound0.7290.7140.15248.518E(y)t = 25.494(0.960)T
Power0.4890.4600.21017.210E(y)t = 27.813T−0.245
S0.2230.1800.2585.179E(y)t = exp[2.698 + (0.603/T)]
Growth0.7290.7140.15248.518E(y)t = exp[3.238 − (0.041T)]
Exponential 0.7290.7140.15248.518E(y)t = 25.494exp−0.041T
Table 3. Results of multiple regression analysis.
Table 3. Results of multiple regression analysis.
VariableModel 1
(Y = Death Rate/100,000 Population)
Bt-Statisticp-Value
VEH_MOTORCYCLE---
VEH_CAR---
VEH_TRUCK---
GDP−2.393 × 10−6−5.551<0.001 **
EN_TRANSPORT0.0012.839<0.05 *
Constant12.8382.370<0.05 *
F-test51.814
Adjusted R20.842
VariablesModel 2
(Y = Death Rate/100,000 Population)
Bt-Statisticp-Value
VEH_MOTORCYCLE/1,000 population---
VEH_CAR/1,000 population−0.127−6.405<0.001 **
VEH_TRUCK/1,000 population---
GDP/1,000 population---
EN_TRANSPORT/100,000 population0.4242.265<0.001 **
Constant19.6324.399<0.001 **
F-test55.915
Adjusted R20.853
VariablesModel 3
(Death Rate/100,000 Population)
Bt-Statisticp-Value
VEH_MOTORCYCLE/1,000 VEH_TOTAL---
VEH_CAR/1,000 VEH_TOTAL−0.081−8.769<0.001 **
VEH_TRUCK/1,000 VEH_TOTAL−0.723−2.266<0.05 *
GDP/1,000 VEH_TOTAL---
EN_TRANSPORT/1,000 VEH_TOTAL0.2604.006<0.001 **
Constant42.0965.903<0.001 **
F-test51.144
Adjusted R20.888
Note: VEH_TOTAL, number of vehicles registered; VEH_MOTORCYCLE, number of motorcycles registered; VEH_CAR, number of cars registered; VEH_TRUC, number of trucks registered; GDP, gross domestic product (billion baht); EN_TRANSPORT, energy consumption (Ktoe).
Table 4. Path analysis model results.
Table 4. Path analysis model results.
RelationshipModel Stat
B (β)SEt-Statisticp-Value
GDP → FA_RATE−0.106 (−0.193)0.033 (0.352)−3.245 (−3.390)<0.05 *(<0.05 *)
EN_TRANSPORT → FA_RATE0.834 (0.992)0.214 (0.274)3.896 (3.615)<0.001 **(<0.001 **)
VEH_MOTORCYCLE → FA_RATE0.041 (0.432)0.022 (0.233)1.881 (1.857)0.060 (0.063)
VEH_CAR → FA_RATE−0.120 (−1.485)0.060 (0.747)−2.021 (−1.989)<0.05 *(<0.05 *)
VEH_TRUCK → FA_RATE0.690 (0.396)1.073 (0.614)0.643 (0.645)0.520 (0.519)
EN_TRANSPORT → GDP8.898 (0.938)0.734 (0.027)12.120 (35.012)<0.001 **(<0.001 **)
VEH_MOTORCYCLE→ EN_TRANSPORT−0.044 (−0.388)0.027 (0.240)−1.623 (−1.615)0.105 (0.106)
VEH_CAR → EN_TRANSPORT0.137 (1.420)0.074 (0.752)1.863 (1.888)0.062 (0.059)
VEH_TRUCK → EN_TRANSPORT−0.345 (−0.166)1.411 (0.680)−0.0244 (−0.244)0.807 (0.807)
Intercept
FA_RATE−0.799 (−0.194)10.513 (2.554)−0.076 (−0.076)0.939 (0.939)
GDP−187.266 (−4.035)27.035 (0.639)−6.927 (−6.317)<0.001 **(0.001 **)
EN_TRANSPORT−0.345 (6.629)10.361 (0.680)3.131 (2.783)<0.05 *(<0.05 *)
Residual variances
FA_RATE−2.262 (0.134)0.715 (0.055)3.162 (2.430)<0.05 *(<0.05 *)
GDP258.150 (0.120)81.634 (0.050)3.162 (2.383)<0.05 *(<0.05 *)
EN_TRANSPORT4.040 (0.169)10.361 (0.069)3.162 (2.452)<0.05 *(<0.05 *)
R2
FA_RATE0.8660.05515.739<0.001 **
GDP0.8800.05017.506<0.001 **
EN_TRANSPORT0.8310.06912.087<0.001 **
*, ** Denotes significance at 0.05, 0.001 level; FA_RATE, death rate/100,000 population.
Table 5. Comparison of the mean absolute percentage error (MAPE) of different methods.
Table 5. Comparison of the mean absolute percentage error (MAPE) of different methods.
ModelYearValue
19972001200620112016MAE *MAPE *
1Time-series (exponential smoothing)22.5318.6820.0511.789.091.628.1
2Curve estimate (cubic)19.5721.6019.2314.9212.402.2311.2
3Curve estimate (quadratic)20.9420.8118.8414.888.912.0410.2
4Curve estimate (linear)23.2420.6717.4714.2611.062.5212.6
5Multiple regression linear model 121.8218.6815.7211.2611.872.5612.8
6Multiple regression linear model 223.2019.0818.4414.197.671.919.5
7Multiple regression linear model 323.6519.2619.1314.9710.081.286.4
8Path analysis24.4021.1319.1715.1611.131.688.4
MAE, mean absolute error; MAPE, mean absolute percentage errors; *, average MAPE 1997–2016.
Table 6. Predictive results, including the variables and prediction to be used in the models.
Table 6. Predictive results, including the variables and prediction to be used in the models.
ParameterYear
202020222024202620282030
Population (106)66.8467.3067.7768.2468.7169.17
GDP (1012 baht/1000 population); 1997 constant price253.17269.79286.41303.04319.66336.28
Motorcycle registration /1000 population333.62344.74355.85366.97378.08389.20
Car registration /1000 population267.55287.37307.19327.01346.84366.66
Truck registration /1000 population17.7218.5719.4220.2721.1221.97
Energy consumption of the transportation sector (Ktoe/100,000 population)48.9150.4752.0353.5955.1556.72
Motorcycle/1000 Total Vehicles registered516.00502.12488.25474.37460.49446.61
Car/1000 Total Vehicles registered428.52441.85455.19468.52481.85495.19
Truck/ 1000 Total Vehicles registered26.4525.9725.5025.0224.5424.07
GDP /1000 Total Vehicles registered; 1997 constant price425.16443.38461.59479.81498.03516.24
Energy consumption of the transportation sector (Ktoe)/1000 Total Vehicles registered76.6474.5372.4270.3268.2166.10
Table 7. Death rate from road accidents per 100,000 population as predicted by the models used.
Table 7. Death rate from road accidents per 100,000 population as predicted by the models used.
ModelYear
202020222024202620282030
1Time-series (exponential smoothing)9.828.547.265.984.703.42
2Curve estimate (cubic)14.2017.0021.5027.8036.2046.90
3Curve estimate (quadratic)2.70-----
4Curve estimate (linear)8.507.215.934.653.372.09
5Multiple regression linear model 112.2013.3014.4015.5016.6017.7
6Multiple regression linear model 26.394.542.680.82--
7Multiple regression linear model 38.196.915.624.343.061.77
8Path analysis7.755.964.162.360.56-
Back to TopTop