Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study

Banik, Joydeep; Miah, Md Emon; Hossain, Arman; Siraj, Md Sifat Bin; Huq, Armana Sabiha; Campisi, Tiziana

doi:10.3390/futuretransp6030105

Open AccessArticle

Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study

by

Joydeep Banik

¹

,

Md Emon Miah

²,

Arman Hossain

¹,

Md Sifat Bin Siraj

³

,

Armana Sabiha Huq

⁴ and

Tiziana Campisi

^5,*

¹

Department of Civil Engineering, Bangladesh University of Engineering and Technology, Dhaka 1000, Bangladesh

²

Department of Civil Engineering, Rajshahi University of Engineering and Technology, Rajshahi 6204, Bangladesh

³

School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 610032, China

⁴

Accident Research Institute (ARI), Bangladesh University of Engineering and Technology, Dhaka 1000, Bangladesh

⁵

Department of Engineering and Architecture, University of Enna Kore, 94100 Enna, Italy

^*

Author to whom correspondence should be addressed.

Future Transp. 2026, 6(3), 105; https://doi.org/10.3390/futuretransp6030105

Submission received: 20 March 2026 / Revised: 8 May 2026 / Accepted: 11 May 2026 / Published: 18 May 2026

(This article belongs to the Special Issue Unfolding Road-Related Aspects of Modern Infrastructure in the Future Era of Road Transport)

Download

Browse Figures

Versions Notes

Abstract

Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS) are becoming convenient modes of transportation; however, their safety remains a critical concern as crashes continue to occur. To reveal crash trends and temporal variations, this study develops time-series forecasting models to predict future crash counts of such vehicles. The crash dataset released by the National Highway Traffic Safety Administration (NHTSA) has been used here. Two univariate forecasting models—the Seasonal Autoregressive Integrated Moving Average (SARIMA) and the Facebook Prophet model—have been used here for different datasets. The models were trained on 30 months of data (July 2021 to December 2023) and validated on 6 months of data (January–June 2024). Validation metrics include Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Theil’s U1 statistic. Results showed that Facebook Prophet significantly outperformed SARIMA for both datasets, achieving an RMSE of 2.71 and an MAPE of 6.9% for ADAS, and an RMSE of 2.24 and an MAPE of 8.85% for ADS. For both systems, the model revealed empirically observed cyclical patterns and consistent rising trends. ADAS crashes exhibit a bimodal temporal pattern, with recurring peaks in January and May–June, alongside notable troughs in February–March and August–September. ADS displays a trimodal pattern, with recurring peaks in April–May, August and October, alongside notable troughs in December and the early winter months. These patterns represent empirically identified temporal regularities rather than causally attributed seasonality. From the future forecasts for July to December 2024, the model showed that ADAS crashes are expected to range between 40 and 80 per month, while ADS crashes are projected to remain between 20 and 40 per month. These findings underscore the need for proactive safety measures and enhanced regulatory oversight during identified high-risk periods to mitigate the growing trend in AV crashes.

Keywords:

SARIMA; Facebook Prophet; time-series forecasting; crash analysis; autonomous vehicles

1. Introduction

Road traffic accidents remain a significant global public health challenge. According to the World Health Organization (WHO), over 1.35 million people lose their lives each year due to road traffic injuries [1]. In the United States, the National Highway Traffic Safety Administration (NHTSA) reports about 32,000 fatalities and more than 2 million injuries annually, resulting in massive economic and social expenses [2]. Human error is widely regarded as a major contributing factor in most fatal crashes, including behaviors such as speeding, distracted driving, and driving while exhausted.

In recent years, Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS) have been developed to minimize human-related errors and improve road safety. These technologies are dependent on sensor fusion, machine learning, and automated control systems to assist or partially replace human driving responsibilities. While ADAS (SAE Level 2) allows the driver some control, ADS (SAE Level 3–5) offers greater levels of automation, fundamentally altering the relationship between cars, drivers, and the surrounding traffic environment. Despite its potential safety benefits, the real-world functioning of these systems is currently under active research [3].

An important breakthrough in this area took place in July 2021, when NHTSA required manufacturers and operators to report accidents involving vehicles that were not equipped with ADAS and ADS [4]. This analysis was performed with a standardized and consistent dataset that includes crashes recorded from July 2021 to June 2024 [5]. Compared to ordinary accident databases, which sometimes suffer from underreporting or inconsistent data collection, this dataset offers more comprehensive and reliable information directly reported by manufacturers. As a result, it provides a satisfactory way to analyze the safety performance of automated vehicle technologies using real-world data.

Existing studies suggest that collisions involving autonomous cars do not follow the same trends as those involving conventional vehicles. For example, rear-end collisions tend to occur more frequently in ADAS- and ADS-related cases, which may demonstrate variations in system responsiveness and interaction with human-driven cars [6,7]. In contrast, other crashes, such as pedestrian or broadside crashes, are recorded less often [8]. In addition, a few research findings state that autonomous vehicle systems are frequently not largely responsible for collisions, underlining the significance of surrounding traffic and human driver conduct [9]. These findings show that collision processes in autonomous driving scenarios are more complicated than for conventional vehicles and may require alternative analytical methodologies.

Although ADAS and ADS technologies are expected to improve safety, collisions continue to occur under a range of traffic and operational conditions. The higher number of reported instances demonstrates that understanding their temporal behavior is becoming increasingly essential. However, most existing research focuses on accident features and severity rather than how these events change over time. Little consideration has been given to determining seasonal trends and short-term fluctuations in crash frequency using real-world statistics.

Time-series analysis provides a suitable framework for evaluating such temporal dynamics, since it enables the discovery of trends, seasonal fluctuations, and repeating patterns in crash data. While these methodologies have been often used on traditional traffic accident datasets, their application to ADAS- and ADS-related incidents remains restricted. This study has the following objective:

To develop and validate univariate time-series models that can forecast the monthly crash frequencies of ADS- and ADAS-equipped vehicles over a short time period.
To characterize the temporal patterns in ADS and ADAS crashes, including overall trends and recurring monthly peaks and troughs.

Thereby providing necessary guidance for policy and resource allocation decisions. This study presents a comprehensive analysis of time-series forecasting methods on the NHTSA autonomous vehicle crash dataset and offers methodologically rigorous approaches for predicting emerging transportation safety phenomena. Following this section are a literature review, methodology, results and analysis, and then the conclusion of this study.

2. Literature Review

Time-series analysis has become increasingly important in transportation safety research, particularly for understanding how crash frequencies change over time and for developing predictive models that inform policy interventions. Traditional approaches to traffic safety analysis have relied heavily on cross-sectional models that examine correlations between safety outcomes and predictor variables within a single period. However, time-series methods are uniquely suited to transportation data because they account for temporal dependencies, seasonal patterns, and autoregressive relationships inherent in crash data [10].

In contemporary research, transportation safety analysis has increasingly adopted data-driven and AI-enhanced time-series frameworks for ADAS/ADS environments. Recent studies emphasize multi-source data fusion, integrating spatiotemporal traffic data, vehicle kinematics, onboard sensors, and environmental factors like weather and seasonal effects to improve crash-risk prediction performance [11]. Chen et al. [12] demonstrate that combining roadway geometry, traffic conditions, weather, and automation-related variables enhances model robustness in ADAS/ADS safety prediction. ADAS/ADS crash patterns are changing due to electrification and vehicle connectivity. Connected and automated vehicles introduce new vehicle–infrastructure–driver interactions that alter traditional crash mechanisms [13]. Kashkanov et al. [14] stated that electrification also affects vehicle dynamics, including acceleration and braking behavior, influencing crash patterns such as rear-end collisions. Therefore, these effects need to be explored in time-series-based crash prediction, especially using large-scale real-world datasets such as NHTSA ADAS/ADS crash reports [2]. These findings indicate that crash risk in automated driving systems is dynamic and multi-factorial, requiring integrated time-series and machine learning approaches.

The use of time-series models in traffic safety research has expanded substantially over the past two decades, with applications ranging from evaluating policy interventions to forecasting crash trends. SARIMA has been extensively applied in traffic safety research for forecasting accident frequencies. Deretic et al. [15] applied SARIMA to 48 months of traffic accident data from Belgrade. The results illustrate that time-series methods can produce operationally useful forecasts with manageable error rates and revealed pronounced seasonal patterns in accident frequency, with significantly higher accident numbers during autumn and winter months. Lavrenza et al. [10] revealed that while time-series methods offer substantial advantages for analyzing crash data with temporal structure, challenges remain in addressing issues such as data quality, seasonal decomposition, and model validation. The ARIMA family of models continues to demonstrate strong utility in transportation safety research across diverse geographic and operational contexts. Al Sulaie [16] applied ARIMA to over two decades of crash data from Saudi Arabia, successfully forecasting injury consequences per 1000 crashes up to 2032, with model predictions closely aligning with observed trends, confirming ARIMA’s suitability for long-horizon traffic safety forecasting. In a similar vein, Choo et al. [17] developed an ARIMA-based accident prediction model using Malaysian occupational crash data, where the ARIMA (2, 0, 2) (2, 0, 0) (12) configuration yielded the lowest AIC value, reinforcing the idea that seasonal ARIMA variants often outperform simpler forms when monthly periodicity is present. At the methodological level, Ni et al. [18] compared ARIMA and SARIMA against ensemble machine learning models for near-future crash prediction on Chinese freeways, finding that integrating time-series structure with count data models better captures serial correlation in daily crash frequencies. More broadly, Tselentis et al. [19] reviewed ARIMA against contemporary machine learning approaches across data-driven network forecasting tasks, noting that while ARIMA remains robust for short-term, linearly structured series, hybrid or seasonal extensions are typically necessary when crash data exhibits non-stationary or periodic behavior. Together, these findings reinforce the relevance of ARIMA-based frameworks as a methodological baseline in time-series crash modeling, while also highlighting their limitations in capturing complex nonlinear dynamics, a gap that motivates the comparative use of Facebook Prophet in the present study. Boye et al. [20] compared Prophet with SARIMA, ARIMA, and grey methods for predicting motor vehicle registrations in Ghana, finding that Prophet achieved the lowest NRMSE (0.2176) and highest R² (0.8229) among all methods tested.

Understanding the safety characteristics and crash patterns of autonomous vehicles is essential for developing appropriate forecasting models. Previous studies of AVs mostly focused on the influence of different features on crash outcomes. Studies based on California DMV and related crash datasets show that autonomous vehicle crashes exhibit distinct patterns compared to human-driven vehicles. Almaskati et al. [21] identified key factors influencing crash severity, including driving mode, collision type, road environment, and weather conditions, and reported that autonomous vehicles generally experience lower crash severity, with rear-end collisions being the most frequent and common among AVs. Cicchino [6] demonstrated that ADAS technologies such as forward collision warning and autonomous emergency braking reduce front-to-rear crash rates by approximately 50%, although effectiveness varies across driving contexts. Combs et al. [22] highlighted both the capabilities and limitations of pedestrian detection systems in automated vehicles, noting potential safety vulnerabilities in complex environments. In addition, responsibility analyses suggest that autonomous vehicles are rarely at fault in reported crashes, with human drivers often blamed even when crashes are difficult to avoid, emphasizing the influence of human–automation interaction in mixed traffic conditions [9,23].

Long-term forecasting studies of road accident casualties consistently report strong seasonal trends in crash data, reinforcing the importance of time-aware modeling approaches [24]. Karacasu et al. [25] examined temporal variations in traffic accidents in Eskişehir, Turkey, documenting significant variations across seasonal, monthly, daily, and hourly time scales. Their analysis revealed that the autumn and winter months experienced higher accident frequencies, while the summer months showed somewhat lower frequencies despite higher traffic volumes. X. Wang et al. [26] stated that seasonal and environmental variations significantly influence road crash risk, as weather, visibility, and traffic exposure change across time periods. Abdulrazaq & Fan [27] further found that crash frequencies exhibit clear seasonal patterns driven by environmental and mobility fluctuations. Wang et al. [28] further highlighted that traffic safety outcomes vary significantly under different weather and seasonal conditions, reinforcing the importance of temporal modeling in crash analysis. In ADAS/ADS contexts, Almaskati et al. [21] reported that crash outcomes are strongly affected by weather and road conditions, yet seasonal modeling remains largely underexplored.

Existing studies based on the NHTSA ADAS/ADS crash dataset have primarily focused on descriptive analyses of automated vehicle crash reports, identifying key operational, environmental, and causation factors in real-world ADS incidents [4,29]. However, these studies provide limited integration with predictive time-series frameworks. In parallel, recent studies on traffic accident forecasting show that combining SARIMA and Facebook Prophet improves prediction performance by capturing both linear temporal structure and nonlinear trend changes, especially in the presence of changepoints [6,8]. Moreover, research on autonomous vehicle safety indicates that automation may reshape crash frequency and severity patterns, making time-series approaches essential for ADAS/ADS analysis [21]. Environmental and weather-related factors have also been widely shown to influence road safety outcomes [30], while empirical studies across different regions consistently confirm strong seasonal and temporal variations in crash occurrences [31,32].

Despite the growing body of research on time-series forecasting of conventional traffic crashes, limited studies have applied such approaches to ADAS/ADS-related crashes using real-world reported datasets. Moreover, comparative evaluation of classical and modern time-series models in this context remains scarce. This gap highlights the need for robust temporal modeling of automated vehicle crash data.

3. Methodology

3.1. Data Description

From July 2021, the NHTSA mandated that manufacturers or operators should report the crashes associated with Society of Automotive Engineers (SAE) level 2 ADAS and Society of Automotive Engineers (SAE) level 3–5 ADS [3,4]. Since then, up to July 2024, a total of 2810 ADAS and 1305 ADS crashes have been reported in the dataset released by the NHTSA. The NHTSA released the data in .csv format, with 122 variables (including subject vehicle, model, mileage, crash details, and details of the crash partner). The original dataset contained multiple versions of the same incidents. Therefore, a rigorous screening process was followed to make the data suitable for analysis. Python (version 3.13) scripts were used to preprocess the data based on these columns: ‘Report ID’, ‘Report Version’, and ‘Same Incident ID’. The important variables were retained for the final dataset. After preprocessing, the final crash counts of ADAS and ADS were 1474 and 675, respectively.

3.2. Data Analysis

In the pre-processed dataset, the final crash counts were aggregated by month. From July 2021 to December 2023, a total of 30 months of data were used to develop the two time series models: univariate SARIMA and FB Prophet with potential checkpoints. Although the NHTSA standing general order dataset contains 122 variables describing vehicles, crash details, and environmental conditions, the present study focuses on univariate time series of monthly crash counts for ADAS and ADS. Accordingly, crash records were aggregated by month and automation level (ADAS vs. ADS), and disaggregate factors such as collision type, injury severity, roadway, or weather were excluded as predictors in the forecasting models, since the objective was to model and forecast total monthly crash volumes and their temporal patterns rather than explain crash severity or type. From January 2024 to June 2024, data were used to validate the model’s predictions. The model’s validation was based on metrics such as Root Mean Squared Error (RMSE) and Mean Squared Error (MSE), Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Theil’s U1 Statistic (

τ)

, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Then, based on the model’s accuracy and capacity to handle complexity, the most suitable model was used to forecast crash counts for the next six months (July 2024 to December 2024). Two models were developed separately for forecasting on two different datasets: one for ADAS crashes and another for ADS crashes. All analyses were performed in Python. The overall methodology followed in this study is shown in Figure 1.

3.2.1. Auto Regressive Moving Average Model (ARMA)

The ARMA model consists of two terms: the autoregressive AR (p) and the moving-average MA (q). This model is best suited to univariate time-series analysis. The mathematical expression is shown below:

χ_{t} = θ_{0} + θ_{1} χ_{t - 1} + θ_{2} χ_{t - 2} + \dots + θ_{p} χ_{t - p} + ε_{t} = θ_{0} + \sum_{i = 1}^{p} θ_{i} χ_{t - i} + ε_{t}

(1)

This is the AR (p) term, where

χ_{t}

are the current values of the time series,

θ_{0}

is the constant term, p is the order of the series,

θ_{i}

(i = 1, 2, …, p) are the coefficients of the autoregressive terms, and

ε_{t}

is the current noise at time t. The MA (q) term is:

χ_{t} = μ + ε_{t} - α_{1} ε_{t - 1} - α_{2} ε_{t - 2} - \dots - α_{q} ε_{t - q} = μ + ε_{t} - \sum_{i = 1}^{q} α_{i} ε_{t - i}

(2)

Here, μ is the mean of the series,

ε_{t}

is the random error or white noise at time t, and αᵢ (i = 1, 2, …, q) are the coefficients of the moving average terms. The combination of these equations can be written as ARMA (p, q):

x_{t} = μ + \sum_{i = 1}^{p} θ_{i} χ_{t - i} + ε_{t} - \sum_{i = 1}^{q} α_{i} ε_{t - i}

(3)

Here,

x_{t}

represents the outcome of the time series, including both AR and MA terms with their orders p and q, respectively.

3.2.2. Auto Regressive Integrated Moving Average Model (ARIMA)

ARIMA (p, d, q) extends the ARMA (p, q) model by including a lag operator in the series, which handles nonstationary series through differencing the series d times. The equation is followed here:

θ (L) {(1 - L)}^{d} χ_{t} = μ + α (L) ε_{t} (1 - \sum_{i = 1}^{p} θ_{i} L^{i}) {(1 - L)}^{d} χ_{t} = μ + ε_{t} (1 - \sum_{i = 1}^{q} α_{i} L^{i})

(4)

Here, p, d, and q represent autoregressive, integrated, and moving-average terms, respectively. θ(L) is the AR polynomial in lag operator, α(L) is the MA polynomial in lag operator, and

{(1 - L)}^{d}

are the differencing operator d times.

3.2.3. Seasonal ARIMA Model (SARIMA)

The SARIMA (Seasonal ARIMA) model is a further extension of the ARIMA model within the Box–Jenkins framework [33]. Most real-world data show a seasonal component that repeats after S observations. If monthly observation is considered, then S = 12, meaning the value

χ_{t}

depends upon

χ_{t - 12}

and

χ_{t - 24}

and so on. While ARIMA effectively models non-seasonal time series, SARIMA incorporates seasonal patterns by adding seasonal autoregressive, differencing, and moving-average components. The equation of SARIMA (p, d, q) × (P, D, Q)_S is given below:

ϕ_{p} (B) Φ_{P} (B^{s}) \nabla^{d} \nabla_{S}^{D} χ_{t} = θ_{q} (B) Θ_{Q} (B^{S}) Z_{t}

(5)

Here, lag operator

B

is the backshift operator and

B^{s}

is a seasonal backshift operator that shifts by s periods.

S

denotes the seasonal lag.

ϕ_{p} (B)

is a non-seasonal autoregressive (AR) polynomial of order p.

Φ_{P} (B^{s})

is a seasonal autoregressive (SAR) polynomial of order P.

θ_{q} (B)

is a non-seasonal moving average (MA) polynomial of order q.

Θ_{Q} (B^{S})

is a seasonal moving average (SMA) polynomial of order Q.

\nabla^{d}

denotes non-seasonal differencing applied d times.

\nabla_{S}^{D}

denotes seasonal differencing applied D times with period s.

3.2.4. Facebook Prophet Forecasting Model

Facebook Prophet represents a relatively recent innovation in time-series forecasting, developed and open-sourced by Facebook (now Meta) to address challenges associated with forecasting at an organizational scale [1]. Facebook Prophet is a decomposable time-series forecasting model designed to handle data with strong seasonal effects and historical trend shifts. It decomposes time series into trend, seasonality, and holiday components and is particularly suited for datasets with irregularly spaced observations or missing values. The mathematical expression is given below:

ŷ (t) = g (t) + s (t) + η (t) + ε (t)

(6)

Here,

ŷ (t)

denotes forecasted value at time t,

g (t)

captures non-periodic changes over time,

β (t)

captures periodic patterns,

η (t)

captures irregular events, and

ε (t)

accounts for noise, outliers, and unexplained variation. The FB Prophet uses a Fourier series to incorporate seasonal effects in the model. The expression is shown below:

s (t) = \sum_{k = 1}^{N} α_{k} C o s (\frac{2 π k t}{p}) + β_{k} S i n (\frac{2 π k t}{p})

(7)

Here,

α_{k}

and

β_{k}

are the Fourier coefficients and p denotes the period of the season pattern. Prophet can model seasonal (and holiday) effects in either an additive form (effects are added to the trend) or a multiplicative form, where seasonal effects scale with the level of the series (implemented via seasonality_mode = ‘multiplicative’). In this study, the Facebook Prophet model was used to forecast monthly counts of road traffic accidents in the United States. Prophet was configured with custom changepoint scales through a grid search, allowing the model to effectively identify structural shifts in the time series. U.S. holidays were incorporated to assess their potential influence on accident patterns. It requires variables y (target) and ds (Date Time) in the time series. It is highly efficient at managing outliers, missing data, and trend shifts [34,35].

3.3. Model Evaluation Metrics

Akaike Information Criterion (AIC): This is a widely used metric for model selection that balances model fit with complexity. AIC rewards goodness-of-fit but penalizes models with more parameters in order to discourage overfitting. It is calculated using the formula shown below:

A I C = 2 K + 2 l n (L)

(8)

Here, k is the number of parameters and L is the maximum likelihood of the model.

Bayesian Information Criterion (BIC): This serves a similar purpose but applies a stronger penalty for model complexity, especially as the sample size increases. BIC tends to favor simpler models and is often preferred for inference or when working with large datasets. It is more conservative than AIC in selecting models with fewer parameters. Its formula is given below:

B I C = \ln (n) k - 2 \ln (L)

(9)

Here, n is the number of observations. The MAE, MAPE, MSE, RMSE and Theil’s U1 Statistic are shown below:

MAE : \frac{1}{n} \sum_{t = 1}^{n} | y_{t} - y_{t}^{^} |

(10)

MAPE : \frac{1}{n} \sum_{t = 1}^{n} \frac{| y_{t} - y_{t}^{^} |}{y_{t}} * 100 %

(11)

M S E : \frac{\sum_{t = 1}^{n} {{(y}_{t} - y_{t}^{^})}^{2}}{n}

(12)

RMSE : \sqrt{\frac{\sum_{t = 1}^{n} {{(y}_{t} - y_{t}^{^})}^{2}}{n}}

(13)

Theil ’ s U 1 Statistic (τ) = \sqrt{\frac{1}{n} \sum_{t = 1}^{n} ε_{t}^{2}} / (\sqrt{\frac{1}{n} \sum_{t = 1}^{n} y_{t}^{2}} + \sqrt{\frac{1}{n} \sum_{t = 1}^{n} {y_{t}^{^}}^{2}})

(14)

Here,

y_{t}^{^}

are the forecast values, and

y_{t}

are the actual values, and

ε_{t}

is the forecast errors (

y_{t} - y_{t}^{^})

. In this study, Theil’s U1 statistic is denoted as τ, where, Theil’s U1statistic, 0 ≤

τ

≤ 1 where

τ

close to zero indicates highly accurate forecasts, with almost perfect alignment between predicted and actual values.

τ

≡ 1 indicates that the model performs no better than a naïve benchmark (usually a random walk or no-change forecast).

3.4. Forecasting of Time Series

Using the validation dataset, the model evaluation metrics can be calculated. After determining the suitable model based on evaluation metrics, future forecasting is conducted. Forecasting in time series can be approached via the infinite Moving Average (MA) representation (Equation (15)), which expresses future values as a weighted sum of past shocks (random error terms), or more commonly through the ARIMA (Auto Regressive Integrated Moving Average) model (Equation (16)), which uses autoregressive terms (past observations), moving average terms (past forecast errors), and differencing (to achieve stationarity).

Y_{t + r}^{^} = μ + \sum_{i = 1}^{\infty} ψ_{i} ε_{t + r - i}

(15)

Y_{t + r}^{^} = \sum_{i = 1}^{p + d} ϕ_{i} y_{t + r - i} + ε_{t + r} - \sum_{i = 1}^{q} θ_{i} ε_{t + r - i}

(16)

Y_{t + r}^{^}

denotes the forecast for time t + r;

μ

is the mean of the stationary process;

ψ_{i}

are the weights assigned to past shocks in the MA representation;

ε_{t + r - i}

are the white noise error terms;

ϕ_{i}

are the autoregressive (AR) coefficients;

θ_{i}

are the moving average (MA) coefficients; and p, d, and q represent the AR order, degree of differencing, and MA order, respectively. Once a forecast is obtained for

Y_{t + 1}^{^}

, it can be recursively used to forecast

Y_{t + 2}^{^}

,

Y_{t + 3}^{^}

and so on, enabling multi-step forecasting for any future time point.

For both the SARIMA and Prophet models, the input consists of a univariate monthly crash count time series {y₁, y₂, …, y₃₀}, where each observation yₜ represents the total number of reported ADAS or ADS crashes in month t, aggregated from the preprocessed NHTSA dataset (July 2021–December 2023, n = 30). The output of both models is a point forecast ŷ_t+h representing the predicted monthly crash count for forecast horizon h = 1, 2, …, 6 (July–December 2024), expressed as a non-negative real number rounded to the nearest integer for interpretation. For the Prophet model, each point forecast is additionally accompanied by a 95% prediction interval [yhat_lower, yhat_upper], reflecting forecast uncertainty. The SARIMA model produces only point forecasts without native prediction intervals at this configuration.

4. Results and Discussion

4.1. Formulation of the SARIMA Framework

To build a reliable forecasting framework, first it is necessary to understand the time-series structure and potential anomalies. Therefore, a box plot, as shown in Figure 2 and Figure 3, was constructed to visually inspect the distribution of monthly crash counts for both ADAS and ADS crashes in order to identify potential outliers in the training data.

The monthly ADAS crash time series can be visualized from the above box-plot diagram (Figure 2). The lowest number of ADAS crashes occurs in February, where the median count is approximately 29, making it the most stable and least dangerous month across the study period. Looking at the broader seasonal pattern, crashes remain relatively moderate during the first quarter (January–March), though March exhibits notable variability, with a wide interquartile range, suggesting inconsistent crash frequency across years. The spring months of April and May show moderate increases, with May recording a consistently higher median of approximately 48. A slight dip is observed from April through June, with relatively tight distributions indicating stable but moderate crash activity. August emerges as a secondary peak month, with elevated crash activity, while July shows relatively high year-to-year variability. The final quarter shows a clear upward trend, with December emerging as the most dangerous month, recording the highest median crash count of approximately 65 or above and a wide interquartile range. The distribution of monthly crashes exhibits a bimodal seasonal pattern, with two distinct peaks observed in January and May–June, and two troughs in February and August, confirming the presence of strong seasonality in the data. Furthermore, no data points were found beyond the whisker boundaries in any month, indicating the absence of statistical outliers in the training dataset. This suggests that the data is sufficiently clean and no outlier treatment or replacement is necessary prior to model development. The ratio between the maximum median monthly value (December, ~65) and the minimum median monthly value (February, ~29) is approximately 2.24, reflecting a substantial seasonal difference, underscoring a strong seasonal pattern present in the ADAS crash data that must be accounted for in forecasting.

The monthly ADS crash time series can be visualized from the box-plot diagram in Figure 3. The lowest number of ADS crashes occurs in January, where the median count is approximately 10, making it the least active month across the study period. February shows a slight increase, with a tight distribution, indicating consistent but low crash activity. A notable rise is observed in March, with a median of approximately 19 and a narrow interquartile range, suggesting relatively stable and moderate crash frequency. April and May follow a similar moderate trend, while June shows a marginal decline. July and August emerge as the most active months, with August recording the highest median of approximately 21 and the widest interquartile range, reflecting substantial year-to-year variability. September and October show elevated but declining activity, while November and December record the lowest medians in the latter half of the year. There are no visible outliers. The distribution of monthly ADS crashes exhibits a trimodal seasonal pattern, with three discernible peaks observed in March, July, and August, and troughs in January, November, and December. The ratio between the maximum median monthly value (August) and the minimum median monthly value (January) reflects a considerable seasonal difference, underscoring the importance of accounting for seasonal dynamics in any ADS crash forecasting.

To further understand the structural dynamics of monthly crash counts, a preliminary time-series decomposition was performed using an additive model. This method assumes that the observed time series is the sum of three distinct components, which are:

y_{t} = τ_{t} + σ_{t} + ε_{t}

(17)

Here,

τ_{t}

is the trend component (the long-term progression or direction of the series),

σ_{t}

is the seasonal component (the recurring, periodic fluctuations), and

ε_{t}

is the residual component, the random noise or irregular variation not explained by trend or seasonality.

Figure 4 shows that for ADAS, monthly crash counts fluctuate between 20 and 70, with significant variability and sharp fluctuations throughout the series. For ADS, as shown in Figure 5, the series shows counts fluctuating smoothly between 10 and 40. The trend component shows a steady upward trajectory from late 2021 through early 2024, suggesting a gradual increase in ADAS incidents. This could be the result of rising ADAS adoption, improved reporting, or system-specific vulnerabilities. The trend component in ADS also shows an increase in incidents, especially from mid-2022 onward. The seasonal patterns for ADAS and ADS are relatively stable, with consistent peaks and troughs across years. This suggests that for both systems crashes may be influenced by recurring temporal factors, such as weather, traffic density and user behavior. The residual components in both series are approximately centered around zero, with no evident autocorrelation or systematic bias, which suggests that the additive decomposition has effectively captured the underlying trend and seasonal structure, leaving only random noise.

To evaluate the temporal dependency of the ADAS crash and preliminary SARIMA model specification, Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots were generated as shown in Figure 6 and Figure 7.

The ACF plot in Figure 6 shows a gradual decline across lags, with initial values exceeding the confidence interval. This pattern is consistent with Moving Average behavior and may also indicate non-stationarity, supporting the need for differencing. In the PACF plot for ADAS in Figure 7, a prominent spike at lag 1, followed by rapid decay within the confidence bounds, suggests a short autoregressive structure, likely AR (1). The absence of significant partial autocorrelations beyond lag 1 indicates limited long-range dependency.

Figure 8 and Figure 9 show the ACF and PACF plots for ADS crashes. The autocorrelation at lag 1 is strong and statistically significant, followed by a gradual decay across subsequent lags. This pattern suggests the presence of Moving Average components and possible non-stationarity, suggesting that differencing may be required. In the PACF plot, a sharp spike at lag 1, with all subsequent lags falling within the confidence bounds, points to a short autoregressive structure, likely AR (1). The clean cutoff after lag 1 supports a low-order AR term. These insights guide the initial parameter bounds for SARIMA grid search and model tuning.

To assess the stationarity of the crash count series, both the Augmented Dickey–Fuller (ADF) and Kwiatkowski–Phillips–Schmidt–Shin (KPSS) tests were applied as shown in Table 1. These tests offer complementary perspectives, with ADF testing the null hypothesis of a unit root (non-stationarity), while KPSS tests the null hypothesis of stationarity [36,37].

The ADF statistics for ADAS in Table 1 reject the null hypothesis of non-stationarity, denoting that the series is stationary, while the KPSS statistics are close to the rejection threshold, suggesting mild non-stationarity. For the ADS series, both tests suggest non-stationarity.

Guided by decomposition, autocorrelation diagnostics, and stationarity testing, a comprehensive SARIMA grid search was implemented for both datasets. This was achieved by using nested loops over candidate values for (p, d, q) and (P, D, Q), with a seasonal period set to 12. The search included combinations of non-seasonal parameters p

ϵ

[0, 2], d

ϵ

[0, 2], q

ϵ

[0, 2] and seasonal parameters P

ϵ

[0, 2], D

ϵ

[0, 2], Q

ϵ

[0, 2] with a seasonal period of m = 12, which resulted in a total of 729 unique SARIMA combinations for each dataset. Each configuration was fitted using python libraries (i.e.,

s t a t s m o d e l s . t s a . s t a t e s p a c e . S A R I M A X

). Given the ACF and PACF plots, which showed at most one dominant spike and no indication of higher-order lag structure, and the limited length of the training series (30 months), values of (p, d, q, P, D, Q) were restricted to values ≤ 2 to maintain parsimony and avoid over-fitting. Their Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were recorded to evaluate model fit during training. Each model was then validated on a separate validation set. Predictions were performed on the validation dataset and multiple performance metrics (AIC, BIC, RMSE, MSE, MAE, MAPE and Theil’s U1) were calculated to assess statistical fit and predictive accuracy. The best model was selected based on the lowest AIC and BIC, but only after confirming its predictive ability through validation metrics. This two-stage evaluation ensured that the final model was both statistically efficient and practically reliable for forecasting.

4.2. Formulation of Facebook Prophet Framework

The same pre-processed monthly crash dataset used for the SARIMA model, cleaned of outliers and converted to a time-indexed format, was used for Prophet. The data were reformatted to the Prophet-required structure, with the date and observation columns labelled ds and y, respectively. Hyperparameters were tuned separately for the monthly series using a grid search over changepoint_prior_scale (5 values), seasonality_prior_scale (4 values), and seasonality_mode (2 options: additive/multiplicative), yielding 40 unique combinations per series. The models were trained on observations up to 2023-12-31 and validated on a holdout period from 2024-01-01 to 2024-06-30. Prophet’s

m a k e_f u t u r e_d a t a f r a m e

and

p r e d i c t

functions were used to generate forecasts for the validation period. U.S. holidays were incorporated via Prophet’s built-in add_country_holidays (country_name = “US”), and yearly seasonality was enabled (weekly and daily disabled) to match monthly data. Each configuration was evaluated on the validation set using RMSE, MSE, MAPE, and Theil’s U1. The specific values for changepoint_prior_scale and seasonality_prior_scale were chosen to span the typical ranges suggested in the Prophet documentation [7] from strongly regularized to relatively flexible settings, and the final combination was selected based on validation RMSE, MAPE, MSE and Theil’s U1.

4.3. Model Evaluation

Table 2 shows the top 10 models across all combinations for the ADAS series.

The AIC and validation metrics were used to choose the best model. The best performing SARIMA (p, d, q, P, D, Q, S) model for the ADAS series was (0, 0, 2) (2, 0, 2, 12), selected based on lowest AIC (21.637), BIC (15.103) and strong validation metrics: RMSE: 5.731, MSE:32.84, MAE:4.527, MAPE: 13.687%, and Theil’s U: 0.103, as shown in Table 2. Although the configuration did not include autoregressive terms, it consistently outperformed other candidates across both statistical and predictive criteria, suggesting that the selected structure was sufficient to capture the series’ underlying dynamics.

The best-performing SARIMA (p, d, q, P, D, Q, S) configuration for the ADS series was (2, 2, 0) (0, 2, 0, 12), selected based on the lowest AIC value of −31.872, BIC −35.793 and robust predictive metrics (RMSE: 5.292, MSE: 28.007, MAE:4.721, MAPE: 13.458%, and Theil’s U: 0.102), as shown in Table 3. Despite the absence of Moving Average terms in the seasonal component, this model consistently outperformed alternatives across both statistical and forecast accuracy criteria. Its structure effectively captured the underlying seasonal and trend dynamics of the series, offering a parsimonious yet powerful fit.

The Table 4 below shows Prophet’s accuracy on the validation dataset across four metrics for ADAS. The top 10 models’ results have been shown here. The model was tuned based on three parameters, changepoint prior scale, seasonality prior scale and seasonality mode. The best values of the tuning parameters were changepoint prior scale 0.1, seasonality prior scale 0.1, and the seasonality mode was found to be multiplicative, for which the model yielded an MSE of 7.330, RMSE of 2.710, MAPE of 6.90%, and a Theil’s U1 value of 0.089, reflecting the magnitude and relative scale of forecast deviations. From the previous table, the best SARIMA configuration (0, 0, 2, 2, 0, 2, 12) for ADAS yielded higher error values (MSE of 32.84, RMSE of 5.731, MAPE of 13.687%, and Theil’s U1 of 0.103).

Table 5 below shows the Prophet model’s accuracy on four different metrics of the top 10 models for ADS. The changepoint prior scale of 0.05, seasonality prior scale of 1 and the additive seasonality mode were found to be the most optimized values for the ADS data. The model yielded RMSE of 2.242, MSE of 5.026, Mape of 8.850%, and Theil’s U1 of 0.095.

The results from Table 2, Table 3, Table 4 and Table 5 suggest that the Prophet model yielded better outputs than the SARIMA model across both datasets in terms accuracy and error metrics. The differences in Theil’s U1 and MAPE further show the model’s robustness in capturing temporal dynamics. As the Prophet model consistently showed higher accuracy across both ADAS and ADS datasets, it was selected for future projection of crashes. To transparently demonstrate Prophet’s performance, a detailed month-wise error analysis was conducted over the validation period for both ADAS and ADS series, as shown in Table 6 and Table 7.

The table includes standard error metrics such as absolute error, squared error, etc. The forecasted crash counts are presented in Table 6 and Table 7. The tables show a precise error computation and more accurate performance diagnostics. The low MAE, MSE, RMSE, MAPE and Theil’s U1 values show that the Prophet model consistently provided accurate forecasts for both ADAS and ADS crash series. The optimal seasonality mode differed between datasets. The best performance was achieved for ADAS with multiplicative seasonality and for ADS with additive seasonality. This difference reflects the underlying data characteristics, with ADAS crashes showing proportional seasonal variations (multiplicative), whereas ADS crashes show more constant seasonal fluctuations (additive).

4.4. Forecast of Future ADAS Crash Counts

The Prophet model was trained on the dataset from July 2021 to December 2023. The model’s accuracy was validated using predictions on the January–June 2024 validation dataset. The previous section clearly demonstrated that FB Prophet offers the best compromise between managing complexities and providing predictions that are close to actual outcomes. Therefore, to forecast for future data, Prophet has been used. A future forecast has been generated for the next six months (July to December 2024), as shown in Figure 10. It indicates a rising trend with seasonal fluctuations based on historical data, and the validation predictions reasonably match actual values. The July to December 2024 forecast (red line) suggests that monthly crashes might stay high, possibly between 40 and 80. Although the confidence interval becomes wider, reflecting uncertainty, it still provides a range that prevents expectations of uncontrollable spikes. In practical terms, without specific actions such as better driver training, system improvements, or policy changes, ADAS-related crashes are likely to remain at current levels or increase slightly but predictably.

Figure 11 compares the theoretical quantiles (expected under a normal distribution) with the actual residuals from the Prophet model’s forecasts of ADAS crashes. Most points align closely with the red reference line, which indicates that the residuals are broadly consistent with normality. A slight deviation in the upper tail (right side) suggests that the model may slightly underpredict some high values. To further assess residual behavior, the Ljung–Box test was applied at lag 10, yielding a p-value of 0.01. This p-value suggests no statistically significant autocorrelation. This supports the forecasts’ reliability and affirms that the model has captured the major temporal structure in the data.

Figure 12 shows the trend component extracted from the Prophet model for ADAS crash data. The steadily rising blue line indicates a consistent upward trajectory in the underlying crash trend, independent of seasonal or irregular fluctuations. This trend suggests that over time, ADAS-related crashes have been increasing. Possible reasons include, but are not limited to, the broader adoption of the ADAS technology, changing traffic patterns, or system limitations. While the trend does not capture short-term variations, it provides a clear signal that the baseline risk is gradually increasing.

Figure 13 shows the yearly seasonality component of ADAS crash data from the Prophet model. The blue line reveals recurring peaks and troughs. It indicates that crash counts tend to rise sharply around January and mid-year, then drop in between. The pattern exhibits two major seasonal peaks: the highest occurs around early January, reaching approximately +0.43 to +0.45, and a moderate peak in May–June reaches approximately +0.20 to +0.21. Conversely, major troughs appear around February–March (approximately −0.30) and August–September (approximately −0.15), indicating periods when seasonal effects reduce crash counts below the baseline. This cyclical pattern suggests that certain months consistently experience higher numbers of crashes, possibly due to weather, traffic volume, or behavioral factors such as holiday travel or commuting peaks. The amplitude of the seasonal effect ranges from about −0.30 to +0.40, indicating that seasonality can significantly influence monthly crash counts.

The graph (Figure 14) shows the average annual pattern of ADAS crashes, with the x-axis representing the day of year and the y-axis showing the deviation from the yearly mean crash count. Positive values suggest above-average crash activity, while negative values indicate below-average levels. The pattern exhibits a bimodal distribution with two distinct peaks. The highest peak occurs around early January (~+18) and a moderate peak appears around May–June (approximately +8). Conversely, two major troughs appear around late February–March (approximately −10 to −13) and August–September (~−7) indicating periods when crash activity falls significantly below the annual average. These patterns suggest that early January and June consistently experience higher levels of crash activity, while March and September represent relatively safe periods. The wide amplitude from roughly −13 to +18 highlights the strong influence of intra-year dynamics on crash frequency.

The observed bimodal pattern in ADAS crashes, with a peak in early January and a secondary peak in May–June, may be associated with several plausible external mechanisms. The January peak coincides with post-holiday traffic resumption, increased winter precipitation and reduced road surface friction, which may degrade sensor performance in camera and radar-based ADAS systems. Holiday travel in December–January also increases vehicle miles traveled on unfamiliar routes, potentially straining system capabilities. The May–June secondary peak may correspond to end-of-quarter vehicle delivery cycles and fleet deployment surges, higher summer traffic volumes, and increased highway driving associated with seasonal travel. The troughs in February–March and August–September may reflect post-holiday traffic normalization and a mid-year lull in deployment activity, respectively. It must be emphasized that these are hypothesized mechanisms drawn from domain knowledge and the prior literature. The present univariate framework does not permit causal attribution of these patterns to any specific external factor.

It is important to note that the seasonal patterns identified in this study represent empirically observed temporal regularities derived from univariate time-series analysis. Without the inclusion of external covariates such as monthly vehicle miles traveled (VMT), ADAS fleet size, weather indices (e.g., temperature, precipitation, visibility), or software update logs, these observed peaks and troughs cannot be causally attributed to specific external factors. The hypothesized mechanisms described above including holiday travel patterns, seasonal weather conditions affecting sensor performance, end-of-quarter vehicle delivery cycles, and seasonal testing schedules are plausible explanations based on domain knowledge and industry practices. However, formal attribution would require a multivariate modeling framework that explicitly incorporates such covariates. Future research incorporating these external variables would allow for more definitive causal inferences regarding the drivers of observed seasonality.

The stacked area chart below (Figure 15) shows the monthly breakdown of ADAS crashes by accident type from July 2021 to July 2024. The chart reveals an upward trend in total crash volume, rising from approximately 20 to 35 crashes per month in 2021 and 2022 to peaks exceeding 65 crashes by early 2024. Three accident types dominate the composition. Unknown/non-contact crashes (dark gray), particularly during peak periods, frontal impact (light blue) and multi-point/complex crashes (orange). Rear-end impacts (light green) and side impacts (purple) also contribute, but to a lesser degree, while corner impacts and vertical impacts remain relatively rare. Notable peaks in total crash volume occur around January 2023 (~65 crashes) and May–June 2024 (~67 crashes), with distinct seasonal fluctuations evident throughout the series. The large portion of unknown crashes indicates that the crash data report should be improved.

4.5. Forecast of Future ADS Crash Counts

The graph below (Figure 16) shows the ADS crash forecast for the 6 months using the Prophet model. It reveals a rising trend with seasonal fluctuations. Validation predictions closely follow actual values, suggesting the model captures temporal patterns well. The forecast for July to December 2024 suggests monthly crash counts will likely range between 20 and 40. The narrowing confidence interval reflects increased certainty of the predicted values. Unless targeted interventions such as improved driver monitoring, system recalibration or regulatory oversight are taken, ADS-related crashes are expected to rise in the near future.

The Ljung–Box test was applied at lag 10, yielding a p-value of 0.929 (Figure 17). This high p-value suggests no significant autocorrelation, reinforcing the assumption of temporal independence and supporting the robustness of the Prophet model’s forecasts.

The Prophet trend component for ADS in Figure 18 shows a steady, near linear increase from approximately 11.5 to 29.5 crashes per month from September 2021 to mid-2024, representing a 2.5-fold increase. This upward trajectory suggests growing baseline crash risk. Which could be due to increased ADS deployment, improved reporting and cumulative exposure. The consistent linearity from 2023 to 2024 indicates that ADS crash frequencies may continue to rise in the near future.

The annual seasonality component of the ADS crashes in Figure 19 shows recurring fluctuations in crash risk throughout the year. There is a trimodal pattern with three distinct peaks. April–May (reaching approximately +4), August (~+3) and October (~+3.5). Conversely, three major troughs appear in January–February (approximately −4), September (transition period around −4), and December (~−7, the deepest dip). These seasonal patterns may correspond to operational cycles such as increased deployment during spring and late summer testing periods, environmental factors such as varying weather conditions, or behavioral trends such as seasonal traffic patterns. The amplitude of seasonal effects, ranging from approximately −7 to +4, indicates that seasonality plays a substantial role in monthly crash variation. This should be considered when planning safety measures. The consistent recurrence of this trimodal pattern across multiple years confirms that these are statistically significant seasonal effects rather than random fluctuations.

The average annual pattern (Figure 20) reveals the typical within-year crash frequency variation, averaged across all observed years. The curve shows a complex wave-like structure with multiple peaks and troughs, indicating strong intra-year cyclicality. Three distinct peaks emerge in April (~+3.5 to +4), August (~+4, the highest annual peak) and October (~+3.5). This suggests that crash rates significantly exceed the annual average during these periods. Conversely, three major troughs occur in February–March (~−4), September (brief dip during transition), and December (~−7, the deepest annual trough), indicating relatively safe periods. The variation ranging from approximately −7 to +4 shows that these seasonal effects are substantial and should be considered for policy planning. Notably, the trimodal pattern of spring, late summer and fall suggests that ADS crashes are influenced by multiple distinct seasonal risk factors operating at different times. Therefore, necessary measures should be taken considering this.

The trimodal ADS crash pattern, with peaks in April–May, August, and October, may reflect a distinct set of operational drivers compared to ADAS. Spring testing and deployment cycles, which are common among ADS manufacturers conducting expanded road trials ahead of annual reporting periods, may contribute to elevated crash counts in April–May. The August peak could be associated with high summer traffic volumes, heat-related sensor degradation (particularly for LiDAR and thermal camera systems), and intensified operational testing. The October peak may correspond to fall fleet deployment announcements and pre-winter testing campaigns, as well as increased traffic variability from school reopening and seasonal commuting changes. The December trough, the deepest in the ADS series, may partly reflect winter operational restrictions imposed on ADS fleets, reduced deployment, and lower vehicle miles traveled. As with the ADAS analysis, these remain domain-informed hypotheses: the observed patterns are empirical temporal regularities and causal attribution requires future multivariate investigation.

As with ADAS, the trimodal seasonal pattern observed in ADS crashes represents an empirically identified temporal regularity rather than a causally established relationship. However, without monthly VMT data, ADS fleet size estimates, weather indices, and deployment logs, these remain plausible hypotheses rather than empirically validated causal factors. Future work incorporating such covariates in a multivariate Prophet model or hierarchical time series framework would enable the disentanglement of exposure growth effects from intrinsic seasonal risk variations.

The stacked area chart (Figure 21) shows the monthly distribution of ADS crashes by accident type from July 2021 to October 2024. Frontal impacts (light orange) form the largest and most consistent base layer throughout the observation period. Then rear-end crashes (brown) maintain a substantial and stable contribution. Multi-point/complex crashes (red) show sporadic spikes, particularly during peak periods. Side-impact (pink) and unknown/non-contact cases (yellow) provide moderate contributions. Overall crash volumes show clear peaks around July 2023 and July 2024, suggesting seasonal or operational surges in risk, with total monthly counts reaching approximately 35–40 during peak periods. Corner and vertical impacts remain relatively rare. The chart underscores the need to prioritize mitigation strategies for high frequency crash types, particularly Frontal and Rear-End collisions, while also addressing the mid-year seasonal surges.

5. Conclusions

This study has been conducted to forecast ADAS and ADS-related crashes. Crashes in ADAS- and ADS-controlled systems are becoming more frequent. This study aimed to assess the accuracy of univariate SARIMA and Facebook Prophet time-series forecasts for ADAS and ADS crashes and to identify key patterns, including trends, seasonality, and the overall structure of those crashes.

The study used the NHTSA crash database of ADAS- and ADS-controlled vehicles. The dataset contained crashes from July 2021 to June 2024, from which 30 months (up to December 2023) of data have been utilized for training both models, and 6 months of data have been used for validation to see the accuracy of the forecast. The Facebook Prophet model outperformed SARIMA in forecasting both ADAS and ADS crashes. The Prophet model showed the best balance while handling real-world data. The model evaluation metrics MSE, RMSE, MAE, MAPE, and Theil’s U1 statistics have shown much lower values than those of the SARIMA model. The Prophet model’s robustness in forecasting ADAS and ADS crashes was validated through evaluation metrics and residual diagnostics.

The analysis showed a steady rising risk profile for both ADAS and ADS crashes over time, with clear seasonal patterns and structural trends. A notable finding is the identification of distinct seasonal patterns in both crash types. ADAS crashes showed a bimodal pattern, with dual peaks in early January and May–June, whereas ADS crashes displayed a trimodal pattern, with three distinct peaks in April–May, August and October. This complex seasonal structure suggests that AV crashes are influenced by multiple operational or environmental factors that recur at specific times throughout the year. The different seasonal profiles for ADAS (bimodal) and ADS (trimodal) indicate that different automation levels face fundamentally different seasonal vulnerabilities. ADS shows greater sensitivity to spring, summer and fall peaks, whereas ADAS shows similar trends in early winter and late Summer. These recurring patterns indicate that crash risks are not random but cyclically concentrated, likely influenced by temporal factors such as traffic exposure, environmental conditions, system usage patterns and seasonal testing or deployment schedules. The presence of strong seasonal components and upward trends underscores the need for heightened vigilance and targeted and level-specific safety measures, especially during high-risk periods, to mitigate foreseeable surges in AV crashes. Additionally, focus should be given to improving the crash reporting system in order to reduce the number of unknown crash types and also to enhance the reliability of safety analyses and enable more targeted intervention strategies.

Limitations: This study has several important limitations that warrant consideration. First, the relatively short time-span of the dataset (36 months) restricts the model’s ability to capture long-term structural shifts, rare seasonal anomalies, or multi-year cyclical patterns. Extended temporal coverage or the inclusion of conventional motor vehicle (CMV) crash data for comparative analysis would enhance forecast robustness. Second, the reliance on monthly aggregation may smooth over short-term spikes or intra-month variations that could provide finer-grained insights into crash dynamics. Third, and most critically, this study employs univariate time-series models that do not incorporate external covariates such as monthly vehicle miles traveled (VMT), ADAS and ADS fleet size, weather indices (e.g., temperature, precipitation, visibility), or software update logs. Consequently, the observed seasonal patterns—while empirically robust—represent temporal regularities rather than causally attributed seasonality. The absence of exposure and environmental covariates means that the identified peaks and troughs reflect observed cyclical behavior in crash counts without disentangling whether these variations arise from changes in fleet exposure, seasonal risk factors, or operational deployment patterns. Future research should adopt multivariate frameworks, such as Prophet, with exogenous regressors or hierarchical time-series models, to incorporate these covariates explicitly. Such approaches would enable the decomposition of observed seasonality into distinct components: exposure-driven growth (e.g., fleet expansion), environmental risk variation (e.g., weather impacts on sensor performance), and deployment-driven fluctuations (e.g., seasonal testing cycles). This would allow for more precise causal attribution and more actionable insights for policymakers and manufacturers.

Author Contributions

Conceptualization, J.B., M.E.M. and A.H.; Methodology, J.B.; Validation, M.S.B.S. and T.C.; Formal analysis, J.B.; Resources, M.S.B.S. and T.C.; Data curation, M.E.M. and A.H.; Writing—original draft, J.B. and A.H.; Writing—review & editing, M.E.M., M.S.B.S. and T.C.; Visualization, J.B.; Supervision, M.S.B.S., A.S.H. and T.C.; Project administration, A.S.H. and T.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets presented in this article are openly available in NHTSA at https://www.nhtsa.gov/laws-regulations/standing-general-order-crash-reporting (accessed on 10 May 2026).

Acknowledgments

The authors gratefully acknowledge the National Highway Traffic Safety Administration (NHTSA) for making the dataset publicly available.

Conflicts of Interest

The authors declare no conflicts of interest.

References

WHO. Global Status Report on Road Safety 2018. Available online: https://www.who.int/publications/i/item/9789241565684 (accessed on 17 March 2025).
NHTSA. CrashStats—NHTSA—Department of Transportation. Available online: https://crashstats.nhtsa.dot.gov (accessed on 18 August 2025).
Banik, J.; Siraj, M.S.B.; Campisi, T. Analyzing Rear-End Collisions: A Comparative Study of ADS and ADAS Involvement. European Transport/Trasporti Europei. Available online: https://www.istiee.unict.it/sites/default/files/files/ET_2026_106_10.pdf (accessed on 10 March 2026).
NHTSA. Standing General Order on Crash Reporting. Available online: https://www.nhtsa.gov/laws-regulations/standing-general-order-crash-reporting (accessed on 18 August 2025).
Samadi, N.; Javid, R.; Ansaroudi, S.Z.; Dehestanimonfared, N.; Naseri, M.; Jeihani, M. Machine Learning Assessment of Crash Severity in ADS and ADAS-L2 Involved Crashes with NHTSA Data. Safety 2026, 12, 2. [Google Scholar] [CrossRef]
Cicchino, J.B. Effectiveness of forward collision warning and autonomous emergency braking systems in reducing front-to-rear crash rates. Accid. Anal. Prev. 2017, 99, 142–152. [Google Scholar] [CrossRef]
Facebook. Prophet/python/prophet/forecaster.py. GitHub. Available online: https://github.com/facebook/prophet/blob/main/python/prophet/forecaster.py (accessed on 24 September 2025).
Kalra, N.; Paddock, S.M. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. Part A Policy Pract. 2016, 94, 182–193. [Google Scholar] [CrossRef]
Feng, T.; Zheng, Z.; Xu, J.; Liu, M.; Li, M.; Jia, H.; Yu, X. The comparative analysis of SARIMA, Facebook Prophet, and LSTM for road traffic injury prediction in Northeast China. Front. Public Health 2022, 10, 946563. [Google Scholar] [CrossRef]
Lavrenz, S.M.; Vlahogianni, E.I.; Gkritza, K.; Ke, Y. Time series modeling in traffic safety research. Accid. Anal. Prev. 2018, 117, 368–380. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, M. Risk Prediction and Safety Driving in Automated Driving: A Review from the Perspective of Embedded Systems. Appl. Comput. Eng. 2025, 149, 209–220. [Google Scholar] [CrossRef]
Chen, D.; Zhang, Z.; Cheng, L.; Liu, Y.; Yang, X.T. INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation. arXiv 2026, arXiv:2502.00262. [Google Scholar] [CrossRef]
Talebpour, A.; Mahmassani, H.S.; Bustamante, F.E. Modeling Driver Behavior in a Connected Environment: Integrated Microscopic Simulation of Traffic and Mobile Wireless Telecommunication Systems. Transp. Res. Rec. 2016, 2560, 75–86. [Google Scholar] [CrossRef]
Kashkanov, A.; Semenov, A.; Kashkanova, A.; Kryvinska, N.; Palchevskyi, O.; Baraban, S. Estimating the effectiveness of electric vehicles braking when determining the circumstances of a traffic accident. Sci. Rep. 2023, 13, 19916. [Google Scholar] [CrossRef] [PubMed]
Deretić, N.; Stanimirović, D.; Awadh, M.A.; Vujanović, N.; Djukić, A. SARIMA Modelling Approach for Forecasting of Traffic Accidents. Sustainability 2022, 14, 4403. [Google Scholar] [CrossRef]
Sulaie, S.A. Use of ARIMA Model for Forecasting Consequences Due to Traffic Crashes in the Kingdom of Saudi Arabia. J. Road Saf. 2024, 35, 54–65. [Google Scholar] [CrossRef]
Choo, B.C.; Razak, M.A.; Tohir, M.Z.M.; Biak, D.R.A.; Syam, S. An Accident Prediction Model Based on ARIMA in Kuala Lumpur, Malaysia, Using Time Series of Actual Accidents and Related Data. Pertanika J. Sci. Technol. 2024, 32, 1103–1122. [Google Scholar] [CrossRef]
Cai, B.; Di, Q. Different Forecasting Model Comparison for Near Future Crash Prediction. Appl. Sci. 2023, 13, 759. [Google Scholar] [CrossRef]
Kontopoulou, V.I.; Panagopoulos, A.D.; Kakkos, I.; Matsopoulos, G.K. A Review of ARIMA vs. Machine Learning Approaches for Time Series Forecasting in Data Driven Networks. Future Internet 2023, 15, 255. [Google Scholar] [CrossRef]
Boye, P.; Ziggah, Y.Y.; Agyarko, K. A Short-Term Prediction Model for the Number of Registered Motor Vehicles Using Facebook Prophet Forecasting Approach. Eng. Technol. J. 2024, 9, 5650–5658. [Google Scholar] [CrossRef]
Almaskati, D.; Kermanshachi, S.; Pamidimukkala, A. Investigating the impacts of autonomous vehicles on crash severity and traffic safety. Front. Built Environ. 2024, 10, 1383144. [Google Scholar] [CrossRef]
Combs, T.S.; Sandt, L.S.; Clamann, M.P.; McDonald, N.C. Automated Vehicles and Pedestrian Safety: Exploring the Promise and Limits of Pedestrian Detection. Am. J. Prev. Med. 2019, 56, 1–7. [Google Scholar] [CrossRef]
Beckers, N.; Siebert, L.C.; Bruijnes, M.; Jonker, C.; Abbink, D. Drivers of partially automated vehicles are blamed for crashes that they cannot reasonably avoid. Sci. Rep. 2022, 12, 16193. [Google Scholar] [CrossRef]
Broughton, J. Forecasting road accident casualties in Great Britain. Accid. Anal. Prev. 1991, 23, 353–362. [Google Scholar] [CrossRef]
Karacasu, M.; Er, A.; Bilgiç, S.; Barut, H.B. Variations in Traffic Accidents on Seasonal, Monthly, Daily and Hourly Basis: Eskisehir Case. Procedia Soc. Behav. Sci. 2011, 20, 767–775. [Google Scholar] [CrossRef]
Wang, X.; Su, Y.; Zheng, Z.; Xu, L. Prediction and interpretive of motor vehicle traffic crashes severity based on random forest optimized by meta-heuristic algorithm. Heliyon 2024, 10, e35595. [Google Scholar] [CrossRef]
Abdulrazaq, M.A.; Fan, W.D. Seasonal instability in the determinants of vulnerable road user crashes: A partially temporally constrained modeling approach. Accid. Anal. Prev. 2026, 224, 108277. [Google Scholar] [CrossRef]
Wang, J.; Kaza, N.; McDonald, N.C.; Khanal, K. Socio-economic disparities in activity-travel behavior adaptation during the COVID-19 pandemic in North Carolina. Transp. Policy 2022, 125, 70–78. [Google Scholar] [CrossRef] [PubMed]
NHTSA. Crash Data Systems. Available online: https://www.nhtsa.gov/data/crash-data-systems (accessed on 18 March 2025).
Theofilatos, A.; Yannis, G. A review of the effect of traffic and weather characteristics on road safety. Accid. Anal. Prev. 2014, 72, 244–256. [Google Scholar] [CrossRef]
Hsu, C.-K. Reconsidering Seasonality, Weather, and Road Safety in Non-temperate Areas: The Case of Kaohsiung, Taiwan. Travel Behav. Soc. 2024, 34, 100710. [Google Scholar] [CrossRef]
Nofal, F.; Saeed, A. Seasonal variation and weather effects on road traffic accidents in Riyadh City. Public Health 1997, 111, 51–55. [Google Scholar] [CrossRef]
Box, G. Box and Jenkins: Time Series Analysis, Forecasting and Control. In A Very British Affair: Six Britons and the Development of Time Series Analysis During the 20th Century; Mills, T.C., Ed.; Palgrave Macmillan: London, UK, 2013; pp. 161–215. [Google Scholar] [CrossRef]
IIETA. Forecasting the Spread of COVID-19 Pandemic with Prophet. Available online: https://www.iieta.org/journals/ria/paper/10.18280/ria.350202 (accessed on 26 September 2025).
Rathore, R.K.; Mishra, D.; Mehra, P.S.; Pal, O.; Hashim, A.S.; Shapi’i, A.; Ciano, T.; Shutaywi, M. Real-world model for bitcoin price prediction. Inf. Process. Manag. 2022, 59, 102968. [Google Scholar] [CrossRef]
Dickey, D.A.; Fuller, W.A. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar] [CrossRef] [PubMed]
Kwiatkowski, D.; Phillips, P.C.B.; Schmidt, P.; Shin, Y. Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J. Econom. 1992, 54, 159–178. [Google Scholar] [CrossRef]

Figure 1. Flow diagram of the study.

Figure 2. Box plot of monthly crash counts ADAS.

Figure 3. Box plot of monthly crash counts ADS.

Figure 4. Monthly crash count decomposition (ADAS).

Figure 5. Monthly crash count decomposition (ADS).

Figure 6. ACF plot for ADAS crash data (Shaded area represents the confidence interval).

Figure 7. PACF plot for ADAS crash data (Shaded area represents the confidence interval).

Figure 8. ACF plot for ADS crash data (Shaded area represents the confidence interval).

Figure 9. PACF plot for ADS crash data (Shaded area represents the confidence interval).

Figure 10. Future forecast of ADAS crashes.

Figure 11. Residual diagnostics of the prophet model for ADAS crashes: QQ plot and Ljung–Box test results.

Figure 12. Long-term trend component of ADAS crashes.

Figure 13. Yearly seasonality pattern of ADAS crashes.

Figure 14. Average annual pattern in ADAS crashes.

Figure 15. Monthly ADAS crash composition by accident type.

Figure 16. Future forecast of ADS crashes.

Figure 17. Residual diagnostics of prophet model for ADS crashes: QQ plot and Ljung–Box test results.

Figure 18. Long-term trend component of ADS crashes.

Figure 19. Yearly seasonality pattern of ADS crashes.

Figure 20. Average annual pattern of ADS crashes.

Figure 21. Monthly composition of ADS crashes by accident type.

Table 1. ADF and KPSS statistics.

	Test	Statistic	p-Value
ADAS	ADF	−3.7526	0.0034
	KPSS	0.5329	0.0342
ADS	ADF	−2.1122	0.2396
	KPSS	0.6917	0.0143

Table 2. Model combinations and evaluation metrics for ADAS-SARIMA.

Parameters	AIC	BIC	MSE	RMSE	MAE	MAPE	Theils_U1
(0, 0, 2, 2, 0, 2, 12)	21.637	15.103	32.84	5.731	4.527	13.687	0.103
(0, 0, 1, 2, 0, 2, 12)	34.562	30.880	66.129	8.132	6.910	17.726	0.108
(0, 0, 0, 1, 0, 1, 12)	149.917	152.416	99.591	9.980	9.046	21.202	0.109
(0, 0, 2, 2, 0, 0, 12)	40.041	38.999	101.651	10.082	9.634	22.083	0.111
(0, 0, 2, 2, 0, 1, 12)	42.041	40.791	107.138	10.351	9.802	22.395	0.115
(0, 0, 1, 2, 0, 0, 12)	40.334	39.501	106.040	10.298	9.868	22.264	0.115
(0, 1, 2, 1, 1, 1, 12)	30.685	24.376	110.712	10.522	8.857	21.095	0.116
(0, 1, 1, 1, 1, 0, 12)	37.710	36.538	116.651	10.801	9.883	21.136	0.122
(0, 1, 2, 1, 1, 0, 12)	37.428	35.866	119.878	10.949	10.421	23.327	0.123
(0, 0, 0, 2, 0, 2, 12)	40.436	38.483	118.105	10.868	9.643	19.892	0.124

Table 3. Model combinations and evaluation metrics for ADS-SARIMA.

Parameters	AIC	BIC	MSE	RMSE	MAE	MAPE	Theils_U1
(2, 2, 0, 0, 2, 0, 12)	−31.872	−35.793	28.007	5.292	4.721	13.458	0.102
(1, 2, 1, 2, 0, 0, 12)	−31.572	−36.079	8929.956	94.498	85.066	296.944	0.980
(2, 0, 0, 2, 0, 2, 12)	−30.539	−34.835	7609.784	87.234	78.591	275.228	0.976
(2, 0, 0, 2, 0, 0, 12)	−28.149	−31.218	145.582	12.066	10.589	43.042	0.174
(1, 2, 0, 2, 0, 0, 12)	−26.799	−30.404	8929.765	94.497	85.065	296.940	0.980
(2, 0, 0, 2, 0, 1, 12)	−26.378	−30.060	7608.502	87.227	78.584	275.204	0.976
(1, 2, 0, 2, 0, 1, 12)	−24.034	−28.541	8929.868	94.498	85.065	296.943	0.980
(2, 1, 0, 2, 0, 2, 12)	−23.136	−29.446	3537.050	59.473	54.182	196.005	0.513
(2, 2, 0, 2, 0, 1, 12)	−18.762	−26.603	702.599	26.507	21.109	64.749	0.615
(2, 2, 0, 2, 0, 2, 12)	−18.287	−27.435	702.533	26.505	21.108	64.745	0.615

Table 4. Model evaluation metrics of the FB prophet model for ADAS.

Changepoint Prior Scale	Seasonality Prior Scale	Seasonality Mode	MSE	RMSE	MAPE	Theils_U1
0.1	0.1	multiplicative	7.330	2.71	6.9	0.089
0.1	0.1	additive	23.240	4.82	11.8	0.098
0.001	0.1	additive	32.260	5.68	14.9	0.104
0.05	0.1	additive	113.159	10.638	26.46	0.108
0.01	0.1	additive	113.226	10.641	26.466	0.108
0.05	0.1	multiplicative	119.827	10.947	26.961	0.11
0.001	0.1	multiplicative	124.233	11.146	27.462	0.111
0.01	0.1	multiplicative	123.418	11.109	27.413	0.111
0.001	1	multiplicative	141.69	11.903	29.077	0.118
0.01	1	multiplicative	142.055	11.919	28.753	0.119

Table 5. Model Evaluation Metrics of the FB Prophet Model for ADS.

Changepoint Prior Scale	Seasonality Prior Scale	Seasonality Mode	MSE	RMSE	MAPE	Theils_U1
0.05	1	additive	5.026	2.242	8.850	0.095
0.01	1	additive	23.160	4.812	14.930	0.108
0.001	1	additive	25.240	5.024	15.487	0.110
0.001	0.1	additive	52.533	7.248	21.616	0.125
0.05	1	multiplicative	53.396	7.307	25.709	0.125
0.1	0.1	additive	52.609	7.253	21.281	0.126
0.05	0.1	additive	52.679	7.258	21.263	0.126
0.01	0.1	additive	52.904	7.273	21.191	0.126
0.05	0.1	multiplicative	55.061	7.420	23.641	0.128
0.001	0.01	multiplicative	59.109	7.688	28.028	0.128

Table 6. Forecast performance statistics for ADAS on validation data.

Month Year	Actual	Predicted	et	\|et\|	eÂ²t	eÂ²t/ActualÂ²	% Error
January–2024	29	34	−5.0	5.0	25.0	0.0297	−17.24
February–2024	25	28	−3.0	3.0	9.00	0.0144	−12.00
March–2024	46	48	−2.0	2.0	4.00	0.0019	−4.35
April–2024	53	55	−2.0	2.0	4.00	0.0014	−3.77
May–2024	65	66	−1.0	1.0	1.00	0.0002	−1.54
June–2024	58	59	−1.0	1.0	1.00	0.0003	−1.72
Total	276	290		12.7	44.0	0.0480
MAE:	2.33
MSE:	7.33
RMSE:	2.71
MAPE:	6.9%
Theil’s U1:	0.089

Table 7. Forecast performance statistics for ADS on validation data.

Month Year	Actual	Predicted	et	\|et\|	eÂ²t	eÂ²t/ActualÂ²	% Error
January–2024	22	20.0	2.0	2.0	4.00	0.0083	9.09
February–2024	22	25.0	−3.0	3.0	9.00	0.0186	−13.64
March–2024	28	31.0	−3.0	3.0	9.00	0.0115	−10.71
April–2024	19	21.0	−2.0	2.0	4.00	0.0111	−10.53
May–2024	40	40.0	0.0	0.0	0.00	0.0000	−0.00
June–2024	39	37.0	2.0	2.0	4.00	0.0026	5.13
Total	170	174.0		12.0	30.00	0.0521
MAE	2.0
MSE	5.0
RMSE	2.24
MAPE	8.85%
Theil’s U1	0.095

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Banik, J.; Miah, M.E.; Hossain, A.; Siraj, M.S.B.; Huq, A.S.; Campisi, T. Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study. Future Transp. 2026, 6, 105. https://doi.org/10.3390/futuretransp6030105

AMA Style

Banik J, Miah ME, Hossain A, Siraj MSB, Huq AS, Campisi T. Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study. Future Transportation. 2026; 6(3):105. https://doi.org/10.3390/futuretransp6030105

Chicago/Turabian Style

Banik, Joydeep, Md Emon Miah, Arman Hossain, Md Sifat Bin Siraj, Armana Sabiha Huq, and Tiziana Campisi. 2026. "Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study" Future Transportation 6, no. 3: 105. https://doi.org/10.3390/futuretransp6030105

APA Style

Banik, J., Miah, M. E., Hossain, A., Siraj, M. S. B., Huq, A. S., & Campisi, T. (2026). Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study. Future Transportation, 6(3), 105. https://doi.org/10.3390/futuretransp6030105

Article Menu

Seasonal Patterns and Future Projections of ADAS and ADS Crashes: A Time-Series Forecasting Study

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Data Description

3.2. Data Analysis

3.2.1. Auto Regressive Moving Average Model (ARMA)

3.2.2. Auto Regressive Integrated Moving Average Model (ARIMA)

3.2.3. Seasonal ARIMA Model (SARIMA)

3.2.4. Facebook Prophet Forecasting Model

3.3. Model Evaluation Metrics

3.4. Forecasting of Time Series

4. Results and Discussion

4.1. Formulation of the SARIMA Framework

4.2. Formulation of Facebook Prophet Framework

4.3. Model Evaluation

4.4. Forecast of Future ADAS Crash Counts

4.5. Forecast of Future ADS Crash Counts

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI