1. Introduction
The nature of criminal activity is complex and constantly changing, making accurate prediction a significant challenge for experts in criminology, data science, and public policy. Unlike static datasets that might reflect unchanging phenomena, crime patterns are subject to continuous and often unpredictable evolution. This includes not only shifts influenced by broad socioeconomic conditions and demographic changes but also the rapid emergence of novel criminal methodologies, such as sophisticated cybercrimes and intricate types of financial fraud. Traditional statistical time series models, while historically valuable for capturing established temporal trends and predictable seasonalities in crime data, frequently fall short when confronted with these abrupt shifts, unforeseen events, or the profound nonlinear relationships that increasingly characterize contemporary criminal behavior. Their reliance on assumptions of linearity and stationarity limits their capacity to fully decipher the intricate causal webs driving modern crime.
Conversely, advanced machine learning models, particularly those capable of discerning highly complex and nonlinear patterns, offer significant predictive power. However, many of these powerful algorithms often operate as “black boxes,” providing accurate forecasts without transparently revealing the underlying logic or the specific influence of individual factors on their outputs. This opacity poses a substantial hurdle for actionable insights in public safety, where understanding why a risk is predicted is as important as the prediction itself, particularly for informing strategic resource allocation, policy adjustments, and even legal defensibility. The primary challenge in developing reliable crime prediction models is ensuring both the highest possible accuracy and maintaining interpretability.
This complex challenge is precisely what hybrid approaches like the ARIMA-ANN model aim to solve by combining statistical precision with machine learning’s adaptability. These models do not just merge methods; instead, a unified system is created that simultaneously handles the unique features of crime data. The ARIMA component effectively identifies and models linear temporal relationships and long-term trends within the data. In contrast, the ANN component is tailored to capture and learn from the complex, nonlinear residual patterns that remain unexplained by ARIMA. This integrated strategy enhances predictive accuracy by addressing both linear and nonlinear dynamics, while also allowing for more refined interpretation of the underlying factors influencing crime risk. Beyond the task of forecasting, this approach clarifies the variables most responsible for changes in model outputs and provides a robust foundation for continuous adaptation to evolving patterns in criminal activity. As a result, the methodology supports greater relevance, operational value, and transparency in data-driven decision-making for law enforcement and urban planning [
1,
2]. Recent studies have shown that human behavior can be anticipated using advanced data-driven methods. For instance, studies demonstrate how sensor data can predict changes in individual activity patterns [
3].
In recent years, classic statistical methods, such as Autoregressive Integrated Moving Average (ARIMA), have given way to machine learning techniques and mixed approaches. These new tools help capture both linear and nonlinear patterns in complex social systems [
4,
5].
The literature points out that single-method approaches struggle to explain the true complexity of crime. Crime rates are influenced by a mix of social, economic, demographic, and environmental factors [
6,
7]. While ARIMA models can track trends over time and spot seasonal effects, they struggle when data show sudden shifts, unexpected changes, or hidden nonlinearities [
8]. Artificial neural networks (ANNs), on the other hand, can catch these nonlinear patterns, although sometimes at the cost of less interpretability and a higher risk of overfitting [
9]. There is a clear movement in predictive analytics towards hybrid models that bring together the strengths of both statistical and machine learning methods, aiming for better accuracy and more reliable results [
10].
This study examines Galați County in Romania, a region that has changed a lot in the last eleven years. There have been shifts in society, the economy, migration, and patterns of criminal behavior. Using a detailed dataset from January 2014 to December 2024 [
11,
12], “Crime risk” was defined as a multiclass variable based on the crime counts adjusted by population size. Class balancing corrects for historical imbalances that could otherwise skew model performance and impact policy [
13,
14].
This study systematically compares the ARIMA-ANN hybrid model with several established multiclass classification methods, using a variety of evaluation metrics and visualization tools. All steps in the data science process are thoroughly documented to ensure transparency and reproducibility. The criteria for feature selection, the approach to constructing the target variable, and the application of domain knowledge at each stage are clearly explained. The findings provide practical insights for public safety practitioners in Romania and demonstrate a methodological approach that can be adapted to similar contexts elsewhere.
The paper is structured as follows:
Section 1 reviews past research on hybrid crime forecasting and multiclass classification in criminology.
Section 2 describes our methods, data processing steps, and model design.
Section 3 presents the results, showing both graphs and tables for performance comparison. In
Section 4, we discuss the results, while
Section 5 draws conclusions and looks at what our results mean for policy and future studies.
Over the last twenty years, research in crime prediction has grown fast. Law enforcement agencies and policymakers now use data-based approaches to select where to place prevention resources, manage operations, and build long-term safety plans [
15]. Early work in this area used statistical models like ARIMA, which could spot patterns over time and help set up prediction frameworks based on real evidence [
16]. However, since ARIMA assumes all patterns are linear, it misses sudden shocks, structural changes, and the fact that crime data often fail to remain stable over time [
8]. This limitation has led to the rise of hybrid models. A large body of work now focuses on hybrid ARIMA-ANN models. These can learn complex relationships and pick up on how outside factors relate to crime [
17,
18]. The original hybrid ARIMA-ANN approach, first suggested by Zhang [
10], uses ARIMA to model linear trends and then sends the leftover error (residuals) to an ANN, which can handle the nonlinear aspects. This two-step method has proven to improve predictions in real cases, from regional crime to city-level demand or even disease forecasting. Many studies confirm that hybrid models are good at catching sudden surges in crime during social or economic crises, or when big policy changes happen [
1].
Beyond hybrid time series models, advances in machine learning and classification have given us better tools to break down crime risk. Methods like multinomial logistic regression, decision trees, random forests, and support vector machines (SVMs) can classify crime risk into multiple categories using a range of indicators—social, economic, demographic, or environmental [
19,
20]. Adding specific variables, such as unemployment rate, economic health, city infrastructure, or weather, makes these models more generalizable and supports more detailed risk assessments [
21,
22]. Yet, criminological data often suffer from class imbalance. To fix this, careful processing steps need to be applied, such as resampling or designing balanced target variables [
23,
24,
25]. Recent comparison studies find that even though tree-based models and SVMs often perform well, hybrid ARIMA-ANN models can do better, especially when the data have time dependencies, nonlinear trends, or sudden changes [
26]. Zhang [
10] showed that the hybrid model produced better precision and recall than a classic ARIMA when forecasting monthly rates of complex social events, especially under abrupt change and nonlinear dynamics. Khashei and Bijari [
27] also found that hybrid models work well for urban risk prediction, since they blend both sequence and nonlinear patterns found in time series data. Similar results have come up in studies focused on Europe, showing that these models can adapt to regional crime risk, even when data limitations or operational conditions differ [
28]. Research also highlights the need for model evaluation and selection to be transparent and repeatable. There is a push for better validation, including nested cross-validation, tuning model settings, and using multiple performance metrics for well-grounded conclusions [
29]. Visualization tools—such as time series breakdowns, confusion matrices, or variable importance charts—have become common and help explain findings to both technical and non-technical audiences [
30]. The impact of hybrid and multiclass models in crime analytics extends beyond prediction. They are now used for operational policing, city planning, and policy action [
4,
31]. Studies from North America and Europe have reported clear drops in crime and less wasted resources once these models help set patrol routes or plan police schedules [
4,
32]. Hybrid ARIMA-ANN and ensemble machine learning models are both scalable and flexible, which makes them a good fit for local and regional authorities who need to base their actions on solid data in complex and changing settings.
There are still some challenges. Data often come from very different sources—police reports, social media, or sensors—which can be messy or incomplete. Models must also be interpretable, so decisions can be defended in court or during official checks [
33,
34]. In Romania, recent improvements in data quality mean it is now possible to apply these advanced models to full regional datasets, but this has not been explored enough. Our study fills this gap, offering a clear, empirical comparison between hybrid ARIMA-ANN and classic multiclass classifiers for monthly crime risk stratification, and sets up a method that can be used in future studies in Romania or similar settings.
2. Materials and Methods
The methodological framework for this research was designed to compare hybrid and traditional multiclass classification models and predict crime risk. It integrated best practices from statistical learning, machine learning, and criminology [
35,
36].
This study uses a new dataset from Galați County, Romania, covering the period from January 2014 to December 2024. This dataset combined data from official sources, including socioeconomic factors, environmental conditions, and crime records. Each entry contained socioeconomic and meteorological details, along with calculated indices and coefficients, as listed in
Table 1.
The predictor variables used in our model—ranging from unemployment rates and income levels to tourism, trade, and climate—were selected based on their theoretical and empirical associations with crime risk. Each variable captures a distinct dimension of social, economic, or environmental stress that can influence criminal behavior. For example, high tourist activity may increase opportunity-driven crimes, while unemployment and low income reflect structural pressures linked to property crime. Seasonal climatic conditions also play a role by modulating public activity patterns. These variables, when analyzed collectively, provide a robust foundation for understanding and forecasting shifts in crime risk.
Before modeling, the dataset underwent checks for internal consistency, temporal continuity, and completeness, following standard statistical diagnostics for time series data. We verified the dataset thoroughly to ensure that missing or inconsistent entries were properly addressed, ensuring accuracy in subsequent analyses.
An essential step involved transforming the raw variable “Number of offenses” into a standardized crime rate. We calculated this rate by dividing the number of offenses by the resident population for the same period and multiplying by 100,000. This standardization allowed for reliable comparisons across different periods of time. In line with best practices, the “resident population number” variable was removed from the set of predictors because its influence was already reflected in the standardized crime rate [
37].
For the classification task, the target variable “crime risk” was created by dividing monthly crime rates into five equal-sized classes: very low, low, moderate, high, and very high risk. This method, called quantile stratification, ensured balanced representation across risk categories and addressed issues related to uneven class distributions common in crime datasets [
38]. Each month was categorized based on its position within the crime rate distribution, producing five groups of similar size. These risk categories formed the basis for all subsequent multiclass analyses. Variables demonstrating low variance or high correlation (absolute value of correlation coefficient |r| > 0.85) were excluded or combined into composite variables according to established guidelines for mitigating multicollinearity and dimensionality reduction [
39]. The final set of predictors included socioeconomic and environmental factors, such as unemployment rates, monthly income, number of tourists, trade balance, climate conditions, and population increase number. Continuous predictors underwent z-score normalization, enhancing model stability and convergence [
40].
Prior to model building, initial exploratory data analysis was conducted to identify clear temporal trends, seasonal variations, and outliers. The time series was decomposed of crime rates, and autocorrelation analyses were performed to determine temporal dependencies and guide the selection of appropriate lagged predictors [
5].
This preliminary analysis provided valuable findings, ensuring that subsequent models accurately reflected the data’s temporal and structural characteristics.
Our main modeling approach utilized a hybrid ARIMA-ANN model in two steps. First, an ARIMA model captured linear trends and seasonal patterns in the crime rate time series. Then, an artificial neural network (ANN) modeled the nonlinear residual patterns that remained unexplained by the ARIMA model. Determining the ARIMA model parameters involved careful examination of autocorrelation (ACF) and partial autocorrelation (PACF) analyses, alongside minimizing the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Our diagnostic analyses consistently revealed significant short-term autocorrelation at lags of 1 to 3 months and annual seasonality at a lag of 12 months. Consequently, ARIMA models typically utilized parameter values of p and q between 1 and 2, with d set at 1, aligned with results from the Augmented Dickey-Fuller test for non-stationarity. This configuration effectively captured observed trends and seasonal structures, consistent with earlier research [
5,
10,
16].
The ARIMA factor is represented as
, using the ACF, PACF, and AIC/BIC results, and is applied in the first modeling step to the monthly crime rate time series (standardized number of offenses per 100,000 population):
where
represents the crime rate at time
,
and
are the autoregressive and moving average parameters, and
is the residual error term.
Once the ARIMA model has been fitted, the residuals—representing the components of the crime rate not accounted for by linear or seasonal behavior—are input to a feed-forward artificial neural network (ANN). The ANN models the nonlinear relationships between the socioeconomic predictors and the crime rate and supports better classification of the monthly crime risk category [
10]. As illustrated in
Figure 1, the final prediction
is the sum of the ARIMA forecast
(linear and seasonal component of the crime rate) and a nonlinear correction generated by the ANN
:
Following ARIMA model fitting, the residuals—indicating nonlinear patterns not explained by ARIMA—were analyzed using an ANN. The ANN was structured as a feed-forward multilayer perceptron optimized through practical validation procedures. Inputs included recent crime rates from the previous six months, the immediate ARIMA forecast, and up to four lags of the ARIMA residuals, justified by residual autocorrelation patterns. Extensive grid searches and five-fold time series cross-validation determined the optimal ANN configuration: a single hidden layer with 10 neurons, ReLU activation function, a learning rate of 0.005, L2 regularization of 0.01, and a dropout rate of 0.2 to minimize overfitting [
40]. Training ceased after 50 epochs if performance did not improve. This configuration was validated using the final 20% of training data, preventing data leakage and ensuring reliable predictions.
To provide comparative findings, several traditional multiclass classification methods—multinomial logistic regression, decision trees, random forests, and support vector machines (SVM)—were evaluated using identical predictors. Each classifier underwent thorough cross-validation and hyperparameter optimization [
19]. Grid searches for tree-based models determined parameters such as maximum depth, minimum samples per leaf, and the number of estimators. For SVMs, kernel types and regularization strengths were tested [
20]. Performance evaluation involved stratified 5-fold cross-validation, ensuring consistent representation of risk categories across folds. Accuracy, macro-averaged precision, recall, F1-score, and AUC-ROC, were measured [
41]. Additional analyses included confusion matrices, ROC curves, and feature importance plots for clearer interpretation of results.
All analyses were conducted using Python (version 3.10) and widely accepted libraries such as pandas for data handling, NumPy for numerical computations, Scikit-Learn for traditional machine learning methods, ARIMA from Statsmodels, and TensorFlow/Keras for neural networks [
42]. Data visualizations were created using matplotlib and seaborn. Code was thoroughly documented and version-controlled, ensuring complete reproducibility in line with computational social science standards [
43,
44]. Special attention was dedicated to avoiding future information in model training or selection, thereby preventing data leakage. Hyperparameter optimization occurred strictly within training subsets of each fold. Final evaluation was conducted on a separate independent test set—the last 20% of data—ensuring unbiased and reliable predictions [
45,
46].
This detailed methodological framework guarantees robust and actionable findings, laying a solid foundation for evidence-based policy decisions in crime prevention and public safety. By clearly outlining each analytical step and its justification, this approach facilitates both replication and practical application of the study’s results.
3. Results
The comparative study with traditional multiclass classifiers is detailed, combining objective quantitative outcomes with contextual qualitative assessments informed by domain-specific knowledge. The results are based on detailed tests using the Galați County dataset, after applying all the data preparation and target definition steps mentioned earlier. The analysis is presented in a clear and structured way, starting with basic statistics and visual graphs, and then moving on to model evaluation and interpretation of the main findings.
An initial investigation showed some significant patterns in terms of spatio-temporal dependencies and the relationships between socioeconomic predictors and the standardized crime rate.
Figure 2 shows the time series plot of the normalized monthly relevant explanatory variables, i.e., the unemployment rate, the net salary adjusted by inflation, the number of tourist arrivals and the international trade balance.
The figure shows obvious evidence of synchronized peak holdings—especially in downturns and when being shocked—such that unemployment rate, crime rate and net salary co-move upwards and tourist arrivals plunge. These results are consistent with the existing literature on the marriage between economic hardship and the surge in crime [
47,
48]. Seasonal effect is also found with high temperature and high crime activity similar to touristic districts during summer time and so the seasonal lags and dummy variables are put in the model [
49].
Setting up the new multiclass target “Crime risk” allowed us to implement a balanced classification problem as exhibited by the profile of the temporal distribution of risk classes in
Figure 3. Visually, it appears that regime transitions are triggered by macroeconomic or policy events, as abrupt transitions in class membership from moderate or low risk to high or very high risk are observed in the case of lockdown due to COVID-19. The balanced five-class stratification prevents class imbalance bias in the model learning process, as can be noted from the well-balanced distribution of the cases and the well-visible regime transitions in the visualization [
50].
The time series decomposition and the autocorrelation analyses shown in
Figure 4 identified the short-term autocorrelation (lag 1–3 months), as well as a prominent annual seasonality (lag 12 months), which supported the application of an ARIMA model with non-seasonal and seasonal components. Looking at the PACF, we see most of the linear signal can be handled by a simple autoregressive model of order 1 or 2 (denoted
AR(
1) or
AR(
2)), meaning that the current crime rate can be effectively predicted using one or two preceding values, with higher lags providing little additional explanatory power. These results demonstrated that the ARIMA model can capture the linear and periodic structure of the crime rate sequence, making it easier for the next ANN step to work with a clearer, more refined set of residual values [
51].
The performance of the hybrid ARIMA-ANN model was compared to classical multiclass classifiers: multinomial logistic regression, decision trees, random forests, support vector machines. All models were trained and validated with five-fold stratified cross-validation to ensure that class balance and temporal structure were maintained. We use the key performance metrics—accuracy, macro-averaged precision, recall, F1-score, and AUC-ROC—for each classifier, as presented in
Table 2. The hybrid ARIMA-ANN model generalizes well out of sample and tends to outperform all benchmarks, presenting an overall accuracy of 76.4%, a macro F1-score of 0.762, and an AUC-ROC of 0.882 against the hold-out test set. The random forest and SVM models also give competitive results but perform worse than the hybrid model, especially in recall and precise classification between the “high” and “very high” risk classes. The predictive ability of the multinomial logistic regression and decision tree models are reduced, and there is a significant decrease in the precision and recall for the high-risk classes as well, which is consistent with the results from other studies [
23,
26,
52].
The detailed confusion matrices and ROC curves, as shown in
Figure 5 and
Figure 6, highlight the relative strength and weakness of each approach.
The confusion matrix of the hybrid model is highly diagonally concentrated, with most errors made between neighboring classes (indicated by models) “moderate” and “high” risk rather than between extreme classes. This is a desirable property to consider when crafting risk stratification models, as errors made in low adjacent classes have diminished operational implications when compared to classifying periods of very high risk as very low risk, or vice versa [
41]. The performance of the hybrid model is also shown by the ROC analysis showing strong discriminative ability of the model among classes and especially high sensitivity and specificity for “high” and “very high” risk periods.
Figure 7 shows the predicted versus actual risk class trajectories for the test period (2022–2024), indicating temporal consistency of predictions of the ARIMA-ANN model with ground truth labels. The hybrid model parallels the initiation and cessation of high-risk time-frames, predicting crime spikes during post-pandemic recuperation and times of financial strain. Comparative examination with other classifiers demonstrates that the ARIMA-ANN model tends to make earlier predictions with fewer missed hits within high-risk windows, a characteristic that has straightforward operational implications for proactive policing and resource deployment [
4].
The variable importance analysis (
Figure 8) shows that unemployment rate, inflation-adjusted net salary and international trade balance are the most important drivers of risk classification in both the tree-based model and as feature weights of the ANN part.
The findings are aligned with theory and evidence, which suggest that exposure to socioeconomic hardship and market volatility is predictive of crime risk profile [
53]. Furthermore, the addition of tourist arrivals and weather factors increased the precision of the model in summer and holiday periods, when the vulnerability of tourist areas to crime increases seasonally.
To enhance transparency regarding the ARIMA-ANN collaboration, the structure and nature of the nonlinear residuals produced by the ARIMA model were analyzed. After ARIMA fitting, the residuals—i.e., the portion of the time series not explained by linear autoregressive and seasonal effects were statistically examined for nonlinearity using methods such as the BDS (Brock–Dechert–Scheinkman) test and visualized via residual-vs-fitted plots. These analyses, presented in
Figure 9 and
Figure 10, consistently revealed that the ARIMA residuals displayed complex patterns, particularly around sudden macroeconomic shocks (e.g., the COVID-19 lockdown period), as well as during periods of marked changes in unemployment, net salary, or population migration. For example, in 2020–2021, the ARIMA residuals showed clear bursts of volatility and sign changes, confirming the presence of underlying nonlinear dynamics.
This validated the need for a nonlinear learner (ANN) in the second stage. Furthermore, autocorrelation of the residuals at nontrivial lags supported the inclusion of multiple residual lags as ANN inputs, allowing the neural network to capture both short memory effects and nonlinear dependencies missed by ARIMA alone.
To address the inherent ‘black box’ nature of the hybrid ARIMA-ANN model, particularly concerning its application in contexts requiring transparent decision-making, such as police resource allocation, advanced model interpretability techniques were incorporated. Sensitivity analysis was performed by systematically varying key socioeconomic and environmental input variables, such as ‘International trade balance Exports—Imports,’ ‘actual employees number, ’average net salary updated with inflation’, and ‘unemployed people number.’ The impact of these perturbations on the predicted ‘Crime risk’ levels was observed, providing understanding of the direct influence of individual factors on forecast outcomes. Furthermore, Explainable AI (XAI) methodologies, specifically SHAP (SHapley Additive exPlanations), were employed.
Figure 11 presents the SHAP summary plot, showing the relative importance and direction of influence of each input variable. The analysis reveals that unemployment rate, number of tourists, net income, and trade balance are consistently among the most influential features. SHAP values also confirm that the model’s predictions align with established criminological assumptions—for example, higher unemployment contributes positively to the prediction of higher crime risk categories. This post-hoc explanation technique demonstrates that while the ANN component is nonlinear, its decision process can be transparently understood, making the overall hybrid model suitable for informed, real-world decision-making in public safety.
The plot illustrates the importance and impact of various features on the hybrid ARIMA-ANN model’s ‘very high’ crime risk predictions. Each point on the plot represents a Shapley value for a specific feature and an instance in the dataset.
The interpretability of the ANN’s hidden layer was further examined, with particular attention given to its capacity to model higher-order interactions between features, which are often critical in complex social contexts. SHAP interaction values were applied to the trained ANN to highlight and quantify the key feature combinations influencing predictions. Notably, a joint increase in unemployment rate and average temperature—representing combined economic and seasonal stressors during summer months—was found to have a pronounced nonlinear effect on predicted crime risk. Other significant interactions included trade balance and net salary, as well as tourists and temperature, supporting empirical observations that tourism-driven fluctuations are intensified in hotter periods. SHAP dependence and interaction plots revealed the logic embedded within the ANN’s hidden layer and enhanced model transparency for policy audiences. This interpretability step, now standard in machine learning practice, ensures that the hybrid model delivers both strong predictive performance and actionable, comprehensible mechanisms for public safety decision-making.
4. Discussion
Comparing the hybrid ARIMA-ANN model with traditional multiclass classifiers using eleven years of monthly crime risk data from Galați County revealed several valuable insights. These findings are important not only for researchers but also for local officials seeking practical tools.
First, when examining the time series for unemployment, net salary, tourist arrivals, and trade, a noticeable pattern emerges. Crime risk does not spike in isolation; it correlates with these economic and social indicators. Whenever job losses increase, wages decrease, or fewer tourists visit the county, crime risk rises. These relationships are visible in the results and align well with criminological theories that suggest that economic pressure makes communities more vulnerable. It is not just about people losing jobs; a drop in tourist numbers or external funding can create a tense local atmosphere, leading to more crime. Additionally, seasonal factors are apparent. During the warmer months and peak tourist seasons, crime risk also increases. This emphasizes that forecasting models should consider both regular cycles and external shocks, such as weather changes or shifts in tourism.
The way the target variable was structured—dividing risk into five clear categories—proved to be a wise approach.
Figure 3 shows that all five risk levels have a significant share of the data. This prevents the model from favoring one class and overlooking rare but crucial risk spikes. During significant events, such as the COVID-19 lockdown, immediate changes in these risk classes are noticeable. The distribution shifts quickly, reflecting real shocks, and the model remains responsive to these transitions. This ability to detect shifts in underlying patterns is a key advantage for real-world monitoring and prompt action.
Looking at
Figure 4, the autocorrelation and partial autocorrelation plots reveal that crime risk in one month is influenced by both preceding months and annual patterns. There is a mixture of short-term memory and longer cycles. The ARIMA part of the model effectively captures these recurring trends. For sudden changes or unusual jumps that do not fit a simple pattern, the ANN fills in the gaps. This division of tasks is significant; it enables the model to detect both obvious and subtle shifts in risk that many other models might miss.
Table 2 shows that the hybrid model outperforms all metrics: accuracy, F1-score, and AUC-ROC. The advantage is substantial. While random forest and SVM models have their strengths, the hybrid model shows greater reliability in critical situations. It excels at identifying high and very high crime risk periods. Weaker models, like decision tree and logistic regression, fail to detect these risky windows effectively, which can lead to dangerous gaps in real situations.
The confusion matrix in
Figure 5 shows that most errors made by the hybrid model occur between adjacent classes, such as “moderate” and “high.” This is important, because mistaking a “moderate” crime risk month for a “high” one is less problematic than confusing a “very low” month with a “very high.” This characteristic makes the hybrid approach much safer for everyday use by police or local planners, ensuring resources are not misallocated significantly. The ROC curves in
Figure 6 support this point of view. The hybrid model demonstrates a strong capacity to differentiate between varying levels of crime risk, particularly during periods characterized by elevated criminal activity. This capability is essential for issuing early warnings and preventing issues before they escalate. Traditional models struggle here, particularly when trying to identify rare, high crime risk months.
The predictions from the hybrid model shown in
Figure 7 closely track actual crime risk levels over recent years, even during turbulent periods like the pandemic or economic downturns. It is evident that the ARIMA-ANN model adjusts well to changes, not just repeating past trends but also recognizing new ones as they arise. This quality makes it valuable for authorities aiming to stay proactive rather than reactive.
When examining which variables matter most (see
Figure 8), the usual factors dominate: natural population change, unemployment, net salary, and trade balance. Their impact is both practical and understandable—financial stress and reduced opportunities tend to increase crime risk. Tourist arrivals and weather also significantly influence crime, especially during busy summer months or holidays.
The application of SHAP analysis provided a robust framework for identifying dominant factors influencing crime risk predictions. When analyzing the ‘very high’ crime risk class, as illustrated in
Figure 11, features such as ‘unemployed people’, ‘natural population change’, and ‘trade balance’ emerged as consistently impactful drivers of crime risk, underscoring their profound influence on the model’s output. This global perspective is important for strategic policy-making, enabling authorities to focus on long-term systemic factors. Locally, SHAP explanations for specific periods demonstrated how dynamic shifts in these and other variables, often in combination, contributed to particular crime risk predictions. For example, during the COVID-19 lockdown, the model’s elevated crime risk predictions could be directly attributed to the increased weight of ‘unemployed people’. This granular level of interpretability allows police forces to understand the specific drivers behind forecasted risks and allocate resources more judiciously. By providing this explainability, the ARIMA-ANN becomes a transparent analytical instrument, fostering greater trust and actionable insights for crime prevention and resource optimization.
In summary, the hybrid ARIMA-ANN model stands out for several reasons. It effectively combines regular, predictable trends with sudden shifts that occur in real life. Clearly defined risk classes prevent it from overlooking rare but important events. The model adapts well to rapid changes, such as policy shifts or economic shocks. It remains grounded in data, relying on real economic and social variables to make predictions.
However, while the model shows promise for Galați County, it should be tested in other regions or countries with different patterns before it can be deemed a universal solution. Future research should also explore new data sources—perhaps real-time data feeds or better connections with police systems—to enhance its predictions further.
Overall, the hybrid ARIMA-ANN approach offers more than just improved accuracy. It provides a well-rounded, adaptable framework that helps public safety teams identify potential issues early and allocate their resources wisely. With further research and testing, it could set a strong benchmark for crime risk prediction, not only in Romania but anywhere detailed and timely risk classification is essential.
5. Conclusions
This research compared a hybrid ARIMA-ANN model with several well-known multiclass classifiers to determine which one better predicts monthly crime risk in Galați County, Romania. Our analysis spans eleven years of data and introduces a new method for categorizing crime risk into five distinct groups. This approach helps ensure fairness and reflects real differences in the local population.
What stood out is that the ARIMA-ANN hybrid consistently outperformed every other model that was tested. This applied to all main evaluation metrics: accuracy, recall, and prediction stability. The hybrid model was especially good at identifying high-risk months that often follow economic shocks or significant social events. We believe this performance stems from the ARIMA component, which models regular cycles and trends in the crime data, while the ANN part detects irregular changes and more complex relationships between various factors.
From a policy viewpoint, these results are important. With the help of the hybrid model, authorities could go beyond just reacting to crime after it occurs. They could plan ahead, spot periods of increased risk, and allocate police or social workers more effectively. Since the model divides crime risk into clear categories, it provides decision-makers with an accurate view of where resources are most necessary. The transparent and repeatable steps in building and validating the model further support public trust, as any decisions made with this model can be explained and justified if questioned. In Galați County, our model delivered stronger and more reliable predictions than standard machine learning models. However, before making broad conclusions, the model should be tested in other counties or countries, especially where local conditions or available data may vary. This represents an important next step for future studies.
This study significantly contributes to the field of predictive crime analytics by integrating a hybrid ARIMA and ANN model that captures both linear and complex nonlinear patterns within the data. Compared to traditional models used in previous research, such as multinomial logistic regression, decision trees, or support vector machines, this approach shows a greater ability to anticipate seasonal trends and unexpected changes in crime risk. This is particularly valuable for monthly aggregated data at the county level, as socioeconomic factors influence criminal behavior.
A key difference from earlier studies is the use of a multidimensional dataset that includes socioeconomic, demographic, and environmental indicators. This allows for a more comprehensive modeling of crime risk forecasting. Unlike many existing studies that only consider raw crime data or historical frequencies, this research takes a systemic view, providing deeper insight into the causal relationships and interdependencies among the factors. Additionally, normalizing crime rates per capita and using a balanced five-level risk classification improve the strength and practical relevance of the findings.
Another innovative feature is the thorough comparison between the ARIMA-ANN model and standard classification techniques, using a wide range of evaluation metrics, including accuracy, precision, F1-score, and AUC-ROC. The results clearly show that the hybrid model excels in identifying complex temporal patterns and lowering misclassification rates in high-risk categories. This is important for developing proactive public safety strategies. This research pushes forward the field of crime analysis by providing not just a high-performing model, but also a repeatable, scalable, and flexible methodological framework. By effectively combining machine learning techniques with time series analysis, this study sets a new standard for applied research in social risk forecasting and serves as a valuable tool for guiding strategic decision-making in public safety and social policy areas.
The study successfully achieved its primary objective of developing a reliable crime prediction framework that balances both high predictive accuracy and interpretability. By integrating the strengths of the ARIMA-ANN hybrid approach and employing evaluation techniques, the proposed methodology demonstrated superior performance over conventional classifiers, particularly in capturing complex temporal dynamics and nonlinear relationships inherent in crime data. Furthermore, model transparency through systematic feature selection, residual analysis, and the use of an interpretability tool, ensured that the results are not only robust but also reliable for public safety decision-makers.
Overall, our findings emphasize the benefits of combining various data types—like social, economic, weather, and tourism indicators—to enhance crime risk prediction. Flexible models that capture both regular trends and sudden changes hold a clear advantage, which is what the ARIMA-ANN approach offers. We believe this research lays a solid foundation for smarter, data-informed public safety planning.
As data systems improve and hybrid modeling techniques continue to evolve, we see real potential to improve crime prevention efforts and move beyond simple reactions. While this approach is still emerging, our results are promising and show a path forward for research and practice in this area.