2.1. Theoretical Background
Neighborhood effects is an important concept in geographic, public health, and social science research and is concerned with how neighborhood conditions affect social outcomes. The notion can be traced back to University of Chicago sociologists Shaw and McKay [
12] who proposed the field’s oldest theoretical perspective, social disorganization, positing that neighborhood structures such as socioeconomic disadvantage, racial heterogeneity, and residential mobility prevent residents from forming social ties to regulate crime. Shaw and McKay’s work heralded a major paradigm shift away from individual-level theories of crime toward ecological models [
13].
While social disorganization theory fell out of favor in the 1960s, the approach was revitalized in the 1980s by scholars in the U.S. with a renewed interest in neighborhood dynamics due to rising crime rates and urban decline. These authors updated the framework by addressing criticisms [
14], testing and clarifying concepts [
15,
16], and expanding causal mechanisms [
17,
18,
19].
One important extension of social disorganization theory was the concept of collective efficacy [
18], which refers to residents’ ability to come together to achieve a shared desire for a safe neighborhood [
20]. Collective efficacy combines social cohesion, defined as trust and sense of community between neighbors, with informal social control, which refers to residents’ ability to regulate community disorder. Subsequent research has repeatedly demonstrated that collective efficacy exerts a strong effect on community crime and violence [
21,
22,
23].
Routine activities (RA) theory is another prominent neighborhood effects perspective and suggests that the way daily activities are organized creates opportunities for crime. The theory specifically posits that crime is more likely to occur when three factors meet in time and space: a motivated offender, an available target, and the absence of a capable guardian (e.g., an authority figure) [
24]. Research in this area is concerned with temporal and spatial effects on crime and focuses on micro-geographies, including “hot spots,” such as street segments where crime occurs [
25].
Pratt and Cullen [
13] assessed RA theory and social disorganization theory along with other criminological frameworks in their meta-analysis of macro-level predictors and theories of crime. They found that social disorganization and resource deprivation theory, which links economic inequality with an inability to regulate behavior in accordance with social norms, had the strongest effects on crime. RA theory had a moderate effect on crime. Spano and Freilich [
26] evaluated the empirical validity of RA theory in response to mixed support in existing multivariate studies. Based on a review of 33 articles, they found overall support for the theory, although nuanced analysis uncovered some limitations. For example, studies using U.S. samples were almost four times more likely to be consistent with hypothesized effects than studies using non-U.S. samples.
Based on the findings above, and the fact that we were largely dependent on the U.S. Census dataset for input, we elected to concentrate on socio-demographic and socio-economic predictors associated with social disorganization theory in our framework. However, we introduced a few predictors consistent with RA theory into our model, such as climate, given the theory’s effectiveness in the U.S. context. In addition, some social structural variables used in social disorganization research are applicable to RA theory (e.g., population characteristics influence who commits a crime and who is victimized) and previous researchers have used Census data measures to represent RA theory [
27].
Predictors of crime associated with social disorganization theory can be divided into two broad categories: “static” neighborhood conditions that reflect a neighborhood’s social structural conditions [
28,
29] and “dynamic” neighborhood processes, such as collective efficacy or social cohesion [
18,
29,
30,
31]. Single static variables with significant effects on crime include income inequality [
32,
33,
34,
35], race/ethnic segregation [
36,
37,
38], racial heterogeneity [
39,
40,
41,
42], residential instability [
43], gender [
44,
45,
46,
47], and age [
48,
49,
50], all taken into account in our model.
Table 1 lists major social structural predictors of crime assessed in prior reviews [
29,
51], and a meta-analysis [
13] and indicates their effects (positive, negative, or unclear) on crime.
Multicollinearity among social structural variables is a potential challenge in regression models concerned with causal analysis of crime. This is because of strong links between many of the structural factors associated with crime [
52], creating what Wilson [
19] referred to as “concentration effects”. Concentrated disadvantage or “resource deprivation” [
53] is one such index variable that incorporates indicators for income inequality, poverty, racial diversity, educational attainment, residential mobility, unemployment, and/or family disruption [
52,
54,
55]. Another index variable is family disruption which combines measures of family stability such as non-marriage, early marriage, early childbearing, parental absenteeism, widowhood, and death [
56,
57,
58]. While we are aware of multicollinearity issues in crime research, we did not use index variables in our model since collinearity is only an issue for causal inference and not prediction—the purpose of our framework.
Brisson and Roll [
29] assessed four dynamic or process variables in their review that tend to interact with static predictors to affect crime. Assessing social cohesion, Brisson and Roll found limited evidence of a relationship between social cohesion and crime in studies on hate crimes [
59] and general violence or intimate partner violence [
60]. Results were mixed for informal social control, with one study showing a relationship between informal social control and a decline in delinquency rates [
61] and another finding effects on anti-Black hate crime [
59]. A third study, however, was unable to demonstrate a link between informal social control and general violence and intimate partner violence [
60]. Research on social ties, which is a concept closely affiliated with social cohesion that looks at the number of relationships in a community, has demonstrated that effects on crime depend on the type and intensity of relationships and their influence on informal social control [
42,
62]. Finally, support for the effect of collective efficacy on crime is robust and the concept is applicable across urban locations. Collective efficacy has been associated with a decline in violent victimization [
63], a decline in homicide [
63], reduced fear of crime [
64], and increased street efficacy [
55].
There is a nascent rural crime literature, largely dominated by studies oriented around social disorganization theory [
65]. Findings have been inconsistent, with evidence for some aspects of social disorganization but little or no support for others [
66]. Consequently, it is difficult to make broad statements about crime patterns, but preliminary research indicates that variables such as poverty and family disruption affect crime differently in rural communities than in urban areas. For example, research suggests that poverty has no relationship or an inverse relationship with crime [
65,
67,
68,
69,
70,
71] possibly because community stability produces stronger informal social control [
72]. In another example, racial heterogeneity appears to have limited effects on social disorganization in rural settings, given the mixed results of studies. For example, Bouffard and Muftic [
67] found no association between ethnic heterogeneity and violent crime, while other scholars have found a positive relationship between variables, including robbery and assault in rural counties [
69] and youth violent crime [
73].
Table 2 provides an overview of social structural predictors of crime in rural communities.
Due to remaining uncertainty about the mechanisms of crime in rural communities, we did not create a separate model for predicting rural crime but applied the same model to rural and urban contexts. Similarly, sparse research into suburban crime [
67,
70,
75] meant that we were not able to develop a distinct model to predict crime in suburban settings.
In sum, based on our thorough review of the neighborhood effects literature, we decided to select predictors of urban crime associated with the neighborhood effects perspective, mainly social disorganization theory and, to a lesser degree, RA theory, to inform our framework. Most of these were social structural predictors that have demonstrated significant relationships with crime in prior research (these are summarized in
Table 3). We subsequently drew on datasets, including the U.S. Census, to select social, economic, and demographic indicators to represent these predictors.
2.2. Related Work: ML and Crime Prediction
In this section, we review the recent work on spatial crime prediction using different ML techniques, with an emphasis on the methods estimating crime rates or occurrences.
H.W. Kang and H.B. Kang [
76] proposed a deep learning method based on a deep neural network (DNN) for crime occurrences prediction at the U.S. census-tract level. In their data strategy, the authors involved various sources of data, including crime occurrence reports and demographic and climate information. Additionally, they considered environmental context information using image data from Google Street View. In their prediction model, the authors adopted a multimodal data fusion method, in such a way that the DNN is defined with four layer groups, namely: spatial, temporal, environmental context, and joint feature representation layers. This predictive model produces significant results in terms of accuracy. However, it was trained and tested using only real-world datasets collected from the city of Chicago, Illinois, due to data availability constraints. Thus, it cannot be used uniformly for all U.S. cities.
Based also on the deep learning family of methods, Huang et al. [
77] proposed a Recurrent Neural Network (RNN) for predicting spatio-temporal crime occurrences in urban areas. Their method is characterized by detecting dynamic crime patterns using a hierarchical recurrent neural network from hidden representation vectors. These vectors embed spatial, temporal, and categorical signals while preserving the correlations between the crime occurrences and their time slots. This method was trained and evaluated using real-world datasets collected from New York City. In this dataset, crimes are recorded with their respective category, location, and timestamp. However, such a method cannot be uniformly used for all urban areas, since these kinds of data are not commonly available for other cities.
A probabilistic model based on the Bayesian paradigm was suggested by [
78]. This proposed model was conceived to predict spatial crime rates using demographic and historical crime data. It quantifies the uncertainties in the output predictions and the model parameters using a combination of two Bayesian linear regression models. A first parametric model that takes into account the relationship between crime rate and location-specific factors, and a second non-parametric model that addresses the spatial dependencies. It also handles the inferences on the regression parameters by estimating the posterior probability distribution using the Markov Chain Monte Carlo method (MCMC). Results regarding three types of crime comply with the existing theoretical criminological assumptions. In addition, the proposed model can be generalized to all of Australia, since it uses demographic census data available nearly in all locations.
Besides these efforts, we found that ensemble-learning methods have been the subject of several studies in the literature, and have proven to be effective in the context of spatial crime prediction. This family of ML models draws its strength from the fact that it employs multiple learning algorithms. Each algorithm works on a chunk or on the whole dataset to produce intermediate predictions that are collected and processed in order to obtain the final predictions. Examples of studies relying on ensemble-learning methods include [
6,
7,
79].
Alves et al. [
6] used a random forest regressor to predict crime in urban areas. Knowing that this ML model is extremely sensitive to its main parameters (the number of trees and the maximum depth of each tree), the authors estimated them using the stratified k-fold cross-validation method and then set them using the grid-search algorithm. Thus, they managed to create a trade-off between bias and variance errors. The authors also studied the relationship between crime incidents and urban indicators using various statistical tests and metrics, in order to select the most important explanatory indicators. Their proposed model has been trained and tested using urban indicators data from all Brazilian cities. Experiments showed that it can yield a promising accuracy reaching up to 97% on crime prediction. However, predictions concern only a single type of crime—i.e., homicides, at an aggregated city-level.
More recently, Kadar et al. [
7] proposed a predictive approach for spatio-temporal crime hotspots predictions in low population density areas. The authors focused mainly on the problem of class imbalance, handled through a repeated under-sampling technique. Indeed, in the learning phase, their predictive model is trained using balanced sub-samples of the input dataset, which are created by randomly selecting the same number of instances from the majority and minority classes. As a next step, they adopted the random forest classifier as a base learner for predicting crime hotspots after a deep evaluation of other ML models. Results with an input dataset composed of different predictors, such as socio-economic, geographical, temporal, meteorological, and crime variables, showed that this approach outperforms the common baselines in predicting hotspots. However, it is conceived to predict only a single type of crime, burglary incidents.
Another ensemble-learning predictive approach was proposed in [
79]. Ingilevich and Ivanov conceived a three-step approach for crime occurrences prediction in a specific urban area. Their approach starts with a clustering step, in which the authors applied the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm in order to study the spatial patterns of the considered crime types and to remove the noise from the dataset. This is followed by a feature selection step, in which the authors applied the chi-squared test in order to study the relative importance of the features. Finally, in the third step, the authors used the gradient boosting model to predict crime occurrences after a performance comparison of two other models—i.e., the linear regression and the logistic regression. This model was trained and tested using the crime incidents dataset from Saint-Petersburg, Russia. It outperformed the two other models in terms of accuracy for three types of street crimes.
Building on this previous work and on our own efforts, we propose a predictive framework that has been carefully designed to spatially predict crime occurrences at the U.S. Census Block Group level, based on the gradient boosting model.