On the Risk Assessment of Terrorist Attacks Coupled with Multi-Source Factors

: Terrorism has wreaked havoc on today’s society and people. The discovery of the regularity of terrorist attacks is of great signiﬁcance to the global counterterrorism strategy. In this study, we improve the traditional location recommendation algorithm coupled with multi-source factors and spatial characteristics. We used the data of terrorist attacks in Southeast Asia from 1970 to 2016, and comprehensively considered 17 inﬂuencing factors, including socioeconomic and natural resource factors. The improved recommendation algorithm is used to build a spatial risk assessment model of terrorist attacks, and the effectiveness is tested. The model trained in this study is tested with precision , recall , and F-Measure . The results show that, when the threshold is 0.4, the precision is as high as 88%, and the F-Measure is the highest. We assess the spatial risk of the terrorist attacks in Southeast Asia through experiments. It can be seen that the southernmost part of the Indochina peninsula and the Philippines are high-risk areas and that the medium-risk and high-risk areas are mainly distributed in the coastal areas. Therefore, future anti-terrorism measures should pay more attention to these areas.


Introduction
Terrorism is one of the most important threats in today's society and has caused great harm to people from all over the world [1].Southeast Asia is not only a key node in the "One Belt and One Road" development initiative, but also an area of frequent terrorist attacks.The spatial risk assessment of the terrorist attacks in Southeast Asia is of great significance to the implementation of both the One Belt One Road Initiative and the counterterrorism strategy.According to the statistics of the Global Terrorism Database (GTD), 1078 terrorist attacks occurred in Southeast Asia in 2016 alone, resulting in 533 deaths and causing great panic within the society.A large number of scholars at home and abroad have made great efforts to solve various problems related to the threat of terrorism [2][3][4][5][6][7][8][9][10][11][12][13][14].However, the risk assessment of terrorist attacks remains a complex and uncertain problem.On the one hand, the existence of the Internet has brought the global community closer together in all corners, in all sectors, and in all fields; as a result, the number of sensitive variables and disturbance variables related to terrorist attacks has increased unprecedentedly.On the other hand, because of the unprecedented advancement of the global digitization and the application of various advanced material collection methods, the terrorist attack assessment can obtain more types and larger volumes of related data from various angles than ever before, requiring researchers to have smarter, more efficient complex data processing capabilities.With the achievements of artificial intelligence in many fields, Sivasamy et al., Minu et al., and Gohar et al. have introduced machine learning methods into the evaluation of terrorist attacks and have conducted a series of fruitful works [15][16][17].Dong believes that the machine learning method can focus on extracting factor vectors from known information, forming pattern recognition and classification, and then use the data outside the sample to perform pattern verification and prediction [18].Moreover, machine learning can be used to automatically re-identify factor vectors, reconstruct conflict modes, and adjust predictive output results based on different data inputs.In addition, the machine learning-based terrorist attack assessment model can also be widely accommodated and integrated with unstructured data, and we have the ability to find discernable patterns from clutter and mixed data [18].

Related Works
The previous related studies primarily involved research from three aspects, as shown in Table 1.A terrorist attack prediction project led by Blair et al. used a neural network to successfully predict the conflict in Liberia in 2010 with the data in 2008; the accuracy was between 0.65 and 0.74 [19].Dong used the 2010-2016 forecast of terrorist attacks in India as an example to empirically examine the effectiveness of machine learning based on back propagation (BP) neural networks in real-life terrorist attacks.It was found that machine learning-based terrorist attack prediction paradigms, even without the support of specific social theories, have a certain ability to anticipate terrorist attacks and can discover new knowledge regarding conflicts [18].However, these studies are only aimed at individual countries and predictions on a national scale.Sheehan used time-series methods to investigate the relationship between the number of global strategic armed forces-related incidents and the frequency of transnational terrorist attacks, the type of attacks, and the type of victims of terrorist attacks with data from transnational terrorism incidents from 1993 to 2004 [20].Sivasamy et al. proposed a new prediction method that uses the mixed average model (MABM) to fit the civilian casualty data resulting from terrorist attacks in South Asia and predicted civilian casualties in 2014 [15].Minu et al. used the wavelet neural network (WNN) for prediction and applied it to the nonstationary nonlinear time-series of the terrorist attack time-series (time-series of the monthly number of world terrorist attacks from February 1968 to January 2007); the results revealed that the WNN is the best model for analyzing the time-series of terrorist attacks [16].These studies were based on the time-series of terrorist attacks.Faryal et al. proposed a new classification and prediction framework to predict terrorist organizations.This framework consisted of four basic classifiers: naive Bayes (NB), K nearest neighbour (KNN), Iterative Dichotomiser 3 (ID3), and decision stump (DS); Compared with a separate classifier, this method was found to achieve a fairly good accuracy and a lower classification error rate [17].Raghavan et al. used the hidden Markov model to establish a model for a terrorist organization's activity and detect the sudden situation of the organization [21].Adam et al. used a power-law distribution based on observations to calculate the likelihood of a single event [22].These studies focused on the terrorist attack itself.Scheffran's study showed that many connections and feedbacks exist among the climate system, natural resources, human security, and social stability [6].Nevertheless, previous studies on terrorist attacks have seldom considered the multi-source factors that affect terrorist attacks; most studies have been conducted at national or regional scales, and the research has been generally conducted from the time-series of the occurrence of an attack or the incident itself and has ignored the spatial distribution of the occurrence of terrorist attacks.To assess the risk of terrorist attacks in the places where terrorist attacks have not occurred, we combine the clustering algorithm and the location recommendation algorithm from the grid scale and conduct research using terrorist data from 1970 to 2016 in Southeast Asia.Based on a comprehensive analysis of the factors of the terrorist attack, we conduct a spatial risk assessment of terrorist attacks.Hidden Markov Models are used to establish a model for the terrorist organization's activity and detect the sudden situation of the organization.
Scharpf A Using a power-law distribution based on observations to calculate the likelihood of a single event In our study, the assessment process mainly includes two parts: partitioning areas and risk assessment.Partitioning areas refers to the regional division of the study area in space according to the influencing factors.In the machine learning algorithm, this method belongs to unsupervised learning, and the clustering algorithm is a typical unsupervised machine learning algorithm.We have selected four classical clustering algorithms, including K-means, Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Self-Organizing Maps (SOM).Through experimental comparison, we obtained the most suitable method for partitioning areas and obtained the weighted grid.The risk assessment part is mainly divided into three steps.First of all, the data of the weighted factor are used for the location recommendation algorithm to calculate the similarity between each grid and then build a kernel density function based on the severity of terrorist attacks.Then, combining the similarity with the kernel density, a score between 0-1 is calculated for each grid where no terrorist attack occurred.Finally, we conducted a validity test of the risk assessment model of terrorist attacks established by the research.

Data Processing
Southeast Asia was chosen as the research area to conduct a spatial risk assessment of terrorist attacks.There are 11 countries in the study area, covering an area of 4.57 × 10 6 km 2 : Vietnam, Laos, Cambodia, Thailand, Myanmar, Malaysia, Singapore, Indonesia, Brunei, the Philippines, and Timor-Leste.Among these countries, Laos is the only landlocked country in Southeast Asia, and Vietnam, Laos, and Myanmar border the People's Republic of China by land.Southeast Asia is a frequent area of terrorist attacks, as shown in Figures 1 and 2. From the figures, we can see that the southernmost part of Thailand and Philippine are high-risk density areas for terrorist attacks.These regions have long displayed an imbalance of political and economic development, and ethnic and religious conflicts are more serious, which is likely to lead to the breeding of terrorism.Therefore, the spatial risk assessment of terrorist attacks in Southeast Asia is of great significance.A risk assessment is a quantitative evaluation of the impact or loss potential of an event or thing [23].The spatial risk assessment of terrorist attacks assesses the location and occurrence of a terrorist attack from a spatial perspective, including, but not limited to, the use of locations where terrorist attacks have occurred, to assess the risk of not having a terrorist attack.We conducted our research from the perspective of location recommendation methods.For data, we collected 17 types of influencing factor data from two aspects of socioeconomic factors and natural resource factors, which are shown in Table 2.Among these factors, socioeconomic factors include ethnic diversity, major drug areas, population density and nighttime lighting, accommodation outlets, catering outlets, transportation sites, religious sites, and political sites; natural resource factors include average precipitation, average temperature, terrain, the distance to the main navigable lake, the distance to the ice-free ocean, and the distance to the main navigable river.Then, standard grid spatial processing (0.1° × 0.1°) of the 17 factors and the terrorist attack data was performed, by which we can get 36,978 standard grids to allow them to be analyzed at the same scale.To unify the measurement scale, this study normalizes the influencing factor.For data, we collected 17 types of influencing factor data from two aspects of socioeconomic factors and natural resource factors, which are shown in Table 2.Among these factors, socioeconomic factors include ethnic diversity, major drug areas, population density and nighttime lighting, accommodation outlets, catering outlets, transportation sites, religious sites, and political sites; natural resource factors include average precipitation, average temperature, terrain, the distance to the main navigable lake, the distance to the ice-free ocean, and the distance to the main navigable river.Then, standard grid spatial processing (0.1° × 0.1°) of the 17 factors and the terrorist attack data was performed, by which we can get 36,978 standard grids to allow them to be analyzed at the same scale.To unify the measurement scale, this study normalizes the influencing factor.For data, we collected 17 types of influencing factor data from two aspects of socioeconomic factors and natural resource factors, which are shown in Table 2.Among these factors, socioeconomic factors include ethnic diversity, major drug areas, population density and nighttime lighting, accommodation outlets, catering outlets, transportation sites, religious sites, and political sites; natural resource factors include average precipitation, average temperature, terrain, the distance to the main navigable lake, the distance to the ice-free ocean, and the distance to the main navigable river.Then, standard grid spatial processing (0.1 • × 0.1 • ) of the 17 factors and the terrorist attack data was performed, by which we can get 36,978 standard grids to allow them to be analyzed at the same scale.To unify the measurement scale, this study normalizes the influencing factor.We mainly use GIS software and Python programming language for data processing, including ArcMap10.3(http://pro.arcgis.com/)and Python 3.6 (https://www.python.org/).
(1) Based on GTD, the location of terrorist attacks in Southeast Asia, as well as the numbers of casualties, can be obtained, and the information on the terrorist attacks is converted into raster data, selecting a grid with a 0.1 • × 0.1 • resolution.The grid serves as a spatial unit to facilitate the statistical determination of the number of terrorist incidents and the total number of casualties.
(2) The raster data of five factors can be obtained by G-Econ 4.0 (a dataset of world economic activity): the distance from the main sailing lake (km), the distance from the main sailing river (km), the distance from the ice-free sea, the average precipitation (mm/a), and the average temperature ( • C); subsequently, ArcMap 10.3 is used to sample the above raster data in a 0. In addition, because the 17 factors have different units, to unify the measurement scale and avoid the differences between different units, we normalize the 17 influencing factors, and the normalized formula is given below: where X norm is the normalized value, X min is the minimum value of the factor, X max is the maximum value of the factor, and n is the number of factors.

Algorithm
The spatial risk assessment of terrorist attacks is used to assess the location and the occurrence risk of terrorist attacks from the perspective of spatial analysis, including, but not limited to, the use of locations where terrorist attacks have occurred, to assess the risk of locations where no terrorist attacks have occurred.The location recommendation algorithm can spatially extract the relationship between the terrorist attack and the location of the attack, thereby scientifically conducting a risk assessment.The traditional location recommendation algorithm mainly focuses on single-source factors, such as sign-in; in contrast, the algorithm has rarely been used for multi-source factors.In research, multiple regions are usually considered as a whole.However, factors such as economy and population often have different influences in different regions.The study found some factors of spatial factors but did not integrate them into the position recommendation process for in-depth research.Based on this research, the multi-source factors, spatial factors, and regional divisions are integrated into the location recommendation algorithm.
First, the Southeast Asian region is divided into regions by a clustering method, and the weights of each factor are obtained by correlation analysis.Next, the location recommendation algorithm is improved by using the factor that the activities in the geographical location are clustered, and the spatial risk assessment of the terrorist attack is completed through the partitioned terrorist attack location factors and the terrorist attack attribute data.The evaluation process of this paper mainly includes two parts: partitioning areas and risk assessment.The flow chart is shown in Figure 3.

Partitioning Areas
Because the influence of various factors on the terrorist attacks in different regions is not the same, we first consider the spatial division of the study area according to the factor data and then use the correlation analysis method to determine the degree of impact of each influencing factor on the terrorist attacks in each sub-area.The extent of the impact of a terrorist attack is weighted.The spatial division involves dividing the study area spatially according to the factors.According to the machine learning algorithm, this method belongs to unsupervised learning, and the clustering algorithm is a typical unsupervised machine learning algorithm.Clustering refers to a large number of unknown datasets.According to the inherent similarity of data, the dataset is divided into multiple clusters.The entities in clusters are similar, and the entities of different clusters are not similar.A cluster is the convergence of points in the test space.The distance between any two points of the same cluster is less than the distance between any two points of different clusters [24].We will select four classical clustering algorithms.Through the experimental comparison, the method that is most suitable for the spatial division of this study is obtained.a. K-means algorithm K-means is a partition-based clustering method.The K-means algorithm calculates the similarity based on the average value of the data objects in the cluster and takes the average (or the centroid) of the objects in the cluster as the center of the cluster.The algorithm first randomly selects k objects among n data objects.Each object represents the average of a cluster.For each remaining object, according to its distance from the center of each cluster and the principle of minimum distance, it is assigned to the nearest cluster.On this basis, the average of each cluster is recalculated.This process is repeated until the sum of squared errors is minimized.The formula is as follows (at this point, the members in the cluster no longer change): where 1 i is the given data object and j w is the average value of cluster j C [25].

Partitioning Areas
Because the influence of various factors on the terrorist attacks in different regions is not the same, we first consider the spatial division of the study area according to the factor data and then use the correlation analysis method to determine the degree of impact of each influencing factor on the terrorist attacks in each sub-area.The extent of the impact of a terrorist attack is weighted.The spatial division involves dividing the study area spatially according to the factors.According to the machine learning algorithm, this method belongs to unsupervised learning, and the clustering algorithm is a typical unsupervised machine learning algorithm.Clustering refers to a large number of unknown datasets.According to the inherent similarity of data, the dataset is divided into multiple clusters.The entities in clusters are similar, and the entities of different clusters are not similar.A cluster is the convergence of points in the test space.The distance between any two points of the same cluster is less than the distance between any two points of different clusters [24].We will select four classical clustering algorithms.Through the experimental comparison, the method that is most suitable for the spatial division of this study is obtained.a. K-means algorithm K-means is a partition-based clustering method.The K-means algorithm calculates the similarity based on the average value of the data objects in the cluster and takes the average (or the centroid) of the objects in the cluster as the center of the cluster.The algorithm first randomly selects k objects among n data objects.Each object represents the average of a cluster.For each remaining object, according to its distance from the center of each cluster and the principle of minimum distance, it is assigned to the nearest cluster.On this basis, the average of each cluster is recalculated.This process is repeated until the sum of squared errors is minimized.The formula is as follows (at this point, the members in the cluster no longer change): where i 1 is the given data object and w j is the average value of cluster C j [25].b.BIRCH algorithm BIRCH is a comprehensive hierarchical clustering method that is commonly used for large-scale data sets.This algorithm introduces two concepts: the clustering factor (CF) and clustering factor tree (CF-tree).These two concepts are used to summarize the clusters, and the distances between clusters are used.The equilibrium iteration of the hierarchical method is used to reduce the size and cluster of data sets.The BIRCH method saves memory and calculates quickly, with only a simple scan of the data set required to build a tree and identify noise points.However, BIRCH does not cluster well for non-spherical clustering and high-dimensional data clustering.In addition, the order of data input affects the results of the algorithm [26].c.DBSCAN algorithm DBSCAN is a more representative density-based clustering algorithm.DBSCAN defines a cluster as the largest set of points connected by density and can divide a region having a sufficiently high density into clusters.The algorithm requires the user to input two parameters: one parameter is the radius (Eps), which represents the extent of a circular neighborhood centered at a given point P; the other parameter is the number of minimum points within the neighborhood centered on the point P (MinPts).These two parameters are difficult to set because they require the user to have a general understanding of the cluster dataset and to set it empirically [27].

d. SOM algorithm
The SOM algorithm is an unsupervised learning algorithm for clustering and high-dimensional visualization, which is an artificial neural network developed by simulating the characteristics of the human brain for signal processing.After the model was proposed by Professor Kohonen of the University of Helsinki in Finland in 1981, it became the most widely used self-organizing neural network method.The SOM network structure consists of an input layer and a competition layer (output layer).The number of input layer neurons is n, and the competition layer is a one-dimensional or two-dimensional planar array composed of m neurons.The network is fully connected, and each input node is connected with all output nodes.The SOM network can map arbitrary dimensional input patterns in the output layer into one-dimensional or two-dimensional graphics and keep its topology unchanged.The "competitive learning" approach is used in training.Each input sample finds a node in the hidden layer that best matches it, called its activation node of the "winning neuron", followed by a random gradient descent method to update the parameters of the activation node.At the same time, the points adjacent to the active node are also updated appropriately according to their distance from the active node.Excitatory feedback is sent to neighboring neurons, and inhibitory feedback is sent to distant neurons.In other words, neighbors encourage each other, and distant neighbors suppress each other [28].
We use the clustering quality indicator called the Calinski-Harabaz (CH) index to evaluate the clustering effect.The CH indicator is the ratio of the degree of separation and compactness of the data set.Tightness is measured by the sum of the squares of the distances between the data points in each class and the representative points, and the degree of separation is measured by the square of the distance between each representative point and the center point of the data set.The larger the CH indicator value is, the closer the class itself is, the more dispersed the classes are, and the better the clustering effect is.
where K represents the number of clusters, n i represents the number of data points in the i-th class, d(c i , c) represents the distance between the representative point of the i-th class and the data center c, d(x, c i ) denotes the distance between data point x and its representative point in class i, and n represents the total number of data points in the dataset [29].
With regard to the factor weights, this study uses the maximum information coefficient (MIC) to calculate correlations and assign weights to factors.The maximum information coefficient is developed on the basis of mutual information.The maximum information coefficient is suitable for exploring the potential relationship between variable pairs in the data set, and it is fair and extensive.

MIC(X, Y|D
where X, Y denote the variables; n denotes the sample size; i × j < B(n) represents the division dimension of the grid G; G indicates that the pairs of variables are divided into i × j grids; and M(X, Y|D) i,j denotes the characteristic matrix of X and Y [30].In this study, B(n) = n 0.6 ; obviously, 0 ≤ MIC ≤ 1.

Risk Assessment
The data of weighted factor are used as the input for the location recommendation algorithm to calculate the similarity between each grid and then build a kernel density function based on the severity of terrorist attacks.Finally, combining the similarity with the kernel density, a score between 0-1 is calculated for each grid where no terrorist attack has occurred.This score indicates the probability of a terrorist attack on the grid.
(1) Similarity Calculation The Euclidean metric (also called the Euclidean distance) is a commonly used distance definition, which refers to the true distance between two points in an m-dimensional space, or the natural length of a vector (that is, the point to the origin distance).The Euclidean distance in 2D and 3D space is the actual distance between two points.
(2) Spatial Characteristic Analysis Kernel density analysis is used in spatial analysis to calculate the density of elements in their surrounding neighborhoods; it considers the neighborhood of each element as a smooth surface.The position of the element has the highest value, and with the increase in the point distance, the value gradually decreases, reaching 0 when the search radius is reached [31].With the analysis of kernel density, it is possible to vividly and intuitively show hot spots where geographical phenomena are distributed.The formula for the kernel density method is given by where f (s) is the kernel density calculation function at the spatial position s; h is the distance attenuation threshold, which is the bandwidth; n is the number of element points whose distance from the position s is less than or equal to h; and the K function is the kernel function.The study of the kernel function is based on the quadratic kernel function described in the work of Silverman.The geometric meaning of this equation is that the density value is largest at each core element x i and decreases continuously during the distance x i until the kernel density value drops to 0 when the distance from the core x i reaches the bandwidth h.
(3) Spatial Risk Assessment We use the data of the weighted factor to calculate the similarity between the grid of land where no terrorist attack has occurred and the grid of the terrorist attack; we then select the three grids with the highest similarity to the grids without terrorist attacks; and then, we weight the average of the kernel density values and their corresponding similarity values for the three grids.The calculated score is the degree of possibility of an assault incident occurring in a grid where no terrorist attacks have occurred.The calculation process is shown in Figure 4.
score is the degree of possibility of an assault incident occurring in a grid where no terrorist attacks have occurred.The calculation process is shown in Figure 4. We use precision, recall, and the combined F-Measure of the two to evaluate the space risk of terrorist attacks.The precision rate represents the proportion of actual terrorist attacks in the grids that the model assessed as high risk.The recall rate indicates the proportion of high-risk grids assessed by the model in the actual terrorist attack grids.The F-Measure is a comprehensive consideration of both rates and can comprehensively reflect the evaluation performance of the model.

FP TP TP cision Pre
  (7) In Equation ( 7), Precision was used for the prediction result; it indicates how many samples in the positive prediction are true positive samples.There are two possibilities for the prediction to be positive: One possibility is to predict the true positive class as a positive class (TP); the other possibility is to predict the negative class as a positive class (FP).

FN TP TP Recall
  (8) In Equation (8), Recall was used for the original sample.It indicates how many positive examples in the sample are correctly predicted.There are two possibilities for being correctly predicted: One is to predict the true positive class as a positive class (TP); the other is to predict the original positive class as a negative class (FN).

R P
In Equation ( 9), P means Precision and R means Recall.There are occasions when contradictory situations exist between P and R indicators.Therefore, a comprehensive calculation formula F-Measure (F) of P and R was selected in this study for the overall evaluation of the model.

Regional Division Results
According to the collected data of 17 multi-source factors, the clustering algorithm in the machine learning method is used to partition the Southeast Asian region.For the BIRCH and Kmeans algorithm, the number of clusters 2 to 10 is selected to adjust the parameters.It is found that the BIRCH clustering effect is the best when the number of clusters is 4 and that the K-means algorithm works best when the number of clusters is 2. For the DBSCAN clustering algorithm, eps (εneighborhood distance threshold) and min samples (ε-neighborhood threshold) were used to adjust the parameters.It is found that, when eps is 0.5 and min samples is 8, the clustering effect is best.For (4) Evaluation index We use precision, recall, and the combined F-Measure of the two to evaluate the space risk of terrorist attacks.The precision rate represents the proportion of actual terrorist attacks in the grids that the model assessed as high risk.The recall rate indicates the proportion of high-risk grids assessed by the model in the actual terrorist attack grids.The F-Measure is a comprehensive consideration of both rates and can comprehensively reflect the evaluation performance of the model.
In Equation ( 7), Precision was used for the prediction result; it indicates how many samples in the positive prediction are true positive samples.There are two possibilities for the prediction to be positive: One possibility is to predict the true positive class as a positive class (TP); the other possibility is to predict the negative class as a positive class (FP).
In Equation ( 8), Recall was used for the original sample.It indicates how many positive examples in the sample are correctly predicted.There are two possibilities for being correctly predicted: One is to predict the true positive class as a positive class (TP); the other is to predict the original positive class as a negative class (FN).
In Equation ( 9), P means Precision and R means Recall.There are occasions when contradictory situations exist between P and R indicators.Therefore, a comprehensive calculation formula F-Measure (F) of P and R was selected in this study for the overall evaluation of the model.

Regional Division Results
According to the collected data of 17 multi-source factors, the clustering algorithm in the machine learning method is used to partition the Southeast Asian region.For the BIRCH and K-means algorithm, the number of clusters 2 to 10 is selected to adjust the parameters.It is found that the BIRCH clustering effect is the best when the number of clusters is 4 and that the K-means algorithm works best when the number of clusters is 2. For the DBSCAN clustering algorithm, eps (ε-neighborhood distance threshold) and min samples (ε-neighborhood threshold) were used to adjust the parameters.It is found that, when eps is 0.5 and min samples is 8, the clustering effect is best.For the SOM algorithm, select the number of neurons to adjust the parameters; the best effect is found when the number is 2. The tuning parameters of the four clustering algorithms are shown in Figure 5.By comparing the optimal parameter states of the four clustering algorithms, we found that the K-means algorithm has the highest clustering quality score.Therefore, K-means was selected for spatial division.The comparison of the four algorithm clustering effects is shown in Figure 6, and the result of the spatial area division is shown in Figure 7.By comparing the optimal parameter states of the four clustering algorithms, we found that the algorithm has the highest clustering quality score.Therefore, K-means was selected for spatial division.The comparison of the four algorithm clustering effects is shown in Figure 6, and the result of the spatial area division is shown in Figure 7.By comparing the optimal parameter states of the four clustering algorithms, we found that the K-means algorithm has the highest clustering quality score.Therefore, K-means was selected for spatial division.The comparison of the four algorithm clustering effects is shown in Figure 6, and the result of the spatial area division is shown in Figure 7.

Spatial Characteristics
Kernel density analysis is used to calculate the density of elements in their surrounding neighborhoods.In the kernel density analysis tool of ArcMap 10.3, the Population field indicates counts or quantities that are distributed throughout the landscape used to create a continuous surface.This study uses ArcMap 10.3 for kernel density analysis and sets the Population field value to the severity of the terrorist attack represented by this point (combining the number of deaths, the number of injured, and property losses).The kernel density of each grid based on the severity of the terrorist attack is shown in Figure 8.

Assessment Results
The data set was divided into two parts: one for training the evaluation model and the other for testing the model.To train and test the performance of the spatial risk assessment model, a 10-fold

Spatial Characteristics
Kernel density analysis is used to calculate the density of elements in their surrounding neighborhoods.In the kernel density analysis tool of ArcMap 10.3, the Population field indicates counts or quantities that are distributed throughout the landscape used to create a continuous surface.This study uses ArcMap 10.3 for kernel density analysis and sets the Population field value to the severity of the terrorist attack represented by this point (combining the number of deaths, the number of injured, and property losses).The kernel density of each grid based on the severity of the terrorist attack is shown in Figure 8.

Spatial Characteristics
Kernel density analysis is used to calculate the density of elements in their surrounding neighborhoods.In the kernel density analysis tool of ArcMap 10.3, the Population field indicates counts or quantities that are distributed throughout the landscape used to create a continuous surface.This study uses ArcMap 10.3 for kernel density analysis and sets the Population field value to the severity of the terrorist attack represented by this point (combining the number of deaths, the number of injured, and property losses).The kernel density of each grid based on the severity of the terrorist attack is shown in Figure 8.

Assessment Results
The data set was divided into two parts: one for training the evaluation model and the other for testing the model.To train and test the performance of the spatial risk assessment model, a 10-fold

Assessment Results
The data set was divided into two parts: one for training the evaluation model and the other for testing the model.To train and test the performance of the spatial risk assessment model, a 10-fold cross-validation method was used.Divide the data set into ten and take nine of them as training data and one of them as test data.The sample data in each test set will get a score between 0 and 1, which is verified by taking the threshold value from 0.1 to 0.9, and the evaluation index selection precision rate, recall rate, and F-Measure.We conducted ten 10-fold cross-validations and sought the average value as an estimate of the final model accuracy, as shown in Figure 9.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 14 of 19 cross-validation method was used.Divide the data set into ten and take nine of them as training data and one of them as test data.The sample data in each test set will get a score between 0 and 1, which is verified by taking the threshold value from 0.1 to 0.9, and the evaluation index selection precision rate, recall rate, and F-Measure.We conducted ten 10-fold cross-validations and sought the average value as an estimate of the final model accuracy, as shown in Figure 9.As seen in Figure 9, with the increase in the threshold, the precision rate increases but the recall rate decreases, and the F-Measure increases first and then decreases.The F-Measure can comprehensively represent the overall performance of the model.Therefore, when the value with the highest F-Measure is selected in this study, the threshold is 0.4, and the precision is 88%.The results of the spatial risk assessment of an assault are shown in Figure 10.The high-risk areas in the figure have scores greater than 0.4, the medium-risk assessment scores range from 0.1 to 0.4, and the low-risk areas score is less than 0.1.From Figure 10, we can see that the high-risk areas of terrorist attacks in Southeast Asia are generally concentrated and multi-centered.The southernmost part of Thailand and the Philippines are high-risk areas of terrorist attacks.The conflicts between religions and ethnic groups in these regions are serious.The economic development of these regions is not balanced and belongs to the main drug areas, so it is easy to breed terrorism; the medium-risk areas of terrorist attacks are widely distributed.Some coastal and border areas are in medium-risk areas of terrorist attacks.Unbalanced economic development, and ethnic and religious conflicts are quite serious and can lead to terrorist attacks.The low-risk areas of terrorist attacks are also widely distributed.There are fewer religions and ethnic groups in these regions, and it is not easy to cause terrorism.As seen in Figure 9, with the increase in the threshold, the precision rate increases but the recall rate decreases, and the F-Measure increases first and then decreases.The F-Measure can comprehensively represent the overall performance of the model.Therefore, when the value with the highest F-Measure is selected in this study, the threshold is 0.4, and the precision is 88%.The results of the spatial risk assessment of an assault are shown in Figure 10.The high-risk areas in the figure have scores greater than 0.4, the medium-risk assessment scores range from 0.1 to 0.4, and the low-risk areas score is less than 0.1.From Figure 10, we can see that the high-risk areas of terrorist attacks in Southeast Asia are generally concentrated and multi-centered.The southernmost part of Thailand and the Philippines are high-risk areas of terrorist attacks.The conflicts between religions and ethnic groups in these regions are serious.The economic development of these regions is not balanced and belongs to the main drug areas, so it is easy to breed terrorism; the medium-risk areas of terrorist attacks are widely distributed.Some coastal and border areas are in medium-risk areas of terrorist attacks.Unbalanced economic development, and ethnic and religious conflicts are quite serious and can lead to terrorist attacks.The low-risk areas of terrorist attacks are also widely distributed.There are fewer religions and ethnic groups in these regions, and it is not easy to cause terrorism.Figure 11a shows that the precision rate increases as the threshold increases and that the precision of the model after K-means, DBSCAN, and SOM partitioning under different thresholds is greater than  Figure 11a shows that the precision rate increases as the threshold increases and that the precision of the model after K-means, DBSCAN, and SOM partitioning under different thresholds is greater than

Figure 2 .
Figure 2. Southeast Asia terrorist attack death map.

Figure 2 .
Figure 2. Southeast Asia terrorist attack death map.

Figure 2 .
Figure 2. Southeast Asia terrorist attack death map.

Figure 6 .
Figure 6.Comparison of the clustering quality of four algorithms.

Figure 5 .
Figure 5. Clustering quality of four algorithms using different parameters.

Figure 5 .
Figure 5. Clustering quality of four algorithms using different parameters.

Figure 6 .
Figure 6.Comparison of the clustering quality of four algorithms.Figure 6.Comparison of the clustering quality of four algorithms.

Figure 6 .
Figure 6.Comparison of the clustering quality of four algorithms.Figure 6.Comparison of the clustering quality of four algorithms.

Figure 11 .
Figure 11.Effect before and after partition.

Figure
Figure 11a-c show comparisons of the accuracy, recall, and F-Measure for the model, respectively, partitioned by the K-means, SOM, BIRCH, and DBSCAN algorithms, and the unpartitioned model.Figure11ashows that the precision rate increases as the threshold increases and that the precision of the model after K-means, DBSCAN, and SOM partitioning under different thresholds is greater than

Figure 11 .
Figure 11.Effect before and after partition.

Figure
Figure 11a-c show comparisons of the accuracy, recall, and F-Measure for the model, respectively, partitioned by the K-means, SOM, BIRCH, and DBSCAN algorithms, and the unpartitioned model.Figure11ashows that the precision rate increases as the threshold increases and that the precision of the model after K-means, DBSCAN, and SOM partitioning under different thresholds is greater than

Table 2 .
Impact factor data.
Center for Comparative and International Studies (CIS), International Conflict Research (http://www.icr.ethz.ch/data/index)Major drug regions World drug report, 2016 Division for Policy Analysis and Public Affairs, United Nations Office on Drugs and Crime (http://www.unvienna.org/unov/en/unodc.html)NASA's Earth Observatory (http://neo.sci.gsfc.nasa.gov/)ISPRS Int.J. Geo-Inf.2018, 7, 354 6 of 19 1 • × 0.1 • grid.(3) Ethnic diversity is based on the GeoEPR (National Relations Dataset); the main drug area is based on the World Drug Report and the national administrative border; nighttime lighting is based on the Earth Observation Organization; population density; and topography is based on NASA's Earth Observatory.We use ArcMap 10.3 to sample the above data in a 0.1 • × 0.1 • grid.(4) With respect to points of interest (POIs), we use the Google Places API to get POI data of Southeast Asia, and then use ArcMap 10.3 to sample it in a 0.1 • × 0.1 • grid.