Quantitative Research on Global Terrorist Attacks and Terrorist Attack Classification

: Terrorist attacks are events which hinder the development of a region. Before the terrorist attacks, we need to conduct a graded evaluation of the terrorist attacks. After getting the level of terrorist attacks, we can fight terrorist organizations more effectively. This paper builds rating models for terrorist attacks, hidden or emerging terrorist organization classification discovery models, terrorist organization alliance network models and more, through quantitative research of the Global Terrorism Database, which solved the event classification. Through studying relevant literature and the variables of the Global Terrorism Database, this paper sorted out 25 observation variables related to the impact level (level of harm) of terrorist attacks. By establishing a mathematical model of factor analysis, 11 factors related to the impact level (level of harm) of terrorist attacks were constructed, and the variance of the contribution of each factor was used as the weight to calculate the comprehensive rate of the impact level of each terrorist attack. Finally, K -means clustering method is used to cluster and analyze the comprehensive rate of impact level, and the top 10 terrorist attacks with the highest impact level in the past two decades were obtained.


Introduction
A terrorist attack is an attack that is made by extremists or organizations, and is not limited to civilians and civilian facilities, nor is it in line with international morality.It is not only extremely devastating and destructive, but it also directly causes huge casualties and property losses.It also brings great psychological pressure on people, causing a certain turmoil in society, hindering normal work and life, and thus greatly hindering the development of social economy.
Terrorism is a common threat to people.Combating terrorism is the responsibility of every country and citizen.In-depth studies and analysis of data related to terrorist attacks will help us deepen our understanding of terrorism, and provide valuable information support for counterterrorism and anti-terrorism.
Criminology has only recently become more open to the study of terrorism and political violence [1].Some maintain that this hesitancy is partly due to the difficulty in defining terrorism, along with challenges in measuring terrorism [2,3].However, the field is increasingly embracing the study of terrorism in general, and also the study of specific terrorist tactics.In particular, the environmental criminology and SCP branches are increasingly being applied to terrorism [4].Grading catastrophic events such as earthquakes, traffic accidents, meteorological disasters, etc. is an important task in social management.The general grading generally adopts a subjective method.The authoritative organization or department, select several main indicators and impose grading standards.For example, the classification standards for traffic accidents specified [5] of "China's Road Traffic Accident Treatment Measures" are mainly based on casualties and economics.The degree of loss is divided.However, the harmfulness of terrorist attacks depends not only on the two aspects of casualties and economic losses, but also on the timing, geography, targeted objects and other factors [6][7][8].Therefore, it is difficult to form a unified standard by using the above classification method.
In this paper, we propose a comprehensive method for evaluating the harmfulness of terrorist incidents, and sort and classify the harmfulness of terrorist incidents.Because terrorism affects political, economic, cultural, social, religious and other aspects, its complex nature makes it difficult for us to summarize its harmfulness from a few simple indicators.Although, from the statistics of terrorist attacks and through factor analysis, we can sum up the extent of the danger of terrorist attacks.Therefore, based on the terrorist attack database, this chapter firstly combines the risk assessment system initially established in the relevant literature, and further uses the factor analysis method to design a model for quantitative analysis of damage's degree of terrorist attacks, and uses a K-means cluster analysis method to the comprehensive score of the degree of a hazard, which is clustered to obtain a rating.

Terrorist Attack Analysis
The security issues facing the world today mainly involve traditional security and nontraditional security.Traditional security has always been the focus of national security policies.It involves political security, national defense security, national territorial sovereignty security, regime security and other important contents [9][10][11][12].Since the 21st century, it has become more urgent to guard against non-traditional security threats, including fighting terrorism, ensuring economic and financial security, combating smuggling and drug trafficking, anti-piracy, and preventing the spread of large-scale transnational infectious diseases [13][14][15][16].Non-traditional security factors are sometimes intertwined with traditional security factors, making the international security situation more complex.Although the possibility of a world war is small, security problems such as terrorist attacks have led to a decline in the sense of security of many countries, and a rise in the sense of insecurity of many people [16][17][18].Local areas have been plunged into turmoil and conflicts for a long time due to frequent terrorist activities, which have seriously affected the sustainable development of local society.No responsible government can ignore this worrying situation.How to prevent and combat terrorists in the light of national and regional conditions, the current world situation, as well as how to better build a sustainable and secure environment, and make the earth a home for peaceful development has become a common concern of people of all countries.In response to this practical concern, this paper studies the rules of global terrorist attacks and the distribution of terrorist organizations' networks through quantitative models, so as to provide effective clues, information and strategies for governments of all countries to jointly combat terrorist organizations and promote the peace, security and sustainable development of the international community.
As shown in Figure 1, the shades of color indicate the number of terrorist attacks in various countries around the world from 1998 to 2017.The darker the color, the more terrorist attacks a country has suffered.Terrorist attacks are a serious social problem.The greater the harm of a terrorist attack, the greater the social impact.However, the assessment of the level of danger of a terrorist attack is not limited to the property damage and casualties caused by the terrorist attack, but other factors should also be considered.In this paper, we conducted a grade evaluation of the terrorist attacks that have occurred through factor analysis, and then counted the distribution of terrorist attacks in various countries, and provided some assistance for countries to defend national security in response to terrorist attacks.We also provided a safe and stable environment for the development of the country.
Terrorism refers to the idea and behavior of creating social panic, endangering public security, infringing on personal and property, or coercing state organs and international organizations to achieve their political and ideological purposes by means of violence, destruction and intimidation.
For a long time, terrorism, marked by its bloody violence, has caused chaos and social unrest in many parts of the world.Although the number of people engaged in terrorist activities is but a small handful, due to the characteristics of terrorist activities, the harm caused by these activities is far greater than that caused by ordinary criminal violence crimes, as they also affect political, economic, military, diplomatic, international relations and other fields.
The main hazards of terrorism include affecting the security of neighboring countries, seriously undermining national harmony and causing social unrest, greatly hindering the economic development and social progress of countries, affecting the image of secular governments of some countries, causing political instability and social unrest, undermining the peace and development of the world, and so on.

Factor Analysis Model
The harmfulness of terrorism is affected by different factors, so key evaluation indicators should be identified before conducting a hazard assessment.The harm of terrorist attacks depends not only on the two aspects of casualties and economic losses, but also on the timing, geographical location, targeted targets and many other factors.Therefore, with this research, we want to establish a more objective and complete evaluation index system, which can not only reflect the direct losses caused by terrorist attacks, but also take into account the potential harms caused by terrorist attacks, including politics, religion and society.Lastly, in this section we used the factor analysis and techniques in multivariate statistical analysis to establish a risk assessment model.The results can provide a reference for terrorism hazard assessment.
Factor analysis [19,20] is a multivariate statistical analysis method.The core idea is data transformation and dimensionality reduction.The intricate variables are integrated into a few main factors for problem interpretation or comprehensive evaluation.It can explain most of the original variables with a small number of potential factors.The starting point of factor analysis is the correlation matrix of the original variables.Factor analysis can eliminate the correlation between variables.By mathematically transforming the terrorist attack risk indicators into several factors, the main factor is selected according to certain criteria, and the national terrorist attack risk comprehensive score is obtained.At the same time, the factor does not need to subjectively determine the index weight.Instead, the weights are automatically obtained based on the observations of the sample data, so subjective factors can be eliminated and objective evaluation results can be provided.

Mathematical Model
Suppose there are p variables: x1, x2, x3, •••, xp.After normalization, the mean value of the variable is 0 and the standard deviation is 1.Each variable can be represented by a linear combination of k (k < p) factors f1, f2, f3, •••, fk.Then the mathematical model can be established as follows [10]: The model can also be represented in matrix form: where F is the common factor, X is the normalized original variable, A is the factor load matrix and (i = 1, 2, …, p; j = 1, 2, …, k), is the covariance of the sum, The larger the absolute value, the greater the dependency.

Factor Load
The factorization load matrix can be solved by least squares method, maximum likelihood method, main axis factor method, principal component method, etc.The principal component method is used in this paper.The calculation process is as follows: (1) Data standardization; (2) Calculating the original sample covariance matrix; (3) Solving the non-zero eigenvalues of the covariance matrix, and sorting, the corresponding unit orthogonalization eigenvectors i e (i = 1, 2, …, p) (4) Calculate the factor load matrix, assuming k < p, then ) ,..., , , (

Factor Rotation
After the original variables are combined into a few factors, if the factors are vague, it is not conducive to further explain and evaluate.Therefore, a linear combination of the initial common factors, that is, a factor rotation, gives the composite factor a specific meaning.The factor rotation is divided into orthogonal rotation and non-orthogonal rotation.The factor rotation ensures that the new factor is closer to zero, or farther away from zero.When the load is close to zero, it indicates that the common factor and correlation are weak, and close to 1 indicates a strong correlation.Therefore, after factor rotation, the practical meaning of the common factor is more explicit.Here we used the orthogonal rotation maximum variance method.
The maximum variance method maximizes the variance of the common factor variable load by rotation, so that the load coefficient of some variables on the factor changes toward the maximum or minimum direction, ensuring that there is no or little medium-sized load, thus making the meaning of the factor more specific and convenient.

Establishment of Indicator System
Terrorist attacks are rooted in conflicts brought about by historical, political, economic, cultural and religious conflicts.The network and resources of terrorist organizations and the measurement of the cost-effectiveness of terrorist attacks will affect the choice of terrorist attacks, which in turn will affect the extent of the consequences of terrorist attacks.Therefore, in combination with the terrorism database, the analysis of the factors affecting the extent of terrorist attacks using weapons, regional characteristics, targets, death levels, and using certain methods, can achieve quantitative analysis of the damage caused by terrorist attacks that have occurred.To achieve a more scientific and rational rating of the harm to terrorist incidents, the following principles should be followed in the establishment of a terrorist attack risk assessment indicator system: 1 Representative principle The harmfulness of terrorist activities is reflected in personal injury, property damage, social impact and so on.In the selection of indicators, the typicality and representativeness of the indicators should be taken into consideration, which can play a key role in assessing the risk of terrorism.

Scientific principles
The selected indicators have an intrinsic logical relationship with each other, ensuring that each indicator must be a concrete manifestation of a certain dimension of harm; secondly, the indicators should be coordinated to ensure that they do not overlap each other.

Comprehensiveness
The hazards of terrorism include both direct and indirect aspects, and some of the hazards also have lagging and long-term characteristics.Therefore, when selecting indicators, we must consider the various links and continuous impacts of terrorism.The indicators should form a systematic whole, and the indicators should be logically rigorous and coordinated.
Based on the above principles, through the literature and the global terrorism database (GTD) variables, we compiled 25 observation variables to judge the risk of terrorist attacks as a measure of the extent of the terrorist attack: Workday: The extent of the terrorist attack is related to the date of occurrence (see supplementary).If the date of the terrorist attack is from Monday to Friday, the degree of harm is greater than that on Saturday and Sunday.Equal to 1 when the terrorist attack date is Monday to Friday, equal to 0 when the terrorist attack date is Saturday or Sunday.
Crtnum: Indicates the number of criteria for terrorist attacks.The more the criteria for inclusion, the greater the harm of the terrorist attack.Crtnum = Crit1 + crit2 + crit3 + doubter.Crit1: Political, economic, religious or social goals, 1 = "Yes" means that the event satisfies the criterion 1, 0 = "No" i.e. the event does not meet the criterion 1 or does not explicitly indicate that the criterion 1 is met Crit2: Intention to coerce, intimidate or incite more people, 1 = "Yes" means that the event satisfies the standard 2, 0 = "No" means that the event does not meet the criterion 2 or does not explicitly indicate that the criterion 2 is met; crit3: Exceeds international humanitarianism Scope of law, 1 = "Yes" means that the event satisfies the standard 3, 0 = "No" means that the event does not meet the criterion 3; doubterr: suspected of terrorism, 1 = "Yes" that the incident is suspected of being a terrorist act, 0 = "No "There is basically no doubt that the incident is a terrorist act.If this variable is not included in the data collection process, the database will be marked as "−9".
Relatednum: The number of related events.Equal to the number of events contained in the related variable.If the terrorist incident is associated with multiple terrorist incidents, the impact of the terrorist incident is generally more extensive and sustained, and the resulting harmful effects are more profound.
Region_wealth: The extent of the damage caused by a terrorist attack, especially property damage, is highly correlated with the level of local economic development.1 indicates three developed regions in North America, Western Europe and Oceania, and 0 is other regions.
City_ornot: The city center has the characteristics of concentrated buildings and large traffic.When terrorist attacks occur in the city center, the casualties are often more serious, the number of property losses is huge, and it may also cause a series of problems such as traffic congestion and mass panic. 1 means in the city center, when the nearby area vicinity = 0; 0 is a non-city, when the nearby area vicinity = 1.
Attacktype_num: Indicates the number of types of attacks on the terrorist attack Weaptype_level: Grading the degree of damage to the weapon type, divided into 0, 1, 2, three levels, the higher the level, the greater the degree of harm, 0 means unknown and others.If the attacktype1 value in the database is 2, 3, 7, we define it as attacktype_level = 3; if the attacktype1 value in the database is 1, 4, 5, 6, we define it as attacktype_level = 3; if the attacktype1 value in the database is 8, we define it as attacktype_level = 1.
Weaptype_num: Indicates the number of weapons used in this terrorist attack Targtype_level: Divided into 0, 1, 2, 3, four grades, 0 is the other, 1 is the government, 2 is the public utility, and 3 is the goal directly related to the people.The higher the level, the greater the hazard.According to this standard, if the targtype1 value in the database is 2, 3, 4, 7, we define it as targtype_level = 1; if the targtype1 value in the database is 5, 6, 8, 9, 11, 15, 16, 19, 21, We define it as targtype_level = 2; if the value of targtype1 in the database is 1, 14, 18, we define it as targtype_level = 3; Targtype_num: Indicates the number of targets/victims of the terrorist attack.
Propextent_level: Divided into 0, 1, 2, 3, four levels, the higher the level, the greater the degree of harm.Propextent_level = 0 when propextent is missing or has a value of 4; propextent_level = 1 when propextent = 3; propextent_level = 2 when propextent = 2; propextent_level = 3 when propextent = 1; Variable extended, variable multiple, attack success success, suicide attack suicide, victimization or kidnapping victim ishostkid, claimed responsibility, property loss property, international logistics INT_LOG, international ideology INT_IDEO, international miscellaneous INT_MISC, international -Any of the above INT_ANY indicators that reflect the extent of the terrorist attack are consistent with the Global Terrorism Database (GTD).
In addition, the definition of the total number of deaths nkill, the total number of injuries nwound, and the total number of hostage/kidnapping victims nhostkid is consistent with the Global Terrorism Database (GTD).Nkill = 0, nwound = 0, nhostkid = 0, when the original data nhostkid is a missing value.
The above indicators, the larger the value, the higher the damage caused by the terrorist attack.

KMO Inspection and Bartley Ball Test Results
Prior to factor analysis, the raw data should be tested for suitability factor analysis, namely the Bartlett spherical test and the KMO (Kaiser-Meyer-Olkin) test.
The Bartlett spherical test is based on a matrix of correlation coefficients.Its zero-hypothesis correlation coefficient matrix is a unit matrix, that is, all elements of the diagonal of the correlation coefficient matrix are 1, and all elements on the off-diagonal are zero.The Bartlett spherical test statistic is obtained according to the determinant of the correlation coefficient matrix, which approximates the chi-square distribution.If the chi-square value is significant and P is less than or equal to 0.05, the null hypothesis is rejected, and the original variable is suitable for factor analysis.On the contrary, the null hypothesis is established, there is no correlation between the original variables, and the data is not suitable for principal component analysis.
The Kaiser-Meyer-Olkin test (as shown in Table 1) sampling sufficiency measure is also an important indicator for measuring the correlation between variables, and is obtained by comparing the correlation coefficient and the partial correlation coefficient of the two variables.KMO is between 0 and 1.The higher the KMO, the stronger the commonality of the variables.If the partial correlation coefficient is relatively high relative to the correlation coefficient, the KMO is relatively low, and the principal component analysis cannot achieve a good data reduction effect.According to Kaiser (1974), the general criteria are as follows: 0.00-0.49,unacceptable; 0.50-0.59,very poor (miserable); 0.60-0.69,barely accepted (mediocre); 0.70-0.79,acceptable (middling); 0.80-0.89,better (meritorious); 0.90-1.00,very good (marvelous).As can be seen from Table 1, the Bartlett spherical test results are significant at the 1% level, and the KMO value is 0.617, indicating that the original data is correlated and suitable for factor analysis.
We can establish a terrorist attack hazard factor analysis model as shown in the Appendix Equations (A1), (A2) and (A3).

Comprehensive Factor Score
Based on the above factor analysis results (as shown in the Tables A1-A5 in the appendix), we calculated the top 10 terrorist attacks in the database (as shown in Table 2).Ranked 1st and 2nd are the "9.11 terrorist attacks", and the "911" incident is the most serious terrorist attack in the United States.The statistics on the property losses of the incident were mixed.The United Nations issued a report saying that the terrorist attack caused US$200 billion in economic losses, equivalent to 2% of the GDP of the year.The psychological impact of the incident on the American people is extremely far-reaching, and the American people's sense of economic and political security has been seriously weakened.The K-means algorithm, also known as k-mean or k-means, is one of the most widely used clustering algorithms.It takes the mean of all data samples in each clustering subset as the representative point of the cluster.The main idea of the algorithm is to divide the data set into different categories through an iterative process, so that the criterion function for evaluating clustering performance reaches the most "Excellent", each cluster generated is compact and independent between classes.
The essence of the K-means algorithm is to calculate the phase difference of two elements.Since the calculated terrorist attack damage degree is scalar, we use the Euclidean distance to measure the phase difference between elements (common measures of scalar heterogeneity include Manhattan distance, Minkowski distance, Pearson coefficient, etc.).Euclidean distance is defined as follows: Its meaning is the set distance of two elements in Euclidean space, which is widely used to identify the difference between two scalar elements because of its intuitive and explanatory nature.
The specific implementation steps of the K-means algorithm are as follows: (1) Determine an initial cluster center for each cluster, so that there are K initial clustering centers.
(2) The samples in the sample set are allocated to the nearest cluster according to the principle of minimum distance.
(3) The sample mean of each cluster is used as the new cluster center.
In other graded articles [21], the usual classification level is 5 categories.Therefore, combined with the clustering effect map (as shown in Figure 2), the number of categories we selected is 5.

Figure 2.
Cluster number and clustering effect: The abscissa is cluster and the ordinate is value.With the increase of cluster, value is getting smaller and smaller, but combined with the actual business, cluster can not take too much, and 5 is the best choose.Where value = (the distance in cluster)/(the distance out cluster).
Here, we use the above K-means algorithm for clustering classification according to the comprehensive score F of risk degree.Then, the index line was established, with the highest level being 1 and the lowest level being 5.The classification results are shown in the following Table 3: It is easy to see from the Table 3 that the levels of terrorist attacks mainly focus on levels 2, 3 and 4. which means that most terrorist attacks on the levels of II-IV, in Pakistan and India, for example, the large number of terrorist attacks in this two countries, however the grade distribution is concentrated, so is the terrorist attacks still has certain space distribution, the spatial distribution for me when we were doing the prediction.It's easy for the governments to prevent terrorist attacks and reduce the dangers of terrorist attacks.The distribution of terrorist attacks at all levels around the world is shown in the Figure 3. Afghanistan and Pakistan are the top two countries which were attacked by a terrorist organization.From Figures 4 and 5, it is easy to find that the most terrorist attacks happen in border areas.The grade I terrorist attacks also happened in these places.These two countries should pay more of their attention to the border areas, and Afghanistan and Pakistan had better to combating terrorist attacks in grade II and III, and grade III and IV respectively, because these two grades occupied over 60% in all terrorist attacks.When terrorist attacks occur frequently, it will give local residents and businesses a sense of crisis, which will cause local enterprises to be reluctant to expand production, thus hindering the local economic development.And the terrorist attacks have seriously damaged the local environment.These two countries should be strengthening the protection of borders, especially power plants, oil wells, factories and residential areas in low-border, thus reducing the incidence of terrorist attacks will improve the security of businesses and residents.These measures will promote local economic development.As shown in Figure 6a, it is not difficult to find that the total number of terrorist attacks events in Australasia and Oceania is small.From 1998 to 2006, the number of terrorist attacks which happened in Australia and Oceania decreased.However, in the 2006 the events reached a peak and the terrorist attacks increased sharply from 2013 to 2016.From the trend of this terrorist attack, Australasia and Oceania should strengthen the prevention of terrorist attacks in order to ensure the security of the region.Due to political turmoil the economic dilemma will not disappear, and which will be exacerbated.On the contrary, economic structural transformation or economic quality improvement requires a stable political and social environment.Countries of the Middle East and North Africa have long been subject to foreign intervention, have a single economic structure, rely heavily on the international market, and are at the edge of the global division of labor.Modern information and network technology have broken the original deep water flow to a certain extent, making the potential conflicts flammable and difficult to control.So, these regions had better to strengthen national security from the following aspects: 1. Improve the fairness of income distribution and enable the masses to share more development results.
2. Respond to demographic changes and employment pressures, and build an economic development safety net.
3. Promote reforms in various fields and provide strong institutional support for economic development.

Conclusions
This problem can be solved by establishing a terrorist attack risk rating system, a factor analysis model, and a clustering algorithm.This paper first selected relevant literature and combed the data to summarize the representative index system.Given the complexity of the terrorist attacks, factor analysis provided us with a good and efficient way to simplify the analysis of problems.Using the factor analysis method, from the dependence relationship within the correlation matrix of the research indicators, some variables with overlapping information and intricate relationships were attributed to a few unrelated comprehensive factors.Then, we calculated each factor score and composite score to get the top 20 terrorist attacks.Next, we used the K-means cluster analysis method to cluster the comprehensive scores of the hazard levels of these terrorist attacks, and rank them according to the comprehensive scores of the five types of terrorist attacks.

Initial Common Factor and Factor Loading Matrix
First, we analyzed all the terrorist attacks in the database from 1998 to the end of 2017.The factor load was obtained by principal component method, and then the factor rotation was performed.The principle of determining the number of factors is that the characteristic root is greater than 1.Factor analysis results Table A1, it can be seen that it is appropriate to retain 11 main factors, and the cumulative variance at this time is 66.2%, which achieves the purpose of dimensionality reduction.The gravel map and factor load matrix are shown in Table A2 and Figure A1.

Factor Rotation and Factor Score Matrix
Further, we do the factor rotation by the maximum variance, and we get Tables A1 and A2 to further calculate the factor score matrix.

Figure 1 .
Figure 1.The number of terrorist attacks in all countries from 1998 to 2017.
(a) Different levels of terrorist attacks (b) The number of terrorist attacks varies over time (c) Proportion of terrorist attacks

Figure 4 .Figure 5 .
Figure 4. Distribution of terrorist attacks grade in Afghanistan.

Figure 6 .
Figure 6.Distribution of terrorist attacks grade in Australasia and Oceania

Figure 7 .
Figure 7. Distribution of terrorist attacks grade in Middle East and North Africa.

Table 1 .
KMO and Bartley spherical test results.

Table 2 .
Top 10 terrorist attacks and ratings.

Table 3 .
K-means for the dangers of terrorist attacks grade.

Table A1 .
The initial common factors to explain the results of the total variance of the original variables.

Table A2 .
The initial factor loading matrix.