Computer Analysis of Human Belligerency

José A. Tenreiro Machado 1,† , António M. Lopes 2,*,† and Maria Eugénia Mata 3,† 1 Department of Electrical Engineering, Institute of Engineering, Polytechnic of Porto, Rua Dr. António Bernardino de Almeida, 431, 4249-015 Porto, Portugal; jtm@isep.ipp.pt 2 LAETA/INEGI, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal 3 Nova SBE, Nova School of Business and Economics (Faculdade de Economia da Universidade Nova de Lisboa), Rua da Holanda, 1, 2775-405 Carcavelos, Portugal; memata@fe.unl.pt * Correspondence: aml@fe.up.pt † These authors contributed equally to this work.


Introduction
Wars have played a major role in human history, because they have long accounted for violence. According to Blum [1], we presently live in a paradox of power, because on the one hand our means and methods of war have become both more devastating (potentially), and on the other hand less devastating (in practice).
Campbell [2] asks what conception of war to adopt. Williams et al. [3] (p. 85) recall that Cicero defined war as contending by force, and Machiavelli [4] installed Machiavellian philosophy in saying that "rulers should be good if they can, but be willing to practice evil if necessary" in order to reach their goals. In the same way, Grotius [5] (p. 18) wrote that "war is the state of contending parties, considered as such," while for Hobbes [6] war was a state of affairs. Regretting wars, Mannies and Laursen [7] prefer to say that war is a violent political disease.
How can one account for wars? The mathematical analysis of war has relied on developing and interpreting the statistical distributions of casualties [8,9]. Such distributions reveal fat-tails, meaning that the size of an event is inversely proportional to its frequency. Such patterns can be used to predict the size distribution of future wars, with implications in sociological and general policy [10].
The Arccosine distance is important when comparing objects described by vectors with different magnitudes. The Canberra is a metric well-suited for quantifying data scattered around an origin and is very sensitive for values close to zero. The Dice, like the Arccosine and the Jaccard, is an angularly-based measure closely related to the Euclidean distance, which is the shortest distance between two points. In particular, the Jaccard has several practical applications, namely, in information retrieval, data mining, and machine learning. The Divergence measures the "distance" between two probability distributions on a statistical manifold. The Lorentzian is the natural logarithm of an absolute difference between objects. The Manhattan distance is a rectilinear distance or taxicab norm. The Sørenson distance is close to the Canberra.

Hierarchical Clustering
Let us consider a set of N objects, v i , i = 1, . . . , N, in a P dimensional real-valued space. The HC is a technique that visualizes groups of similar objects and involves three steps [17]. The first consists of defining a measure of the distance d(v i , v j ), i, j = 1, . . . , N, between the objects i and j. The second step regards the comparison of all objects and the construction of a matrix of distances, symmetric, with zeros in the main diagonal. In the final step the HC algorithm produces a structure of clusters that is represented by some graphical portrait, such as a hierarchical tree or a dendrogram. We can adopt two main techniques: (i) the agglomerative and (ii) the divisive iterative schemes. For the agglomerative, each object starts in its own cluster. Then, the successive iterations join the most similar clusters until reaching one single cluster. For the divisive scheme, all objects start in a single cluster. Then, the iterations remove the "outsiders" from the least cohesive cluster, until each object has a separate cluster. The HC requires the definition of a linkage criterion, consisting of some distances, for quantifying the dissimilarity between clusters. The distance d (v R , v S ) between a pair of objects v R ∈ R and v S ∈ S, in the clusters R and S, respectively, can be determined by means of a number of alternative metrics, such as the average-linkage [18]: For assessing the quality of the clustering, we can adopt the cophenetic coefficient cc [19]. Let us assume that the objects v i and v j are described by the HC representations t i and t j , respectively; then the index cc is given by: , with av(·) denoting average, and d(t i , t j ) is the cophenetic distance between the HC objects t i and t j . We have 0 ≤ cc ≤ 1 and the limits correspond to bad and good clustering of the original data. Additionally, the Shepard chart can be used to compare the original and the cophenetic distances, so that the closer the points to the 45 degree line, the better the result. The graphical portrait consists of a dendogram or a tree, and the objects are the "leafs".

Multidimensional Scaling
MDS is a computational method for determining and visualizing the similarities or dissimilarities (distances) between objects in a dataset [20]. The main concept is to find the key dimensions explaining the observed distances between the objects. The matrix D is the source of information of the MDS. The algorithm tries to find the positions of M ≤ P dimensional objects v i (represented by points), producing a matrix D = [d( v i , v j )] that approximates the original one. Several MDS types were proposed and we can cite the metric, non-metric, and generalized versions. For the metric MDS, we have minimization of the stress cost function S: The Sammon criterion can be also adopted, yielding: The stress S has a monotonic decreasing variation with the dimension M. The user establishes a compromise between the two variables, and usually either M = 2 or M = 3 is adopted since such values allow a direct graphical representation. The resulting map is read by following the clusters and by checking how they reflect the relationships embedded in the original data. Consequently, the shape of the map and the dimensions of the locus are meaningless.
For assessing the "quality" of the MDS, the user can check, subjectively, whether the locus clearly displays some clusters reflecting the characteristics of the dataset. Additionally, the user can compare the original and the reproduced information stored in D and D, respectively. The Shepard diagram portraits d(v i , v j ) versus d( v i , v j ), so that a good representation corresponds to points close to the 45 degree line. Alternatively, the plot of S versus M indicates a good representation when we have a significant reduction of the stress. If the map is not clear the user can adopt another measure d(v i , v j ) until obtaining a suitable representation.
Similarly to the HC, the definition of an adequate distance d(v i , v j ) requires some practice and eventually a few numerical trials. We must note that the alternative distances are correct, the difference being merely in the capability of each one to capture the characteristics embedded in the dataset.
Several distances can lead to valid MDS maps and reveal the same clusters, just differing in their geometrical shapes.

Spectral Domain
Let us consider the time-series X = {x n : n = 1, . . . , N}, resulting from sampling a continuous variable x(t) at the frequency f s . We can express X in the frequency-domain using the discrete Fourier transform, resulting in: where ı = √ −1 and F {·} is the Fourier operator. Often, we consider only the first half of the spectrum versus frequency, f , by considering k = 1, . . . , N 2 and f = k F s 2 / N 2 , where · stands for the ceiling function.

Entropy
Let us consider a discrete probability distribution P = {p 1 , p 2 , . . . , p N }, with ∑ i p i = 1 and p i ≥ 0. The Shannon entropy, H (S) , of P is given by: which represents the expected value of the information content I(p i ) = − ln p i . Several generalizations of (16) have been proposed [21]. Herein, we recall the

Machado-H
(M) α [22], and Machado and Lopes-H and H (ML 2 ) q,α [21], formulations, derived in the framework of fractional calculus (FC). The FC generalizes the concepts of differentiation [23][24][25][26] to non-integer orders. The theory was introduced by Leibniz by the 17th century, but only recently gained popularity in applied sciences [27][28][29][30][31][32][33]. , where D α stands for the fractional derivative of order α ∈ R. Therefore, the concepts of fractional information and fractional entropy of order α can be formulated as: where ψ = ψ (1) − ψ (1 − α) and ψ (·) represents the digamma function. For the case α = 0, we verify that H , we adopt a general averaging operator, instead of the linear one that is assumed for the Shannon entropy (16). Let us consider a monotonic function f (x) with inverse f −1 (x). Therefore, for a set of real values {x i }, i = 1, 2, . . . , with probabilities {p i }, we can define a general mean [34] associated with f (x) as: Applying (19) to the Shannon entropy (16) we obtain: where f (x) is a Kolmogorov-Nagumo invertible function [35]. If the postulate of additivity for independent events is considered in (19), then only two functions f (x) are possible, consisting of we get the ordinary mean and we verify that H = H (S) . For f (x) = c · e (1−q)x we have the expression: leading to the Rényi entropy: If we combine (17) and (21), then we obtain: On the other hand, if we rewrite (22) as: where q is a generalized mean, then we obtain: In the limit, when α → 0, both H

The Spans of Wars
This Section characterizes the real-world data describing the spans of wars. In Section 3.1 the adopted dataset is presented. In Section 3.2 these data are processed by computing the dissimilarity indices (1)- (27), followed by the HC and the MDS techniques for dimensionality reduction and scientific visualization. The loci are interpreted in the light of the emerging clusters.

Description of the Dataset
Wars have resulted in about 3.5 billion casualties, people who died in the battlefields or later on, as an indirect consequence or a result of those events. Since 1820, ninety five international and intra-state wars have occurred, depending on how war is defined [36,37]. Table A1, in Appendix A, contains the database of military conflicts from century VI B.C. to present date that is considered in this paper. Its construction adhered to a strict and prudent account of proceedings in order to preserve scholarly caution in synthesizing the conflicts whose consequences can be measured by the suffering that results. The selection criterion was based on the threshold of 25,000 estimated casualties, because this indicator can express not only lost human capital, but also its devastating impact on families and society in terms of pain and social disruption [38]. The database is global, as it includes all military conflicts above the defined threshold of casualties, wherever they took place. The number of casualties, namely, for ancient conflicts, is derived from historical texts by contemporary writers. For some wars, deaths due to collateral effects are included, such as those due to diseases caused by starvation and general degradation of health care. For details, please refer to the notes in Table A1.
The database contains N = 163 wars, where the ith war, i = 1, . . . , 163, is characterized by means of four variables: (i) the mean (or center) time of the event, t i = where t b i and t e i denote the starting and ending years, respectively; (ii) the time span, T i = t e i − t b i + 1 (expressed in years); (iii) the number of belligerents, B i ; and (iv) the number of deaths, C i . Therefore, the data are organized in a 163 × 4 dimensional array, W = [ w ik ], where w ik , i = 1, . . . , 163, k = 1, . . . , 4, represents the ith war and its kth characterizing variable.

The HC Analysis and Visualization of the Spans of Wars
For applying the HC, firstly the array W is normalized by the arithmetic mean, µ(·), and standard deviation, σ(·), to avoid numerical saturation. This means that the columns of W, to be denoted by u k , are converted to: yielding a normalized array W. Secondly, the rows of W, to be denoted by v i , are used for calculating the dissimilarity matrices Finally, the matrices D n are processed through the HC for producing the loci of objects that represent the spans of wars. The agglomerative clustering and average-linkage methods are adopted.
Although the HC trees are 2-dimensional loci, we can highlight particular aspects embedded in the data or capture distinct information provided by the HC. Herein, the HC trees will consist of two dimensions produced by the standard HC and one extra dimension, corresponding to time, t, or casualties, C. The 3rd dimension is thus obtained by means of radial basis interpolation (RBI) [39], using the information of each point and the thin-plate spline φ( ) = 2 log , where stands for the Euclidean distance between the HC points in the plane. Therefore, isoclines represent identical loci of time or of casualties. Figures 1 and 2 depict, for example, the HC trees obtained with the Jaccard and Sørenson distances, d 6 and d 9 , respectively, while the 3rd dimension is calculated interpolating either t or ln(C). For the other distances the loci are of the same type. For both distances, we verify the emergence of identical clusters, C 1 and C 2 , composed by sub-clusters, C 11 and C 12 , and C 21 and C 22 , respectively. These clusters reflect the similarities between objects, but often the interpretation of the loci is difficult, namely, in the presence of many objects. Figure 3 shows the Shepard plot for assessing the HC tree with the distance d 6 . The chart reflects an accurate clustering of the original data, with cc = 0.89. For the other distances the charts are identical, and therefore, are not presented.

The MDS Analysis and Visualization of the Spans of Wars
We visualize the spans of wars using the metric MDS and the Sammon criterion (13). The inputs to the algorithm are the matrices D n . Figures 4 and 5 depict, for example, the 3-dimensional MDS loci obtained with the Jaccard and Sørenson distances, d 6 and d 9 , respectively. The spheres representing wars have sizes proportional to the numbers of casualties, and color is proportional to time. Figure 6a,b shows the MDS assessment charts for the Jaccard distance. Since the Shepard diagram exhibits a small scatter around the 45 degree line, we have a good fit between the initial and reproduced distances d 6 (v i , v j ) and d 6 ( v i , v j ). The stress plot shows that the 3-dimensional locus is a good representation, since M = 3 points to the elbow of the function S(M). Therefore, 3-dimensional representations give a good compromise between accuracy and readability. For the other distances we obtain charts of the same type.

The MDS Analysis and Visualization of the Spans of Wars based on a Generalized Distance
Distances (1)-(9) have their own pros and cons; that is to say, they highlight specific aspects of the data, but give lesser importance to others. Therefore, in order to embed their distinct characteristics, we propose the distance d 10 as "generalized": where λ r ∈ R, ∑ 9 i=1 λ r = 1, are weighting constants. In other words, we conjecture that the distances (1)-(9) capture distinct characteristics of the objects and that a more complete grasp of the information is obtained by using all indices complementarily. Therefore, distance (27) may lead to a multi-perspective visualization. Since we have no a priori preference for a given distance we consider in the follow-up all weights to be identical, that is, λ r = 1 9 , r = 1, . . . , 9. The spheres size and color have the same meaning as in the previous MDS loci. The corresponding MDS assessment charts are omitted, since they are of the same type as those shown in Figure 6. Several experiments showed that, as expected, the generalized distance, d 10 , leads to better clustering than the one revealed by the distances {d 1 , . . . , d 9 } used separately. More importantly, we note in all cases, the explosion in the number of events during the recent decades and a large scattering in the MDS plots for the events in the last decades corresponding to a multitude of distinct characteristics.

Sociological Interpretation of the Spans of Wars
Perhaps wars are as ancient as humankind, and go back for tens or hundreds of thousands of years, because they are of great value in clarifying the human responses to disputes. During the Mesolithic period, from circa 9700 B.C. to 8750 B.C., when European hunter-gatherers settled and developed more complex societies, they developed warring. However, they are not considered in Table A1, because collective memory on these wars has been lost, as writing systems were not available and oral tradition has given origin to only myths and legends.
Causes for warring in sedentary existence also include the shift to a growing population, and concentrations of assets and value in terms of resources such as livestock, which have increased the complexity of social relationships and social ranking. Cooperation in hunting, agriculture, or food sharing must be recognized as means for conflict resolution. The existence of surpluses triggered barter for common products and trade in high-value commodities. Both Europeans and Asians faced scarcity of resources in Ancient ages. From the 6th century B.C. to the 4th century A.D., 24 military events took place in 11 centuries, which averages to one every 50 years. The 24 ancient wars here represented with a frequency of one per half a century make a cluster of their own (Figure 7). The establishment of collective identities required group boundaries and territory control in the Chinese, Greek, Persian, and Roman empires. The Three kingdom war (labeled with number 21) from the Han to the Jin dynasty, and the Yellow Turban Rebellion (labeled with number 22) are the bloodiest known conflicts since the beginning of humankind.
Splitting of empires also brought military conflicts, such as the Hunnic invasions which put an end to the West Roman empire (with number 24 in Figure 7). The flexibility that allowed individuals to move to other groups was another superior instrument for social and political arrangements. Such peaceful social mechanisms have included family alliances through marriage, and cross-group ties of kinship. However, these mechanisms did not eliminate serious conflict, namely, because of religion or ideological mass killings, such as Reconquista from the Muslims and the Crusades (1095-1291), labeled in the Figure 7 as 27. Territorial conquest and economic gain continued as the bloodiest, such as the Mongol conquests (32 in the Figure 7, in the cloud), and the Timured Conquests (35) which downgrade the importance of the European Hundred Years' War (34 in the cloud). Clouds mean that peaceful periods occurred for short duration. The long durations of these kinds of wars for control of means of production, trade routes, and raw materials oblige belligerents to invest great amounts of spending and armies' blood, which makes it very difficult to stop fighting if victory is not clear, because losses would become useless [40]. Continuation of fighting is the way to redeem all previous military efforts and human sacrifices.
There was also a tremendous dependency on weather conditions. According to rainfalls, temperature, and other weather conditions, crops could flourish or be lost. Production was subject to high vulnerability, and standards of life could deteriorate because of scarcity, high pricing of food, and diet problems for a large number of people. Under those conditions, standards of life could languish, and resistance to diseases could decrease, soon resulting in famine, epidemics, and higher mortality, as happened because of the Eurasian Black Death epidemics (1343-1353). Riots, revolts, and warring were a common consequence of such a set of difficult conditions for human survival. Expansion to scarcely known lands, in the Modern Age, brought discovery and colonization. Wars labeled 37 to 61 were the great Modern Age wars. The Spanish conquests of Yucatan, the Inca, and the Azteca empires in the New World (which are labeled as 39 to 41 Figure 7) are famous, which downgrades the European conflicts (such as the French Wars of Religion labeled 44, and the Thirty Years' War labeled 49) [41] thanks to the spread of diseases in the New World [42]. Strategic innovations became available for military superiority, and new hopes supported victory possibilities [43]. The old armor, pikes, and longbows became old fashioned when compared to muskets and cannons, increasing human sacrifice (casualties), as Figure 7 clearly shows [43].
Environmental and economic conditions strongly conditioned "the economic history of man as a successful species" [44]. A good example of the correlation between bad weather conditions and lower production levels did happen in the 17th century in Northern Europe. The so-called Cooling age generated conditions that led to wars (labeled 49-52, 54-57, and 59 and 60). Environmental upheavals such as long rainy winters obliged an adaptation to alternative means of survival beyond colonial expansion: capitalist industry, trading, and finance. A world without water or ice is hard to visualize, and recent trends toward increasing average temperatures may bring serious problems to humankind, and military conflicts may also become more plausible and frequent [45].
Looking at contemporary history, newly available and sophisticated means of warfare became available in this phase. Artillery and bombing at a distance were key aspects in Napoleon's invasions and conquest wars, (labeled 64 in the Figure 7). Soldiers' bodies engaged one another within confined battlefields has given place to long-range weapon technologies. Wars became much bloodier in the nineteenth and twentieth centuries. Civil wars have dominated the military scene, and the Chinese Civil War of 1927-49 downgraded the American and the Spanish Civil Wars (1861-65 and 1936-39, respectively) with labels 75 and 97.
More recently, the World Wars' modern battlefields labeled 89 and 99 also differed a lot, because of the air force bombing capacity in the second one (1939)(1940)(1941)(1942)(1943)(1944)(1945). By earlier standards, the First World War (1914)(1915)(1916)(1917)(1918), was not in fact particularly global [46]. Recent weaponry technology has evolved in two different aspects: destructiveness and distance. Each one of the most recent wars was dramatically bloody (colored yellow-orange-green in Figures 1, 2, 4, 5, and 7. Weapons with great destructive power can defeat any enemy nowadays (or at least oblige its government to sit for a bargaining deal). By operating across great distances, they present much higher effective ranges of threat.
Global-scale ambition and conquest purposes have been explanatory variables for more frequent belligerency. Such a high frequency of conflict means that it has been difficult to reap the anticipated gains that were forecast for victory [47]. Perhaps the announcements of messianic future benefits of victory by ruling elites and their political propaganda frequently have been persuasive so as to create popular domestic attitudes of enthusiasm and support to fighting. The return to peace and normalcy includes tremendous political costs for politicians, even for victors.

Entropy Analysis of the Span of Wars
The number of casualties in each war, C i , i = 1, . . . , N, with N = 163, is distributed along the time span, T i = t e i − t b i + 1, yielding the "density of casualties" time-series: where t b i and t e i stand for the beginning and the end the ith war, respectively, and δ(n) denotes the Dirac delta function at time n. Figure 8 depicts the time-series X (n) and its spectrum |Y ( f )|. In the first, we note the tendency toward chaotic-like behavior, and in the second, we verify the existence of some complex behavior. Therefore, we need some more assertive mathematical and numerical tool to unveil other characteristics of the dataset.
We start by adopting a 10-year sliding window without overlap, that is, slicing the time-series X into 275 intervals, w i (i = 1, . . . , 275). This minimizes issues related to the non-stationarity of the data and yields a good compromise between time discrimination and statistical significance. For the ith window we determine a histogram by binning the elements of w i into 10 equally spaced containers and counting the number of elements in each container. Then, we compute the Shannon and fractional entropies discussed in Sections 2.4 and 2.5, where the probabilities are estimated from the histograms of relative frequencies. , respectively. For these values, the fractional entropies have a higher sensitivity to the data characteristics [21].
The two models: and , respectively, where {β 1 , β 2 , β 3 , β 4 } ∈ R are the models' parameters, whose values are listed in Table 1.  We clearly observe three periods: the first, P 1 ∈ [−549, 650], with a slow increase of the entropy until the middle followed by a slow decreasing; the second, P 2 ∈ [650, 1400], with very small values of entropy; and the third, P 2 ∈ [1400, 2020], with a fast increase of the entropy. This behavior is coherent with the explosion of conflicts with large mortality rates fueled by the development of more "efficient" weapons. In fact, guided missiles and sniper fire (and drones, nowadays) are allowing targeting more enemies and economic resources, with lower risk of counterattacks. The aftermath of a war has always included disaster, economic disarray, and profound social unrest for the defeated partners, with unbearable costs beyond the efforts in belligerency. More importantly, the results highlight an evolution toward increasing values of the entropy, somehow reflecting a thermodynamic behavior, seemingly aligned with the anthropological and social issues pointed out in this paper. This conclusion was also reached previously based on different concepts and tools, namely, when using HC and MDS for a collection of distances. Therefore, the results obtained by distinct techniques are compatible and seem not to depend on the type of measure or the computational approach hand. Moreover, we verify that a data-driven modeling, combining mathematical and computational tools, constitutes a relevant exploratory strategy for describing complex real-world phenomena. Indeed, we are tackling a class of systems for which a classical approach based purely on an analytical perspective would require subjective initial assumptions, possibly biasing the resulting analysis.

Discussion and Conclusions
MSD revealed the entire graphical appraisal of humankind's warring. When looking at the historical record of wars, one can see that warring is not an abnormality for the humankind. From a long-term perspective, warring may be equated as a powerful element in the economic history of the humankind, as military defeat means the spoiling of economic resources (including loss of human capital) beyond social turmoil.
War has served as a mechanism of natural selection in which the fittest prevailed to acquire both mates and resources. Ravaged economies and societies have had great difficulties to return to growth, while military victory for their enemies brought enlargement of territory and more availability of economic and human resources. Warship may be an instrument of policy for reaching economic targets from a strategical perspective, a kind of investment for expansionary purposes in a local, regional, or global context. War is a deliberate and instrumental economic choice of political and military elites for leadership. In a distinct perspective, we verified that present day computational techniques, both for data processing, and for visualization of the results, may represent a key role in the study of these dramatic events. This strategy does not preclude the use of classical modeling techniques and in fact can be complemented by such description. Moreover, the proposed method suggests collecting further characteristics of the events since the algorithmic approach can easily handle a higher and richer description involving a higher number of dimensions.
On another level, we verified an increasing number of conflicts. This produced not only large scattering in the HC and MDS plots for recent times, but also larger and growing values of entropy. Such conclusions seem to be robust and not to depend on the mathematical index or computational tool. We can question ourselves whether human civilization is reflected by events such as wars that are manifestations of the second law of thermodynamics. The results seem to indicate that the Plato quote, "Only the dead have seen the end of war," is still going to be relevant for the years to come. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.