An Adversarial-Risk-Analysis Approach to Counterterrorist Online Surveillance

The Internet, with the rise of the IoT, is one of the most powerful means of propagating a terrorist threat, and at the same time the perfect environment for deploying ubiquitous online surveillance systems. This paper tackles the problem of online surveillance, which we define as the monitoring by a security agency of a set of websites through tracking and classification of profiles that are potentially suspected of carrying out terrorist attacks. We conduct a theoretical analysis in this scenario that investigates the introduction of automatic classification technology compared to the status quo involving manual investigation of the collected profiles. Our analysis starts examining the suitability of game-theoretic-based models for decision-making in the introduction of this technology. We propose an adversarial-risk-analysis (ARA) model as a novel way of approaching the online surveillance problem that has the advantage of discarding the hypothesis of common knowledge. The proposed model allows us to study the rationality conditions of the automatic suspect detection technology, determining under which circumstances it is better than the traditional human-based approach. Our experimental results show the benefits of the proposed model. Compared to standard game theory, our ARA-based model indicates in general greater prudence in the deployment of the automatic technology and exhibits satisfactory performance without having to relax crucial hypotheses such as common knowledge and therefore subtracting realism from the problem, although at the expense of higher computational complexity.


Introduction
The global threat of terrorism is currently one of the greatest challenges facing our society. Since 11 September, Western countries have been allocating more effort and resources to fight terrorism on the national and international scales. However, the resources for the increased security to counter potential large-scale attacks are limited.
In this context, the Internet is one of the most powerful means of propagating a threat with lethal effects, especially in the case of jihadist terrorism. In fact, a quantitative study [1] of 178 individuals detained in Spain between 2013 and 2016 for activities related to jihadist terrorism shows that there are two crucial factors for understanding their radicalization: (1) face-to-face or online contact with a radicalization agent; and (2) the existence of previous social links with other radicalized individuals. With the rise of the Internet of things (IoT), where billions of online objects embedded in our homes (e.g., smart grid technologies [2]), workplaces and cities will collect and analyze our data, the risk to national security is exacerbated while it opens up a new horizon for more invasive online surveillance technologies. We highlight, on the one hand, the emerging and recent deployment of vehicular ad hoc networks (VANETs) and specifically the vehicle cloud computing, a paradigm of cooperative mobile communications [3,4]. This type of networks represent a challenge in security and privacy since they require sophisticated mechanisms to protect against attacks coming from users connected to the network with identical privileges. On the other hand, wireless sensor networks are of crucial importance in the protection of critical infrastructures. The continuous monitoring of these infrastructures and the detection of malicious activity in the traffic of sensor networks requires advances that adapt to unknown attacks [5].
The revelations by NSA whistleblower Edward Snowden revealed the scale and extent of digital surveillance, particularly on the Internet, by different security and intelligence agencies [6]. In this work, we focus on the problem of online surveillance faced by a security agency that monitors a set of specific websites by tracking and classifying profiles that are potentially suspected of carrying out terrorist attacks. While there is an extensive body of research in decision-making models and risk analysis for fighting terrorism (We refer the reader to [7] for a complete review of the field.), to the best of our knowledge the problem above of online surveillance with counterterrorist purposes, understood as a game between opponents who want to maximize their benefits, has not been tackled yet. Although it is a controversial issue, our interest is to rationalize the matter from a strictly scientific point of view and, in any case, raise new questions and challenges.
The aim of this work is to conduct a theoretical analysis of the rationality conditions implied in the deployment of an online surveillance system for detecting and neutralizing potential terrorist threats on the Internet. We consider an approach for evaluating the problem based on adversarial risk analysis (ARA), whose bases are found in [8]. This approach supposes a new perspective of decision analysis, providing a robust analytical framework that is a hybrid between game theory and risk analysis. Its objective is to face precisely the risks derived from the intentional actions of intelligent adversaries, which increase security risks, and uncertain results.
We analyze the feasibility of using a technology based on an automatic suspect detection system that covers the functions of investigators who inspect certain websites. That is to say, we aim to determine under which circumstances the tracking and automatic detection model is better than the traditional model ("status quo") in which the collected user profiles are inspected manually. Our work also allows us to limit the paradox of the false positive [9], which is a controversial problem of mass surveillance systems, since our approach is selective and does not infer errors from a broad reference population. Our objective is to carry out a rigorous analysis of the problem.
Next, we summarize the major contributions of this work: • We analyze the suitability of decision-making models based on standard game theory and ARA, to tackle the problem of online surveillance. Our analysis contemplates the case of sequential defense-attack models, and examines the fulfillment of certain requirements on the defender and attacker's side. • We propose an ARA-based model to investigate the problem of online surveillance and analyze the rationality conditions of an automatic threat detection system. Our analysis constitutes a preliminary step towards the systematic application of ARA, in that it aims to establish a point of departure and connection between the analytical framework provided by ARA, a young field within risk analysis, and the problem of online surveillance.

•
We conduct an experimental evaluation of the proposed decision-making model and illustrate the typical problem solving approach used in a real case. Our evaluation methodology, in fact, may serve as a template for real problems, which would basically add modeling and computational complexities. Furthermore, we carried out a sensitivity analysis and provide a thorough comparison with a standard game-theoretic approach under assumptions of common knowledge. Our experiments showed that our ARA-based model outperforms the standard game-theoretic approach, although at the expense of more costly solutions, from a computational point of view.

•
The connection between the ARA models and online counterterrorism sheds new light on the understanding of the suitability of such decision-making models when it comes to applying them to the online surveillance problem. We also hope to illustrate the riveting intersection between the fields of ARA and threat intelligence, in an attempt towards bridging the gap between the respective communities.
The remainder of this paper is organized as follows. Section 2 provides some background on online third-party tracking and establishes our assumptions about the surveillance system. Section 3 describes the online surveillance problem tackled in this work. Section 4 examines the appropriateness of decision-making models based on standard game theory and ARA, to address the problem of online surveillance. Section 5 proposes an ARA-based model for sequential decision-making in the context of online surveillance. Section 6 conducts an experimental evaluation of the proposed model. Section 7 discusses several aspects of our model in relation to the experimental results. Finally, conclusions are drawn in Section 8.

Background and Assumptions
The purpose of online third-party tracking is behavioral advertising [10,11], that is to say, showing ads based on the user's past browsing activity. In this section, we first give a brief overview of the main actors of the advertising ecosystem. This will be necessary to understand our assumptions about the online surveillance system assumed in this work, described below in Section 2.2.

Background in Online Third-Party Tracking
The online advertising industry is composed by a considerable number of entities with very specific and complementary roles, whose ultimate aim is to display ads on websites. Publishers, advertisers, ad platforms, ad agencies, data brokers, aggregators and optimizers are some of the parties involved in those processes [12]. Despite the enormous complexity and constant evolution of the advertising ecosystem, it is usually characterized in terms of publishers, advertisers and advertising platforms [13][14][15][16][17].
In this simplified albeit comprehensive terminology, the third-party tracking and advertising is carried out as follows. As users navigate the Web and interacts with websites, they are observed by both "first parties", which are the sites the user visits directly, and "third parties", which are typically hidden trackers such as ad networks embedded on most web pages. The former parties are often known as publishers and the latter as ad platforms.
Tracking by third-parties begins with publishers embedding in their sites a link to the ad platform(s) they want to work with. The upshot is as follows: when a user retrieves one of those websites and loads it, their browser is immediately directed to all the embedded links. Then, through the use of third-party cookies, web fingerprinting or other tracking technologies, the ad platform is able to track the user's visit to this and any other site partnering with it. Third parties can learn not only the webpages visited and hence its content, but also the user's location through their IP address, and, more importantly, their web-browsing interests, also known as navigation trace.

Assumptions
In this section, we describe our assumptions about the surveillance system deployed by a security agency for detecting possible terrorist threats on websites of interest. We acknowledge that a number of security and privacy aspects would need to be considered if such an online surveillance technology were to be deployed in real practice; among those aspects, the exchange of information between the security agency and the tracking/advertising platform(s) would be critical. However, the practical details of this system and possible anti-tracking countermeasures are beyond the scope of this work. The purpose of our analysis is not to explore these details but rather to study the rationality conditions of deploying such an online surveillance technology.
First, we suppose that a security agency wants to develop a web infrastructure on which to apply an online automatic threat detection system. The websites or publishers targeted by the agency will be those that make it possible to detect threats. For example, certain web forums where ISIS recruiting messages appear with certain frequency are sites that are susceptible to being investigated by the security agency.
In addition, we suppose that it is possible to track users' activity in the target sites, or in other words, there are advertising and tracking companies operating on these sites. We acknowledge, however, that there may be sites such as those hosted on the dark web or others that are on the Internet that cannot be subject to surveillance because there are no ad platforms and tracking companies.
We assume that the agency can contract the services of the trackers available at the target sites to capture the users' visit data, which may include, among others, their activity within the site, location, IP address and web-browser fingerprints. Once properly treated, all such data may allow the agency to re-identify a given web user, possibly with the help of the Internet service provider in question.
In essence, the infrastructure assumed is based on three well-differentiated activities. In a first stage, the agency selects its target publishers and hires the services of the companies that track them to obtain the users' raw visit data. In a second optional stage, the agency exploits the data captured by the third-party trackers through an automatic system based on artificial intelligence methods (classifiers) so that, once the navigation trace of each user is extracted, it is possible to obtain a binary classification: suspicious or not suspicious. The threat detection algorithm that underlies this automatic system inevitably has certain sensitivity and specificity parameters (false positives). In a third and final stage, whether the automatic system has been deployed or not, there is an essential manual investigation of the flagged users by security experts. It should be noted that this type of architecture has two types of limited resources that are well differentiated: resources for hiring trackers and resources for the manual investigation of the collected profiles. In this work, we consider the cost of first type of resources is negligible compared to that of the latter. Figure 1 provides a conceptual depiction of the surveillance infrastructure assumed in this work.  Figure 1. Third-party tracking requires that publishers include a link to the ad platform(s) they want to partner with (1). When a user visits pages partnering with this/these ad platform(s) (2), the browser is instructed to load the URLs provided by the ad platform(s). Through the use of third-party cookies and other tracking mechanisms, the ad platform(s) can track all these visits and build a browsing profile (3). Finally, the information collected by the ad platform(s) is shared with the security agency, provided that they have an agreement (4).

The Problem of Online Surveillance
In this section, we describe the problem of online surveillance from the intrusion-detection problem posed and solved by [18] (According to [19]), intrusion detection systems (IDSs) are hardware or software systems that automate the process of monitoring events that have occurred in a computer system or network, analyzing them to detect security problems.. It is also appropriate to point out the work of Merrick and McLalay [20] on the use of scanners against smuggling of nuclear devices in cargo containers. Both works treat physical or logical security problems and assess their conditioning factors under uncertainty with the use of automatic threat detection systems. We rely on the cited works to define the problem at hand.
Suppose we are going to give support to the decision making of a security agency that has jurisdiction in a territory to prevent and neutralize attacks perpetrated by terrorists who use the Internet as a resource for carrying out their attacks. In general terms, we assume that the agency wishes to suffer the least possible harm, and, on the contrary, the terrorists want to cause the greatest possible damage. Faced with a normal Internet user, we define the suspect as a user whose digital activity can be considered a threat that must be investigated by the agency.
Suppose then that, in a certain period of time, the security agency will carry out online surveillance tasks on a series of websites that, based on expert knowledge, have been classified as susceptible to being used (propaganda, training, forums, etc.) by users who could potentially acquire the capabilities to prepare and/or carry out attacks.
To monitor these sites, the security agency uses a digital technology based on automatic detection of threats, described in Section 2, which consists of two well-defined complementary functions: automatic collection and classification of user profiles. Firstly, and based on tracking the digital activity of users who browse the target websites, the system collects the navigation traces, which result in unique user profiles. Secondly, of the profiles collected, the system is able to detect those that are potentially suspicious with certain sensitivity and specificity rates. More specifically, these classifiers are based on artificial intelligence methods. The use of the classifier is optional and in any case the system can always be supported by an "ad hoc" manual investigation by experts whose criteria we assume to be totally reliable. The classifier analyses each profile and if it considers it to be suspicious, it generates an alarm signal. Afterwards, the agency decides whether to investigate the profile based on available (limited) resources. Therefore, the agency makes decisions about whether to investigate according to the state (signal or lack of signal) of the system. However, when the system generates a signal, the agency does not know with certainty whether it is a real threat or the system has generated a false alarm. On the other hand, the suspect user's main objective is not to be detected by the surveillance system, which would imply, immediately and to simplify, the success of their actions.
The aim of the agency is to configure the system by choosing a point in its effectiveness function that minimizes the total cost of surveillance (the cost is not necessarily a monetary value but we can treat values such as image, privacy, etc., or in any case monetize them). Thus, we initially define the probability of detection α as the probability of classifying a suspect conditioned on the user really being a suspect, and the probability of a false positive β as the probability of classifying a suspect conditioned on the user not being a suspect. In a perfect surveillance system, we would suppose α = 1 and β = 0. However, and in general, online surveillance technology is such that a high value of α also implies a high value of β, due to the variability of the data associated with the normal and abnormal traces and the imprecision of the algorithms used by these types of systems.
In general terms, the navigation trace of potential suspects will depend on factors such as the benefit derived from acquiring the capacities to carry out terrorist acts of different levels; the loss that they will receive if they are captured; and the probability that they will be detected. We assume that a potential terrorist obtains a benefit b if their navigation is not detected. If it is detected, the user incurs a loss l over a non-positive net benefit of (b − l) 0. Suppose that it is reasonable to think that l = (1 + λ)b, with λ 0. The loss can take different forms depending on the nature of the terrorist potential (cost of legal persecution, reputation, intimidating effect, etc.). We denote by π the probability of the presence of a suspicious user in the set of monitored sites.
The agency complements the system with a manual investigation conducted by security experts. In general, it is expensive to always carry out manual investigations (it is obvious that it is a limited resource). When the agency does not deploy the automatic system, expert investigators must manually investigate a proportion ρ of the user profiles collected. When the system is deployed, experts can only investigate a proportion ρ 1 of the profiles that generated alarm signals and a proportion ρ 0 of the profiles that did not generate signals. The agency incurs a cost c every time the experts conduct a manual investigation. We assume that expert manual investigation always confirms or discards threats with certainty (it is 100% effective). If the agency detects a threat it will not incur any loss other than the cost c of the manual investigation. When a suspicious profile is not detected, the agency incurs a damage d. Suppose again that it is reasonable to think that c φd, with φ 1. It is usual to estimate these possible damages in the risk assessment phase before implementing and configuring the detection system. Traditionally, the quality function of a detection system is modeled through its relative operating characteristic (ROC) curve, although other evaluation functions can be appropriate, as shown in the next section. Table 1 summarizes the parameters of the problem of online surveillance defined in this section.

Symbol Description α
Probability of ASC alarm due to suspicion (true positive) β Probability of ASC alarm without suspicion (false positive) π Probability of presence of a suspicious user ρ Probability of manual investigation without using ASC ρ 1 Probability of manual investigation when the ASC generates an alarm ρ 0 Probability of manual investigation when the ASC does not generate an alarm c Cost of manual investigation; c φd, φ 1 d Damage derived from an undetected suspect φ Cost/damage coefficient of the system b Benefit for suspects not detected; l (1 + λ)b, λ 1 l Loss for suspects not detected λ Benefit/loss coefficient of the suspect

Analysis of Decision-Making Models
The terrorist attacks that occurred in Western countries in the last decades have sparked a growing interest in decision-making models and risk analysis for fighting terrorism. We refer the reader to the work in [7] for a complete review of the field.
The vast majority of this literature adopts a game-theoretic approach [21]. For example, [22], studied multi-attribute utility functions for the defender and attacker, and for simultaneous and sequential actions, to compute Nash equilibria; and [23] proposed several max-min optimization models to tackle defender-attacker, attacker-defender and defender-attacker-defender-problems. A hybrid model between game theory and risk analysis is ARA [8], a new perspective of decision analysis that differs from standard game theory in that it makes no assumptions of common knowledge.
The other mainstream literature adopts a decision-analysis approach. Among such works, we highlight the work in [24], which uses decision trees to assess man-portable air defense systems countermeasures. The recurrent problem of decision analysis, however, is the need to evaluate the likelihood of the actions of the others, which is a central issue of the Bayesian approach to games (We would like to stress that the tension between game-theoretic and decision-analytic approaches to decision-making problems with adversaries is by no means exclusive of counterterrorism models [21].).
This section is organized as follows. We first specify our requirements for the desired decision-making model in Section 4.1. Then, Section 4.3 examines the classical game-theoretic approach, and Section 4.4 analyzes ARA and verifies whether the requirements are met by these two models.

Model Requirements and Notation
In this section, we focus on standard game theory and ARA, and analyze their suitability to tackle the online surveillance problem described in Section 3. Since the aim of our analysis is to gain insight into the rationality of online surveillance with the principle of being as close as possible to reality, we define the following requirements for such a model:

•
Both opponents (intelligent, rational) want to maximize their utility.

•
There is uncertainty about the attacker's actions due to uncertainty about their utilities and probabilities.

•
The information on the evaluation of the objectives between opponents is incomplete, with the possibility of obtaining it partially through different sources that we will call intelligence (experts, historical data and/or statistical distributions).

•
And it is possible to model simultaneous and non-simultaneous (sequential) decisions.
Throughout this section, we follow the convention of using uppercase letters for random variables (r.v.s), and lowercase letters for the particular values they take on. Accordingly,p denotes approximation, estimation, as a result of Monte Carlo simulation; and p k ∼ P denotes the former is the kth iteration of the Monte Carlo simulation of the latter r.v. In the text, we drop the superindex k for notational simplicity.

Sequential Defense-Attack Model
To study the appropriateness of standard game theory and ARA, we develop first the sequential defense-attack model, which is one of the two standard counterterrorism model (Other standard models include the simultaneous defense-attack model, the sequential defense-attack-defense [25] and the sequential defense-attack with private information.). We use this model to analyze the problem that is the objective of this work. For the sake of comparison, we consider the following example of counterterrorism scenario.
Example 1 (Counterterrorism scenario). The authority of an airport (D, the defender) must decide whether to install body scanners at the security checkpoints of an airport, replacing the X-ray scanners. On the other hand, a terrorist group (A, the attacker) decides whether to try to smuggle a bomb onto an airplane. D makes the first move, so A can see if the new body scanners are in use when they arrive at the airport. Because A can observe the actions of D before deciding their move, they do not need to know their probabilities or utilities. However, D must have a distribution for A, which specifies its utilities and probabilities.
In this model, the defender makes the first move, deploys their defensive resources and makes a certain choice to position themselves against the possible terrorist threat. The attacker, after having observed this decision, evaluates their options and carries out an attack.
We assume that the defender initially has a discrete set of possible decisions D = {d 1 , d 2 , . . . , d m } and that the attacker can respond with one of their possible attacks A = {a 1 , a 2 , . . . , a p }. As a consequence of these actions, a result is produced. This result is the only uncertainty of the problem and depends probabilistically on (d, a) ∈ D × A. The decision sets can include the option to do nothing or combine several defenses or several attacks. To simplify the discussion, we consider only two possible values for the result, S = {0, 1}, which represents the failure or success of the attack. Thus, the defender and the attacker can have different probability distributions for the possibility of success, given a pair d i , a j . They can also have different utility functions.
To visualize the situation, we have built the influence diagram and the decision tree corresponding to the problem at hand. These are two decision analysis tools that help us to gain a clearer view of the sequential decisions that have to be made.
An influence diagram is a directed acyclic graph with three kinds of nodes: decision nodes, which are shown as squares; chance or uncertainty nodes, shown as circles; and value nodes, shown as hexagons. In addition, an influence diagram can have three types of arcs depending on their destination: if the arc arrives at a decision node, this indicates that the decision is made knowing the value of the predecessor; if it arrives at a random node, then the uncertainty depends on the predecessor node conditioned probability; and if it arrives at a value node, then the utility reflected in that value node depends on the values of the predecessors. Figure 2 shows the coupled influence diagram of the model. In view of this, we assume that the consequences for the defender and the attacker depend, respectively, on (d, s) and (a, s). This is then inferred to the defender's utility node by two arcs coming from the decision node, D, and the uncertainty node, S, which represents the result. The decision node representing the attacker's utility also has two arcs, which in this case come from the decision node A and the uncertainty node S. It is also reflected that the result node, S, depends, in this case probabilistically, on the defender's initial action and the attacker's response. The influence diagram arc from the node of the defender's first decision to the attacker's node reflects that the defender's choice is observed by the attacker before they decide on their attack.   We also show in Figure 2b the decision tree for this problem, clearly reflecting its sequential nature. First, a decision is made corresponding to the set D; once the attacker observes this descision, they decide whether to attack; the final result is produced as a consequence of these two actions. Note that there are two utility values in the terminal node of the tree. Each of these represents the utility that corresponds to each of the actors: one value refers to the defender's utility and the other value refers to the attacker's utility. The fact that there are several branches of each of the nodes refers to the possible decisions or results, which is the case of the chance node, which can be taken in each of them. The number of possible decisions in each decision node is not always the same and that is reflected in the decision tree.

Analysis Based on Standard Game Theory
The focus of game theory to solve the posed problem requires obtaining the utility functions of the defender u D (d, s) and attacker u A (a, s), as well as evaluating the probability of the event S|d, a for each of the participants, which we designate as p D (S|d, a) and p A (S|d, a) for the defender and the attacker, respectively. Standard game theory requires as initial assumption that the defender knows the attacker's utilities and probabilities and the attacker knows the defender's utilities and probabilities, this being common knowledge. If this happens, a solution to the problem can be obtained from the decision tree (Figure 2b) by backward induction as follows.
In node S, it is common knowledge for the two participants that the defender's expected utility associated with each pair (d, a) ∈ D × A, and the attacker's expected utility associated with (d, a) ∈ D × A, Knowing what the defender will do in decision node D, the attacker can determine what is their best attack in node A, after observing the defensive action of the defender, solving the problem This is also known by the defender due to the hypothesis of common knowledge. The defender can determine their best decision in the decision node D, solving the problem Thus, under the assumption of common knowledge, standard game theory predicts that the defender will choose d * ∈ D in node D. Then, the attacker will respond by choosing the attack a * (d * ). The pair (d * , a * (d * )) determines a solution of the game and is a Nash equilibrium.

Analysis Based on ARA
Now we abandon the assumption of common knowledge. It should be taken into account that ARA serves here as support to the defender.
To do this, we treat the attacker's behavior in node A as uncertainty from the defender's point of view and we model this added uncertainty. This is reflected in the influence diagram and the decision tree, as the attacker's decision node has become a chance node, replacing the square by a circle. Looking at the influence diagram ( Figure 3a) we now need to obtain p D (A|d), the probability that the defender will assign to the attack what the attacker will choose once they have observed every defensive move d ∈ D of the defender. The defender also needs to evaluate u D (d, s) and p D (S|d, a), already described above.
Teniendo en cuenta eso proporcionamos un análisis ARA como apoyo al defensor.   Once these data have been evaluated, the defender can solve their decision problem with backward induction considering the decision tree ( Figure 3b). Then, the defender will obtain their expected utility in the node S, ψ D (d, a), for each pair (d, a) ∈ D × A in the same way as in the previous approach. It is at this time when the defender can use the evaluation of the probability of what the attacker will do faced with each of the defender's decisions, p D (A|d), to determine their expected utility in the node A for each d ∈ D, with the expression Finally, the defender can find the decision that maximizes their expected utility in node D, solving the problem Therefore, the best strategy for the defender for the defense-attack model is to choose first d * in node D after having observed s ∈ S.
The key now is how to evaluate p D (A = a | d). To do this, ARA assumes the defender can use a statistical method if they have historical data on the attacker's behavior in similar situations. To complement this evaluation, the defender could also use expert opinions. However, as we describe in Section 5, an approach could be modeling the uncertainty that the defender has about the attacker's decision. This can be done assuming: (i) the attacker wants to maximize their expected utility; and (ii) the defender's uncertainty in evaluating this probability stems from the uncertainty that the defender has about the attacker's probabilities and utilities associated with the attacker's decision problem. In short, the evaluation is limited to analyzing the attacker's decision problem from the defender's point of view (Figure 4). The evaluation of the attacker's probabilities and utilities from the defender's perspective will be based on all the information that the defender has available, which can include previous data from similar situations and expert opinions. If the defender does not have this kind of information, they can use an uninformative or reference distribution to describe p D (A = a | d). Therefore, to obtain p D (A = a | d), the defender needs to evaluate u A (a, s) and p A (S|d, a), the attacker's utilities and probabilities, which are unknown to the defender.
15 del defensor estarán basadas en toda la información que el defensor tenga disponible, la cual puede incluir datos previos de situaciones similares y la opinión de expertos. Si el defensor no dispone de esta clase de información podría usar una distribución no informativa o de referencia para describir = | .   If the defender can access the attacker's probabilities and utilities they will learn, by the same procedure as in the game theory approach, the action that the attacker would give most expected utility, a * (d), for each d ∈ D, and therefore, p D (A = a * (d) | d) = 1. This would imply that the attacker's decision would be anticipated by the defender, and therefore they would not need to evaluate p D (A = a | d) .
We start, therefore, from the assumption that the defender does not know these two quantities, but can recognize their uncertainty about them by means of a probability distribution F = (U A (a, s) , P A (S|d, a)) and solve the attacker's decision problem using backward induction on the decision tree of Figure 4b with the expression In node A, assuming that the attacker wants to maximize their expected utility, the defender's distribution on the attacker's choice when the defender has considered their defense d is This distribution can be approximated using Monte Carlo simulation methods, generating n values, such thatp Once the defender has completed their evaluations, from these data they can solve their problem in the S node for each (d, a) ∈ D × A with the expression Then, their estimated expected utility is and finally their optimal decision is In view of our analysis of standard game theory and ARA, we regard the latter as the most appropriate model for evaluating the online surveillance problem defined in Section 3. The counterterrorism modeling based on common-knowledge assumptions entails that parts have too much information about their counterparts, which does not seem to make sense in a field in which secrecy tends to be an advantage. In a scenario where the adversary wishes to increase the risks of the defender, it seems unusual that the defender will have a full knowledge of their objectives, intentions or possible movements. Similarly, it seems unrealistic that the adversary fully knows the objectives, intentions or possible movements of the defender. Table 2 summarizes the analysis of the two models by matching the initial requirements defined in Section 4.1. As we show in next section, we rely on the analyzed ARA model to tackle the problem at hand. Table 2. Comparison between standard game theory and ARA.

Requirements
Standard Game Theory ARA opponents aim to maximize their utility uncertainty about the attacker's actions incomplete information about the evaluation of the objectives between opponents simultaneous and sequential decisions

An ARA Model for the Online Surveillance Problem
We present an ARA model to evaluate the problem of online surveillance described in Section 3. This model allows us to analyse the rationality conditions of the automatic threat detection system.
We assume that we support an agent (the agency, the defender, D) in their decision-making in relation to deploying an online surveillance system to monitor a set of selected websites, faced with the threat posed by the presence of the other agent (the suspect, the adversary, A) in the target sites. We assume that both agents operate monolithically.
According to the premises described in Section 2, we assume that the dynamics of the defender and the adversary can be described by means of a sequential defense-attack decision model represented in Figure 5 as an influence diagram coupled for the two agents.
To begin and given a set of target websites, the defender makes his initial decision d 1 = {0, 1} (0 is No, 1 is Yes) about using the technology, represented by the decision node D 1 . The adversary knows about these tracking measures and even so decides to be present in the set of sites, a = {0, 1}, represented by the decision node A. The automatic system, in the case that it is deployed (d 1 = 1), can lead to a system alarm signal, s 1 = {0, 1} , represented by the node of uncertainty S 1 shared by the defender and the adversary (if the system is not deployed, s 1 = 0 unfailingly). Depending on the previous result, the defender manually investigates the alarm, d 2 = {0, 1}, represented by the node of uncertainty D 2 , to the degree that their (limited) resources allow. All this leads to the final result of the success/failure of the two agents, s 2 = {0, 1}, represented by the node of uncertainty S 2 . We understand as success for the security agency the fact of detecting the threat and avoiding its potential actions, and failure is understood as the opposite. For the adversary, success and failure are the reverse events of the defender.  The utility u D obtained by the defender depends on the added cost of the manual investigation and the final success of the surveillance (nodes D 2 and S 2 ), on which their utility function is applied. Similarly, the utility u A obtained by the attacker depends on the added cost of access to the set of sites and the final success of the surveillance (nodes A y S 2 ), on which their utility function is applied.

The Defender's Decision Problem
We describe in this section the defender's decision problem, illustrated by an influence diagram in Figure 6, where the threat appears as a probability node A, from the point of view of D, which, given a collected profile should: • Agregar sus costes y evaluar los resultados con su función de utilidad . To solve the decision problem, D, it is necessary to evaluate the probability distributions, p D (A|d 1 ), p D (S 1 |a, d 1 ), p D (D 2 |d 1 , s 1 ) and p D (S 2 |a, d 2 ) and the utility function u D (d 2 , s 2 ). Assuming that D is capable of providing such inputs, we would proceed by applying standard decision analysis calculations based on dynamic programming to obtain the optimal decision.

1.
First, for each relevant scenario (d 2 , s 2 ), add the consequences and obtain the utility u D (d 2 , s 2 ).
In node D 2 , calculate the expected utilities: 4.
In node S 1 , calculate the expected utilities:

5.
In node A, calculate the expected utilities: 6. Finally, the decision node D 1 maximizes the expected utility and stores the corresponding optimal initial decision d * 1 .
Then, d * 1 describes the optimal decision for the defender. It should be kept in mind that we can describe the defender's optimization problem with the expression Note that of the four values required by the agency, p D (A|d 1 ) is the most problematic, insofar as it involves the defender's beliefs about the adversary's decision once they have observed the defender's initial decision d 1 . This is an evaluation that requires strategic thinking for which we propose an approach based on ARA. For this, we need to solve the adversary's decision problem, assuming uncertainty about their evaluations and propagating it to obtain its expected distribution based on the optimal presence of the adversary in the set of monitored sites. We discuss this in the following section.

The Adversary's Decision Problem
We describe the dynamics of the threat, illustrated as an influence diagram in Figure 7, according to the defender's point of view, where D 1 is now a probability node for the attacker, who must: To solve the decision problem, we assume that the adversary wants to maximize their expected utility. They therefore need to evaluate p A (S 1 |a, d 1 ), p A (D 2 |s 1 , d 1 ), p A (S 2 |d 2 , a), and u A (a, s 2 ). We cannot easily obtain these values so we model the defender's uncertainty about them with random probability distributions. Then, we can propagate this uncertainty using the standard reduction algorithm of influence diagrams and obtain the optimal and random decision a = {0, 1} for each value of d 1 = {0, 1}. This provides us with the required distribution p D (A | d 1 ) = P(A * (d 1 ) = a).

1.
Add the consequences and obtain the random utility U A (a, s 2 ), for each (a,s 2 ).
In node D 2 , calculate the expected random utilities: 4.
In node S 1 , calculate the expected random utilities: 5.
In node A, calculate the (random) optimal decision in response to each value of d 1 :
• Finalmente, agregar sus costes y obtener la utilidad correspondiente . This provides us with the required distribution p D (A = a * |d 1 )=P[A * (d 1 ) = a], assuming that the space of a is discrete. It should be kept in mind again that the reduction of the previous influence diagram can be recast as The distribution p D (A|d 1 ) can be estimated by Monte Carlo simulation. To do this, we sample n times the probabilities and utilities of the set F = {P A (S 1 |a, d 1 ), P A (D 2 |s 1 , d 1 ), P(S 2 |d 2 , a), U(a, s 2 )} (24) to obtain the optimal decision a * ∼ A * (d 1 ) in the k−th iteration of the Monte Carlo simulation, k = 1, . . . , n. Then, we can approximate p D (A|d 1 ) through |1 k n : a * k = a|/n. Note that, of the four components in F, the first three can be easily obtained. Normally, P A (S 1 |a, d 1 ) would be related to p D (S 1 |a, d 1 ) through some uncertainty, as we are referring to the results and the interaction between the attacker and defender, based on their decisions d 1 and a. This is also true for P A (S 2 |d 2 , a) with respect to p D (S 2 |d 2 , a). Regarding U A , we generally have information about the multiple interests of the adversary, which we add. However, the fourth element, P A (D 2 |s 1 , d 1 ), could require strategic thinking. In fact, the proposal presented here can be seen as a model of "level-2" thought, in which the defender optimizes their expected utility, with adverse probabilities derived from the optimization of the expected utilities (at random) of the adversary (Algorithm 1).

Algorithm 1:
Overall attacker-defender approach. 1 Initialize parameters 2 For the Adversary 3 for each d 1 , and for k = 1, . . . , n do 4 In node S 2 , and for each (a, s 1 , d 2 ) , In node D 2 , and for each (d 1 , a, s 1 ) , In node S 1 , and for each (d 1 , a) , In node A, 10 Calculate a * k (d 1 ) = arg max a ψ k A (d 1 , a) 11 end 12 Aproximate for each a,p D (A = a | d 1 ) = |{1 k n : a * k = a}|/n 13 For the Defender, calculate 14 In node S 2 , for each (a, d 2 ) , In node D 2 , for each (d 1 , s 1 ) , In node S 1 , for each (d 1 , a) , In node A, for each d 1 , In node D 1 , calculate ψ D = arg max d 1 ψ D (d 1 ).

Overall Approach
The above ideas can be integrated into a step-by-step algorithm. First, we use simulation to estimate the distribution that predicts the options of the adversarial (suspicious) presence in the monitored sites. Second, we find the optimal initial allocation d * 1 (in favor or against the deployment of technology), maximizing the defender's expected utility with respect to the distribution of the derived prediction. We assume that the intervening r.v.'s are discrete, that is, that the impacts S 1 and S 2 are classified.
In the previous scheme, d 1 would be the optimal initial defense allocation. The corresponding probabilities of adversary presence, p D (A = a|d 1 ), represent the probability of each adversary scenario after deploying d * 1 , which would help raise awareness about the state of the security. It is important to take into account that we make simulations for all possible initial allocations d 1 .
Our problem is a binary allocation on using a technology or continuing with the status quo; however, in problems where the number of these allocations/resources is too large, we could use a regression meta-model, as explained in [26], simulating some defenses, evaluating the corresponding attack probabilities and, consequently, approaching the attack probabilities in other defenses. Then, we would use that estimated attack prediction distribution to find the optimal resource allocation.

Experimental Evaluation
This section evaluates the decision-model proposed in the previous section. We used artificial data, given the absence of real, accurate information of terrorist web-browsing data and counterterrorism strategies. Therefore, we gave value to each of the evaluations that the defender must make about their own decisions and their beliefs about the adversary's decisions.
We also illustrate the proposed decision model with an example that serves to show some of the computational subtleties, as well as the typical problem solving approach used in a real case. In fact, it may serve as a template for real problems, which would basically add modeling and computational complexities. Essentially, first we structured the problem, then modeled the defender's evaluations about themselves, and, afterwards, about the adversary. In the computational phase, we simulated the adversary's problem to obtain their attack probabilities and fed them into the defender's problem to obtain the optimal defense. Finally, we carried out a sensitivity analysis. For purposes of completeness and comparison, we also provide a standard game-theoretic approach under assumptions of common knowledge.

Structure of the Problem
We begin by identifying the available resources for both the defender and the adversary. Defensive resources. We considered the defender's defensive resources to be the use of the automatic threat detection system with d 1 = 1. Otherwise, d 1 = 0.
Adversarial resources. For the adversary's resources, we took the presence of the adversary in the set of monitored sites, with a = 1. Otherwise, a = 0.
Results of the game. Finally, we must consider the results of the decisions of both agents. We assumed that the states of S 1 and S 2 are 0 or 1, which means, respectively, the success or failure of the detection in terms of the alarm signal of the automatic system (recall that, if d 1 = 0, then s 1 = 0) and the final detection of the threat after a manual investigation.

The Defender's Evaluations
Now we consider the evaluations of the beliefs and preferences for the agency, that is, p D (S 1 |a, d 1 ), p D (D 2 |d 1 , s 1 ), p D (S 2 |d 2 , a) and u D (d 2 , s 2 ), defined in Section 5. In the assumed scenario, none of them require strategic thinking. In the evaluations, we used the different parameters of the online surveillance problem defined in Section 3. Next, we examined the evaluation of the probability distributions involved as well as the utility model.

•
Evaluating p D (S 1 |a, d 1 ). S 1 represents the probability that the automatic threat detection system generates an alarm, whether there is suspicion or not. Obviously, if the system is not used, it is impossible that it generates an alarm. We established a range of values for this probability, although the defender will operate their problem with the base values. These considerations are reflected in Table 3. Table 3. p D (S 1 = 1|d 1 , a).
Evaluating p D (D 2 |d 1 , s 1 ). D 2 represents the probability of manually investigating a profile collected both when the automatic system is used and when it is not. We also established a range of values that includes the base value for the defender's problem, as shown in Table 4. Table 4. p D (D 2 = 1|d 1 , s 1 ).
Evaluating p D (S 2 |d 2 , a). S 2 represents the final success/failure of the surveillance. As described in Section 3, manual investigation was considered 100% effective in confirming or ruling out a threat. In this case, we did not use a range for the values of this probability. Table 5 shows these considerations. Table 5. p D (S 2 = 1|a, d 2 ). a = 1 a = 0 . Finally, the utility u D (d 2 , s 2 ) as a measure of the quality of the model. We opted for an exponential utility function that allowed us to order the costs v D of the defender while assuming their (constant) risk aversion. Accordingly, we define u D (d 2 , s 2 ) = − exp (c D v D ), with c D ∼ U(0, 3) and consider the parameters shown in Table 6. Table 6. v D (d 2 , s 2 ).

The Defender's Evaluations about the Adversary
The security agency also needs to evaluate p D (A|d 1 ). This requires strategic thinking, as explained in Section 3. To do this, we must put ourselves in the adversary's shoes and make assessments about their probabilities and utilities, from the defender's perspective. Next, we go through how to estimate the probability distributions of the problem at hand and the adversary's utility function.

•
Evaluating P A (S 1 |a, d 1 ). We assumed that p A (S 1 = 1|d 1 , a) is similar to p D (S 1 = 1|d 1 , a). To model our lack of knowledge about the probabilities used by the adversary in their decision problem, we added some uncertainty. In particular, we assumed that, except in cases where p D (S 1 = 1|d 1 , a) is 0 or 1, for those who suppose that the adversary's probabilities will match their beliefs P A about p A (S 1 = 1|d 1 , a) are uniform within the ranges [p min A , p max A ] of Table 7, evaluated by the defender. Table 7. p A (S 1 = 1|d 1 , a).
Then, we modeled p A as a uniform distribution between p min A and p max A . Thus, P A (S 1 |a, d 1 ), was defined by the expression with ω ∼ U (0, 1) , so that the uncertainty about ω induced uncertainty about p A to provide P A .
. We adopted the same approach as before, now based on Table 9. Table 9. p D (S 2 = 1|a, d 2 ). a = 1 a = 0 , s 2 ). Finally, for utility u A (a, s 2 ), we also opted for an exponential utility function that allowed us to order the adversary's costs v A , while we assumed their (constant) risk seeking in relation to their benefits. Thus, we defined u A (a, s 2 ) = exp (c A v A ), with c A ∼ U(0, 0.025) and consider the parameters shown in Table 10. Table 10. v A (a, s 2 ). a = 1 a = 0

Results
We solved the problem with the open-source software R (R version 3.3.3 (2017-03-06)) with an Intel Core TM processor i3-2370 CPU at 2.4 GHz, 4Gb RAM on a Windows 10 64-bit operating system. In our example, the computation time was acceptable (15-20 s per problem on average) and therefore we did not consider the implementation and its performance as the object of the analysis. In any case, it should be noted that the resolution of the problem implied a Monte Carlo simulation for each value d 1 and that in each simulation we must propagate uncertainty at different levels, which became a strong computational challenge for larger problems.
For comparative and sensitivity analysis purposes, we prepared an experiment with 1000 random scenarios for five levels c D of risk aversion of the defender, specifically c D = (0.01, 0.1, 0.5, 1.0, 3.0). In total, we obtained a set of 5000 solutions. In this way, we intended to determine how the parameters of the problem influenced the defender's optimal decision d * 1 and at the same time compared their behavior according to their level of risk aversion, between c D = 0.01 (minimum) and c D = 3.0 (maximum). In Table 11, we include the parameters that we used when the 1000 scenarios were generated for each value of c D .
The implementation of our evaluation model allowed us to obtain in each run (i) the optimal solution d * 1 for the defender; and (ii) the conditional probability p D (A|d 1 ) that the defender needed to solve their problem. As explained above, this probability was estimated using a Monte Carlo simulation with n = 10, 000 replications for each value d 1 ∈ D. Tables 12 and 13 show an extract of the results for one scenario and different levels of c D of risk aversion of the defender, and the explicit form of the probability p D (A|d 1 ) for that scenario. Table 11. Main parameters of the experimental evaluation.   Thus, for example, the probability that the adversary is present in the set of monitored websites, taking into account the possibility that the defender is monitoring their navigation, isp D (A = 1 | d 1 = 1) = 0.91. This means that, in this example, solving the defender's problem ends with the optimal solution d * 1 = 1 for levels c D of the defender's risk aversion 0.01 and 0.10 and the contrary for higher levels.

Sensitivity and Specificity Proportion of Manual Investigations Costs and Coefficients
In Figures 8 and 9, we can observe graphically some of the most relevant results. The favorable use of the system, d * 1 = 1, is given in a moderate proportion of the 1000 cases (between 30% and 38%), and in a more conservative way at a higher level c D of the defender's risk aversion. The same decreasing behavior is observed for the ratio between the expected utilities of the optimal solution and its opposite. On the other hand, we have the average values of the estimates,p D (A = 1 | d 1 ), for which we observe thatp D (A = 1 | d 1 = 1) p D (A = 1 | d 1 = 0) when d * 1 = 1, and conversely in the opposite case. All these results confirm what we intuitively assumed a priori, and the correct behavior of the calculations.    Finally, we adjusted a parametric model to the set of solutions, specifically a logistic regression of the form logit d * with the aim of determining the relationship between d * 1 and the parameters of the problem. To select the best model, we used the bestglm (Available online: https://cran.r-project.org/web/packages/bestglm/index.html (accessed on 8 May 2012)) package of R. The reason was to avoid losing information and overestimating the logit model. Table 14 shows the results of the adjustments, where between the null model and the complete model, the best model obtained is "ARA08.06" (logit model with six variables out of eight available variables, highlighted in bold). Thus, the model indicates that, a priori, we could do without the parameters β base and λ to explain the optimal decision d * 1 = 1 of the defender, while the parameter ρ base (proportion of profiles investigated manually when the system is not used), with an odds ratio = 16.56, is shown to be highly influential. The confusion matrix (The cut-off value is 0.40 for a dataset with 1608 reference cases over 5000) shown in Table 15, computed from the predictions of the "ARA08.06" model, indicates that 3703 of the 5000 scenarios (74%) would be correctly predicted, with 953 out of 1 608 (59%) referring to the cases where d * 1 = 1. At this point, we want to highlight that other parametric and/or nonparametric adjustments can be used alternatively to logistic regression. It is also possible to analyze the sensitivity of the problem from other angles, be it for example through game theory in its classical form or through differential calculus, to find solutions and optimal parameter configurations. We look at this in the following subsection, which analyzes the problem from a standard game-theoretic approach.
Comparison with Game Theory We next solved our example using the game theory approach and compared the results with the solution offered by ARA. The standard game theory approaches generally assume a common knowledge about the structure of the game (values at stake, resources available for the players, feasible assignments, etc.) as well as the utilities and probabilities of the players. In addition, the existence of objective probabilities for uncertainties is also usually assumed, in our case the results of automatic and/or manual investigations, S 1 |d 1 , a and S 2 |a, d 2 .
We assume that the conditional probabilities (S 1 |d 1 , a) and p(S 2 |a, d 2 ) derive, respectively, from p D (S 1 |d 1 , a) and p D (S 2 |a, d 2 ), that is, the defender's belief about the probability that the adversary's presence is not detected. These probabilities now represent objective non-detection probabilities, and both the defender and the adversary know them. In addition, the assumption of common knowledge ensures that the defender knows the probabilities used by the adversary when they solve their decision problem, and therefore does not need to represent the uncertainty surrounding them.
To resolve the problem we adapted the reduction algorithm of influence diagrams proposed by [27] to evaluate an influence diagram that represents a decision problem of a single agent, to solve the sequential defense-attack games formulated as multi-agent influence diagrams. We solved, in parallel, the experiment proposed for ARA when the coefficients of risk aversion and risk seeking for the defender and the adversary are, respectively, c D = (0.01, 0.1, 0.5, 1.0, 3.0) and c A = 0.0125 (remember that originally c A ∼ U(0, 0.025)). These values determine the utility functions of the defender and the adversary, given, respectively by u D (d 2 , s 2 ) and u A (a, s 2 ), which are also common knowledge. In general, we expect coincidental and opposite results that reveal the different assumptions of the methods. In Table 16, we show the same extract of results that Table 12 shows obtained by ARA, now adding the optimal solution provided by game theory. In this scenario and depending on the risk aversion level c D of the defender, ARA behaves more prudently than game theory, which constantly obtains the same solution d * GT 1 = 1.  In Figure 10, we can observe graphically some of the results obtained. The favorable use of the system, d * 1 = 1, is given in a proportion between 49% and 59% of 1000 cases, decreasing to a greater level c D of risk aversion of the defender. The same decreasing behavior is observed for the ratio between the expected utilities of the optimal solution and its opposite. Compared to ARA, the frequency of favorable use of the system is significantly less conservative. This result satisfies us because it corresponds to ARA applications solving other problems. The opposite occurs with the ratio of the expected utilities between the optimal solution and the opposite, where ARA also has decreasing but higher ratios. We understand that these differences are found in the terms used to calculate the expected utilities of the defender in node A. 44 satisface ya que se corresponde con aplicaciones de ARA resolviendo otros problemas. Al contrario ocurre con el ratio de las utilidades esperadas entre la solución óptima y la contraria, donde ARA presenta ratios también decrecientes pero superiores. Entendemos que estas diferencias se encuentran en los términos empleados para el cálculo de las utilidades esperadas del defensor en el nodo A. En la Tabla 16 reflejamos el mismo ajuste logístico usado para los resultados obtenidos con ARA. Entre el modelo nulo y el modelo completo, el mejor modelo obtenido con la Teoría de Juegos es 'GT08.05' (modelo logit de 5 variables sobre 8 disponibles). El modelo nos indica que, a priori, podríamos prescindir de los parámetros , y para explicar la decisión óptima del satisface ya que se corresponde con aplicaciones de ARA resolviendo otros problemas. Al contrario ocurre con el ratio de las utilidades esperadas entre la solución óptima y la contraria, donde ARA presenta ratios también decrecientes pero superiores. Entendemos que estas diferencias se encuentran en los términos empleados para el cálculo de las utilidades esperadas del defensor en el nodo A. En la Tabla 16 reflejamos el mismo ajuste logístico usado para los resultados obtenidos con ARA. Entre el modelo nulo y el modelo completo, el mejor modelo obtenido con la Teoría de Juegos es 'GT08.05' (modelo logit de 5 variables sobre 8 disponibles). El modelo nos indica que, a priori, podríamos prescindir de los parámetros , y para explicar la decisión óptima del (average ± deviation) for d * 1 = 1, both depending on the level c D of risk aversion of the defender.
In Table 17, we show the same logistic adjustment used for the results obtained with ARA. Between the null model and the complete model, the best model obtained with game theory is "GT08.05" (logit model with five variables out of eight available variables). The model indicates that, a priori, we could dispense with parameters α base , φ and λ to explain the defender's optimal decision, while parameter β base (false positives), with an odds ratio o f 29.88, demonstrates to be highly influential. For comparison, we also include in the table the best ARA model obtained, "ARA08.06". The confusion matrix (The cut-off value is 0.55 for a dataset with 2589 reference cases over 5000) shown in Table 18, computed from the predictions of the "GT08.05" model, indicates that 3067 of the 5000 scenarios (61%, 74% with ARA) would be correctly predicted, 1312 out of 2589 (51%, 59% with ARA) referring to the cases in which d * 1 = 1. Table 18. Confusion matrix of the "GT08.05" logit model.

Pred.
In general, we can conclude that the results obtained with ARA are more satisfactory than those obtained with game theory. In contrast to standard game theory, however, ARA provides more costly solutions from a computational point of view, but more realistic in terms of the dynamics and perspective of the game proposed between adversaries to solve the problem of online surveillance.

Discussion
We defined the problem of online surveillance based on the intrusion-detection problem posed by [18] through backward induction. The choice of methodology and model applied to this problem is one of the central ideas of this work. In this sense, we believe that the essence of the problem is adequate for the use of ARA, which arises, among other aspects, from (i) the need to address an efficient allocation of security resources for managing a terrorist threat and (ii) to improve the methods of decision analysis when the risks are derived from intentional actions by intelligent adversaries. These improvements of ARA are aimed to address the unrealistic situations such as the hypothesis of common knowledge about payments and probabilities among adversaries that the classical game theory requires, and the unpredictability of the adversaries' intentions that the standard risk analysis supposes.
On this basis, we investigated an ARA model as a novel way of proposing and solving the problem of online surveillance, where, among other aspects, we do not assume the hypothesis of common knowledge. Unlike the problem tackled by [18], we evaluated the problem of online surveillance faced by a security agency that monitors a set of specific websites by tracking and classifying profiles that are potentially suspected of carrying out terrorist attacks.
Our analysis constitutes a preliminary, theoretical step in that it aimed to establish a point of departure and connection between the analytical framework provided by ARA, a young field within risk analysis, and the problem of online surveillance with counterterrorist purposes, understood as a game between opponents who wanted to maximize their benefits.
To give consistency to our proposal, we have illustrated a feasible architecture for online surveillance based on an engine for tracking user navigation traces on monitored websites and an automatic classifier of suspects, thought of as a classification method based on artificial intelligence. However, we recognize that the implementation of such a surveillance infrastructure may pose several technical difficulties. A number of security and privacy aspects, out of the scope of this work, would need to be considered if such an infrastructure were to be implemented in real practice; among those aspects, special attention should be paid to the exchange of information between the security agency and the tracking/advertising platform(s) and the design of secure authentication schemes [28][29][30].
In this scenario, we evaluated the adoption of automatic technology compared to the status quo that involves the manual investigation of profiles. The automatic threat detection system, nonetheless, is limited to the extent given by the sensitivity and specificity parameters of the artificial-intelligence algorithm responsible for classifying the collected profiles. Specifically, the compromise between the defined parameters and the evaluations of the probabilities and payments of the agents governed by the dynamic strategy modeled with ARA determines the limitations of the automatic system. The results of our experiments indicate that the use of the automatic detection system is strongly influenced by the proportion of profiles that the security agency could investigate manually (status quo case). This is a proportion that supposedly depends in turn on the budget available for intelligence analysts.
We applied the ARA methodology, which offers the possibility of treating the problem with game theory and risk analysis approaches in a new perspective of decision analysis against intelligent adversaries, who increase the risk of security and uncertain results.
Our experimental results corroborate the benefits of the proposed model and at the same time are indicative of its potentialities.
Compared to the analysis of the problem from standard game theory, ARA indicated in general greater prudence in the deployment of the automatic system, even more at higher levels of risk aversion. In addition, the behavior of estimated conditional probabilities correctly responded to our intuitions (p D (A = 1|d 1 = 1) p D (A = 1|d 1 = 0) when d * 1 = 1, and conversely in the opposite case). From this point of view, the ARA model is more attractive than the standard game-theoretic model since it behaves satisfactorily without having to relax crucial hypotheses such as common knowledge and therefore subtracting realism from the problem.
We used a parametric model with the aim of understanding the relationship between the optimal decision d * 1 = 1 and the parameters of the problem. This can be a way of not having to execute the resolution algorithm countless times to determine the adjustment of the system parameters. This and other decisions can be part of the implementation of the online monitoring architecture (machine learning, sensitivity analysis, etc.). In addition to the better parametric adjustment of ARA over standard game theory, we observed that ρ base (proportion of profiles investigated manually in case of not using the system) is determinant for ARA while β base (false positives) was for game theory. β base depends on how good our detection system was and ρ base on the available resources. A priori the two implications could be coherent and a further debate could explore this, for example, depending on how the surveillance architecture is implemented.
Given the absence of real, accurate information of terrorist web-browsing data, our experiments were conducted on artificial data. We consider that the obtained results, although based on artificial and therefore limited data, place us positively at this starting point, as they have been satisfactorily contrasted with standard game theory. The obtained experimental results are solvent from a theoretical standpoint. Furthermore, from a practical point of view, those results provide confirmatory evidence as we could corroborate some previous intuitions we expected from ARA, as illustrated in Figures 8  and 9, particularly with regard to the predictions about the attacker p D (A|d 1 ), whose evaluation is key for ARA, and which we derived from our uncertainty about the adversary's problem and under the hypothesis that the adversaries endeavor to maximize their expected utility. The following support the validity and appropriateness of the proposed approach: (i) the model formulation used in this work is one of the basic template models employed in counterterrorism; (ii) this model has been successfully addressed by ARA; and (iii) ARA is a robust, extensively investigated theoretical framework in the literature [8,21,31].
Furthermore, ARA goes beyond the dynamics of a player facing "nature", introducing in its place intelligent adversaries in a game of rational confrontation represented by the agents' utility functions. In addition, another important advantage of ARA is its use of multi-agent influence diagrams, a graphic tool that allows clearly representing a decision problem between more than one agent. In a nutshell, we can conclude that ARA is an excellent option for modeling and solving the problem posed against the classical model of game theory.

Conclusions and Future Work
In recent years, Western countries have allocated tremendous amounts of resources to fight terrorism. As in any war, the battle occurs in various environments and the Internet, with the advent of the IoT, is one of the most powerful ones for propagating a threat and recruiting terrorists. However, at the very same time, this environment is the perfect storm for the development of ubiquitous online surveillance.
In this work, we first examined the suitability of standard game theory and ARA, to tackle the online surveillance problem in which a security agency aims at countering terrorism online by deploying an automatic threat detection system on certain target websites. Then, we proposed an ARA-based model to analyze the feasibility of using such an automatic system, and determined under which conditions said deployment is better than the traditional model in which terrorist online activity is inspected manually by agents. Experimental results show that our ARA-based model is more attractive than the standard game-theoretic model as the former behaves satisfactorily without having to relax crucial hypotheses such as common knowledge and thus subtracting realism from the problem. Specifically, experiments on artificial data showed ARA would correctly predict 74% of the 5000 simulated scenarios, 59% of them corresponding to the case d * 1 = 1. In contrast, GT would yield 61% correct prediction, with 51% of the analyzed scenarios corresponding to d * 1 = 1. Future research lines include adopting other sequences and/or introducing new intermediate decisions to be taken into account (for example, changing the uncertainty node D2 to a decision node). For this, we can use other ARA templates and model new situations, in both sequential and simultaneous game dynamics.