A User-Friendly Algorithm for Detecting the Influence of Background Risks on a Model

Background, or systematic, risks are integral parts of many systems and models in insurance and finance. These risks can, for example, be economic in nature, or they can carry more technical connotations, such as errors or intrusions, which could be intentional or unintentional. A most natural question arises from the practical point of view: is the given system really affected by these risks? In this paper we offer an algorithm for answering this question, given input-output data and appropriately constructed statistics, which rely on the order statistics of inputs and the concomitants of outputs. Even though the idea is rooted in complex statistical and probabilistic considerations, the algorithm is easy to implement and use in practice, as illustrated using simulated data.


Introduction
Actuarial, financial, and economic literature is abundant with models and analyses of background, or systematic, risks that affect decision making (cf., e.g., Finkelshtain et al. 1999;Franke et al. 2006Franke et al. , 2011;;Nachman 1982;Pratt 1998;Guo et al. 2018;Furman et al. 2018;and references therein).Various models have been proposed, including additive, multiplicative, and more intricate ones that couple underlying losses (or, generally speaking, inputs) with background risks.For recent far-reaching contributions to this area, we refer to Perote et al. (2015), Su (2016), Su andFurman (2017a, 2017b) Semenikhine et al. (2018), Guo et al. (2018), as well as to the extensive lists of references therein.
Systems and thus their models are prone to a myriad of intentional or unintentional disruptions, which could affect inputs and/or outputs.The literature on the topic is vast, and some of the recent contributions include those tackling deliberate intrusions (e.g., Cárdenas et al. 2011;Premathilaka et al. 2013), and false data injections (e.g., Liang et al. 2017).A number of sophisticated methods have been developed for tackling the problems (e.g., Huang et al. 2016;Onoda 2016;He et al. 2017;Potluri et al. 2017), to name a few.
Whether or not these risks affect the underlying input variables and thus decision making is a problem of immense interest.From the conceptual point of view, broadly speaking, two scenarios arise.First, if it is suspected that the outputs are affected, then testing whether or not this is indeed the case falls, in a sense, within the context of regression analysis, though additional statistical challenges arise (e.g., Perote and Perote-Peña 2004;Perote et al. 2015.;Chen et al. 2018;Gribkova and Zitikis 2018).The second scenario, which is the main topic of the present paper, deals with the case when it is the inputs that are possibly affected by risks.
Statistically speaking, given the input and output random variables X and Y, respectively, which in the risk-free scenario are connected by a "transfer" function h via the equation we wish to have an algorithm that would tell us whether risk-free model ( 1) is true or the risk-contaminated one where δ is an exogenous risk, sometimes called input-reading error, that directly affects the input X and thus, indirectly, the output variable as well.We note that Chen et al. (2018) consider model ( 1) with deterministic inputs, like those to be defined in Equation (3) below.Gribkova and Zitikis (2018) explore risk-free model ( 1), which can be viewed as the "null hypothesis" in the context of the present paper.Hence, model ( 2) can be viewed as the "alternative hypothesis," and the algorithm to be constructed and illustrated in this paper will distinguish between the two hypotheses.The rest of the paper is organized as follows.In Section 2, we lay out the foundations for assessing the presence, or absence, of input-affecting risks.In Section 3, we describe the algorithm itself.It relies on two statistics whose roles, interrelationship, and asymptotic properties are presented in Sections 4 and 5. Section 6 concludes the paper with a brief overview of main findings.

The Model
Systems are usually associated with finite-length transfer windows, say [a, b] ⊂ R, and also with transfer functions h : [a, b] → R. Let X 1 , . . ., X n be input random variables, which we assume to be pre-whitened (e.g., Box et al. 2015), that is, independent and identically distributed (iid).Denote their marginal cumulative distribution functions (cdf) by F(x), whose support is the transfer window [a, b].Hence, the input values are always in [a, b].We assume that the cdf F(x) is strictly increasing on the interval [a, b], with F(a) = 0 and F(b) = 1.In fact, to simplify mathematics and still cover a wide variety of applications, we assume that the cdf is continuously differentiable and its probability density function (pdf) is bounded away from 0 on the transfer window [a, b].
We assume that the input-affecting risks are pre-whitened, that is, iid random variables, and we also assume that they are independent of the input variables X 1 , . . ., X n and are affecting their values in the additive way.The inputs X i take values in the interval [a, b], but the risks δ i , being exogenous variables, are not restricted to any domain and can therefore take any real values.Our goal in this paper is to offer a practical way for detecting whether or not the risks are absent, or present.Two following notes relate our research to the topics in statistical literature.
First, the problem that we tackle is different from that dealing with errors-in-variables, where observations already contain errors, whereas in our case, the inputs X i are uncontaminated but possibly become such while being transferred into the filter, also known as the transmission channel in the engineering literature.That is, in the errors-in-variables scenario, we would observe X i + δ i , whereas in the present context we observe the original inputs X i and want to know whether or not they are affected by δ i .
Second, there is a connection between our research and classical regression, and we have already noted contributions by Perote and Perote-Peña (2004), Perote et al. (2015), where we also find extensive lists of related references.Namely, given the outputs Y i = h(X i + δ i ) and assuming for the sake of argument that the risks δ i are small, the Taylor formula gives the approximation Y i ≈ h(X i ) + h (X i )δ i , which places the input-based scenario into the output-based scenario Y i = h(X i ) + ε i , but the risks ε i ≈ h (X i )δ i depend on the inputs X i via the term h (X i ).This dependence feature presents a major hurdle, which we circumvent in our following considerations and produce a user-friendly algorithm for detecting δ i 's when they are present.
Throughout the paper we assume that the transfer function h(x) has a bounded and continuous first derivative, and we also assume that the derivative is not identically equal to 0, thus ruling out the trivial case of constant transfer functions.Actually, throughout the paper we also exclude the case h(a) = h(b), which causes some technical complications but is hardly of practical relevance, as we shall explain in the next section.If, however, due to some considerations we would need to depart from these conditions, then there is room for relaxing the conditions, though naturally at the expense of more complex considerations.

The Algorithm
We first elaborate on the definition of outputs.Indeed, even though X i 's are in the transfer window [a, b], the affected inputs X i + δ i may or may not be in [a, b], which is the domain of definition of the transfer function h(x).Hence, the actual outputs are where g(x) = h max{a, min{x, b}} .
Since the cdf of X is continuous, we can uniquely order the random variables X 1 , . . ., X n .The resulting order statistics X 1:n < • • • < X n:n give rise to the concomitants Y 1,n , . . ., Y n,n (e.g., David and Nagaraja 2003).Based on them and using the notation x + = max{x, 0}, we define the statistics and and then, in turn, their ratio The algorithm, to be introduced in a moment, for detecting input-affecting risks is based on asymptotics, when n gets large, of I n and B n , which we call the pivot and its supporter, thus hinting at their main and supporting roles, respectively.Before formulating the algorithm, we make the natural assumption that the risks, when they exist, should not be so large that the performance of the system would be derailed to such an extent that it becomes unnecessary to run any algorithm.For the purpose of rigour, in the following definition we summarize the circumstances under which there is ambiguity as to the absence, or presence, of input reading risks, and thus employing the algorithm becomes warranted.
Definition 1.The presence of input-affecting risk is suspected, and thus becomes a subject for testing, when it is believed that there is a set T ⊂ [a, b] such that the event X ∈ T has a (strictly) positive probability and, for all x ∈ T, the random variable g(x + δ) is non-degenerate, due to the random δ.
We note at the outset that Definition 1 is a user-friendly reformulation of technically-looking condition (10) to be presented in Section 5 below, where it plays a pivotal role in setting rigorous mathematical foundations for our algorithm.In this regard, we note that the condition is tightly tied to the indefinite growth of B n when the sample size n grows, as we shall see in Theorem 3 below.Hence, if the subject-matter knowledge is not sufficiently convincing for the decision maker to see whether or not the circumstances delineated by Definition 1 hold, then data-based checking of the asymptotic behaviour of B n for large n should clarify the situation.
Definition 1 implies that the system's output Y = g(X + δ) varies not just because of X but also because of δ, assuming of course that the latter is present, that is, is not degenerate at 0. This, for example, excludes situations (as unquestionably obvious) when g(x + δ) = g(a) for every x (i.e., when −δ > 0 is very large), or when g(x + δ) = g(b) for every x (i.e., when δ > 0 is very large).In either of these extreme cases, the decision maker would immediately see the system's malfunction because of the outputs constantly lingering on, or near, the boundaries g(a) and g(b), and thus no special testing would be warranted.
We are now ready to formulate the algorithm for detecting the input-affecting risk when its presence is suspected.
Case 1: The pivot I n is not approaching 1/2.(i) If I n decisively tends to a limit other than 1/2, then we advise the decision maker about the absence of the risk.
(ii) If I n seems to tend to a limit other than 1/2 but there is some doubt as to whether this is true, then we check if the supporter B n is asymptotically bounded, and if yes, then we advise the decision maker about the absence of the risk.
Case 2: The pivot I n is approaching 1/2.
(i) If the supporter B n tends to infinity, then we advise the decision maker about the presence of the risk.
(ii) If the supporter B n is asymptotically bounded, then h(a) and h(b) are likely to be insufficiently different to have already triggered Case 1 above, and we thus advise the decision maker about the absence of the risk.
In the next two sections, we present rigorous results upon which the above algorithm relies.We note in passing that irrespective of whether the algorithm detects risks or not, in either case we can still wish to double-check the findings.It can also be necessary to check the system's vulnerability (e.g., Hug and Giampapa 2012;and references therein).In such cases, we can use artificially constructed inputs, such as We conclude this section with an example that shows how the algorithm works in practice.For this, let the transfer function be h(x) = 1 − (x − 0.25) 2 for x ∈ [0, 1].Furthermore, upon recalling that the (unconditional) Lomax cdf is 1 − (1 + x/β)) −α for x ≥ 0, with shape and scale parameters α > 0 and β > 0, we assume that the input X follows the Lomax(α, β) distribution conditioned on the transfer interval [0, 1].Throughout the illustration, we set α = 1.5 and β = 1.
Let δ follow the normal distributions with the mean 0 and standard deviation σ.In the risk-free case (i.e., σ = 0), the asymptotics of I n and B n is depicted in panels (a) and (b) of Figure 2, and when σ = 0.1, their asymptotics is depicted in panels (c) and (d).We also check the performance of the algorithm when the risk δ is discrete, specifically, when it is equal to −2 with probability 0.7 and to 2 with probability 0.3.The asymptotics of I n and B n is depicted in panels (e) and (f) of Figure 2.    We see from the left-hand panels that the pivot I n converges to the limit other than 1/2 (i.e., to the value of I h to be defined by Equation (4) in the next section) only in the risk-free case.The increasing pattern of B n in panels (d) and (f) confirms the presence of input risk in both scenarios, which have initially been detected by the pivot I n (due to its convergence to 1/2) in panels (c) and (e).Note that the convergence to 1/2 in panel (e) is decisive, whereas the convergence in panel (c) may not be so well pronounced, and thus the increasing pattern of B n in panel (d) provides reassurance.

Asymptotics of the Pivot I n
We begin with the case when the input-affecting risk is absent, and thus the system is functioning properly.This is the starting point of many works (e.g., Cárdenas et al. 2011, p. 360) dealing with intrusion detection (e.g., Debar et al. 1999;Premathilaka et al. 2013), false data injections (e.g., Liang et al. 2017), and other disruptions.Recall the notation x + = max{x, 0} for any x ∈ R.
Theorem 1 (Gribkova and Zitikis, 2018).If δ is absent, then, when n → ∞, the pivot I n converges to For another perspective on the meaning of I h , we refer to Davydov and Zitikis (2017) where I h arises as the solution to an optimization problem.The importance of Theorem 1 in the present paper follows from the fact that when the cdf F δ is non-degenerate, then (details in Section 5 below) the pivot I n converges to 1/2 when n → ∞.Of course, the limit 1/2 can also manifest when δ is absent, that is, in the context of Theorem 1, but this can happen only when h(a) = h(b).Indeed, as it is easy to check using the equations |x| = x + + x − and x = x + − x − with x − = max{−x, 0}, we have I h = 1/2 if and only if h(a) = h(b).The latter property is, however, an exception rather than the rule: it manifests in such cases when, for example, the system is down and thus h(x) takes the same value irrespective of x ∈ [a, b].Hence, unless explicitly noted otherwise, throughout the paper we assume as we have already mentioned earlier.
We next discuss how to check whether or not the risk δ is degenerate.Naturally, in order to detect anomalies, the original state of the system has to be in reasonable working order (cf., e.g., Cárdenas et al. 2011, p. 360).Gribkova and Zitikis (2018) have put forward an argument in favour of the following definition.Definition 2. A system is in reasonable working order whenever in the absence of input-affecting risk (i.e., when δ = 0 almost surely), the sequence B n is asymptotically bounded in probability.In mathematical terms, we write this as B n = O P (1) when n → ∞.
Given that in the absence of input-affecting risk we are exploring the asymptotic behaviour of the pivot I n , which is the ratio of A n and B n , both of which are asymptotically bounded in probability, the requirement B n = O P (1) is natural.This can be seen from the following argument involving the mean-value theorem: for some ξ i between X i−1:n and X i:n , where As a side-note, the right-hand side of bound (6) implies that, if needed, the boundedness of the first derivative of the transfer function can be relaxed and the system can still remain in reasonable working order, as per Definition 2. We next present an example that shows what happens with the system when the input-affecting risk is present, that is, when the cdf F δ is non-degenerate.Before starting the example, we recall (David and Nagaraja 2003) that the concomitants Y 1,n , . . ., Y n,n can be written as follows where δ [i] is the random variable among δ 1 , . . ., δ n that corresponds to X i:n .As noted by David and Nagaraja (2003, p. 145), the random variables δ [1] , . . ., δ [n] are iid and follow the cdf F δ of the original risk δ.
Example 1.Let δ take value c > 0 with probability p ∈ (0, 1) and −c with probability 1 − p, and let c ≥ b − a.The latter assumption implies that irrespective of the value of X i:n , the value of X i:n + δ [i] is above b with probability p and below a with probability 1 − p.Hence, the concomitant Y i,n is equal to h(b) with probability p and to h(a) with probability 1 − p.Since each concomitant can take only two values, which implies Since the variables δ [1] , . . ., δ [n] are iid and follow the same cdf F δ as the original δ, the mean From this we conclude that if p is neither 0 nor 1, which we assume, and if h(b) = h(a), which we also assume, then B n Combining statements ( 8) and ( 9), we have I n = A n /B n P → 1/2 when n → ∞, which in turn implies that the system is affected by the risk.This concludes Example 1.
The above example has been constructed to show-in a somewhat dramatic way-what happens when the input-affecting risk pushes the input outside the transfer window, but the same conclusion can be reached under much weaker assumptions on δ, as we shall show in the next section.

Growth of the Supporter B n
The following general result plays a major role in the justification of the earlier presented algorithm.
Theorem 2. Gribkova and Zitikis (2018) Let (X 1 , Y 1 ), . . ., (X n , Y n ) be independent copies of a generic random pair (X, Y), with X having continuous cdf and Y having finite second moment.If where x s = F −1 (s) is the sth percentile of X, and F −1 g(x s +δ) (t) denotes the quantile function of the random variable g(x s + δ).
Condition (10) arises naturally, but its formulation is not user friendly.Remarkably, its meaning is very simple and has already been conveyed in Definition 1.Before proving Theorem 3, we next illuminate the meaning of condition (10) by revisiting Example 1 through the lens of the condition.
Example 2. Let δ take value c > 0 with probability p ∈ (0, 1) and −c with probability 1 − p, and let c ≥ b − a.Since for every s ∈ (0, 1) we have x s = F −1 (s) ∈ [a, b], the random variable g(x s + δ) has the probability distribution g(x s + δ) = h(a) with probability 1 − p, h(b) with probability p.
To obtain its quantile function, we start with the case h(a) ≤ h(b) and have the formula Consequently, Analogous calculations when h(a) ≥ h(b) give the answer p(1 − p)(h(a) − h(b)), thus establishing the equation irrespective of the values of h(a) and h(b).We can therefore conclude that as long as h(b) = h(a) and p is neither 0 nor 1, condition (10) is satisfied.Thus, we have B n P → ∞ according to Theorem 3.
Proof of Theorem 3. We first show that if condition (10) is satisfied, then B n P → ∞.We start with the bound Since the transfer function h(x) has a bounded derivative on the interval [a, b], the function g(x) is Lipschitz continuous on the entire real line, that is, |g(x) − g(y)| ≤ h |x − y| for all x, y ∈ R. Continuing with bound (11), we have because (i) the inputs X i and the risks δ i are independent, (ii) the inputs X i have the same cdf F, and (iii) the risks δ i have the same cdf F δ .Hence, if the expectation on the right-hand side of bound ( 12) does not vanish, then we must have B n P → ∞ when n → ∞.The proof of the converse (i.e., if B n P → ∞, then condition ( 13) is satisfied) follows from the same arguments but now with "+" instead of "−" and the reversed inequalities in bounds ( 11) and ( 12).
The right-most equation holds due to the well-known representation of the Gini mean difference as a Choquet integral (e.g., Giorgi 1993;Yitzhaki and Schechtman 2013;Furman et al. 2017;and references therein).We conclude with the note that the Gini mean difference is known to be (strictly) positive whenever the underlying random variable is non-degenerate, which in our case is g(x + δ).Hence, by assuming non-degeneracy of g(x + δ) for every x ∈ T ⊆ [a, b] such that P[X ∈ T] > 0, we arrive at condition (13) and thus, in turn, at (10).The proof of Theorem 3 is finished.

Concluding Notes
The need for an algorithm that distinguishes between the "null hypothesis" Y = g(X) and the "alternative" Y = g(X + δ) for exogenous background risk δ arises in many problems of economics, insurance, and finance.In the present paper, we have developed a user-friendly algorithm for distinguishing between the aforementioned two hypotheses.The algorithm is based on the asymptotic behaviour of two statistics: the pivot I n and its supporter B n , which are constructed using the order statistics of inputs and the corresponding concomitants of outputs.We have supplemented our theoretical considerations with illustrative examples, graphs, and discussions, thus facilitating the use of the algorithm in practice.
As we have noted in the Introduction, practical considerations give rise to alternatives which couple X and δ not just in the additive way but possibly in a more intricate way, which we generally formulate as Y = g(X, δ).In this regard we also note that X and δ might be dependent random variables, and even multivariate ones, thus giving rise to a highly non-trivial follow-up problem.
I n in the risk-free case.B n in the risk-free case.I n for normal(0, 0.1) risk.
B n for discrete ±2 risk.

Figure 2 .
Figure2.The risk-detection algorithm via the asymptotics of the pivot I n and its supporter B n , with the horizontal line in the three left-hand panels at the height of I h to be defined by Equation (4) below.
We know from statement (6) and the arguments around it that if δ is degenerate, then B n cannot be true.In the next theorem, we give a necessary and sufficient condition for B n