A Framework for Analysis and Prediction of Operational Risk Stress

: A model for ﬁnancial stress testing and stability analysis is presented. Given operational risk loss data within a time window, short-term projections are made using Loess ﬁts to sequences of lognormal parameters. The projections can be scaled by a sequence of risk factors, derived from economic data in response to international regulatory requirements. Historic and projected loss data are combined using a lengthy nonlinear algorithm to calculate a capital reserve for the upcoming year. The model is embedded in a general framework, in which arrays of risk factors can be swapped in and out to assess their effect on the projected losses. Risk factor scaling is varied to assess the resilience and stability of ﬁnancial institutions to economic shock. Symbolic analysis of projected losses shows that they are well-conditioned with respect to risk factors. Speciﬁc reference is made to the effect of the 2020 COVID-19 pandemic. For a 1-year projection, the framework indicates a requirement for an increase in regulatory capital of approximately 3% for mild stress, 8% for moderate stress, and 32% for extreme stress. The proposed framework is signiﬁcant because it is the ﬁrst formal methodology to link ﬁnancial risk with economic factors in an objective way without recourse to correlations.


Introduction
Every year, financial institutions (banks, insurance companies, financial advisors, etc.) have to demonstrate that they are resilient to adverse economic conditions. To do that, they are required to calculate what level of capital reserves would be necessary for the upcoming year. The requirements are specified in central bank publications such as from the Bank of England (BoE) [1], the European Central Bank (ECB) [2], or the Federal Reserve Bank (Fed) [3]. The following quote is taken from the introduction in the BoE's stress test guidance [1]: "The main purpose of the stress-testing framework is to provide a forward-looking, quantitative assessment of the capital adequacy of the UK banking system as a whole, and individual institutions within it." The particular terms forward-looking and quantitative are important for the analysis presented in this paper. Our stress framework incorporates both of those requirements.
The regulations concentrate on what financial instruments should be included and on the operational principles involved (data security, data collections, time deadlines, etc.). They say nothing about how stress testing should be conducted. The purpose of this paper is to provide a general-purpose framework that details not only how stress testing can be done but also a mathematical basis and practical steps for stress testing. The methodology presented is based on a principle that, under conditions of economic stress, there is a need to retain sufficient capital to withstand potential shocks to the banking system, whether or not any direct causal relationship exists between economic conditions and financial performance. Retaining capital is therefore effectively an insurance.

Operational Risk: A Brief Overview
The context considered in this paper is operational risk, which arises from adverse events that result in monetary loss. Operational risk may be summarised in the following definition from the Bank for International Settlements [4]: "The risk of loss resulting from inadequate or failed internal processes, people and systems or from external events" Each operational risk loss is a charge against the profit on a balance sheet and is fixed in time (although subsequent error corrections do occur). Operational risk falls in the category of non-financial risk and is therefore distinct form the principal components of financial risk: market, credit, and liquidity. The essential distinction is that financial risk arises from investment and trading, whereas non-financial risk arises from anything else (reputation, regulatory environment, legal events, conduct, physical events, decisionmaking, etc.). A useful categorisation of events that results in operational risk may be found at https://www.wallstreetmojo.com/operational-risks/ (accessed on 6 January 2021): human and technical error, fraud, uncontrollable events (such as physical damage and power outages), and process flow (procedural errors). The only way to manage operational risk is by preventing adverse events from occurring (e.g., acting within the law) or by minimising the effects if they do occur (e.g., mitigating the amount of fraud). When losses do occur, their values range from a few pounds (or dollars or euros) to multi-millions.

Acronyms and Abbreviations
The following acronyms and abbreviations are used in this paper. Most are in common usage in the field of operational risk. GoF: Goodness-of-Fit (in the context of statistical distributions) • TNA: The Transformed Normal 'A' goodness-of-fit test, discussed in reference [5].
Less used acronyms and abbreviations are introduced within the main text. The categorisations introduced in Section 1.3 are only used incidentally.

Operational Risk: Categorisation
OpRisk capital is usually calculated in terms of value-at-risk (VaR [6]). VaR is defined as the maximum monetary amount expected to be lost over a given time horizon at a pre-defined confidence level. The 99.9% confidence level is specified worldwide as the standard benchmark for evaluating financial risk. VaR at 99.9% is often referred to as regulatory capital.
In this paper, the term capital is used to mean "value-at-risk of historic OpRisk losses at 99.9%" and a 3-month (i.e., 1/4-year) time horizon is referred to as a "quarter".
OpRisk losses are classified worldwide into 7 categories, as specified originally in a data collection exercise from the Bank of International Settlements (the "Basel Committee") [7]. The categories are Internal Fraud (IF); External Fraud (EF); Employment Practices and Workplace Safety (EPWS); Clients, Products, and Business Practices (CPBP); Damage to Physical Assets (DPA); Business Disruption and System Failures (BDSF); and Execution, Delivery, and Process Management (EDPM). Of these, CPBP is often treated in a different way from the others because it tends to contain exceptionally large losses, which distort calculated capital values unacceptable. See, for example, [8] or [9]. Consequently, that category is excluded from this analysis. Losses in the others are aggregated to provide sufficiently large samples. In this paper, that aggregation is referred to as risk category nonCPBP.

Operational Risk: Measurement
The most common procedure for measuring VaR is a Monte Carlo method known as the Loss Distribution Approach (LDA) [10]. After fitting a (usually fat-tailed) distribution to data, there are three steps in the LDA, which is a convolution of frequency and severity distributions. First, a single draw is made from a Poisson distribution with a parameter equal to the mean annual loss frequency. That determines the number of draws, N, for the next step. In step 2, N draws are taken from the fitted severity distribution. The sum of those draws represents a potential total loss for the upcoming year. Step 2 is repeated multiple times (typically 1 million), giving a population of total annual losses, L. In step 3, for a P% confidence level, the minimum loss that is larger than the Pth percentile of L is identified and is designated "P% VaR". The LDA process is applicable for any severity distribution, which is why it is in common use. However, each complete Monte Carlo analysis can take up to 10 minutes to complete if sampling is slow and if the sample size is large.

Literature Review
We first summarise general approaches to stress testing and then review correlation relationships, since they are relevant for the Fed's stress testing methodology.

Stress Testing Regulatory History
The early history of stress testing is traced in a general publication from the Financial Stability Institute (part of the Bank for International Settlements) [11]. Stress testing appeared in the early 1990s and was mainly developed by individual banks for internal risk management and to evaluate a bank's trading activities [12]. A more coherent approach in the context of trading portfolios started with the 1996 "Basel Committee" regulations [13]. Regulation increased markedly following the 2008 financial crisis, mainly aimed at specify-ing what institutions and products should be regulated. The 2018 Bank for International Settlements publication [14] reiterated the aims and objectives of stress testing but without specific details on how tests should be conducted.

Financial Stress Testing Approaches
Regulatory guidance in the UK and EU documents [1] and [2] is circulated to financial institutions with historic and projected economic data on, among others, GDP, residential property prices, commercial real estate prices, the unemployment rate, foreign exchange rates, and interest rates. The data are intended to be used in conjunction with financial data, but no guidance is given as to how. There is no widely accepted categorisation for stress testing methods. Axtria Inc.
[15] partitions approaches into "Parameter Stressing" and "Risk Driver Stressing". Our approach corresponds to a mixture of the two. Otherwise, approaches may be classified by their mathematical or statistical approach.

Forward Stress Testing
The basis of "Forward" Stress testing is set out explicitly by Drehmann [16]. The process comprises four stages: scenario generation, developing risk factors, calculating exposures, and measuring the resulting risk. Stages 2 and 3 of that sequence may be summarised by Equation (1), in which V represents an algorithm that calculates V t , the capital at time t. The arguments of V are two vectors, both at time t: losses x t and stress factors λ t . The precise way in which they are used remains to be defined precisely.
A more specific version of Equation (1) will be given in Section 3, where the subject of this paper is introduced.

Reverse Stress Testing
In reverse stress testing [17], a desired level of stress in the target metric is decided in advance and the necessary data transformations to achieve that stress level are calculated. The overall procedure for reverse stress testing can be cast as an optimisation problem (Equation (2)). In that equation,V is the target value of a metric (such as capital) and E(V(t) | ω) is the expected value of the metric at time t conditional on some scenario ω, taken from a set of possible scenarios Ω. In the context of operation risk, a binary search is an efficient optimisation method.

Bayesian Stress Testing
A less-used stress testing methodology that is conceptually different from those already mentioned uses Bayesian nets and is discussed in [18]. Bayes nets use a predetermined network representing all events within a given scenario. Bayes nets rely on the property of the joint distribution of n variables f (x 1 , x 2 , ..., x n ) given in Equation (3). In that equation, f (x i |x i ) denotes the distribution of x i given its parent node x i . The conditional probability of the root node coincides with its marginal.
In practice, it is quick and easy to amend conditional probabilities in Bayes nets. However, they are often very difficult to quantify and structural changes to the network necessitate probability re-calculations.

Correlation of Operational Risk Losses with Economic Data
Regulatory guidance stresses the need for banks to maintain adequate capital reserves in the event of economic downturn. Consequently, some effort has been made to seek correlations between financial and economic data. In the context of OpRisk, those efforts have largely failed. Prior to the 2008 financial crisis, there was a lack of data and authors either had to use aggregations from multiple banks [19], proxy risk data [20], or a restricted range of economic data [21]. In all those cases, only a few isolated correlations were found in particular cases. Post-2008, the situation had not changed. Three papers by Cope et al. [22][23][24] reported similar isolated results. A third of these [24] were significant because it summarises the situation prior to 2013. The main findings are listed below.

1.
There were increases in OpRisk losses during the 2008 crisis period, mostly for external fraud and process management.

2.
There was a decrease in internal fraud during the same period. 3.
In each risk category considered, there was a notable decline in OpRisk losses between 2002 and 2011, with an upwards jump in 2008.

4.
A conclusion that the effects of an economic shock on OpRisk losses is not persistent.
This last point is important because it shows that the one case considered resulted in a case of shock followed by lengthy relaxation to an "ambient" state. Correlations were not explicit.
Very little has been reported in the years 2014-2018. Financial Stability Reports [2,[25][26][27] give an overall assessment that the UK banking system is resilient to stressed economic conditions, despite the COVID-19 pandemic in the latter case. Each implies a link between OpRisk losses and economic stress, but correlations were not made. Curti et al. [28] summarised the situation in recent years with the comment "have struggled to find meaningful relationships between operational losses and the macroeconomy".

The US Federal Reserve Board's Regression Model
The degree of detail in the stress testing documentation from the Fed [3,29] is very different from that of their European counterparts. The Dodd-Frank Act Stress Test specifies details of the loss distribution model to be used and, significantly, how a regression of OpRisk losses against economic features should be made. The model is specified by the Fed to be implemented by regulated firms with their own data. There are four components:
A predictor of future losses to account for potential significant and infrequent events 3.
A linear regression using a set of macroeconomic factors as explanatory variables. The regressand is industry-aggregated operational-risk losses.

4.
Projected losses allocated to firms based on their size The third stage shows that an assumption of economic factor/OpRisk loss correlation is an integral component of the Fed's model. Evidence in the literature and our own evidence (see Section 4.4) do not support this view. Our alternative in described in Section 3.

Operational Risk Stress Testing in Practice
Regulatory documentation continues to say very little about what banks should actually do in practice (as in [1]). The only "directives" are to consider macroeconomic scenarios and to review correlations of economic scenarios with OpRisk losses. ECB "directives" are similar. The only specific requirement in [2] is clause 389, which specifies that OpRisk predictions under stressed conditions should be not less than an average of historic OpRisk losses and that there should be no reduction relative to the current year.
In practice, banks use a wide range of methods to account for stressed conditions. Overall, the approach to OpRisk stress testing is similar to that used for credit and market risk stress testing. The major steps are scenario design, followed by model implementation, followed by predictions, and lastly feedback with changed conditions (as in [11]). There is often a concentration on data quality and data availability. For OpRisk management, the feedback component can provide indications on what conditions might reduce the severity or frequency of OpRisk losses. For example, a likely increase in fraud might prompt a bank to tighten fraud-prevention measures.
Some OpRisk-specific practices are listed below.
• Many banks take the BoE guidance literally and base predictions on economic scenario correlations, with the addition of the ECB lower limit. That approach would not be adequate for severe stress. • Others use the LDA method described in Section 1.4 and supplement historic OpRisk losses with additional "synthetic losses" that represent scenarios. That approach necessitates a reduction in the array of economic data (as supplied by a regulatory authority) to a short list of impacts that can be used in an LDA process. The idea is to consider specific events that could lead to a consequential total loss via all viable pathways. Fault Tree Analysis is a common technique used to trace and value those paths. The task is usually performed by specialist subject matter experts, and a discussion may be found in [31]. There is a brief description of the "Impact-Horizon" form of such scenarios in Section 3.3. The horizon would be set to 1 year, for which the interpretation is that at least one loss with the stated impact or more is expected in the next year. • A common way to implement known but as yet unrealised OpRisk liabilities (such as anticipated provisions) is to regard the liability as a realised loss and to recalculate capital on that basis. The same often applies for "near miss" losses, which did not actually result in an OpRisk loss but could have done with high probability. An example is a serious software error that was identified just as new software was about to be installed and was then corrected just before installation.
More generally, some general initiatives are under way and may change OpRisk stress testing procedures in the future [32]. The Dutch banks ABN, ING, and Rabobank, in conjunction with the US quantum developer Zapata Computing, are jointly exploring the use of quantum computing for stress testing. If successful, it should be possible to do many more Monte Carlo analyses. Independently, aiming to improve CCAR, the American Bankers Association (ABA) is seeking to standardize the way in which risk drivers are set for a variety of OpRisk situations (e.g., cyber crime). The ABA sees this as a solution to their view that the Fed consistently overshoots banks' own loss forecasts.

Method: Forward Stress Framework
The subject of this paper is a generalised framework that combines OpRisk losses and stress factors in an ordered way based on relevant stress sources. We refer to it as the Forward Stress Framework (FSF). Its development is prompted by the lack of evidence of correlations between OpRisk losses and economic factors. Instead, we propose that a prudent strategy is to build capital reserves to withstand possible effects of economic stress but without assuming a need for correlations.
The important aspect of "prediction" in the context of stress testing is to compare unstressed losses with losses that have been stressed by appropriate scenarios (such as worsening economic conditions, increasing cyber crime, decreasing reputation, etc.). Therefore, although predictions under stressed scenarios can be viewed in isolation, the change in predictions relative to corresponding unstressed predictions is important. Therefore, we provide "no stress" predictions to compare with the stressed cases.
The FSF can be described by Equation (4), which is a more precise form of the generalised "Forward" stress testing Equation (1). Equation (4) has a scalar product of two vectors. The first component is a scaler product of stressed predicted lossesx t . The second is a stress factor vector λ t . The scalar product is one argument of the function V. The other is a vector of historic losses x t . The function V takes two arguments: the scalar product and a vector of historic losses x t . At the next time slot, t + 1, the capital isV t+1 .
The stress factor vector λ t is an element-wise product of J (> 0) individual stress factors λ t,j , each derived from an appropriate source.

Forward Stress Framework: Operation
We attempt to predict the future lossesx t using time series derived from historic data. In [33], we found that predicting 3 months into the future was sufficiently accurate in most cases. Therefore, suppose that n quarters t 1 , t 2 , ..., t n are available and that "windows" of r consecutive quarters are taken. Therefore, there are n − r + 1 such windows. Then, a window starting at time t with length r quarters ends at quarter t + r − 1. Within that window, we assume that losses are lognormally distributed and can estimate lognormal fit parameters µ t and σ t . The window advances by one quarter at a time. Successive parameter estimations for each window define the time series used for predictions of future data. Figure 1 shows the window starting at time t within the total n quarters. The yellow highlight shows a window spanning the period t to t + r − 1, for which the fitted lognormal parameters are µ t and σ t . Windows that use only historic data are indicated in black, and windows that use some predicted data are indicated in blue.
Algorithm 1 shows the overall FSF process, projecting m quarters into the future. Fit a lognormal distribution to successive windows of length r using historic data, so deriving µ-and σ-parameter time series, both of length n − r + 1.
for quarter t from n to n + m (a) Predict lognormal parameters µ t and σ t (b) generate a random sample,x t , of size d t (the number of days in quarter t) from a log-normal distribution with parameters µ t and σ t . (c) Stress the generated losses using a stress factor vector, so deriving λ t .x t (d) Calculate capital,V t+1 , using the stressed losses with losses from the preceding r − 1 quarters.

Parameter Prediction
The last window that can be formed from the historic data spans the period t n−r+1 to t n . The corresponding time series are then used to predict the next pair of lognormal parameters µ n+1 and σ n+1 . The prediction is made using Loess fits to the µ and σ time series plus a "mean reversion" term to counter the trend continuation tendency of Loess. Therefore, if L s (X) represents the Loess function with span parameter s applied to a time series argument X and E(X) is the expected value of the argument X, then we can define a linear combination operator αL s + (1 − α)E (α ∈ [0, 1]) that acts on X. We found that α = 0.5 produced credible results. With that operator, the µ and σ predictors for quarter n + 1 are given in Equation (5).
The predictorsμ n+1 andσ n+1 can then be used to generate a random sample of lognormally distributed losses for quarter t n+1 . The sample size is set to the number of days, d n 1 , in that quarter. Capital can then be calculated for the combination of the losses generated for quarter t n+1 (which can be stressed), with historic losses from the preceding r − 1 quarters. The window then advances by one quarter at a time.

Stress Factor Calculation
It is assumed that historic stress-type values are supplied externally (for example, by a regulatory authority). It is likely that predicted stress type values will also be supplied.
To avoid problems of infinite relative change in a stress type caused by a move from a zero value in the previous quarter, all features are first normalised (on a feature-by-feature basis) to the range [1,10]. The precise range is not important, provided that it does not include zero. To calculate the stress factors λ t in Equation (1), the starting point for stress type j is a time series of m-n values of a stress type j: S j = {S j,n+1 , S j,n+2 , ..., S j,n+m } (with j = 1, 2, ..., J). Given the stress type values for index j (normalised to [1,10] to avoid division by zero), denote the normalised value stress type S j byS j and the normalisation function by N . Then, the stress factors are calculated using the first two parts of Equation (6). The product in the third part is the stress factor vector.S

Theoretical Basis of the FSF
The theoretical basis of the FSF is a second order Taylor approximation for capital. In practice, the implementation is encapsulated by Equation (6). Each term in it represents some part of the calculation.
The Taylor expansion is set up in the following way: using an extension of the notation introduced in Section 3 which better fits with the usual "Taylor" notation. Let V(x, t) be the capital at time t for a set of losses x subject to a set of J stress factors S = {S 1 , S 2 , ..., S J }. Then, a second-order Taylor expansion of V(x, t) is given by Equation (7), in which O(3) denotes all terms of order 3 or more in the increments δx j and δt.
The first-order term in Equation (7) represents the effect of a change in V due to a change in data. That data change is calculated using the Loess estimators for the period δt. The result is the termx t in Equation (4). The two steps are as follows:

1.
A calculation of the estimated lognormal parametersμ δt andσ δt , the details of the discrete version of which are in Equation (5). Nominally, δt is one quarter.

2.
Generation of a random sample of size d δt lognormal losses usingμ δt andσ δt , where d δt is the number of days in the period δt.
The first-order term in Equation (7) represents the effect of a change in V due to a time progression of stress. It uses the generated lossesx t and is another two-step process.
The second-order terms in Equation (7) represent the effect of a pair of stress types on V. That represents the product of a pair of stress factors λ j,t λ k,t in Equation (6). In practice, more than two stress types are possible, and in that case, the interpretation of these second-order terms is a recursive application of product pairs. The second-order partial derivative terms involving time, δx j δt and (δt) 2 are treated as non-material and are not modelled in Equation (6). Similarly, the third-and higher-order terms are also treated as non-material.

Stress Factor Types
The FSF was originally designed to accommodate the economic data provided by the Bank of England [1]. The BoE data is supplied in spreadsheet form, from which stress factors can be derived using the method described in Section 3.2. Consequently, a similar pattern was adopted for data from other sources. The BoE data comprises time series (both historic and predicted) for economic variables such as GDP, unemployment, and oil price for the UK and for other jurisdictions (the EU, the USA, China, etc.). Practitioners can select whichever they consider to be relevant and can add others if they choose. A lower of limit 0.9 can be imposed on stress factor values to prevent excessive "de-stressing". Similarly, an upper limit of 3.0 prevents excessive stress. In principle, any data series can be used to calculate stress factors, provided that the data are relevant. We have considered, for example, cyber crime using the Javelin database [34,35] and the UK Office for National Statistics [36]. With increasing awareness on climate change, Haustein's HadCRUT4-CW (Anthropogenic Warming Index) data [37] on global warming are available. In practice, global warming stress factors are only marginally greater than 1 and thus supply virtually no stress in the short term.
OpRisk scenarios are usually expressed in "Impact-Horizon" form, such as an impact of 50 million and a horizon of 25 years. This would normally be expressed as "50 million, 1-in-25". To fit the FSF paradigm, this format must be translated to the required stress factor format (Equation (6)). Appendix C suggests a method to do that by calculating a probability that, for a 1-in-H scenario with impact M, there will be at least one loss of value M or more in the next H years.
In order to test particular stress scenarios, the format of the stress factor makes it easy to define user-defined stress factors. To test an annual scenario in which model parameters are changed by a factor s , the approximation s = s 4 is a reasonable starting point for the required single quarter stress factor. As an illustration, the results for an extreme economic stress scenario is given in Section 4.2.1.
Stress factors that manipulate data directly can also be accommodated. Dedicated programming is the simplest way to accomplish this task, although it is possible to supply suitable formulae in a spreadsheet. Appendix B shows how it can be done. The results for an example in which only the largest losses are scaled is given in Section 4.2.1.

Choice of Distribution
The analysis in the preceding sections has concentrated on the LogNormal distribution. In principle, an alternative fat-tailed distribution could be used. The issue of which to use is not simply a matter of using a best-fit distribution. Some distributions provide better differentiation between predicted losses under "no stress" and stressed economic conditions. That is largely due to the value of the empirical standard deviation in a series of identical runs. The principal source of variation is the stochastic components inherent in the FSF. The LogNormal distribution has several advantages. The empirical standard deviation in a series of identical runs is small compared to some others (Pareto, for example). Therefore, it is possible to note that a mildly stressed economic case does result in a corresponding mildly inflated capital when compared to applying no stress.
In practice, the evaluation of LogNormal distribution parameters and ordinates is fast and reliable. Instances of failure to converge in parameter estimation are extremely rare. The Pareto (in our case, the type I variety) distribution is particularly notable because it tends to produce a significantly higher VaR than most others. Therefore, a Pareto distribution is unlikely to be a good predictor of future capital. However, if used as a discriminator between predictions using stresses and unstressed losses, it has potential. Section 4.3 contains an evaluation of commonly used fat-tailed distributions.

Results
The results for predictions and correlations follow the notes on data and implementation.

Data and Implementation
Two OpRisk loss data sets were used. The first was extracted from a dedicated OpRisk database and spans the period from January 2010 to December 2019. The Basel risk class CPBP was excluded because of distortions introduced by extreme losses and in accordance with BoE directives. We refer to it as nonCPBP. The second is a control data set, randomly generated from a lognormal (10,2) distribution. control has one simulated total loss per day for the same period.
Of the supplied BoE economic stress types, the following were considered "relevant". Lagged variables were not used. All are used with two sets of economic data supplied by the BoE. Base data were intended to model mild stress, and ACS (Annual Cyclical Scenario) was intended to model more severe (but not extreme) stress. In addition, an extreme data set was proposed to model extreme economic conditions such as mass unemployment, negative GDP, and severely reduced household income. The COVID-19 scenario models the (surprising) effect on OpRisk losses in the period January-July 2020. There is evidence, from Risk.net [38], that there has been a 50% reduction in OpRisk losses. This is attributed to much reduced transaction volumes. Some increase in fraud was noted, with a slight increase in customer complaints due to Internet and bank branch access problems. The COVID-19 scenario was constructed using simulated transaction levels that represent a 50% reduction in transaction volume in year 1, rising to 70% in year 2.
All calculations were conducted using the R statistical programming language (https://www.r-project.org/ (accessed on 6 January 2021)), with particular emphasis on the lubridate and dplyr packages for date manipulation and data selection, respectively. Mathematica version 12 was used for graphics and dynamic illustrations. Table 1 gives the mean (m) and standard deviation (s) of 25 independent runs in which 1-year and 2-year predictions are made for the nonCPBP and control data under the base and ACS economic scenarios. Empirically, the resulting capital distributions are normally distributed (Figure 2 shows an example); therefore, 95% confidence limits may be calculated using m ± 1.96s. As a rough guide, the confidence limits vary from the means by about 25% for nonCPBP and 17% for control.

Projections: Economic and Scenario Stress
In this section we show the projection results for both the nonCPBP and control data sets under BoE's economic stress scenarios (base and ACS). Two scenario projections are also shown. The extreme scenario models a severe downturn in economic conditions, much harsher than conditions implied by the ACS case. It was derived by exaggerating ACS data. The stress factors for the base, ACS, and extreme cases were calculated using Equation (6). The upper quartile scenario models an extreme condition by inflating the largest losses by 25%. The largest losses are known to have a significant effect on regulatory capital (see [8]). Technically, the stress factors in the upper quartile were supplied by reading a spreadsheet containing instructions to manipulate selected losses in the way required. Figure 5 shows the 2-year nonCPBP projection details (quarters 21-29) for all cases. The downward trend for the COVID-19 scenario and the steep rise in the extreme economic scenario are both highly prominent.
The plots in Figure 5 illustrate an inherent volatility in the historic data. In all cases, the predicted profiles have the same shape and the base case effect is small. The ACS case is only significant effect in year 2. The extreme scenario is much more pronounced than the upper quartile scenario, in which only the most extreme losses are inflated. The downward trend for the COVID-19 scenario only becomes marked in the second year. Figure 6 shows the equivalent projections for the control data. Those plots give a better understanding of the effect of the various economic cases, since the "historic" data profile is essentially non-volatile. The profiles follow the same pattern as for the nonCPBP data.   Table 2 shows a three-year summary in terms of percentage changes relative to a "no change" state for the base, ACS, extreme, and COVID-19 cases. For the first year only, a mild increase is indicated, effectively showing that the BoE's economic projections are relatively modest. Appendix A shows a further view of the nonCPBP projections in the form of confidence surfaces. Appendix D shows a first-order approximation for the confidence bounds for capital, implemented in symbolic terms using Mathematica. The results show that those confidence bounds are well-conditioned with respect to the stress factor parameter, λ. Table 3 shows the percentage changes using global warming data [37] as the source of stress. The changes relative to the "no stress" case are very small, showing that global warming does not make a significant contribution as a stress factor.

Results Using Other Distributions
To test the effect of distributions other than LogNormal on the FSF, the analysis in Section 3 was repeated, substituting a range of alternative distributions for the LogNormal. It is well known in OpRisk circles that, even if more than one distribution fits the data (in the sense that a goodness-of-fit test is passed), the calculated capitals can be very different. Therefore, two comparisons were made. First, the percentage changes in calculated capitals under stressed and unstressed conditions after 1 and 2 years were noted. Second, the 5-year data windows in Figure 1 were tested for goodness-of-fit (GoF) to selected distributions.
The criteria to be evaluated for each distribution tested are as follows: 1. Does the model predict a sufficient distinction between the "no stress" case and stresses due to the base and ACS scenarios in both 1 and 2 years? 2.
As many as possible data sets tested should pass an appropriate GoF test.

3.
Are predicted capitals consistent with the capital calculated using empirical losses only. We expect capitals predicted using a distribution to be greater than the capitals calculated using empirical losses only, since the empirical capital is limited by the maximum loss. Capitals calculated from distributions are theoretically not bounded above. However, they should not be "too high". As our "rule of thumb", 5 times the empirical capital would to "too high". Table 4 shows the results of a comparison of predictions in the 'no stress', base, and ACS cases. Table 4. One-and two-year percentage changes in capital relative to "no stress" using the base and ACS economic scenarios with nonCPBP data. The percentage changes indicate increasing capital from year to year, and greater increases for strong economic stress (ACS) relative to mild stress (Base).  Table 4 shows that the LogNormal and Weibull distributions both provide a reliable and sufficient distinction between stressed and unstressed conditions. They also provide accurate parameter estimates and run quickly. The LogNormal year 2 prediction is much larger than the Weibull equivalent, as might be expected from severe stress. However, year 2 predictions are less reliable for all distributions.

% Increase in Capital Relative to 'No Stress'
The response from GoF tests shows that LogNormal fits are the most appropriate. Twenty-one 5-year windows are available in the construct summarised in Figure 1, and the TNA GoF test was applied to each. Details of the TNA test may be found in [5]. This test was formulated specifically for OpRisk data and has two main advantages over alternatives. First, it is independent of the number of losses. Second, the value of the test statistic at confidence level c%, T c , provides a very intuitive measure of GoF and the value of T c is a direct measure of the quality of the fit. Zero indicates a perfect fit. The 5% 2-tailed critical value is 0.0682. Therefore, values of T c in the range (0, 0.0682) are GoF passes at 5%. Table 5 shows a comparison of the GoF statistics for the range of fat-tailed distributions tested. The values presented show that the LogNormal distribution is a very good fit for all 21 of the 5-year periods, and that other distributions are viable. In particular, the Gamma and LogGamma distributions are also good fits for all of those periods. Others are poorer fits, either because of the number of satisfactory fits or the GoF themselves. The mean TNA value for the Gamma distribution exceeds the mean TNA value for the LogNormal, making it less preferable. Although the mean TNA value for the LogGamma distribution is less than the corresponding LogNormal value, 13 of the 21 LogNormal passes were also significant at 1%, compared with only 3 for LogGamma. The column "Consistency with Empirical VaR" shows whether the VaR estimates are "too small" or "too large" (criterion 3 at the start of this subsection). Distributions that produce "too small" or "too large" VaRs can be rejected. In addition, parameter estimation for LogNormal distributions is much more straightforward than for LogGamma distributions. Therefore LogNormal is the preferred distribution. Table 5. Goodness-of-fit for historic nonCPBP data. Column "GoF passes at 5%" shows the number of GoF passes out of 21. Column "Mean TNA value" shows the mean of the corresponding values of the TNA statistic. "N"/"Y" in column "Consistency with Empirical VaR" means that the calculated VaR is inconsistent/consistent with the empirical VaR.

Distribution
GoF Therefore, the LogNormal distribution is optimal for use in the FSF because it provides a best fit for the empirical data and provides an appropriate distinction between the unstressed and stressed cases.

Correlation Results
Correlations between economic indicators (quoted quarterly) and nonCPBP OpRisk losses (quarterly loss totals) were assessed using Spearman rank correlation, which is more suitable if the proposed regression relationship is not known to be near-linear. The statistical hypotheses for the theoretical correlation coefficient ρ were null, ρ = 0, and alternative, ρ = 0. The list below shows the results.

•
Only two (Bank.Rate and Volatility.Index) were not significantly correlated with nonCPBP loss frequency. • Eleven significant frequency correlations were at a confidence level of 99% or more. • Two significant frequency correlations were at a confidence level of between 95% and 99%.
The explanation of these results is straightforward. Many of the historic economic time series show a marked trend with respect to time. Similarly, the aggregated loss frequency time series do too. In contrast, the aggregated severity-based time series do not. Associating two trending series inevitably results in a significant correlation.

Correlation Persistence
The conclusion from the literature review in Section 2 was that, if correlations between economic factors and OpRisk losses exist, they do not persist. We can confirm that conclusion using our data. With a rolling window of length five years, Spearman's rank correlation coefficient was calculated for the five-year periods corresponding to start quarters 1, 2, ..., 21, and 14 economic factors. A total of 294 correlations were examined. Table 6 shows that only a small percentage of severity correlations persist for 5 years, whereas a high percentage of frequency correlations do. Correlations therefore depend critically on the time period selected. The two severity correlations that are significant both occur within the five-year periods starting at quarters 1 and 2. In that period, data collection was less reliable than it was in later quarters. The significant frequency correlations are more numerous in the five-year periods starting at quarters 8 and 9, which was when a marked decrease in loss frequency started to become most apparent. The correlations that exist are due to associating pairs of trending time without any explicit justification for those associations.  Table 7 summarises the data in Table 6. It shows the percentage of 5-year periods where significant correlations were observed for loss severity and frequency. The difference between severity and frequency correlations is striking. Table 7. Percentage of significant (out of 294) 5-year rolling window correlations: nonCPBP and control data. There are few severity correlations but many frequency correlations.

Inappropriate Correlations
We caution against calculating correlations of OpRisk losses with potential stress types for which no relationship with those losses is apparent. Many can be found in the World Bank commodity database [39], and many of them have significant correlations with the nonCPBP severity data. The following commodities were correlated with the nonCPBP OpRisk losses at 5%: agricultural raw materials; Australian and South African coal; and Australian and New Zealand beef, lamb, groundnuts, sugar, uranium, milk, and chana. Furthermore, the commodity data for aluminium and cotton were positively correlated at 1% significance with the same nonCPBP data. The variety and number of commodities in these lists indicate, possibly, that there is some sort of underlying factor, as yet unknown. Twelve out of 85 significant correlations are many more than would be expected at 5% confidence. One would expect 5% of 85 ∼ 4.

Discussion
Overall, OpRisk losses are event-driven and are subject to particular economic shocks. Consequently, and perhaps conveniently, the need to look for correlations between OpRisk losses and stress types can be removed. A general principle should be to only seek correlations if causal factors can be identified. Therefore, doubt must be cast on the validity of the Fed's CCAR model [29], which uses correlations between loss severity and economic factors to predict regulatory capital.
The Basel Committee has recently issued further guidance on measuring the resilience of the banking system to economic shock [40], largely in response to the COVID-19 pandemic. Whilst the general measures proposed (robust risk management, anticipation of capital requirement, vulnerability assessment, etc.) are sensible, that guidance remains notable in that it continues to not say how capital should be calculated. The proposed FSF method enables capital to be calculated as a quantified response to economic or other data. Although the COVID-19 pandemic represents a very severe economic downturn, indications are that it will not inflate OpRisk losses. Using calculated OpRisk capital for another purpose represents a significant departure from accepted practice in financial risk. Hitherto, the reserve for each risk type has been applied only to that risk type. For financial prudence, we recommend a pooling of reserves, each calculated independently in response to external conditions. In Section 4.4, we established that very few significant 5-year severity correlations between OpRisk losses and economic factors can be found. Those that do exist do not persist. This finding is sensitive politically since the US CCAR process depends on the existence of such severity correlations. Informally, national regulatory authorities imply that correlations do exist by supplying economic data in the context of financial risk regulation. It is possible that a significant and persistent correlation can be found for an aggregation of all national OpRisk loss severities with economic factors. However, such aggregations are only known to national regulatory authorities, and they would not disclose the data. Regulations do not dwell on loss frequencies, which are explainable, provided that risk controls become increasingly stringent.

Economic Effects of Operational Risk: Intuitions
The economic effects on operational risk are not always clear. That is the fundamental reason for the detection of only sporadic OpRisk/economic factor correlations. The reasons for this are that OpRisk losses are driven by physical events, behavioural factors, and policy decisions. Causal relationships can be suggested but are not backed by evidence. The examples below illustrate these points.
DPA could increase with social unrest in stressed economic circumstances, but correlations have not been observed. Damage to cabling because of building work, fire, or flooding is much more likely. The same applies for other event-driven risk classes. BDSF, for example, is more likely to be affected by software problems. Anecdotal evidence suggests that EF (but not IF) does increase in stressed economic circumstances. The suggested reason is that fraudsters spot more opportunities. Informally, the amount of fraud can be governed solely by the extent of anti-fraud measures. Some is tolerated because customers disapprove of anti-fraud measures that are too severe. The same principle applies to CPBP and EPWS, which are largely controlled by a bank's response to customers and employees. CPBP and EPWS are driven by the bank's operational procedures (such as the number of telephone operators employed) and its policies (such as its response to illegal activity).
The COVID-19 pandemic has had a very significant effect on economic indicators, but the effects on OpRisk losses will not become apparent until the first quarter of 2021. It is likely that provisions will be put in place for safety measures in bank branches and for IT infrastructure to enable "working from home". Economic scenarios for stress testing that incorporate COVID-19 effects will probably appear in early 2022.

Overall Assessment of the FSF
The following list is a brief summary of the advantages and disadvantages of our proposed framework.

1.
Advantages: (a) It is flexible: stress factors can be added or removed easily.
Correlation is not assumed.
(c) Using projected losses eliminates idiosyncratic shocks since it assumes a "business as usual" environment. (d) Generating projected losses allows for modification of all or some of the projected losses, which is more flexible than modifying single capital values. (e) The degree of stress for any given stress type can be geared to achieve any (reasonable) required amount of retained capital. This is an objective way to calculate capital, based on factors that could affect capital.
The framework is able to detect the relative effects of the stress types considered. Some have little effect (e.g., global warming), whereas others have a significant effect (e.g., increasing the largest loss).

2.
Disadvantages: (a) It is not entirely straightforward to include stresses that act on parts of the projected data only. Specialised procedures must be used to generate the necessary stress factors.
Generating projected losses requires a loss frequency of approximately 250 per year. Accuracy is lost with fewer losses. For that reason, the aggregate risk class nonCPBP was used to obtain an overall view of the effect of stress. (c) Special treatment is needed for stresses that have a decreasing trend with time but are thought to increase capital. Decreasing stress values would result in stress factors that are less than 1. We suggest replacing a stress factor λ i,t by 1 λ i,t in such cases. Stresses in this category have to be identified in advance of calculating stress factors.
There is no objective way to decide what stress types should be used.

Guidance for Practitioners, Modellers, and Regulation
The results presented in Sections 4.2, 4.2.1, and 4.2.2 have highlighted points of note for regulation, for modelling, and in practice.

Regulation
European regulators are unlikely to specify precisely how stress testing should be conducted, and the Fed is unlikely to change its overall approach to the CCAR process. However, regulators can note that significant correlations cannot always be found. European banks are therefore likely to remain free to implement whatever stress testing method they deem appropriate. The FSF provides such a method that avoids explicit use of non-significant correlations.

Modelling
A primary issue is fitting a distribution to data. Typically, distribution parameters are often calculated using the Maximum Likelihood method. The major problems are in estimating suitable initial parameters values, speed of convergence, and failed convergence (in which case, initial estimates have to be used). Alternative parameter calculation methods sometimes result in very different parameter values. An example is the R packages Pareto and EnvStats, both of which can be used to model a Pareto distribution. The latter is very slow and can provide parameter values that result in VaR values that are many orders of magnitude greater than the largest empirical losses.
A further problem to be avoided is that some distributions can generate huge losses in random samples that skew VaR upwards to such an extent that they completely dominate the VaR. This problem occurs for the Pareto, LogLogistic, and G&H distributions and is exacerbated in small samples. A solution is to detect and remove those items from the sample.

Practice
If model parameter choices are left to practitioners, details such as which distribution and how many Monte Carlo cycles to use can affect results. If those questions are answered automatically, practitioners should be aware that producing curves such as in Figures 5 and 6 is time-consuming. Repeat runs are needed to obtain consistent results, and we recommend at least 25, for which several hours is likely to be needed. Practitioners should also note the spread of predictions and always report results in terms of confidence bounds (similar to those in Figure 3). They should also be aware of two further points. First, long-term predictions are less reliable than short-term predictions. Second, the choice of stress data (economic, fraud, etc. or user-defined) must be appropriate.

Conclusions
The principal conclusions are as follows: 1.
Predictions of OpRisk capital based on economic factor correlations is unsafe, as statistically significant correlations do not always exist. The proposed FSF provides a viable alternative to naive use of correlations in the context of OpRisk/Economic factor stress testing.

2.
The FSF works by calculating stress factors based on changes in economic factors (or any other stress type) and by applying them to projected OpRisk losses. As such, it acknowledges that OpRisk capital should increase in response to economic stress.

3.
The FSF is flexible and responsive. It allows for investigation using appropriate time series (not only the BoE economic data) as well as using user-defined scenarios.

4.
FSF has a disadvantage in that it requires a minimal volume of data to generate the necessary samples and predictions. Therefore, using individual Basel risk classes is not always feasible.

Further Work
Further work on reverse stress testing is already well-advanced and uses the same sampling and prediction techniques that were described in the FSF algorithm (see Algorithm 1). Suitable risk factors can be found quickly and reliably using a binary search as well as Bayesian optimisation. We also expand the suggestion in Appendix C to cast OpRisk scenarios from their usual "severity + horizon" form to the stress factor format used in the FSF.

Conflicts of Interest:
The author declares no conflict of interest. Figure A1 shows upper and lower confidence surfaces for the 2-year capital projections using the FSF. Each surface comprises a projected capital calculation for particular values of a stress factor λ ∈ [1, 2] (representing zero to 100 per cent stress) and a quarter. The main features are listed below.

1.
Very uneven surfaces due to data volatility.

2.
Increasing capital with increasing time, apart from quarters 27 and 28, when there were much reduced losses.

3.
Small overall increases in capital with increasing stress. 4.
Non-divergent upper and lower confidence surfaces, indicating a stable stochastic error component. not empty. Denote the number of elements in that set by n M . Next, calculate a parameter µ, which is the expected number of losses that exceeds αM in the time window of y years. That number serves as a parameter for a Poisson model (Equation (A1)).
Then, the Poisson probability p that there will be at least one loss of impact M in the horizon period H is The probability p can then be expressed as a quarterly (hence the factor 4) stress factor λ by calculating The same value λ should then be used at each time step in the FSF process.

Appendix D. Sensitivity of Capital with Respect to the Stress Factor
Symbolic computation can be used effectively to study the sensitivity of capital with respect to the stress factor, which is controlled by the parameter λ t of Equation (6). In particular, we consider the confidence bounds of Appendix A. In order to simplify the notation slightly, the subscript t in the stress factor λ t is omitted in what follows. It should be understood that the symbol "λ" refers to the stress factor applied at some time t and that the same applies for other parameters introduced.
Rao [41] has shown that, if Q is a random variable that represents an observation q of the p−quantile of a distribution with density f (•) derived from a sample of size n, then Q has a normal distribution given by Equation (A4). Q ∼ N q, p(1 − p) n( f (q)) 2 (A4) Therefore, the upper and lower symmetric c% confidence bounds are given, respectively, by (where z c is the appropriate normal ordinate for the percentage point c, for example 1.96 when c = 95%): It is easy to derive symbolic first-order approximations for C U and C L . For a small change, δλ in λ, the corresponding first order changes in C U and C L are as follows: Now consider the case of a random variable that has a lognormal mixture distribution with mixture parameter r and density f (q) = rφ(q, m, s) + (1 − r)φ(λq, m, s); r ∈ (0, 1), λ > 0 (φ is the normal density function). This density corresponds to the control data, discussed in Section 4.1.
Using Mathematica, if the Rao variance (Equation (A4)) is defined in a procedure Rao[m_, s_, lambda_, r_, u_, n_, p_], then the implementation of Equations (A5) and (A6) is as follows: R = Rao[m, s, lambda, r, u, n, p] CU = q + zc R dCU = dlambda D[CU, lambda] CL = q -zc R dCL = dlambda D[CL, lambda] The expressions for C U and C L derived in Mathematica are summarised in Equation (A7). Both C U and C L are dominated by the additive term q. ∂R ∂λ = 2π p(1 − p) n q(r − 1) s g(λq)(s 2 − m + log (λq)) h 2 λ 2 (A8) The right-hand-most bracket gathers all terms that contain the parameter λ and, therefore, determines the λ-variation of ∂R ∂λ and, hence, of ∂C U ∂λ and ∂C L ∂λ . With parameter values consistent with those normally observed, the components of ∂R ∂λ are of the following orders. Overall, ∂R ∂λ ∼ O(10 2 ) or O(10 3 ), depending on the value of λ. This term is several orders of magnitude less than the magnitude of C U and C L (typically O(10 7 )). Therefore, and especially if δλ is small, the incremental derivative component makes only a marginal change to the capital confidence limits. Consequently, C U and C L are well-conditioned with respect to λ.